Presentation

Download Report

Transcript Presentation

Linux Traffic Control
Linux Traffic Control Essentials
TCNG Overview
Study of a Token Bucket Scenario
Papadimitriou Panagiotis
17/06/2004
Components of Linux Traffic Control
The basic components of the Linux QoS
architecture are:
• Queuing Disciplines
• Classes
• Filters
Queuing Disciplines
Queuing Disciplines (qdiscs) have:
•
an enqueue function, called whenever the network layer of the
operating system wants to transmit a packet, and
•
a dequeue function, called when the device is able to transmit the
next packet
The available qdiscs can be divided into two groups:
•
The simple qdiscs which have no inner structure, known as
queues. These can be used to shape traffic for an entire interface,
without any subdivisions.
•
The qdiscs which have classes, known as schedulers. These are
very useful when there are different kinds of traffic which should
have differing treatment.
Path of a Data Socket through
the Linux Network Stack
Queues
The first group of queuing disciplines includes:
•
pfifo_fast: a 3-band priority FIFO queue (default)
•
sfq: a stochastic fair queuing discipline
•
tbf: a Token Bucket Filter queue
•
red: implements the Random Early Detection Behavior (RED)
•
gred: a generalized RED implementation used for DiffServ
support
•
ingress: a queue used for policing ingress traffic
Schedulers
The second group of queuing disciplines includes:
•
cbq: implementation of the class based queuing link-sharing
scheme
•
atm: a special qdisc which supports the re-direction of flows
to ATM virtual channels
•
csz: a Clark-Shenker-Zhang scheduling discipline
•
dsmark: qdisc for DiffServ support (uses DSCP)
•
wrr: a Weighted Round Robin scheduler
Sample qdisc which has inner
classes, filters and qdiscs
Sample Scenario for Traffic Control
•
A small company has a 10 Mbit/s link
which connects its workstations and one
FTP server to an Internet service
provider.
•
Since bandwidth is a scarce resource the
company wants to limit the share of the
FTP traffic to 20% and at times where
less bandwidth is needed by FTP the rest
should be available for the workstations.
On the other hand FTP traffic must never
exceed its 20% share even if the rest of
the bandwidth is currently unused
because the companies.
•
ISP charges extra for any bandwidth consumed above a rate of 2 Mbit/s.
In order to solve this problem, a Linux router is installed at the edge of
the corporate network.
Traffic Control Configuration
The first Ethernet interface of the Linux router (eth0) is connected to the ISP,
the second interface (eth1) is the link towards the internal network. Since it is
only possible to limit outgoing traffic, the setup consists of two parts:
•
•
the CBQ configuration limiting the outgoing traffic on eth0 (the
downstream” traffic from the internal network’s point of view), and
a second part limiting outgoing traffic on eth1 (the “upstream”).
Introduction to TCNG
The Traffic Control Next Generation (TCNG) project focuses on:
•
providing a compact and user-friendly configuration
language, in which traffic control systems can be expressed in
an intuitive way
•
supporting hardware accelerators in traffic control
TCNG is comprised by two major components:
•
the Traffic Control Compiler (TCC)
•
the Traffic Control Simulator (TCSIM)
Traffic Control Compiler
•
TCNG language is closely modelled after common programming
languages, such as C, Perl or Java.
•
Consequently, learning effort is reduced for anyone who is familiar
with one of these languages.
•
Traffic Control Compiler translates configuration scripts from the
TCNG language into a multitude of output formats used to
configure traffic control subsystems.
TCC in Operation
Traffic Control Compiler:
•
gets its input from a script program
•
invokes the appropriate input parser to translate the
configuration data into a common internal data structure
•
invokes one or more output generators (named “targets”) to
issue commands to the corresponding output processor(s)
Finally, output processors translate the output from tcc into actions
understood by lower-level components.
TCC Internal Structure & Interface
Traffic Control Simulator
Traffic Control Simulator is used to simulate the behavior of Linux
Traffic Control at a very high level of detail.
Traffic Control Simulator has been developed mainly for the
following purposes:
•
validation of configurations generated by tcc
•
development of configuration scripts
•
testing of traffic control components
TCSIM in Operation (1)
Traffic Control Simulator:
•
directly supports configuration using the standard traffic control
language (tc), and
•
it supports the new TCNG language by automatically invoking
TCC, and integrating its output
Furthermore, Traffic Control Simulator:
•
combines the original traffic control code from the Linux kernel
with the user-space code of the configuration utility tc, and
•
adds the framework for communication among them, plus an
event-driven simulation engine
TCSIM in Operation (2)
•
The resulting program runs entirely in user space, but
executes almost exactly the same code as a “real system”,
approximating the behavior of traffic control in a Linux system
much more accurately than a more general simulator
(e.g. NS-2) would.
Traffic Control Simulator:
•
processes a script defining the system configuration and the
data to send, and
•
generates a message trace, which can then be processed to
obtain statistics or graphs
TCSIM Internals and Helper Programs
TCNG Example: Steps 1-2
Step 1: We write the following TCNG code in the file: example.tc
dev eth0 {
egress {
drop if tcp_sport != PORT_HTTP;
}
}
Step 2: We run tcc to convert the TCNG configuration to tc
commands. We save the output in the file: example.sh
tcc example.tc > example.sh
TCNG Example: After Step 2
After Step 2 the file example.sh contains the following tc configuration:
tc qdisc add dev eth0 handle 1:0 root dsmark indices 1 default_index 0
tc filter add dev eth0 parent 1:0 protocol all prio 1 handle 1:0:0 u32 divisor 1
tc filter add dev eth0 parent 1:0 protocol all prio 1 u32 match u8 0x6 0xff at
9 offset at 0 mask 0f00 shift 6 eat link 1:0:0
tc filter add dev eth0 parent 1:0 protocol all prio 1 handle 1:0:1 u32 ht 1:0:0
match u16 0x50 0xffff at 0 classid 1:0
tc filter add dev eth0 parent 1:0 protocol all prio 1 u32 match u8 0x6 0xff at
9 classid 1:0 police index 1 rate 1bps burst 1 action drop/drop
tc filter add dev eth0 parent 1:0 protocol all prio 1 u32 match u32 0x0 0x0 at
0 classid 1:0
TCNG Example: Step 3
Step 3: We define a simulation scenario in the file: example.tcsim
with one interface called eth0, running at 100 Mbps. The
simulation scenario consists of sending two packets.
#include “packet.def”
#include “ports.tc”
dev eth0 100 Mbps {
#include “example.tc”
}
send TCP_PCK($tcp_sport = PORT_HTTP);
send TCP_PCK($tcp_sport = PORT_SSH);
end
TCNG Example: Step 4
Step 4: We run the simulation with tcsim:
tcsim –s 22 example.tcsim
The output looks like this:
0.000000 E : 0x80bd560 40 : eth0: 45000028 00000000 40060000
0a000001 0a000002 0050 ...
0.000000 D : 0x80bd560 40 : eth0: 45000028 00000000 40060000
0a000001 0a000002 0050 ...
0.000000 E : 0x80bd870 40 : eth0: 45000028 00000000 40060000
0a000001 0a000002 0016 ...
0.000000 * : 0x80bd870 40 : eth0: enqueue returns POLICED (3)
TCNG Example: Steps 5-6
Step 5: We verify that the configuration did indeed work: The first
packet was enqueued (“E”), and then dequeued (“D”).
When trying to enqueue the second packet, it is rejected.
Step 6: We can try this example on a live system. We execute the
tc commands to create the configuration in the kernel:
sh example.sh
A more comprehensive TCNG example (1)
This example illustrates most of the elements found in a typical TCNG configuration:
dev "eth0" {
egress {
class (<$high>) if tcp_dport == PORT_HTTP;
class (<$low>) if 1;
prio {
$high = class (1) {
fifo (limit 20kB);
}
$low = class (2) {
fifo (limit 100kB);
}
}
}
}
A more comprehensive TCNG example (2)
The dev and egress lines
dev "eth0" {
determine what is being configured:
egress {
i.e. the egress (outbound) side of the network interface eth0.
The configuration consists of two parts:
•
the classification: class (<$high>) if tcp_dport == PORT_HTTP;
class (<$low>) if 1;
•
the setup of the queuing system: prio {
$high = class (1) {
fifo (limit 20kB);}
$low = class (2) {
fifo (limit 100kB);}
In this example, we use a priority scheduler with two classes for the
priorities “high” and “low”.
A more comprehensive TCNG example (3)
In this configuration, packets:
•
with TCP destination port 80 (HTTP) are sent to the high
priority class,
•
while all other packets (if 1;) are sent to the low priority class
The queuing part defines the queuing discipline for static priorities,
with the two classes:
•
Inside the high priority class, there is another queuing discipline:
a simple FIFO with a capacity of 20 KB.
•
Likewise, the low priority class contains a FIFO with 100 KB.
A more comprehensive TCNG example (4)
The compilation of this TCNG code results in the following tc configuration:
tc
tc
tc
tc
qdisc
qdisc
qdisc
qdisc
add
add
add
add
dev eth0
dev eth0
dev eth0
dev eth0
handle
handle
handle
handle
1:0
2:0
3:0
4:0
root dsmark indices 4 default_index 0
parent 1:0 prio
parent 2:1 bfifo limit 20480
parent 2:2 bfifo limit 102400
tc filter add dev eth0 parent 2:0 protocol all
tc filter add dev eth0 parent 2:0 protocol all
tc filter add dev eth0 parent 2:0 protocol all
tc filter add dev eth0 parent 1:0 protocol all
tc filter add dev eth0 parent 1:0 protocol all
offset at 0 mask 0f00 shift 6 eat link 1:0:0
prio
prio
prio
prio
prio
1
1
1
1
1
tcindex mask 0x3 shift 0
handle 2 tcindex classid 2:2
handle 1 tcindex classid 2:1
handle 1:0:0 u32 divisor 1
u32 match u8 0x6 0xff at 9
tc filter add dev eth0 parent 1:0 protocol all prio 1 handle 1:0:1 u32 ht 1:0:0
match u16 0x50 0xffff at 2 classid 1:1
tc filter add dev eth0 parent 1:0 protocol all prio 1 u32 match u32 0x0 0x0 at 0
classid 1:2
Simulation Output
By default, TCSIM prints a message:
•
•
whenever a packet is enqueued or dequeued, or
when some exceptional condition (e.g. an error) occurs.
This output can be post-processed:
•
•
to extract statistical data,
or to generate a graphical representation of traffic characteristics
TCSIM can also provide more detailed information on the inner workings
of the traffic control subsystem, which is useful:
•
•
for testing configurations, and
the development of new traffic control elements
Pretty-printing Traces (1)
The script tcsim_pretty can be used to format traces in a more
human-readable way.
Running the simulation script example.tcsim with the command syntax:
tcsim example.tcsim
produces the following output:
0.000000 E : 0x93a87c8 40 : eth0: 45000028 00000000 40060000
0a000001 0a000002 00500000 00000000 00000000 50000000 00000000
0.000000 D : 0x93a87c8 40 : eth0: 45000028 00000000 40060000
0a000001 0a000002 00500000 00000000 00000000 50000000 00000000
0.000000 E : 0x93a88c0 40 : eth0: 45000028 00000000 40060000
0a000001 0a000002 00160000 00000000 00000000 50000000 00000000
0.000000 * : 0x93a88c0 40 : eth0: enqueue returns POLICED (3)
Pretty-printing Traces (2)
Running the same simulation script with the following command syntax:
tcsim example.tcsim | tcsim_pretty
produces a more readable output:
----- 0.000000 -----------------------------------------------------------------------0x9d207c8 E 40: eth0: 45000028 00000000 40060000 0a000001 0a000002
+ 00500000 00000000 00000000 50000000 00000000
=
D 40: eth0: 45000028 00000000 40060000 0a000001 0a000002
+ 00500000 00000000 00000000 50000000 0000000
0x9d208c0 E 40: eth0: 45000028 00000000 40060000 0a000001 0a000002
+ 00160000 00000000 00000000 50000000 00000000
=
* eth0: enqueue returns POLICED (3)
Output Filtering (1)
Enqueue and dequeue records can be selected in trace output
with the tcsim_filter script.
Additional filtering is supported, according to a selection of fields.
The following fields are recognized:
•
•
•
•
•
•
•
tos: TOS byte
len: Total length field
src: Source IP address
dst: Destination IP address
sport: Source port (TCP or UDP)
dport: Destination port (TCP or UDP)
dev: Device name (e.g. eth0)
Output Filtering (2)
When printing records, each line contains:
•
the time
•
the ID string
•
the packet length in bytes
The tcsim_filter script supports counting the results instead
of printing data points on standard output.
In this case, the records with the same ID string are counted.
Examples of Output Filtering
Running the simulation script dsmark+policing with the command syntax:
tcsim dsmark+policing | tcsim_filter -c tos
produces the following output:
D:00 201
D:b8 139
E:00 201
E:01 201
Likewise,
tcsim dsmark+policing | tcsim_filter -c tos=0xb8
produces the output: D 139
Graphical Output
Filtered output can be further processed with the script tcsim_plot,
which uses gnuplot to generate plots.
The following plot types are available:
•
•
•
•
rate: Bit rate (based on the inter-arrival time)
iat: Packet inter-arrival time
cumul: Cumulative amount of data
delay: Queuing delay, measured at dequeue time
Token Bucket Scenario
#define RATE 1Mbit
#define BURST 3kB
#define LIMIT 20kB
#define NOTHING
#define PACKET
/* 100-sizeof(iphdr) = 80 bytes. */
IP_PCK(NOTHING) 0 x 80
dev eth0 10000 /* 10 Mbps */
tc qdisc add dev eth0 root handle 1:0 tbf limit LIMIT rate RATE burst BURST
every 0.0005s send PACKET
time 1s
end
/* 1.6 Mbps */
Packet Losses
Scenario 1
Scenario 2
Scenario 3
Scenario 4
Rate = 1 Mbps
Burst=3KB
Limit = 20KB
Rate = 1 Mbps
Burst = 3KB
Limit = 10KB
Rate = 1 Mbps
Burst = 5KB
Limit = 20KB
Rate = 1.1 Mbps
Burst = 3KB
Limit = 20KB
Running the simulation script tbf with the command syntax:
tcsim tbf | tcsim_filter -c
which counts the packets enqueued and dequeued, produces the
following outputs for each scenario:
D: 1590
E: 2002
D: 1488
E: 2002
D: 1612
E: 2002
D: 1727
E: 2002
Time of 1st Packet Loss (1)
Scenario 1
0.356500
0.357000
0.357000
0.357500
0.357500
E : 0x9ecf858 100 : eth0: 45000064 00000000 40000000 0a0000 ...
E : 0x9ed3c58 100 : eth0: 45000064 00000000 40000000 0a0000 ...
* : 0x9ed3c58 100 : eth0: enqueue returns DROP (1)
E : 0x9ed3c58 100 : eth0: 45000064 00000000 40000000 0a0000 ...
* : 0x9ed3c58 100 : eth0: enqueue returns DROP (1)
Scenario 2
0.189500
0.190000
0.190000
0.190000
E : 0x82b0ee8 100 : eth0: 45000064 00000000 40000000 0a0000 ...
E : 0x82b0ff8 100 : eth0: 45000064 00000000 40000000 0a0000 ...
* : 0x82b0ff8 100 : eth0: enqueue returns DROP (1)
D : 0x82aece8 100 : eth0: 45000064 00000000 40000000 0a0000 ...
Time of 1st Packet Loss (2)
Scenario 3
0.388000
0.388500
0.388500
0.389000
0.389000
E : 0x96dbb48 100 : eth0: 45000064 00000000 40000000 0a0000 ...
E : 0x96dbc58 100 : eth0: 45000064 00000000 40000000 0a0000 ...
* : 0x96dbc58 100 : eth0: enqueue returns DROP (1)
E : 0x96dbc58 100 : eth0: 45000064 00000000 40000000 0a0000 ...
* : 0x96dbc58 100 : eth0: enqueue returns DROP (1)
Scenario 4
0.448500
0.449000
0.449000
0.449500
0.449500
E : 0x8583b48 100 : eth0: 45000064 00000000 40000000 0a0000 ...
E : 0x8583c58 100 : eth0: 45000064 00000000 40000000 0a0000 ...
* : 0x8583c58 100 : eth0: enqueue returns DROP (1)
E : 0x8583c58 100 : eth0: 45000064 00000000 40000000 0a0000 ...
* : 0x8583c58 100 : eth0: enqueue returns DROP (1)
Cumulative Amount of Data for Scenarios 1-4
1
2
3
4
Queuing Delay for Scenarios 1-4
1
2
3
4
TCSIM Restrictions & Extensions
•
TCSIM only includes a small part of the network stack, and
does not support full routing or firewalling. Therefore, the
route classifier is not available in tcsim, and the usability of the
fw classifier is limited.
•
tc bugs may crash TCSIM.
•
TCSIM supports only the simulation of constant bit-rate flows
(using the every keyword) and the sending of single packets at
a specified point in time.
•
In order to support the simulation of Poisson distributed and
bursty flows, a simple tool, the TCSIM Traffic Generator
(TrafGen) was developed, which creates trace files to be used in
a simulation.
References
•
TCNG HomePage, URL: http://tcng.sourceforge.net
•
Linux Advanced Routing & Traffic Control, URL: http://lartc.org/
•
L. Wischhof and J. W. Lockwood, “Packet Scheduling for LinkSharing and Quality of Service Support in Wireless Local Area
Networks”, November 2001
•
Linux IP, URL: http://linux-ip.net/
•
Practical QoS, URL: http://www.opalsoft.net/qos/index.html