3rd Edition: Chapter 2

Transcript 3rd Edition: Chapter 2

Week 11
TCP Congestion Control
1
Principles of Congestion Control
Congestion:
 informally: “too many sources sending too much
data too fast for network to handle”
 different from flow control!
 manifestations:

lost packets (buffer overflow at routers)

long delays (queueing in router buffers)
 a top-10 problem!
2
Causes/costs of congestion: scenario 1
Host A
 two senders, two
receivers
 one router,
Host B
lout
lin : original data
unlimited shared
output link buffers
infinite buffers
 no retransmission
 large delays
when congested
 maximum
achievable
throughput
3
Causes/costs of congestion: scenario 2
 one router,
finite buffers
 sender retransmission of lost packet
Host A
Host B
lin : original
data
l'in : original data, plus
retransmitted data
lout
finite shared output
link buffers
4
Causes/costs of congestion: scenario 2
= l
(goodput)
out
in
 “perfect” retransmission only when loss:
 always:
l
l > lout
in
 retransmission of delayed (not lost) packet makes
(than perfect case) for same
R/2
l
lout
R/2
in larger
R/2
lin
R/2
a.
lout
lout
lout
R/3
lin
R/2
b.
R/4
lin
R/2
c.
“costs” of congestion:
 more work (retrans) for given “goodput”
 unneeded retransmissions: link carries multiple copies of pkt
5
Causes/costs of congestion: scenario 3
Q: what happens as lin
and linincrease ?
 four senders
 multihop paths
 timeout/retransmit
Host A
lin : original data
lout
l'in : original data, plus
retransmitted data
finite shared output
link buffers
Host B
6
Example
What happens when each demand peaks at unity rate?
Throughput = 1.52? (How) twice the unity rate T = 1.07?
7
Max-min fair allocation
 Given a network and a set of sessions we
would like to find a maximal flow that it
is fair
 We will see different definitions for
max-min fairness and will learn a flow
control algorithm
 The tutorial will give understanding what
is max-min fairness
8
How define fairness?
 Any session is entitled to as much network
use as is any other.
 Allocating the same share to all.
9
Max-Min Flow Control Rule
 The rule is maximizing the network use
allocated to the sessions with the minimum
allocation
 An alternative definition: is to maximize
the allocation of each session i under
constraint that an increase in i’s allocation
doesn’t cause a decrease in some other
session allocation with the same or smaller
rate than i
10
Example
Session 3
Session 2
Session 1
C=1
C=1
Session 0
 Maximal fair flow division will be to give
for the sessions 0,1,2 a flow rate of 1/3
and for the session 3 a flow rate of 2/3
11
Notation
 G=(N,A) - Directed network graph (N is set
of vertexes and A is set of edges)
 Ca – the capacity of a link
 Fa – the flow on a link
a
a
 P – a set of the sessions
 rp – the rate of a session

p
We assume a fixed, single-path routing method
12
Definitions
Fa 

rp
all sessions p
crossing link a
 We have following constraints on the
vector r= {rp | p Є P}
rp  0, for all p  P
Fa  Ca , for all a  A
 A vector
r satisfying these constraints is
said to be feasible
13
Definitions
r is said to be max-min
fair if is a feasible and for each p Є P, rp
 A vector of rates
can not be increased while maintaining
feasibility without decreasing rp’ for some
session p’ for which rp’ ≤ rp
 We want to find a rate vector that is max-
min fair
14
Bottleneck Link for a Session
 Given some feasible flow r, we say that
a is a
bottleneck link with respect to r, for a session p
crossing a if Fa = Ca and rp ≥ rp’ for all sessions p’
crossing link a.
4:1
1:2/3
1
b
4
d
5
5:1/3
3:1/3
2:1/3
2
3
c
a
All link capacity is 1. Bottlenecks for 1,2,3,4,5 respectively are: c,a,a,d,a
Note: c is not a bottleneck for 5 and b is not a bottleneck for 1
15
Max-Min Fairness Definition
Using Bottleneck
 Theorem: A feasible rate vector
r is max-
min fair if and only if each session has a
bottleneck link with respect to r
16
Algorithm for Computing
Max-Min Fair Rate Vectors
 The idea of the algorithm:
 Bring all the sessions to the state that they
have a bottleneck link and then according to
theorem it will be the maximal fair flow
 We start with all-zero rate vector and to
increase rates on all paths together until Fa = Ca
for one or more links a.
 At this point, each session using a saturated
link has the same rate as every other session
using this link. Thus, these saturated links
serve as bottleneck links for all sessions using
them
17
Algorithm for Computing
Max-Min Fair Rate Vectors
 At the next step, all sessions not using the
saturated links are incremented equally in
rate until one or more new links become
saturated
 Note that the sessions using the previously
saturated links might also be using these
newly saturated links (at a lower rate)
 The algorithm continues from step to step,
always equally incrementing all sessions not
passing through any saturated link until all
session pass through at least one such link
18
Algorithm for Computing
Max-Min Fair Rate Vectors
Init: k=1, Fa0=0, rp0=0, P1=P and A1=A
1. For all aA, nak:= num of sessions pPk
crossing link a
2. Δr=minaAk(Ca-Fak-1)/nak (find inc size)
3. For all p  Pk, rpk:=rpk-1+ Δr, (increment)
for other p, rpk:=rpk-1
4. Fak:=Σp crossing arpk (Update flow)
5. Ak+1:= The set of unsaturated links.
6. Pk+1:=all p’s, such that p cross only links in
Ak+1
7. k:=k+1
8. If Pk is empty then stop, else goto 1
19
Example of Algorithm Running
4:1
1:2/3
1
b
4
d
5
5:1/3
3:1/3
2:1/3
2
All link capacity is 1
3
c
a
 Step 1: All sessions get a rate of 1/3, because of
a and the link a is
saturated.
 Step 2: Sessions 1 and 4 get an additional rate increment of 1/3
for a total of 2/3. Link c is saturated now.
 Step 3: Session 4 gets an additional rate increment of 1/3 for a
total of 1. Link d is saturated.
 End
20
Example revisited
 Max-min fair
vector if Tij = ∞

r = (½ ½ ½ ½ )

T = 2 > 1.52
 What if the
demands T13 and
T31 = ¼ , T24 = ½,
T42 = ∞

r = (¼ ½ ¼ ¾)
21
Causes/costs of congestion: scenario 3
H
o
s
t
A
l
o
u
t
H
o
s
t
B
Another “cost” of congestion:
 when packet dropped, any “upstream transmission
capacity used for that packet was wasted!
22
Approaches towards congestion control
Two broad approaches towards congestion control:
End-end congestion
control:
Network-assisted
congestion control:
 no explicit feedback from
 routers provide feedback
network
 congestion inferred from
to end systems

end-system observed loss,
delay
 approach taken by TCP

single bit indicating
congestion (SNA,
DECbit, TCP/IP ECN,
ATM)
explicit rate sender
should send at
23
Case study: ATM ABR congestion control
ABR: available bit rate:
 “elastic service”
RM (resource management)
cells:
 if sender’s path
 sent by sender, interspersed
“underloaded”:

sender should use
available bandwidth
 if sender’s path
with data cells
 bits in RM cell set by switches
(“network-assisted”)

congested:

sender throttled to
minimum guaranteed
rate

NI bit: no increase in rate
(mild congestion)
CI bit: congestion
indication
 RM cells returned to sender by
receiver, with bits intact
24
Case study: ATM ABR congestion control
 two-byte ER (explicit rate) field in RM cell

congested switch may lower ER value in cell

sender’ send rate thus minimum supportable rate on path
 EFCI bit in data cells: set to 1 in congested switch

if data cell preceding RM cell has EFCI set, sender sets CI
bit in returned RM cell
25
TCP Congestion Control
 end-end control (no network
assistance)
 sender limits transmission:
LastByteSent-LastByteAcked
 CongWin
 Roughly,
rate =
CongWin
Bytes/sec
RTT
 CongWin is dynamic, function
of perceived network
congestion
How does sender
perceive congestion?
 loss event = timeout
or
3 duplicate acks
 TCP sender reduces
rate (CongWin) after
loss event
three mechanisms:

AIMD

slow start

conservative after
timeout events
26
TCP AIMD
multiplicative decrease:
cut CongWin in half
after loss event
congestion
window
24 Kbytes
additive increase:
increase CongWin by
1 MSS every RTT in
the absence of loss
events: probing
16 Kbytes
8 Kbytes
time
Long-lived TCP connection
27
Additive Increase
 increase CongWin by 1 MSS every RTT in
the absence of loss events: probing
 cwnd
+= SMSS*SMSS/cwnd (*)
 This adjustment is executed on every
incoming non-duplicate ACK.
 Equation (*) provides an acceptable
approximation to the underlying principle
of increasing cwnd by 1 full-sized segment
per RTT.
28
TCP Slow Start
 When connection begins,
CongWin = 1 MSS


Example: MSS = 500
bytes & RTT = 200 msec
 When connection begins,
increase rate
exponentially fast until
first loss event
initial rate = 20 kbps
 available bandwidth may
be >> MSS/RTT

desirable to quickly ramp
up to respectable rate
29
TCP Slow Start (more)
 When connection


Host A
Host B
RTT
begins, increase rate
exponentially until
first loss event:
double CongWin every
RTT
done by incrementing
CongWin for every ACK
received
 Summary: initial rate
is slow but ramps up
exponentially fast
time
30
Refinement
 After 3 dup ACKs:


CongWin is cut in half,
Threshold is set to
CongWin
window then grows linearly
 But after timeout event:



Philosophy:
• 3 dup ACKs indicates
network capable of
delivering some segments
• timeout before 3 dup
ACKs is “more alarming”
Threshold set to
CongWin/2 and CongWin
instead set to 1 MSS;
window then grows
exponentially
to a threshold, then grows
linearly
31
Refinement (more)
Q: When should the
exponential
increase switch to
linear?
A: When CongWin
gets to 1/2 of its
value before
timeout.
Implementation:
 Variable Threshold
 At loss event, Threshold is
set to 1/2 of CongWin just
before loss event
32
Summary: TCP Congestion Control
 When CongWin is below Threshold, sender in
slow-start phase, window grows exponentially.
 When CongWin is above Threshold, sender is in
congestion-avoidance phase, window grows linearly.
 When a triple duplicate ACK occurs, Threshold
set to CongWin/2 and CongWin set to
Threshold.
 When timeout occurs, Threshold set to
CongWin/2 and CongWin is set to 1 MSS.
33
TCP sender congestion control
Event
State
TCP Sender Action
Commentary
ACK receipt for
previously unacked
data
Slow Start (SS)
CongWin = CongWin + MSS,
If (CongWin > Threshold)
set state to “Congestion
Avoidance”
Resulting in a doubling of
CongWin every RTT
ACK receipt for
previously unacked
data
Congestion
Avoidance (CA)
CongWin = CongWin+MSS *
(MSS/CongWin)
Additive increase, resulting in
increase of CongWin by 1 MSS
every RTT
Loss event detected
by triple duplicate
ACK
SS or CA
Threshold = CongWin/2,
CongWin = Threshold,
Set state to “Congestion
Avoidance”
Fast recovery, implementing
multiplicative decrease. CongWin
will not drop below 1 MSS.
Timeout
SS or CA
Threshold = CongWin/2,
CongWin = 1 MSS,
Set state to “Slow Start”
Enter slow start
Duplicate ACK
SS or CA
Increment duplicate ACK count
for segment being acked
CongWin and Threshold not
changed
34
TCP Futures
 Example: 1500 byte segments, 100ms RTT, want 10
Gbps throughput
 Requires window size W = 83,333 in-flight
segments
 Throughput in terms of loss rate:
1.22  MSS
RTT p
 p = 2·10-10
 New versions of TCP for high-speed needed!
35
Macroscopic TCP model
Deterministic
packet losses
1/p packets transmitted in a cycle
success
loss
36
TCP Model Contd
Equate the trapozeid
area 3/8 W2 under to 1/p
C  3 / 2  1.22
37
TCP Fairness
Fairness goal: if K TCP sessions share same
bottleneck link of bandwidth R, each should have
average rate of R/K
TCP connection 1
TCP
connection 2
bottleneck
router
capacity R
38
Why is TCP fair?
Two competing sessions:
 Additive increase gives slope of 1, as throughout increases
 multiplicative decrease decreases throughput proportionally
R
equal bandwidth share
loss: decrease window by factor of 2
congestion avoidance: additive increase
loss: decrease window by factor of 2
congestion avoidance: additive increase
Connection 1 throughput R
39
Fairness (more)
Fairness and UDP
 Multimedia apps often
do not use TCP

do not want rate
throttled by congestion
control
 Instead use UDP:

pump audio/video at
constant rate, tolerate
packet loss
 Research area: TCP
friendly, DCCP
Fairness and parallel TCP
connections
 nothing prevents app from
opening parallel cnctions
between 2 hosts.
 Web browsers do this
 Example: link of rate R
supporting 9 cnctions;


new app asks for 1 TCP, gets
rate R/10
new app asks for 10 TCPs,
gets R/2 !
40
Queuing Disciplines
 Each router must implement some queuing
discipline
 Queuing allocates both bandwidth and
buffer space:


Bandwidth: which packet to serve (transmit)
next
Buffer space: which packet to drop next (when
required)
 Queuing also affects latency
41
Typical Internet Queuing
 FIFO + drop-tail
 Simplest

choice
Used widely in the Internet
 FIFO (first-in-first-out)

Implies single class of traffic
 Drop-tail

Arriving packets get dropped when queue is full
regardless of flow or importance
 Important distinction:

FIFO: scheduling discipline

Drop-tail: drop policy
42
FIFO + Drop-tail Problems
 Leaves responsibility of congestion control
completely to the edges (e.g., TCP)
 Does not separate between different flows
 No policing: send more packets  get more
service
 Synchronization: end hosts react to same
events
43
FIFO + Drop-tail Problems
 Full queues


Routers are forced to have have large queues to
maintain high utilizations
TCP detects congestion from loss
• Forces network to have long standing queues in
steady-state
 Lock-out problem


Drop-tail routers treat bursty traffic poorly
Traffic gets synchronized easily  allows a few
flows to monopolize the queue space
44
Active Queue Management
 Design active router queue management to
aid congestion control
 Why?
 Router


has unified view of queuing behavior
Routers see actual queue occupancy (distinguish
queue delay and propagation delay)
Routers can decide on transient congestion,
based on workload
45
Design Objectives
 Keep throughput high and delay low

High power (throughput/delay)
 Accommodate bursts
 Queue size should reflect ability to accept
bursts rather than steady-state queuing
 Improve TCP performance with minimal
hardware changes
46
Lock-out Problem
 Random drop

Packet arriving when queue is full causes some
random packet to be dropped
 Drop front

On full queue, drop packet at head of queue
 Random drop and drop front solve the lock-
out problem but not the full-queues
problem
47
Full Queues Problem
 Drop packets before queue becomes
full (early drop)
 Intuition: notify senders of
incipient congestion
 Example:
early random drop (ERD):
• If qlen > drop level, drop each new packet
with fixed probability p
• Does not control misbehaving users
48
Random Early Detection (RED)
 Detect incipient congestion
 Assume hosts respond to lost packets
 Avoid window synchronization
 Randomly
mark packets
 Avoid bias against bursty traffic
49
RED Algorithm
 Maintain running average of queue length
 If avg < minth do nothing

Low queuing, send packets through
 If avg > maxth, drop packet

Protection from misbehaving sources
 Else mark packet in a manner proportional
to queue length

Notify sources of incipient congestion
50
RED Operation
Min thresh
Max thresh
P(drop)
Average Queue Length
1.0
maxP
minth
maxth
Avg queue length
51
Improving QOS in IP Networks
Thus far: “making the best of best effort”
Future: next generation Internet with QoS guarantees

RSVP: signaling for resource reservations

Differentiated Services: differential guarantees

Integrated Services: firm guarantees
 simple model
for sharing and
congestion
studies:
52
Principles for QOS Guarantees
 Example: 1Mbps IP phone, FTP share 1.5 Mbps link.

bursts of FTP can congest router, cause audio loss

want to give priority to audio over FTP
Principle 1
packet marking needed for router to distinguish
between different classes; and new router policy
to treat packets accordingly
53
Principles for QOS Guarantees
(more)
 what if applications misbehave (audio sends higher
than declared rate)

policing: force source adherence to bandwidth allocations
 marking and policing at network edge:

similar to ATM UNI (User Network Interface)
Principle 2
provide protection (isolation) for one class from others
54
Principles for QOS Guarantees
(more)
fixed (non-sharable) bandwidth to flow:
inefficient use of bandwidth if flows doesn’t use
 Allocating
its allocation
Principle 3
While providing isolation, it is desirable to use
resources as efficiently as possible
55
Principles for QOS Guarantees
(more)

Basic fact of life: can not support traffic demands
beyond link capacity
Principle 4
Call Admission: flow declares its needs, network may
block call (e.g., busy signal) if it cannot meet needs
56
Summary of QoS Principles
Let’s next look at mechanisms for achieving this ….
57
Scheduling And Policing
Mechanisms
 scheduling: choose next packet to send on link
 FIFO (first in first out) scheduling: send in order of arrival to
queue

real-world example?

discard policy: if packet arrives to full queue: who to discard?
• Tail drop: drop arriving packet
• priority: drop/remove on priority basis
• random: drop/remove randomly
58
Scheduling Policies: more
Priority scheduling: transmit highest priority queued
packet
 multiple

classes, with different priorities
class may depend on marking or other header info, e.g. IP
source/dest, port numbers, etc.
59
Scheduling Policies: still more
round robin scheduling:
 multiple classes
 cyclically scan class queues, serving one from each
class (if available)
60
Scheduling Policies: still more
Weighted Fair Queuing:
 generalized Round Robin
 each class gets weighted amount of service
in each cycle
61
Deficit Weighted Round Robin
(DWRR)
 DWRR addresses the limitations of the WRR
model by accurately supporting the weighted fair
distribution of bandwidth when servicing queues
that contain variable-length packets.
 DWRR addresses the limitations of the WFQ
model by defining a scheduling discipline that has
lower computational complexity and that can be
implemented in hardware. This allows DWRR to
support the arbitration of output port bandwidth
on high-speed interfaces in both the core and at
the edges of the network.
62
DRR
 In DWRR queuing, each queue is configured with a
number of parameters:


A weight that defines the percentage of the output port
bandwidth allocated to the queue.
A DeficitCounter that specifies the total number of
bytes that the queue is permitted to transmit each time
that it is visited by the scheduler. The DeficitCounter
allows a queue that was not permitted to transmit in the
previous round because the packet at the head of the
queue was larger than the value of the DeficitCounter to
save transmission “credits” and use them during the next
service round.
63
DWRR
 In the classic DWRR algorithm, the scheduler
visits each non-empty queue and determines the
number of bytes in the packet at the head of the
queue.
 The variable DeficitCounter isincremented by the
value quantum. If the size of the packet at the
head of the queue is greater than the variable
DeficitCounter, then the scheduler moves on to
service the next queue.
 If the size of the packet at the head of the queue
is less than or equal to the variable
DeficitCounter, then the variable DeficitCounter
is reduced by the number of bytes in the packet
and the packet is transmitted on the output port.
64
DWRR
quantum of service that is proportional to the
weight of the queue and is expressed in terms of
 A
bytes. The DeficitCounter for a queue is
incremented by the quantum each time that the
queue is visited by the scheduler.
 The scheduler continues to dequeue packets and
decrement the variable DeficitCounter by the size
of the transmitted packet until either the size of
the packet at the head of the queue is greater
than the variable DeficitCounter or the queue is
empty.
 If the queue is empty, the value of DeficitCounter
is set to zero.
65
DWRR Example
Queue 1: 50% BW, quantum[1] =1000
300
400
600
Queue 2: 25% BW, quantum[2] =500
400
300
400
Queue 3: 25% BW, quantum[3] =500
400
300
600
Modified Deficit Round Robin gives priority to
one say VoIP class
66
Policing Mechanisms
Goal: limit traffic to not exceed declared parameters
Three common-used criteria:

(Long term) Average Rate: how many pkts can be sent
per unit time (in the long run)


crucial question: what is the interval length: 100 packets per
sec or 6000 packets per min have same average!
Peak Rate: e.g., 6000 pkts per min. (ppm) avg.; 1500
ppm peak rate

(Max.) Burst Size: max. number of pkts sent
consecutively (with no intervening idle)
67
Policing Mechanisms
Token Bucket: limit input to specified Burst Size and Average
Rate.
 bucket can hold b tokens
 tokens generated at rate

r token/sec unless bucket full
over interval of length t: number of packets admitted less than
or equal to (r t + b).
68
Policing Mechanisms (more)
 token bucket, WFQ combine to provide
guaranteed upper bound on delay, i.e., QoS
guarantee!
arriving
traffic
token rate, r
bucket size, b
WFQ
per-flow
rate, R
D = b/R
max
69

3rd Edition: Chapter 2

Transcript 3rd Edition: Chapter 2

Directory