Transcript ppt
Lecture 4: Congestion Control
Challenge: how do we efficiently share
network resources among billions of
hosts?
Today:
TCP
– Hosts adjust rate based on packet losses
Next
time: Alternative solutions
– Fair queueing, RED
– Vegas, packet pair
– ECN, rate control, credits
A brief Internet history...
1991
WWW/HTTP
1990
ARPANET
dissolved
1986
NNTP
1972
TELNET
RFC 318
1969
ARPANET
created
1970
1973
FTP
1977
MAIL
1982
TCP & IP
1984
DNS
RFC 454
RFC 733
RFC 793 & 791
RFC 883
1975
1980
RFC 977
1985
1990
1995
Multi-backbone
Internet
1992
MBONE
1995
TCP: This is your life...
1975
Three-way handshake
Raymond Tomlinson
In SIGCOMM 75
1974
TCP described by
Vint Cerf and Bob Kahn
In IEEE Trans Comm
1975
1980
1984
Nagel’s algorithm
1987
to reduce overhead
Karn’s algorithm
1990
of small packets;
to better estimate
4.3BSD Reno
predicts congestion
round-trip time
fast retransmit
collapse
delayed ACK’s
1983
1988
1986
BSD Unix 4.2
Van Jacobson’s
Congestion
supports TCP/IP
algorithms
collapse
congestion avoidance
observed
1982
and congestion control
TCP & IP
(most implemented in
RFC 793 & 791
4.3BSD Tahoe)
1985
1990
TCP: After 1990
1994
T/TCP
(Braden)
Transaction
TCP
1993
TCP Vegas
(Brakmo et al)
real congestion
avoidance
1993
1994
ECN
(Floyd)
Explicit
Congestion
Notification
1994
1996
SACK TCP
(Floyd et al)
Selective
Acknowledgement
1996
Hoe
Improving TCP
startup
1996
1996
FACK TCP
(Mathis et al)
extension to SACK
1999
TCP Rainier
???
TCP Flow Control
TCP is a sliding window protocol
For
window size n, can send up to n bytes
without receiving an acknowledgement
When the data is acknowledged then the
window slides forward
Each packet advertises a window size
Indicates
number of bytes the receiver is
willing to get
Original TCP always send entire window
Visualizing the window
offered window
(advertised by receiver)
usable window
1
2
3
sent and
acknowledged
4
5
6
7
8
9
can send ASAP
sent, not ACKed
10 11 12
can’t send until
window moves
Left side of window advances when data is acknowledged.
Right side controlled by size of window advertisement
Observed TCP Problems
Too many small packets
silly
window syndrome
Nagel’s algorithm
delayed acks
Too many retransmissions
dynamic
retransmission timer
congestion collapseb
Silly Window Syndrome
Problem: (Clark, 1982)
If
receiver advertises small increases in the
receive window then the sender may waste
time sending lots of small packets
Solution
Receiver
must not advertise small window
increases
Increase window by min(MSS,RecvBuffer/2)
Nagel’s algorithm
Small packet problem:
Don’t
want to send a 41 byte packet for
each keystroke
How long to wait for more data?
Solution:
Allow
only one outstanding small segment
that has not yet been acknowledged
Self-adaptive: faster that ACKs arrive, more
tinigrams can be sent
Retransmission
How long a timeout to set?
Estimate round-trip time R
= R + (1- )M
is a smoothing factor of 0.9
M is measured round-trip time (ACK’s to
data)
R
Timeout at R, where is a delay
variance factor of 2.0
Karn’s algorithm
Problem:
How
to estimate RTT when data is
retransmitted?
Solution:
Don’t
sample RTT when you retransmit
data
Also added exponential backoff to
retransmit interval
Router Congestion
What if rate of packets arriving > rate of
packets departing?
aaaaa
bbbbb
ccccc
ddddd
C
r
o
s
s
b
a
r
adcba
Routers drop any overflow packets
The Problem
Original TCP sent full window of data
When links become loaded, queues fill
up, and this can lead to:
Congestion
collapse: when round-trip time
exceeds retransmit interval this can create
a stable condition in which every packet is
being retransmitted many times
Synchronized behavior: network oscillates
between loaded and unloaded
Jacobson Solution
Modify retransmission timer to adapt to
variations in queueing delay
Infer network bandwidth from packet loss
drops
=> congestion => reduce rate
– drops also caused by link noise!
no
drops => no congestion => increase rate
Limit send rate based on network
bandwidth as well as receiver space
Jacobson’s RTT estimator
Problem:
Can’t
deal with wide variance in RTT
Solution:
Smooth
estimator of variance as well
Err = M - R
R = R + gErr
( gain for avg = 1/8)
D = D + h(|Err| - D) ( gain for dev = 1/4)
Timeout
at R + 4D
When variance is small, R dominates
TCP Congestion Control
Adjust rate to match network bandwidth
Additive
increase/multiplicative decrease
– oscillate around bottleneck capacity
Slow
start
– quickly identify bottleneck capacity
Fast
retransmit
Fast recovery
Tracking the Bottleneck Bandwidth
Throughput = window size/RTT
Multiplicative decrease
Timeout
=> dropped packet =>
cut window size in half
Additive increase
Ack
arrives => no drop =>
increase window size by one packet/window
TCP “Sawtooth”
Oscillates around bottleneck bandwidth
adjusts
to changes in competing traffic
round-trip times
27
24
21
18
15
12
9
6
3
Additive Increase/Multiplicative Decrease
18
16
14
12
window 10
(in segs) 8
6
4
2
0
0
Slow start
How do we find bottleneck bandwidth?
Start
by sending a single packet
– start slow to avoid overwhelming network
Multiplicative
increase until get packet loss
– quickly find bottleneck
Previous
window < bottleneck rate
– shift into linear increase/multiplicative decrease
Slow Start
Quickly find the bottleneck bandwidth
Slow Start
300
250
200
window
150
(in segs)
100
50
0
0
1
2
3
4
5
round-trip times
6
7
8
Slow Start Problems
Slow start usually overshoots bottleneck
will
Bursty traffic source
will
lose many packets in window
cause bursty losses for other flows
Short flows
Can
spend entire time in slow start!
Avoiding burstiness: ack pacing
bottleneck
packets
Sender
Receiver
acks
Window size = round trip delay * bit rate
Problems with Ack Pacing
Ack compression
Variations
in queueing delays on return
path changes spacing between acks
Example: ack waits for single long packet
Worse with bursty cross-traffic
What happens after a timeout?
Ack Pacing After Timeout
1
Packet loss disrupts ack
pacing
3
5
2
1
1
4
slow
start/additive increase
cause packet loss!
After loss, use slow start to
regain ack pacing
1
1
1
2
switch
to linear increase at
last successful rate
“congestion avoidance”
5
Putting It All Together
Slow Start + Congestion Avoidance
18
16
14
12
window 10
(in segs) 8
6
4
2
Timeouts dominate performance!
39
36
33
30
round-trip times
27
24
21
18
15
12
9
6
3
0
0
Fast Retransmit
1
Can we detect packet loss
without a timeout?
Duplicate acks imply either
packet
reordering (route change)
packet loss
3
5
2
4
1
1
1
2
TCP Tahoe
resend
if see three dup acks
eliminates timeout delay
1
1
5
Fast Retransmit Caveats
Requires in order packet delivery
Dynamically
adjust number of dup acks
needed for retransmit
Doesn’t work with small windows
ex:
modems!
Doesn’t work if packets lost in burst
ex:
at peak of slow start?
Fast Retransmit
Slow Start + Congestion Avoidance
+ Fast Retransmit
18
16
14
12
window 10
(in segs) 8
6
4
2
28
26
24
22
round-trip times
20
18
16
14
12
10
8
6
4
2
0
0
Regaining ack pacing limits performance
Fast Recovery
1
Use duplicate acks to maintain
ack pacing
3
5
2
1
1
4
dup
ack => packet left network
every other ack => send packet
Doesn’t work if lose packets in
burst
timeout
and slow start to
reestablish ack pacing
1
1
1
2
3
Fast Recovery
Slow Start + Congestion Avoidance
+ Fast Retransmit + Fast Recovery
18
16
14
12
window 10
(in segs) 8
6
4
2
0
0
2
4
6
8
10
12
14
16
round-trip times
18
20
22
24
Delayed ACKS
Problem:
In
request/response programs, you send
separate ACK and Data packets for each
transaction
Solution:
Don’t ACK
data immediately
Wait 200ms (must be less than 500ms)
Must ACK every other packet
Must not delay duplicate ACKs
Delayed Ack Impact
TCP congestion control triggered by acks
if
receive half as many acks => window
grows half as fast
Slow start with window = 1
will
trigger delayed ack timer
What if two TCP connections
share link?
Reach equilibrium independent of initial
bandwidth (assuming equal RTTs)
16
14
12
10
window
8
(in segs)
6
4
2
0
0
1
2
3
4
5
6
7
8 9 10 11 12 13 14 15 16 17 18 19
round-trip times
Equilibrium Proof
What if TCP and UDP share link?
Independent of initial rates, UDP will get
priority! TCP will take what’s left.
18
16
14
12
UDP
TCP
window 10
(in segs) 8
6
4
2
18
round-trip times
16
14
12
10
8
6
4
2
0
0
What if two different TCP
implementations share link?
If cut back more slowly after drops =>
will grab bigger share
If add more quickly after acks => will
grab bigger share
Incentive to cause congestion collapse!
Many
TCP “accelerators”
Easy to improve perf at expense of network
Solution: per-flow fair queueing at router
What if TCP connection is short?
Slow start dominates performance
What
if network is unloaded?
Burstiness causes extra drops
Packet losses unreliable indicator
can
lose connection setup packet
can get drop when connection near done
signal unrelated to sending rate
In limit, have to signal every connection
50%
loss rate as increase # of connections
Example: 10KB document
10Mb/s Ethernet,70ms RTT, 536 MSS
Ethernet ~ 10 Mb/s
64KB window, 70ms RTT ~ 7.5 Mb/s
can only use 10KB window ~ 1.2 Mb/s
5% drop rate ~ 275 Kb/s (Matthis)
model timeouts ~ 228 Kb/s (Padhye)
slow start, no losses ~ 140 Kb/s
slow start, with 5% drop ~ 75 Kb/s
Bandwidth (Kbps)
Short flow bandwidth
140
120
100
80
60
40
20
0
median
average
0
2.5
5
7.5
10 12.5 15
Packet loss rate (% )
Flow length=10Kbytes, RTT=70ms
Sharing congestion information
Intra-host sharing
Multiple
web connections from a host
[Padmanabhan98, Touch97]
Inter-host sharing
For
a large server farm or a large client
population
How much potential is there?
Destination locality
Destination Host Locality
(time since host last accessed)
1
All Flows
0.9
Inter-host only
Cumulative Fraction
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.1
1
Seconds
10
100
Sharing Congestion Information
Enterprise/Campus Network
Border Router
Subnet
Congestion
Gateway
Internet
Matthis et al.
Problem: TCP has complex behavior
Can we model its performance?
Make
lots of simplifying assumptions
– one drop per window
– long flow behavior (ignore slow start)
– large receiver window
– no timeouts
Result
BW
MSS C
, C 0.9
RTT
p
What does this mean?
BW
MSS C
, C 0.9
RTT
p
Bigger RTT => worse performance
For
shared links, TCP penalizes WAN traffic
Queueing delay hurts performance
Higher drop rate => worse performance
correlated
vs. non-correlated drops
– does my behavior cause more drops for me, or
just more drops for other flows?
Does adding buffers help?
++ Increases ability of router to handle
bursts => fewer drops
-- Allows greater overshoot in slow start
=> bursty traffic => more drops
-- Potentially increases RTT
Balakrishnan et al.
Observation: poor performance on wireless
links (<= 10% of peak)
Problem: Interaction with TCP
Wireless
links drop pkts due to noise, handoffs
TCP interprets all drops as congestion
=>TCP will reduce sending rate
Paper’s goal: Evaluate proposed solutions
Alternatives
Modify TCP implementation
better
handling of multiple drops per window
even faster retransmit (one dup ack)
explicit bit for non-congestion packet loss
Retransmission at link layer
Can
result in packet reordering
TCP-aware suppresses duplicate acks
Split connection
Modify TCP Implementation
E2E: standard TCP
NewReno: if multiple drops in window,
first ack after fast recovery will be “partial”
Selective ACK: list recent missing pkts
IETF
-- wait for 3 dup acks
SMART -- resend after one dup ack
ELN: explicit bit for non-congestion loss
ELN + SMART not simulated
Link Layer Solutions
Link layer: transparent retransmission
end
host sees packets arrive out of order
TCP aware link layer
suppress
Link layer with SACK
better
duplicate acks
handling of multiple drops
Link layer with SACK and TCP aware
bingo!
Split TCP
Install proxy at wireless link
allows
local retransmissions
But
proxy
failure visible to end hosts
end to end xput limited to min across hops