Congestion Control

download report

Transcript Congestion Control

An Introduction
to
Computer Networks
Lecture 11: Congestion Control
University of Tehran
Dept. of EE and Computer Engineering
By:
Dr. Nasser Yazdani
Univ. of Tehran
Introduction to Computer Network
1
Outline

Allocating resource among competing
users.






Bandwidth on the line
Buffers on the routers
Congestion control and resource
allocation, two side of the same coin.
Different congestion control Policies.
Reacting to Congestion
Avoiding Congestion
Univ. of Tehran
Introduction to Computer Network
2
Congestion
10 Mbps
1.5 Mbps
100 Mbps


Different sources compete for resources
inside network
Why is it a problem?



Sources are unaware of current state of resource
Sources are unaware of each other
In many situations will result in < 1.5 Mbps of
throughput (congestion collapse)
Univ. of Tehran
Introduction to Computer Network
3
Issues

Two sides of the same coin


pre-allocate resources so to avoid congestion
control congestion if (and when) is occurs
Router
1.5-Mbps T1 link
Destination
Source
2

Two points of implementation



hosts at the edges of the network (transport protocol)
routers inside the network (queuing discipline)
Underlying service model


best-effort (assume for now)
Univ. of Tehran
Computer Network
multiple
qualities ofIntroduction
serviceto(later)
4
Why is Congestion Bad?


Wasted bandwidth: retransmission of dropped
packets.
Poor user service : unpredictable delay, reduced
throughput. Increased load can even result in lower
network throughput.
» Switched nets: heavy traffic -> long queues -> lost
packets ->retransmits
» Ethernet: high demand -> many collisions
» compare with highways: too much traffic slows down
throughput

Solutions? Redesign the network.
» Add capacity to congested links
» Reroute traffic.
» What are the options?
Univ. of Tehran
Introduction to Computer Network
5
Evaluation
We need to know how good is a resource
allocation/congestion aviodness
mechanism.




Fairness
Power of network (ratio of throughput to
delay), Maximize this ratio.
Distributedness
Efficiency
Cost
Throughput/delay

Optimal load
Univ. of Tehran
Introduction to Computer Network
Load
6
Evaluation (Fairness)

What is fair resource allocation?




Equal share of bandwidth
How about the length of paths?
Given a set of flow throughput (x1,x2,…, xn)
f(x1,x2,…, xn)= (Σi=1n xi)2/n Σi=1nxi2
Fairness is always between 0, 1 and 1
completely fair and 0 completely unfair.
Univ. of Tehran
Introduction to Computer Network
7
Possible Solutions

Redesign the network.
» Add capacity to congested links
» Very slow solution: takes days to months!

Reroute traffic.
» Alternate paths are not always available
» Also too slow: takes 10s of seconds
» In practice, most routing algorithms are static

Adjust the load in the network. Load
balancing
» What are the options?
Univ. of Tehran
Introduction to Computer Network
8
Principles of Congestion Control
Congestion:




informally: “too many sources sending too much
data too fast for network to handle”
different from flow control!
manifestations:
 lost packets (buffer overflow at routers)
 long delays (queuing in router buffers)
a top-10 problem!
Univ. of Tehran
Introduction to Computer Network
9
Causes/costs of congestion: scenario
1



two senders, two
receivers
one router, infinite
buffers
no retransmission


Univ. of Tehran
Introduction to Computer Network
large delays
when congested
maximum
achievable
throughput
10
Another scenario



four senders
multihop paths
timeout/retransmit
Univ. of Tehran
Q: what happens as lin
and lin increase ?
Introduction to Computer Network
11
Causes/costs of congestion: scenario
3
Another “cost” of congestion:
 when packet dropped, any “upstream
transmission capacity used for that packet
was wasted!
Univ. of Tehran
Introduction to Computer Network
12
Framework

Connectionless flows


sequence of packets sent between
source/destination pair
maintain soft state at the routers
Source
1
Router
Destination
1
Router
Source
2
Router
Destination
2
Source
3
Univ. of Tehran
Introduction to Computer Network
13
Framework (cont)

Taxonomy: different approaches for congestion Ctrl.

router-centric versus host-centric:



Router-Centric: each router takes the responsibility for dropping
packets, and informing the generating host the amount of traffic
allowed to be sent.
Host-Centric: the end hosts observe the amount of traffic that is
successfully getting through the network and adjust their
transmission rates accordingly.
reservation-based versus feedback-based


Reservation-Based: the end host asks the network for a certain
amount of bandwidth when requesting a connection (or flow). If
the network does not have enough bandwidth it will reject the
connection.
Feedback-Based: the end hosts send data without first reserving
any capacity and then adjust their sending rate according to the
feedback they received


Explicit feedback
Implicit feedback.
Univ. of Tehran
Introduction to Computer Network
14
Framework (cont)

window-based versus rate-based:




Window-Based: the receiver advertises a window to the
sender. The window corresponds to how much buffer space
the receiver has and it limits how much data the sender can
transmit. Network also can do that, like X.25.
Rate-Based: How many bit the sender can send or the
network can absorb.
Rate-Based can support video.
Rate-Based still is an open problem
Univ. of Tehran
Introduction to Computer Network
15
Where to Prevent Collapse?

Can end hosts prevent problem?



Yes, but must trust end hosts to do right
thing
E.g., sending host must adjust amount of
data it puts in the network based on
detected congestion
Can routers prevent collapse?
No, not all forms of collapse
 Doesn’t mean they can’t help
 Sending accurate congestion signals
 Isolating well-behaved from ill-behaved
sources
Univ. of Tehran
Introduction to Computer Network

16
Knee – point after
which
throughput increases
very slow
 delay increases fast
Cliff – point after which
 throughput starts to
decrease very fast to
zero (congestion
collapse)
 delay approaches infinity


knee
Delay

Throughput
What’s Really Happening?
packet
loss
cliff
congestion
collapse
Load
Load
Univ. of Tehran
Introduction to Computer Network
17
Goals
Operate near the knee point
 Remain in equilibrium
 How to maintain equilibrium?

Don’t put a packet into network until
another packet leaves. How do you do it?
 Use ACK: send a new packet only after
you receive and ACK (self-clocking)
 This maintains the number of packets in
network “constant”

Univ. of Tehran
Introduction to Computer Network
18
Self-clocking
Pb
Pr
Sender
Receiver
As
Univ. of Tehran
Ab
Ar
Introduction to Computer Network
19
How Do You Do It?

Detect when network approaches/reaches
knee point


Stay there
Questions
How do you get there?
 What if you overshoot (i.e., go over knee
point) ?


Possible solution:
Increase window size until you notice
congestion
 Decrease window size if network congested

Univ. of Tehran
Introduction to Computer Network
20
TCP Congestion Control

Idea




assumes best-effort network (FIFO or FQ
routers)each source determines network
capacity for itself
uses implicit feedback
ACKs pace transmission (self-clocking)
Challenge


determining the available capacity in the first
place
adjusting to changes in the available capacity
Univ. of Tehran
Introduction to Computer Network
21
Additive Increase/
Multiplicative Decrease


Objective: adjust to changes in the available capacity
New state variable per connection:
CongestionWindow (cwnd)
MaxWindow = min(cwnd, AdvertisedWindow)
EffectiveWindow = MaxWindow – (LastByteSent – LastByteAcked)
sequence number increases

Idea:


LastByteAcked
LastByteSent
increase Cwnd when congestion goes down
decrease Cwnd when congestion goes up
Univ. of Tehran
Introduction to Computer Network
22
AIMD (cont)


Question: how does the source determine whether
or not the network is congested?
Answer: a timeout occurs



timeout signals that a packet was lost
packets are seldom lost due to transmission error
lost packet implies congestion
Univ. of Tehran
Introduction to Computer Network
23
AIMD (cont)
Algorithm



increment Cwnd by one packet
per RTT (linear or additive
increase)
divide Cwnd by two whenever a
timeout occurs (multiplicative
decrease)
…

Destination
Source
In practice: increment a little for each ACK
Increment = MSS * (MSS/cwnd)
cwnd += Increment
MSS= Maximum segment size
Univ. of Tehran
Introduction to Computer Network
24
AIMD (cont)
KB

Trace: saw tooth behavior
70
60
50
40
30
20
10
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
10.0
Time (seconds)
Univ. of Tehran
Introduction to Computer Network
25
TCP: Slow Start

Question? How much is the initial cwnd size?




Small cwnd implies less network utilization
Big cwnd means congestion again
What is optimal of cwnd.
Solution: Start with cwnd =1 and Quickly
increase cwnd until network congested  get a
rough estimate of the optimal of cwnd


Each time a segment is acknowledged increment cwnd by
one (cwnd++)
Increase of cwnd is exponential
26
Slow Start Example



TCP slows down the
increase of cwnd
when
cwnd >=ssthresh
Ssthresh - slow
start threshold
value.
After ssthresh, TCP
change from slow
start to congestion
avoidance!
Univ. of Tehran
source
destination
cwnd = 1
cwnd = 2
cwnd = 4
cwnd = 8
Introduction to Computer Network
27
Slow Start/Congestion
Avoidance Example

Assume that
ssthresh = 8
Cwnd (in segments)
14
Client
Server
cwnd = 1
cwnd = 2
cwnd = 4
12
10
cwnd = 8
8
ssthresh
6
4
2
cwnd = 9
0
Roundtrip times
Univ. of Tehran
cwnd = 10
Introduction to Computer Network
28
Putting Everything Together:
TCP Pseudocode
Initially:
cwnd = 1;
ssthresh = infinite;
New ack received:
if (cwnd < ssthresh)
/* Slow Start*/
cwnd = cwnd + 1;
else
/* Congestion Avoidance */
cwnd = cwnd + 1/cwnd;
Timeout:
/* Multiplicative decrease */
ssthresh = win/2;
cwnd = 1; /* Again slow start */
29
The big picture
cwnd
Timeout
Congestion
Avoidance
Slow Start
Time
30
Slow Start (cont)


Exponential growth, but slower than all at
once
Used…


Trace
KB

when first starting connection
when connection goes dead waiting for timeout
70
60
50
40
30
20
10
1.0

2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
Problem: lose up to half a Cwnd’s worth of
data
Univ. of Tehran
Introduction to Computer Network
31
Packet Loss Detection
Wait for Retransmission Time Out (RTO)
 What’s the problem with this?
 Because RTO is performance killer
 In BSD TCP implementation, RTO is
usually more than 1 second

the granularity of RTT estimate is 500 ms
 retransmission timeout is at least two times
of RTT


Solution: Don’t wait for RTO to expire
Univ. of Tehran
Introduction to Computer Network
32
Fast Retransmit

Resend a segment
after 3 duplicate
ACKs


source
destination
cwnd = 1
Remember, a
duplicate ACK means
that an out-of
sequence segment
was received
cwnd = 2
cwnd = 4
Notes:


3 duplicate
Duplicate ACKs due
ACKs
packet reordering!
If window is small
won’t get duplicate
ACKs!
• Set cwnd to 1 after each retransmit!!
Univ. of Tehran
Introduction to Computer Network
33
KB
Results
70
60
50
40
30
20
10
1.0


2.0
3.0
4.0
5.0
6.0
7.0
Compare to slow start, it avoids long time
out waiting in 4.0 – 5.0 period.
How avoid slow start at each fast
retransmission?
Univ. of Tehran
Introduction to Computer Network
34
Fast Recovery

After a fast-retransmit set cwnd to
ssthresh/2

i.e., don’t reset cwnd to 1, then, avoid slow
start again!.
But when RTO expires still do cwnd = 1
 Fast Retransmit and Fast Recovery 
implemented by TCP Reno; most widely
used version of TCP today

Univ. of Tehran
Introduction to Computer Network
35
Fast Retransmit and Fast
Recovery
cwnd
Slow Start
Congestion
Avoidance
Fast retransmit



Prevent expensive timeouts
No need to slow start again
At steady state, cwnd oscillates
around the optimal window size.
Time
36
Congestion Control Summary


Architecture: end system detects congestion
and slow down
Starting point:

Slow start/congestion avoidance


Fast retransmission/fast recovery


packet drop detected by retransmission timeout RTO
as congestion signal
packet drop detected by (three) duplicate acks
Router support

Binary feedback scheme: explicit signaling

Today Explicit Congestion Notification [RF99]
Univ. of Tehran
Introduction to Computer Network
37
Congestion Control vs.
Congestion Avoidance
Congestion control goal


Stay left of cliff
Congestion avoidance knee
goal

Stay left of knee
Throughput

cliff
congestion
collapse
Load
Univ. of Tehran
Introduction to Computer Network
38
Congestion Avoidance

TCP’s strategy



Alternative strategy




control congestion once it happens
repeatedly increase load in an effort to find the
point at which congestion occurs, and then back
off
predict when congestion is about to happen
reduce rate before packets start being discarded
call this congestion avoidance, instead of
congestion control
Two possibilities


router-centric: DECbit and RED Gateways
host-centric: TCP Vegas
Univ. of Tehran
Introduction to Computer Network
39
DECbit


Add binary congestion bit to each packet header
Router

monitors average queue length over last busy+idle cycle
Queue length
Current
time
Previous
cycle
Averaging
interval


Current
cycle
Time
set congestion bit if average queue length > 1
attempts to balance throughout against delay
Univ. of Tehran
Introduction to Computer Network
40
End Hosts



Destination echoes bit back to source
Source records how many packets resulted in
set bit
If less than 50% of last window’s worth had bit
set


increase Cwnd by 1 packet
If 50% or more of last window’s worth had bit
set

decrease CongestionWindow by 0.875 times
Univ. of Tehran
Introduction to Computer Network
41
Random Early Detection (RED)

Notification is implicit



just drop the packet (TCP will timeout)
could make explicit by marking the packet
Early random drop

rather than wait for queue to become full, drop
each arriving packet with some drop probability
whenever the queue length exceeds some drop
level
Univ. of Tehran
Introduction to Computer Network
42
RED Details

Compute average queue length AvgLen
AvgLen = (1 - Weight) * AvgLen +
Weight * SampleLen
0 < Weight < 1 (usually 0.002)
SampleLen is queue length each time a packet arrives
MaxThreshold
MinThreshold
Queue
AvgLen
Univ. of Tehran
Introduction to Computer Network
43
RED Details (cont)

Two queue length thresholds
if AvgLen <= MinThreshold then
enqueue the packet
if MinThreshold < AvgLen < MaxThreshold
then
calculate probability P
drop arriving packet with probability P
if ManThreshold <= AvgLen then
drop arriving packet
Univ. of Tehran
Introduction to Computer Network
44
RED Details (cont)

Computing probability P
TempP = MaxP * (AvgLen - MinThreshold)/
(MaxThreshold - MinThreshold)
P = TempP/(1 - count * TempP)

Drop Probability Curve
Count is the # of queued from
last drop.
P(drop)
1.0
MaxP
AvgLen
MinThresh MaxThresh
Univ. of Tehran
Introduction to Computer Network
45
Tuning RED





P of a particular flow’s packet(s) is roughly
proportional to the share of the flow’s bandwidth.
MaxP is typically 0.02, meaning when is halfway
between the two thresholds, router drops roughly
one out of 50 packets.
If traffic is bursty, then MinTh. should be sufficiently
large to allow link utilization to be acceptably high.
Difference between two thresholds should be larger
than the typical increase in the calculated average
queue length in one RTT; setting MaxThreshold to
twice MinThreshold is reasonable.
Penalty Box for Offenders
Univ. of Tehran
Introduction to Computer Network
46
Sourced-based Cong. Avoidance

End hosts adapt traffic before any lost in the net.


Watch for router queue’s lengths by checking RTT.
A collection of related algorithms




In each 2 RTT check if RTT > (minRTT + maxRTT)/2 ->
decrease cwnd, cwnd -= cwnd/8. Increase otherwise.
Check (currentWind –oldWind)x (CurrentRTT- oldRTT), if
the result is positive, decrease cwnd, cwnd -= cwnd/8.
Increase 1 otherwise.
Check flattening of sending rate. Increase Cwnd by 1 in
each RTT, compare throughput with previous one, if it is
less than half of one packet, decrease cwnd by one.
TCP Vegas is like the last one with a difference to compare
to expected rate.
Univ. of Tehran
Introduction to Computer Network
47
TCP Vegas
Idea: source watches for some sign that router’s
queue is building up and congestion will happen
too; e.g.,

RTT grows

sending rate flattens
KB

Queue size in router
Sending KBps
•Top is cwn size
•Middle is ave. rate in source
•Bottom is ave. queue length
in the bottleneck router
•In Shade, Cwnd increase,
but ave. rate stay flat.
Univ. of Tehran
70
60
50
40
30
20
10
0.5
1.0 1.5
2.0
2.5 3.0
3.5 4.0 4.5
Time (seconds)
5.0
5.5
6.0
6.5
7.0 7.5
8.0 8.5
0.5
1.0 1.5
2.0
2.5 3.0
3.5 4.0 4.5
Time (seconds)
5.0
5.5
6.0
6.5
7.0 7.5
8.0 8.5
0.5
1.0 1.5
2.0
2.5 3.0
3.5 4.0 4.5
Time (seconds)
5.0
5.5
6.0
6.5
7.0 7.5
8.0 8.5
1100
900
700
500
300
100
10
5
Introduction to Computer Network
48
TCP Vegas

Keep track of extra data in the network.


Extra data is the amount more than available
bandwidth.
Maintain “Right” amount of extra data.
Univ. of Tehran
Introduction to Computer Network
49
Algorithm


Let BaseRTT be the minimum of all measured RTTs
(commonly the RTT of the first packet)
If not overflowing the connection, then
ExpectRate = Cwnd/BaseRTT


Source calculates sending rate (ActualRate) once
per RTT
Source compares ActualRate with ExpectRate
Diff = ExpectedRate - ActualRate
if Diff < a
increase Cwnd linearly
else if Diff > b
decrease Cwnd linearly
else
leave
Univ.
of TehranCwnd unchanged
Introduction to Computer Network
50
Algorithm (cont)
Parameters



a = 1 packet
b = 3 packets
KB

Top Cwn size
70
60
50
40
30
20
10
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0
5.0
5.5
6.0
6.5
7.0
7.5
8.0

• black line is
Actual rate
•Colored expected
rate
•Shade area is a
and b
CAM KBps
Time (seconds)
240
200
160
120
80
40
0.5
Even faster retransmit


1.0
1.5
2.0
2.5
3.0
3.5 4.0 4.5
Time (seconds)
keep fine-grained timestamps for each packet
check for timeout on first duplicate ACK
Univ. of Tehran
Introduction to Computer Network
51