Transport Overview

Transcript Transport Overview

Computer Networks
Lecture 17
Congestion Control; AIMD; TCP Reno
10/31/2013
1
Admin.: Assignment 4
Part 1: Design discussion with instructor or a TF: Nov. 11;
Code check point: 11:55 pm, Nov. 15
Part 2: Design discussion with instructor or a TF: Nov. 14;
Code and report: 11:55 pm, Nov. 19.
proj-sol (part 1):
proj:
129 400 3045 FishThread.java
129
388 1457 12873 Node.java
341
51 167 1145 PingRequest.java
51
83 250 2106 SimpleTCPSockSpace.java
181 605 5248 TCPManager.java
50
889 3088 26381 TCPSock.java
132
60 149 1316 TCPSockID.java
123
123 382 3866 TransferClient.java
147
147 500 5059 TransferServer.java
973
2051 6998 61039 total
400 3045 FishThread.java
1301 11313 Node.java
167 1145 PingRequest.java
128 909 TCPManager.java
460 3146 TCPSock.java
382 3866 TransferClient.java
500 5059 TransferServer.java
3338 28483 total
2
Recap: Connection Management
Connection setup:
agree on initial sequence numbers to avoid
accepting duplicate/out of order packets
Internet solution: Three-Way-Handshake (TWH)
Connection close
both sides can release allocated resources
Internet solution: time wait
3
%netstat -t -a
CLOSED
SYN
SENT
CLOSED
LISTEN
SYN
RCVD
ESTABLSIHED
ESTABLSIHED
ESTABLSIHED
ESTABLSIHED
FIN
WAIT 1
CLOSE
WAIT
FIN
WAIT 2
LAST
ACK
TIME
WAIT
4
A Summary of Questions
 How to improve the performance of rdt3.0?

sliding window protocols
What if there are duplication and
reordering?



network guarantee: max packet life time
transport guarantee: sender not reuses a seq#
before life time
seq# management and connection management
How to determine the “right” parameters?
5
History
Key parameters for TCP in mid-1980s
fixed window size W
timeout value = 2 RTT
Network collapse in the mid-1980s
UCB  LBL throughput dropped by 1000X !
Blamed timeout and fixed window size
6
Timeout
Ideal timeout = RTT
too short: premature timeout
• unnecessary retransmissions/duplicates
too long: slow reaction to segment loss
Measure RTT: measure
time from segment
transmission until ACK
receipt
7
From Ideal to Implementation
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
350
300
RTT (milliseconds)
Problem: SampleRTT
vary, want a “smoother”
estimated RTT
o use several recent
measurements, not
just current SampleRTT
250
200
150
100
1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
time (seconnds)
SampleRTT
Estimated RTT
EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT
Exponential weighted moving average
influence of past sample decreases exponentially fast
typical value:  = 0.125
8
Setting Timeout
Problem:
using the average of
SampleRTT will generate
many timeouts due to
network variations
RTT
Solution:
EstimtedRTT plus “safety
margin”
Q: what is margin in original TCP
Margin should depend on
variation: large variation in
EstimatedRTT -> larger
safety margin
9
TCP Timeout
freq.
RTT
DevRTT = (1-)*DevRTT + *|SampleRTT-EstimatedRTT|
(typically,  = 0.25)
Then set timeout interval:
TimeoutInterval = EstimatedRTT + 4*DevRTT
10
An Example TCP Session
11
A Summary of Questions
 How to improve the performance of rdt3.0?

sliding window protocols
What if there are duplication and
reordering?



network guarantee: max packet life time
transport guarantee: sender not reuses a seq#
before life time
seq# management and connection management
How to determine the “right” parameters?
 timeout: mean + variation
o sliding window size
12
Principles of Congestion Control
Big picture:
How to determine a flow’s sending rate?
a top-10 problem !
Congestion:
informally: “too many sources sending too much data
too fast for the network to handle”
different from flow control !
manifestations:
lost packets (buffer overflow at routers)
long delays (queueing in router buffers)
wasted bandwidth
13
Some General Questions
How can congestion happen?
What is congestion control?
Why is congestion control difficult?
14
Cause/Cost of Congestion: Single Bottleneck
flow 1
10 Mbps
flow 2 (5 Mbps)
router 1
router 2
- Flow 2 has a fixed sending rate of 5 Mbps
- We vary the sending rate of flow 1 from 0 to 20 Mbps
- Assume
no retransmission; link from router 1 to router 2 has infinite buffer
throughput: e2e packets
delivered in unit time
Delay?
delay at central link
throughput of
flow 1 & 2 (Mbps)
10
sending rate
by flow 1 (Mbps)
sending rate
by flow 1 (Mbps)
5
0
5
0
5
15
Cause/Cost of Congestion: Single Bottleneck
router 5
router 3
flow 1
10 Mbps
flow 2 (5 Mbps)
router 2
router 1
router 4
router 6
Assume
no retransmission
the link from router 1 to router 2 has finite buffer
throughput: e2e packets delivered in unit time
throughput of
flow 1 & 2 (Mbps)
x
x 5
min(
10,5) 
10
sending rate
by flow 1 (Mbps)
5
0
5
x
5
x 5
10
Zombie packet: a packet
dropped at the link from
router 2 to router 5; the
upstream transmission
from router 1 to router 2
used for that packet was
wasted!
16
wasted upstream
bandwidth when a pkt is
discarded at
downstream
wasted bandwidth due to
retransmission (a pkt
goes through a link
multiple times)
Delay
Cost
Packet loss
Throughput
Summary: The Cost of Congestion
knee
packet
loss
cliff
congestion
collapse
Load
High delay
Load
17
Outline
Admin and recap

Transport congestion control


what is congestion
congestion control
18
Rate-based vs. Window-based
Rate-based:
Congestion control by
explicitly controlling
the sending rate of a
flow, e.g., set sending
rate to 128Kbps
Example: ATM
Window-based:
Congestion control by
controlling the window
size of a transport
scheme, e.g., set
window size to
64KBytes
Example: TCP
Discussion: rate-based vs. window-based
19
Window-based Congestion Control
A main advantage of window-based congestion
control is self-clocking: considers flow
conservation, and adjusts to RTT variation
automatically
20
Sliding Window Congestion Control
Transmission rate limited by congestion
window size, cwnd, over segments:
cwnd
Assume cwnd segments, each with MSS bytes sent
in one RTT:
throughput 
MSS: Minimum Segment Size
cwnd * MSS
Bytes/sec
RTT
21
The Desired Properties of a
Congestion Control Scheme
Efficiency: close to full utilization but low
delay
- fast convergence after disturbance
Fairness (resource sharing)
Distributedness (no central knowledge for
scalability)
22
A Simple Model
User 1
x1
User 2
x2

d=
 xi > Xgoal?
xn
User n
Flows observe congestion signal d, and locally take
actions to adjust rates.
23
Linear Control
Proposed by Chiu and Jain (1988)
The simplest control strategy
 aI  bI xi (t )
xi (t  1)  
aD  bD xi (t )
if d(t)  no cong.
if d(t)  cong.
Discussion: values of the parameters?
24
State Space of Two Flows
 aI  bI xi (t )
xi (t  1)  
aD  bD xi (t )
if d(t)  no cong.
if d(t)  cong.
fairness
line: x1=x2
x2
x(0)
overload
efficiency line:
x1+x2=C
underload
x1
25
congestion
x0
 a  bI xi (t )
xi (t  1)   I
aD  bD xi (t )
if d(t)  no cong.
if d(t)  cong.
x0
efficiency
x0
efficiency: distributed linear rule
x0
fairness
intersection
26
Implication: Congestion (overload) Case
In order to get closer to efficiency and
fairness after each update, decreasing of
rate must be multiplicative decrease (MD)
aD = 0
bD < 1
aI  bI xi (t )
xi (t  1)  
 bD xi (t )
if d(t)  no cong.
if d(t)  cong.
27
no-congestion
x0
 a  bI xi (t )
xi (t  1)   I
aD  bD xi (t )
if d(t)  no cong.
if d(t)  cong.
x0
efficiency
x0
fairness
efficiency: distributed linear rule
x0
convergence
28
Implication: No Congestion Case
In order to get closer to efficiency and
fairness after each update, additive and
multiplicative increasing (AMI), i.e.,
aI > 0, bI > 1
aI  bI xi (t )
xi (t  1)  
 bD xi (t )
if d(t)  no cong.
if d(t)  cong.
Simply additive increase gives better
improvement in fairness (i.e., getting closer
to the fairness line)
Multiplicative increase is faster
29
Four Special Cases
Additive
Increase
Multiplicative
Increase
Additive
Decrease
Multiplicative
Decrease
AIAD
(bI=bD=1)
AIMD
(bI=1, aD=0)
MIAD
(aI=0, bI>1, bD=1)
MIMD
(aI=aD=0)
 aI  bI xi (t )
xi (t  1)  
aD  bD xi (t )
Discussion: state transition trace.
if d(t)  no cong.
if d(t)  cong.
30
AIMD: State Transition Trace
x2
fairness line:
x1=x2
overload
x0
underload
efficiency line:
x1+x2=C
x1
31
Another Look
Consider the difference or ratio of the rates
of two flows
AIAD
MIMD
MIAD
AIMD
32
Mapping A(M)I-MD to Protocol
What do we need to apply the A(M)I-MD
algorithm to a sliding window protocol?
aI  xi (t )
xi (t  1)  
 bD xi (t )
if d(t)  no cong.
if d(t)  cong.
33
Mapping A(M)I-MD to Protocol
What do we need to apply the A(M)I-MD
algorithm to a sliding window protocol?
aI  xi (t )
xi (t  1)  
 bD xi (t )
if d(t)  no cong.
if d(t)  cong.
34
Implicit vs. Explicit
Explicit:
routers provide feedback to
end systems
explicit rate sender
should send at
single bit indicating
congestion (SNA,
DECbit, TCP ECN, ATM)
Implicit:
congestion inferred by end
systems through observed
loss, delay
35
Outline
Admin and recap

Transport congestion control
what is congestion
 congestion control principle


TCP congestion control: Reno
36
TCP Congestion Control
Closed-loop, end-to-end, window-based congestion
control
Designed by Van Jacobson in late 1980s, based on
the AIMD alg. of Dah-Ming Chu and Raj Jain
Works well when the bandwidth of the Internet has
increased by more than 200,000 times
Many versions
TCP/Tahoe: this is a less optimized version
TCP/Reno: many OSs today implement Reno type
congestion control
TCP/Vegas: not currently used
TCP CUBIC
For more details: see TCP/IP illustrated; or read
http://lxr.linux.no/source/net/ipv4/tcp_input.c for linux implementation
37
TCP/Reno Congestion Detection
Detect congestion
(d) in two cases
and react
differently:
3 dup ACKs
timeout event
Philosophy:
• 3 dup ACKs indicates
network capable of
delivering some segments
• timeout is “more
alarming”
38
Basic Structure
Two “phases”
slow-start: MI
congestion avoidance: AIMD
Important variables:
cwnd: congestion window size
ssthresh: threshold between the slowstart phase and the congestion avoidance
phase
39
Visualization of the Two Phases
40
Slow Start: MI
What is the goal?
getting to equilibrium gradually but quickly
Implements the MI algorithm
double cwnd every RTT until network congested
 get a rough estimate of the optimal of cwnd
41
Slow-start
Initially:
cwnd = 1;
ssthresh = infinite (e.g., 64K);
For each newly ACKed segment:
if (cwnd < ssthresh)
/* slow start*/
cwnd = cwnd + 1;
cwnd = 1
cwnd = 2
cwnd = 4
cwnd = 6
cwnd = 8
42
Startup Behavior with Slow-start
See [Jac89]
43
TCP/Reno Congestion Avoidance
Maintains equilibrium and reacts around
equilibrium
Implements the AIMD algorithm
increases window by 1 per round-trip time (how?)
cuts window size
• to half when detecting congestion by 3DUP
• to 1 if timeout
• if already timeout, doubles timeout
44
TCP/Reno Congestion Avoidance
Initially:
cwnd = 1;
ssthresh = infinite (e.g., 64K);
For each newly ACKed segment:
if (cwnd < ssthresh)
/* slow start*/
cwnd = cwnd + 1;
else
/* congestion avoidance; cwnd increases (approx.)
by 1 per RTT */
cwnd += 1/cwnd;
Triple-duplicate ACKs:
/* multiplicative decrease */
cwnd = ssthresh = cwnd/2;
Timeout:
ssthresh = cwnd/2;
cwnd = 1;
(if already timed out, double timeout value; this is called exponential backoff)
45
TCP/Reno: Big Picture
cwnd
TD
TD
TD
TO
ssthresh
ssthresh
ssthresh
ssthresh
Time
slow
start
congestion
avoidance
congestion
avoidance
congestion
avoidance
slow congestion
start avoidance
TD: Triple duplicate acknowledgements
TO: Timeout
46
A Session
Question: when cwnd is cut to half, why sending rate is not?
47
Backup Slides
48
High-level Idea
Ideally, we set timeout = RTT, but RTT is not a
fixed value (why?)
Set timeout = average + safe margin
49
High-level Idea
Ideally, we set timeout = RTT, but RTT is not a
fixed value (why?)
Set timeout = average + safe margin
50

Transport Overview

Transcript Transport Overview

Directory