Set with TCP in ppt

Download Report

Transcript Set with TCP in ppt

CMPE 252A:
Computer Networks
Lecture 15
CMPE252A Fall 2004
1
Announcements


Solution homework 4 posted.
Midterm: Wednesday, 11.17.


In class, closed book.
Final: Wednesday, 12.08, 7:3010:30pm.
CMPE252A Fall 2004
2
TCP protocol overview (cont’d)
CMPE252A Fall 2004
3
Flow- versus congestion
control?
CMPE252A Fall 2004
4
TCP flow control

Sliding window.

Receiver’s advertised window.


Size of advertised window related to receiver’s
buffer space.
Sender can send data up to receiver’s
advertised window.
CMPE252A Fall 2004
5
TCP flow control: example
App. writes
2K of data
App. does
3K write
Sender
blocked
Sender
may send up
to 2K
4K
2K;SEQ=0
2K
ACK=2048; WIN=2048
2K; SEQ=2048
0
App. reads
2K of data
ACK=4096; WIN=0
ACK=4096; WIN=2048
1K; SEQ=4096
2K
1K
CMPE252A Fall 2004
6
TCP flow control: observations

TCP sender not required to transmit
data as soon as it comes in from
application.


Example: when first 2KB of data comes in,
could wait for more data since window is
4KB.
Receiver not required to send ACKs as
soon as possible.

Wait for data so ACK is piggybacked.
CMPE252A Fall 2004
7
Delayed ACKs



Tries to optimize ACK transmission.
Delay ACKs and window update (500msec)
hoping to piggyback on data segment.
Example: telnet to interactive editor:




Send 1 character at a time: 20-byte TCP header+
1-byte data+20-byte IP header.
Receiver ACKs immediately: 40-byte ACK.
When editor reads character, window update:
40-byte datagram.
Then echoes character back: 41-byte datagram.
CMPE252A Fall 2004
8
Nagle’s algorithm


Tries to optimize sending of small data
chunks.
Example: telnet to interactive editor.


Send first byte and buffer the rest until
outstanding byte is ACKed; then send all
buffered data in one segment; buffer until
next ACK.
Disabled in some cases (e.g., window
application: mouse movements).
CMPE252A Fall 2004
9
Silly window syndrome

Caused by receiver sending window updates of
very small values.

Example:



Receiver application reads 1 byte at a time and receiver TCP
sends 1-byte window update.
Sender TCP has large blocks to send but can only send 1
byte at a time.
Solution: force receiver to wait until it can
handle its advertised window size or until its
buffer is half empty.
CMPE252A Fall 2004
10
Congestion Control
CMPE252A Fall 2004
11
What’s congestion control?
CMPE252A Fall 2004
12
What is congestion control?

“Efforts made by network nodes to prevent or
respond to overload condition”,
[Peterson&Davie].


Flow control?


Prevent senders from sending too much data into
network.
Prevents (faster) sender from overrunning
(slower) receiver.
Both concepts are often confused as they
share common mechanisms.
CMPE252A Fall 2004
13
Why do congestion control?




Use network resources efficiently.
Preserve fair network resource
allocation.
Prevent or avoid congestion collapse.
Congestion collapse is not just a theory!

Has been frequently observed in many
networks.
CMPE252A Fall 2004
14
Network dynamics
delay
throughput
load
CMPE252A Fall 2004
load
15
Taxonomy [Peterson&Davie]

Where?


Host- or router-based.
Router-based: routers do most of the work.



Host-based:


Decide what to forward/drop.
Inform hosts.
Hosts observe network and act accordingly.
Not mutually exclusive.


Network-level cc: hosts are expected to adhere.
E2E cc: routers are expected to act. (how?)
CMPE252A Fall 2004
16
Taxonomy (cont’d)

How?



Reservations or feedback?
Reservation-based: negotiation and setup.
Feedback-based: send and adjust.

Explicit versus implicit feedback.
CMPE252A Fall 2004
17
Taxonomy (cont’d)

Control parameter: (how much?)

Window- or rate-based.
CMPE252A Fall 2004
18
Congestion control approaches


When?
Avoidance versus detection/recovery.


TCP: acts upon congestion indication.
“Creates losses to find available bandwidth.”


Avoidance: doesn’t let congestion happen.



Then recover.
Try to predict it, and
Backoff.
Typically congestion avoidance need router
support: network-level cc.
CMPE252A Fall 2004
19
Router congestion notification



Router has unified view of queuing
behavior.
Routers can distinguish between
propagation and persistent queuing
delays.
Routers can decide on transient
congestion, based on workload.
CMPE252A Fall 2004
20
Router mechanisms

Congestion notification



DEC-bit scheme [Ramakrishnan and Jain].
Explicit congestion notifications (ECN)
[Floyd et al.].
Queuing disciplines (RED, etc).
CMPE252A Fall 2004
21
The DEC-bit scheme


Developed for the Digital Network
Architecture (DNA).
Basic ideas:




On congestion, router sets congestion indication
(CI) bit on packet.
Receiver relays bit to sender.
Sender adjusts sending rate.
Key design questions:


When to set CI bit?
How does sender respond to CI?
CMPE252A Fall 2004
22
Setting the CI bit



Compute average queue length.
Set bit on arriving packet when AVG_Q
> 1; if AVG_Q > 2, set bit in all
packets.
Threshold of 1 is a trade-off between
queuing and delay.


Optimizes power function (throughput /
delay).
Compromise: sensitivity and stability.
CMPE252A Fall 2004
23
Computing average queue
length
Q
current time
time
Previous cycle
Current cycle
Averaging interval
Busy: router transmitting.
AVG_Q = Q(previous busy+idle + current interval)/(averaging interval)
CMPE252A Fall 2004
24
Sender’s response

Sender policy:


Monitor packets within window.
Make change if more than 50% of packets
had CI set:



If < 50% had CI set, then increase window by
1
Else new window = window * 0.875.
Additive increase, multiplicative decrease
for stability
CMPE252A Fall 2004
25
DEC-bit evaluation






Relatively easy to implement.
No per-connection state.
Stable.
Assumes cooperative sources.
Conservative window increase policy.
More conservative window decrease.
CMPE252A Fall 2004
26
Avoidance or control?

Avoidance keeps system at knee of curve.


But, to do that, need accurate congestion signals
(feedback).
Sending host must adjust amount of data it
puts in the network based on detected
congestion.


TCP uses window.
But what’s the right strategy to increase/decrease
window?
CMPE252A Fall 2004
27
Feedback control model



Reduce window when congestion is
perceived.
Increase window otherwise.
Constraints:



Efficiency.
Fairness.
Stability or convergence (the system must
not oscillate significantly).
CMPE252A Fall 2004
28
Linear control
X (t + 1) = a + b X (t)
 Formulation allows for the feedback signal:



To be increased/decreased additively (a).
To be increased/decreased multiplicatively (b).
What class of control simultaneously satisfies
efficiency+fairness+stability?

Additive increase, multiplicative decrease.
CMPE252A Fall 2004
29
E2E congestion control

Why do it at the transport layer?


Use law of “conservation of packets”.



Real fix to congestion is to slow down sender.
Keep number of packets in the network constant.
Don’t inject new packet until old one leaves.
Congestion indicator: packet loss.
CMPE252A Fall 2004
30
TCP Congestion Control

Underlying design principle: packet
conservation


At equilibrium, inject packet into network
only when one is removed.
Basis for stability of physical systems.
CMPE252A Fall 2004
31
Self-clocking

If we have large actual window, should
we send data in one shot?

No, use ACKs to clock sending new data.
CMPE252A Fall 2004
32
..Self-clocking
Pb
Pr
receiver
sender
As
Ab
Ar
CMPE252A Fall 2004
33
TCP congestion control basics

Like, flow control, also window based.




Sender keeps congestion window (cwin).
Each sender keeps 2 windows: receiver’s
advertised window and congestion window.
Sender’s maximum window:
min(advertised window, cwin).
Sender’s actual window:

Max window - unacknowledged segments.
CMPE252A Fall 2004
34
TCP congestion control
mechanisms

Collection of interrelated mechanisms:





Slow start.
Congestion avoidance.
Accurate retransmission timeout
estimation.
Fast retransmit.
Fast recovery.
CMPE252A Fall 2004
35
Slow start [Jacobson 1988]

How do we get this self-clocking behavior to
start?



Initialize cwnd = 1.
Upon receipt of every ack, cwnd = cwnd + 1
Implications


Window actually increases in RTT * log2W
(exponential increase).
Can overshoot window and cause packet loss.
 Connection’s congestion window starts at 1
segment.
CMPE252A Fall 2004
36
Slow start example
one RTT
0R
1
one pkt time
1R
1
2
3
2R
2
3
4
5
3R
4
6
7
5
8
9
6
10
11
7
12
13
14
15
CMPE252A Fall 2004
37
Slow start sequence plot
Data (KB)
time
CMPE252A Fall 2004
38
Congestion avoidance


Timeout as loss indicator.
If loss occurs:


Set cwnd to 0.5W (multiplicative decrease).
Upon receiving ACK


Increase cwnd by 1/cwnd (additive
increase).
Multiplicative increase -> non-convergence.
CMPE252A Fall 2004
39
Congestion avoidance
sequence plot
CMPE252A Fall 2004
40
Slow start and congestion
avoidance

If packet is lost, self clocking is lost.



Slow-start and congestion avoidance in
concert.
Third parameter: sshresh.
When timeout occurs set ssthresh to
0.5cwnd.


cwnd=1 and use slow start until ssthresh.
Then use congestion avoidance.
CMPE252A Fall 2004
41
cwin
TCP congestion window
timeout
threshold
threshold
time
CMPE252A Fall 2004
42
Steady state
Congestion
window
time
CMPE252A Fall 2004
43
TCP retransmission timer

When segment sent, retransmission
timer starts.


If segment ACKed, timer stops.
If time out, segment retransmitted and
timer starts again.
CMPE252A Fall 2004
44
How to set timer?



Based on round-trip time: time between
a segment is sent and ACK comes back.
If timer is too short, unnecessary
retransmissions.
If timer is too long, long retransmission
delay.
CMPE252A Fall 2004
45
Jacobson’s algorithm (1)

Determining the round-trip time:





TCP keeps RTT variable.
When segment sent, TCP measures how
long it takes to get ACK back (M).
RTT = alpha*RTT + (1-alpha)M.
alpha: smoothing factor; determines
weight given to previous estimate.
Typically, alpha=7/8.
CMPE252A Fall 2004
46
Jacobson’s algorithm (2)

Determining timeout value:




Measure RTT variation, or |RTT-M|.
Keeps smoothed value of cumulative
variation D=alpha*D+(1-alpha)|RTT-M|.
Alpha may or may not be the same as
value used to smooth RTT.
Timeout = RTT+4*D.
CMPE252A Fall 2004
47
Karn’s algorithm

How to account for ACKs of
retransmitted segments?



Count it for first or second transmission?
Karn proposed not to update RTT on any
retransmitted segment.
Instead RTT is doubled on each failure until
segments get through.
CMPE252A Fall 2004
48
Tuning TCP’s original
congestion control


Timeouts can cause connections to be idle for
long time waiting for timer to expire.
Fast retransmit: may trigger retransmission of
dropped packet sooner.


Complements regular timeouts.
Fast recovery: avoids unnecessary slowstart’s.
CMPE252A Fall 2004
49
Fast retransmit

When can duplicate ACKs occur?




Loss.
Packet re-ordering.
Sender waits for some number of duplicate ACKs
before retransmitting.
Assume packet re-ordering is infrequent.


Use receipt of 3 or more duplicate acks as
indication of loss.
Retransmit that segment before timeout.
CMPE252A Fall 2004
50
Fast recovery

Goal: reduces number of times
connection is slow-started.



Uses ACKs in the pipe for self-clocking.
In congestion avoidance mode, after
fast retransmit, reduce cwnd to half
(rather than dropping it to 1).
Slow start only used in the beginning of
connection or when timeout occurs.
CMPE252A Fall 2004
51
Fast recovery
CMPE252A Fall 2004
52
Fast retransmit and recovery

If we get 3 duplicate acks for segment N:




For every subsequent duplicate ACK:


Retransmit segment N.
Set ssthresh to 0.5*cwnd.
Set cwnd to ssthresh + 3.
Increase cwnd by 1 segment.
When new ACK received:

Reset cwnd to ssthresh (resume congestion
avoidance).
CMPE252A Fall 2004
53
TCP flavors


Tahoe, Reno, New-Reno, SACK.
TCP Tahoe (distributed with 4.3BSD Unix)


Original implementation of van Jacobson’s
mechanisms (VJ paper).
Includes:



Slow start (exponential increase of initial window).
Congestion avoidance (additive increase of window).
Fast retransmit (3 duplicate acks).
CMPE252A Fall 2004
54
TCP Reno


1990.
Includes:




All mechanisms in Tahoe.
Addition of fast-recovery (opening up
window after fast retransmit).
Delayed ACKs (to avoid silly window
syndrome).
Most widely deployed variant.
CMPE252A Fall 2004
55
TCP New-Reno

In Reno’s fast recovery, if multiple
packet drops within window.


Can cause window to deflate prematurely.
In New-Reno:


Remember outstanding packets at start of
fast recovery.
If new ACK is only a partial ACK, assume
following segment was lost and resend,
don’t exit fast recovery.
CMPE252A Fall 2004
56
TCP SACK


Reno suffers timeouts with more than 2
losses per window.
New-Reno avoids that, but can only re-send
one dropped packet per RTT.


Because it can learn of multiple losses only once
per RTT.
TCP SACK


Implements the SACK option in TCP.
Can transmit more than one dropped packet
because the sender now knows which packet was
dropped.

Sends dropped packets in preference to new data.
CMPE252A Fall 2004
57
Wireless TCP (1)


According to layered system design
principles, transport protocol should be
independent of underlying technology.
However, wireless networks invalidate
this principle.


Ignoring properties of wireless medium can
lead to poor TCP performance.
Problem: TCP’s congestion control.
CMPE252A Fall 2004
58
Wireless TCP (2)

Problem: packet loss as congestion
indicator.


When retransmission timer times out,
sender slows down.
Wireless links are lossy!

Dealing with losses in this case should be
re-sending lost segments asap.
CMPE252A Fall 2004
59
Indirect TCP (I-TCP)


[Bakne and Badrinath, 1995].
Split TCP connection in 2: one from sender to
base station and the other from base station to
receiver.



Base station serves as “repeater”: copies segments
between connections in both directions.
Connections are homogeneous; timeouts on 1st.
connection, slow down sender.
Problem: violates TCP’s e2e’ness.

Example: ACKs to sender mean base station received
segments, not necessarily receiver.
CMPE252A Fall 2004
60
Snoop TCP



[Balakrishnan et al., 1995].
Does not break connection.
Modifications to base station’s network layer
code.



Snooping agent on base station observes and caches
TCP segments sent to mobile host and ACKs coming
back.
If it doesn’t see an ACK for a segment or sees
duplicate ACKs, it times out and retransmits.
But source may time out anyway.
CMPE252A Fall 2004
61
Explicit Congestion Notification
CMPE252A Fall 2004
62
ECN

Router-assisted scheme.



DECbit as example of ECN scheme.
In TCP/IP networks most routers have no
way of detecting incipient congestion.
Advantages:



Congestion avoidance.
Detect incipient congestion.
Avoid unnecessary packet drops.
CMPE252A Fall 2004
63
Application’s perspective

Benefits bulk-data and interactive
traffic.


Delay sensitive traffic: packet drops and
retransmissions mean more delay.
Bulk-data traffic: responsiveness.

Even if fast retransmit is used.
CMPE252A Fall 2004
64
Existing ECN mechanisms

In TCP/IP networks.



ICMP Source Quench as the only existing
ECN mechanism.
Generated by routers when they run out of
buffers [RFC 1009].
But Source Quench have not been
recommended [Requirements for IP
Routers] because:


Consume network resources.
Fairness?
CMPE252A Fall 2004
65
TCP’s response

According to RFC 1009, TCP responds
by triggering slow-start.


If packet drop, then timeout or fast
retransmit follows.
ssthresh is set low, therefore, congestion
avoidance becomes too conservative.
CMPE252A Fall 2004
66
Router’s role

Decides when to generate ECN
messages.



Influenced by router queuing policy.
In the paper, they consider RED routers.
Monitoring of average queue size.


If above threshold, probabilistic algorithm to
choose packets to mark (drop or set ECN field).
ECN and Drop Tail.
CMPE252A Fall 2004
67
Proposed TCP response




TCP should be less conservative to ECN
than to drops.
It should respond to single ECN
messages.
It should react to ECN at most once per
RTT.
Normal procedure after timeout.

No further decrease of ssthresh.
CMPE252A Fall 2004
68
Characterizing TCP connection
bandwidth

What does TCP bandwidth depend upon?



Packet loss rate
Round-trip time!
Useful to mathematically characterize this

Can help determine TCP-friendliness

A connection is not TCP-friendly if it uses more
bandwidth than a conformant TCP would under the same
circumstances
CMPE252A Fall 2004
69
The TCP-friendly equation





B(p) is the long-term bandwidth allocated to a
connection
Wmax is the determined by maximum receiver
advertised buffer size.
p is the packet loss rate
RTT is the round-trip time of the connection
T0 is the retransmission timeout
CMPE252A Fall 2004
70