TCP - Design and Analysis of Algorithms

Download Report

Transcript TCP - Design and Analysis of Algorithms

Transmission Control Protocol (TCP),
Tahir Azim
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
1
The Need for TCP



Network Layer (IP) provides a best-effort service
Need to build a reliable layer on top of IP
Possible problems sending messages from one host
to another:






May be lost
May be reordered
May be delivered multiple times
May be delivered after a long delay
May be broken into smaller messages
May be corrupted
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
2
Need for TCP 2

Support services needed by the application








Multiple connections per host
Guaranteed delivery
Messages delivered in the order they were sent
Messages delivered at most once
No limit on message size
Messages not corrupted
Synchronization between sender and receiver
Flow control
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
3
Can this be done hop-by-hop
instead of end-to-end?


Think of this like the postal service
One major problem: Nodes would not know what
happened two or more hops ahead of them
 Routing tables incorrect, packets dropped, machine crashed,
link failed…
 Only ends can be sure of receipt/non-receipt of a message

For others: read “End-to-end arguments for
System Design”… Classic, must-read paper, could
be on exam
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
4
TCP Usage Model

Connection setup
 􀁻 3-way handshake

Data transport
 Sender writes data
 TCP
 Breaks data into segments
• Sends each segment over IP
• Retransmits, reorders and removes duplicates as necessary
• Receiver reads some data

Teardown
 4 step exchange
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
5
TCP Characteristics

TCP is connection-oriented.




TCP provides a stream-of-bytes service.
TCP is full-duplex; data can flow in both directions
simultaneously
TCP is reliable:







3-way handshake used for connection setup.
Acknowledgements indicate delivery of data.
Checksums are used to detect corrupted data.
Detects missing sequence numbers or mis-sequenced data.
Corrupted data is retransmitted after a timeout.
Out of order bytes of data are reordered.
(Window-based) Flow control prevents over-run of receiver.
TCP uses congestion control to share network capacity
among users. We’ll study this in the next lectures.
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
6
TCP is connection-oriented
(Active)
Client
(Passive)
Server
Syn (ISNA)
(Active)
Client
Syn (ISNB) + Ack (ISNA+1)
(Passive)
Server
Fin (SNA)
(Data +) Ack (SNA+1)
Ack (ISNB+1)
Fin (SNB)
Ack (SNB+1)
Connection Setup
3-way handshake
Connection Close/Teardown
2 x 2-way handshake
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
7
TCP supports a “stream of
bytes” service
Host A
Host B
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
8
…which is emulated using TCP
“segments”
Host A
Segment sent when:
TCP Data
Host B
1.
TCP Data
Segment full (MSS bytes,
default 352),
2. Not full, but times out, or
3. “Pushed” by application.
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
9
The TCP Segment Format
IP Data
TCP Data
0
TCP Hdr
15
Src port
IP Hdr
31
Dst port
Sequence #
Ack Sequence #
RSVD
6
Flags
URG
ACK
PSH
RST
SYN
FIN
HLEN
4
Checksum
Window Size
Urg Pointer
(TCP Options)
TCP Data
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
10
TCP Segment Header



16-bit source and destination ports
32-bit send and ACK sequence numbers
4-bit header length in 4-byte words
 􀁻 Minimum value of 5
 􀁻 Offset to first data byte

6 1-bit flags






􀁻 URG: Segment contains urgent data
􀁻 ACK: ACK sequence number is valid
􀁻 PSH: Do not delay delivery of data
􀁻 RST: Reset connection (reject or abn. termination)
􀁻 SYN: Synchronize segment for setup
􀁻 FIN: Final segment for teardown
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
11
TCP Segment header

16-bit advertised window
 􀁻 Space remaining in receive window

16-bit checksum
 􀁻 Uses IP checksum algorithm
 􀁻 Computed on header and data

16-bit urgent data pointer
 􀁻 If URG = 1
 􀁻 Index of last byte of urgent data in segment
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
12
TCP Options

Negotiate maximum segment size (MSS)
 Each host suggests a value
 Minimum of two values is chosen
 Prevents IP fragmentation over first and last hops

Packet timestamp
 Allows RTT calculation for retransmitted packets
 Extends sequence number space for identification of stray
packets

Negotiate advertised window granularity
 Allows windows to be scaled to larger sizes
 Good for routes with large bandwidth-delay products
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
13
Sequence Numbers
Host A
ISN (initial sequence number)
Sequence
number = 1st
byte
Host B
TCP Data
TCP
HDR
TCP Data
Ack sequence
number = next
expected byte
TCP
HDR
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
14
TCP State Transition Diagram

Original ASCII art diagram in RFC 793
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
15
TCP State Descriptions
CLOSED
LISTEN
Disconnected
Waiting for incoming connection
SYN_RCVD
Connection request received
SYN_SENT
Connection request sent
ESTABLISHED Connection ready for data transport
CLOSE_WAIT
Connection closed by peer
LAST_ACK
Connection closed by peer, closed locally, await ACK
FIN_WAIT_1
FIN_WAIT_2
Connection closed locally
Connection closed locally and ACK’d
CLOSING
Connection closed by both sides simultaneously
TIME_WAIT
Wait for network to discard related packets
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
16
TCP State Transition Diagram
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
17
TCP State Transition Diagram
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
18
TCP Sliding Window
How much data can a TCP sender have
outstanding in the network?
 How much data should TCP retransmit
when an error occurs? Just selectively
repeat the missing data?
 How does the TCP sender avoid overrunning the receiver’s buffers?

Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
19
TCP Sliding Window
Window Size
Data ACK’d
Outstanding
Un-ack’d data
Data OK
to send
Data not OK
to send yet
Window is meaningful to the sender.
 Current window size is “advertised” by receiver
(usually 4k – 8k Bytes when connection set-up).
 TCP’s Retransmission policy is “Go Back N”.

Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
20
TCP Sliding Window Protocol
– Sender Side




Variables:
 LastByteAcked: The last byte acked by receiver
 LastByteSent: The last byte sent by sender
 LastByteWritten: The last byte written by application layer onto the transport
layer
LastByteAcked <= LastByteSent
LastByteSent <= LastByteWritten
Buffer bytes between LastByteAcked and LastByteWritten
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
21
TCP Sliding Window Protocol
– Receiver Side

Variables:






LastByteRcvd: The last byte received by receiver
NextByteExpected : The next byte still to be received from the sender
LastByteRead: The last byte passed to the application layer by the transport layer
LastByteRead < NextByteExpected
NextByteExpected <= LastByteRcvd + 1
Buffer bytes between LastByteRead and LastByteRcvd
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
22
TCP Flow Control

Receiving side
 Receive buffer size = MaxRcvBuffer
 LastByteRcvd - LastByteRead < = MaxRcvBuffer
 AdvertisedWindow= MaxRcvBuffer - (NextByteExpected LastByteRead)
• Shrinks as data arrives and
• Grows as the application consumes data

Sending side
 Send buffer size = MaxSendBuffer
 LastByteSent - LastByteAcked < = AdvertisedWindow
 EffectiveWindow = AdvertisedWindow - (LastByteSent –
LastByteAcked)
• EffectiveWindow > 0 to send data
 LastByteWritten - LastByteAcked < = MaxSendBuffer
 block sender if (LastByteWritten – LastByteAcked) >
MaxSenderBuffer
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
23
TCP Sliding Window
Round-trip time
Round-trip time
Window Size
Host A
Host B
???
Window Size
ACK
(1) RTT > Window size
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
Window Size
ACK
ACK
(2) RTT = Window size
24
TCP: Retransmission and
Timeouts
Round-trip time (RTT)
Retransmission TimeOut (RTO)
Guard
Band
Host A
Estimated RTT
Data1
Data2
ACK
ACK
Host B
TCP uses an adaptive retransmission timeout value:
Congestion RTT changes
Changes in Routing frequently
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
25
TCP: Retransmission and Timeouts
Picking the RTO is important:


Pick a values that’s too big and it will wait too long to
retransmit a packet,
Pick a value too small, and it will unnecessarily retransmit
packets.
The original algorithm for picking RTO:
1. EstimatedRTTk=  EstimatedRTTk-1 + (1 - ) SampleRTT
2. RTO = 2 * EstimatedRTT
Determined
empirically
Characteristics of the original algorithm:


Variance is assumed to be fixed.
But in practice, variance increases as congestion increases.
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
26
TCP: Retransmission and Timeouts


Router queues grow when there is more
traffic, until they become unstable.
As load grows, variance of delay grows
rapidly.
Average Queueing Delay

There will be some (unknown) distribution
of RTTs.
We are trying to estimate an RTO to
minimize the probability of a false timeout.
Probability

Variance
grows rapidly
with load
variance
mean
RTT
Load
(Amount of traffic
arriving to router)
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
27
TCP: Retransmission and Timeouts
Newer Algorithm includes estimate of variance in RTT:
Same as
before
Difference = SampleRTT - EstimatedRTT
 EstimatedRTTk = EstimatedRTTk-1 + (*Difference)
 Deviation = Deviation + *( |Difference| - Deviation )


RTO =  * EstimatedRTT +  * Deviation
1
4
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
28
TCP: Retransmission and Timeouts
Karn’s Algorithm
Host A
Host B
Host A
Retransmission
Wrong RTT
Sample
Host B
Retransmission
Wrong RTT
Sample
Problem:
How can we estimate RTT when packets are retransmitted?
Solution:
On retransmission, don’t update estimated RTT (and double RTO).
Courtesy: Nick McKeown, Stanford
Umar Kalim, NIIT
Robin Kravets, UIUC
29