TCP_lec_DR13

Download Report

Transcript TCP_lec_DR13

ECE544: Communication Networks-II
Spring 2013
D. Raychaudhuri
Includes teaching materials from, L. Peterson, Sumathi Gopal and Sumit Rangwala
Today’s Lecture
•
•
•
•
Introduction to transport protocols
UDP
TCP
RTP
The Disconnect
Host1
Appl.
Guaranteed Service
Best-effort
ETH
FDDI
Appl.
Best-effort
R2
R1
IP
IP
ETH
Host8
IP
FDDI
R3
IP
PPP
PPP
ETH
IP
ETH
• Applications running on hosts need to
communicate
– Require some guarantees from the underlying layer
• Network Layer (IP) provides only best-effort
communication services
– Only between hosts (not applications)
Transport Protocol
Host8
Host1
Appl.
Appl.
TCP/UDP
TCP/UDP
IP
ETH
ETH
R2
R1
IP
FDDI
IP
FDDI
R3
IP
PPP
PPP
ETH
IP
ETH
• Transport protocol
– Provides services required by applications using the
services provides by the network layer
– The Transport Layer is the lowest layer in the network
stack that is an end-to-end protocol
Transport Protocols
• Applications requirements vs. IP layer limitations
– Guarantee message delivery
• Network may drop messages.
– Deliver messages in the same order they are sent
• Messages may be reordered in networks and incurs a long delay.
– Delivers at most one copy of each message
• Messages may duplicate in networks.
– Support arbitrarily large message
• Network may limit message size.
– Support synchronization between sender and receiver
– Allows the receiver to apply flow control to the sender
– Support multiple application processes on each host
• Network only support communication between hosts
– Many more
• Design just a few transport protocols to meet most of the current and
future application requirements
– Each satisfies the requirements for a class of applications
– Many applications=>few transport protocols
Most Popular Transport Protocols
• User Datagram Protocol (UDP)
– Support multiple applications processes on each host
– Option to check messages for correctness with CRC check
•
Transmission Control Protocol (TCP)
– Ensures reliable delivery of packets between source and destination
processes
– Ensures in-order delivery of packets to destination process
– Other services
•
Real Time Protocol (RTP)
– Serves real-time multimedia applications
– Moves decision making to the applications
– Runs over UDP
• TCP, UDP and RTP satisfy needs of the most common applications
– Applications requiring other functionality usually use UDP for transport
protocol, and implement additional features as part of the application
User Datagram Protocol (UDP):
Demultiplexing
• Service: Support for multiple processes on each host to communicate
– Issue: IP only provides communication between hosts (IP addresses)
• Solution
– Add port number and associate a process with a port number
– 4-Tuple Unique Connection Identifier: [SrcPort, SrcIPAddr, DestPort,
DestIPAddr ]
Appl
process
0
16
SrcPort
DesPort
Length
Checksum
Payload
Appl
process
Appl
process
Appl
process
31
UDP
UDP
Network
UDP Packet Format
IP
IP
User Datagram Protocol (UDP): Error
Detection
• Service: Ensure message correctness
– Issue: Packet corruption in transit
• Solution
– Use Checksum. Why isn’t IP checksum enough?
– Includes UDP header, payload, pseudo header
– Pseudo header
• Protocol number, source IP address, destination IP address,
and UDP length
0
16
31
SrcPort
DesPort
Length
Checksum
Payload
Transmission Control Protocol (TCP)
• First proposed by Vinton Cerf and Robert
Kahn, 1974
– TCP/IP enabled computers of all sizes, from
different vendors, different OSs, to
communicate with each other.
– Used by 80% of all traffic on the Internet
• Reliable, in-order delivery, connectionoriented, bye-stream service
TCP: Connection-oriented
• Service: Connection-oriented
– Application states the destination once
– Issue: IP is connection-less
• Solution: TCP maintains the connection
state
– Connection Establishment
– Connection Termination
A Simple File Transfer
• Connection Establishment
– Server:
• passive open and wait for connection (on a port)
– Client:
• Active open and initialize connection establishment
• After connection establishment
– Data transport (more later)
• Terminate connection
– Both sides independently close their half of the
connection
TCP: Packet Format
• Flags
– SYN, FIN, ACK, RESET, URG,
PUSH
• Sequence number
– Sequence number of the
first byte of data in the
segment
• It is an abstract number (more
later)
• Acknowledgement
– Next sequence number
expected from the sender
Connection Establishment
• Server
– Informs TCP about the listening
port
Connection
Establishment
• Up-call registration
• Client
– Performs three way
handshake
– SYN and ACK flags in the
header are used
– Initial Sequence numbers x and
y selected at random
Data
transport
Active
participant
(client)
Passive
participant
(server)
Connection Termination
• Any side can terminate the
connection
• Each side closes its half of
the connection
independently
– A connection may be halfopened
Can only
receive data
Data write
Data ACK
TCP State-Transition
• Max segment lifetime (MSL): 120 sec (recommended)
TCP: Byte-stream
• Service: Byte-stream
– Application reads or writes a stream of bytes to the transport
– Issue: IP is packet-oriented
• Solution: TCP maintains a local buffer
– Chop the stream into packets and transmit (sender)
– Coalesce data from packets to form a stream (receiver)
TCP: Reliable and Ordered
Delivery
• Service: Reliable Delivery of byte-stream
• Solution: Sliding Window Protocol
– Studied earlier
– Buffer size at the receiver should be at least receiver window size
Receiving Appl
Sending Appl
LastByteRead
TCP
LastByteWritten
LastByteAcked
LastByteSent
Receiver Window Size
TCP
NextByteAcked
LastByteRcvd
Receiver Window Size
But what if the receiving application cannot read data
fast enough?
Slow Receiver
• Receiver cannot read bytes at the speed the network is
delivering data
– Requires a buffer > receiver window size
• If receiver window size is kept constant, in worst case requires
infinite buffer
Receiving Appl
Sending Appl
LastByteRead
TCP
LastByteWritten
LastByteAcked
LastByteSent
< Receiver Window Size
TCP
NextByteExpected
LastByteRcvd
Receiver Window Size
TCP: Flow Control
• Flow Control
– “Prevent sender from overrunning the capacity (buffer) of the receiver”
• Solution: Use adaptive receiver window size
– Goal is to keep (C) – (A) < MaxRcvBuffer
– Every packet carries ACK and AdvertisedWindow
Receiving Appl
Sending Appl
TCP
LastByteAcked (J)
(I) LastByteWritten
(K) LastByteSent
LastByteSent (K) – LastByteAcked (J) <= AdvertisedWindow
EffWin = AdvertisedWin (LastByteSent-LastByteAcked)
LastByteWritten – LastByteAcked <= MaxSendBuffer
LastByteRead
(A)
(B)
NextByteExpected
TCP
(C)
LastByteRcvd
AdvertisedWindow = MaxRcvBuffer((NextByteExp-1)-LastByteRead)
Sequence Number Wrap Around
• Protect against SequenceNum wrap around
– Sliding window
• Seq # space >= 2 x WinSize
• For TCP: 232 >> 2 x 216
– Seq # should not wraparound within a MSL (120 sec)
period of time
– For OC-48 (2.5 Gbps), time until wraparound: 14 sec
• TCP extension to the sequence # space for
protecting against seq # wrapping around
– Add 32-bit timestamp as optional header
Keep the Pipe Full
• AdvertisedWindow: 216=>64 KB
– Big enough to allow the sender to keep the pipe full
(assume that the receiver has enough buffer to handle
the data)
– If RTT = 100 ms,
• Delay x Bandwidth = 122 KB for 10 Mbps link
• Delay x Bandwidth = 1.2 MB for 100 Mbps link
(AdvertisedWindow is not large enough)
• TCP Extension:
– Scaling factor option for AdvertisedWindow,
• e.g., use 16-byte units of data
TCP Error Control
• Cumulative ACK: ACK the highest contiguous
bytes received
– Same as studied before
Extension: Selective ACK (SACK), ACK additional blocks of
received data in TCP optional header
• Timeout Timer
– If timeout too soon
• unnecessarily retransmit → adds load to network
– If timeout too late
• Increases latency
• Limits the throughput.
TCP Timeout
• Issue: RTT in a wide area network varies substantially
• Solution: Adaptive Timeout
• Original Algorithm:
– EstimatedRTT = a x EstimatedRTT + (1-a) x SampleRTT
0 a 1
– Timeout = β x EstimatedRTT (β = 2)
• Problem
– Does not distinguish whether the ACK is for original transmission
or retransmission (suggestions?)
– Constant β is not good.
• Assumes constant variance
TCP Timeout
• Karn/Partridge Algorithm
– Whenever TCP retransmits a segment, it stops taking samples of
the RTT
• Only measure SampleRTT for segments that have have been sent
only once
– Each time TCP retransmits, set the next timeout to be twice the
last timeout
• Relieves congestion
• Jacobson/Karels Algorithm: Adaptive variance (uses mean
variance)
Difference = SampleRTT - EstimatedRTT
EstimatedRTT = EstimatedRTT + (d x Difference) → (same as in original)
Deviation = Deviation + d(|Difference|- Deviation)
Timeout = m x EstimatedRTT + f x Deviation
0  d 1
(default: set m = 1 and f= 4 )
Triggering Transmission
• When to transmit a segment:
– small segments subject to large overhead
• Reach max segment size (MSS): the size of the
largest segment TCP can send without causing the
local IP to fragment
– MSS = local MTU – IP & TCP header
• The sending process explicitly ask the TCP to
transmit, “push”
TCP Silly Window Syndrome
• TCP Silly Window Syndrome
–
–
–
–
–
–
Sender has MSS bytes of data to send, but window is closed
ACK arrives with a small window
Sender sends a small segment (high overhead)
Receiver advertise a small window
Sender sends a small receive segment
Repeat the above
• To solve: Nagle’s Algorithm
– When the application have data to send
• If both available data and the window >= MSS
– Send a full segment
• Else
– If there is unACKed data in flight
» Buffer the new data until an ACK arrives
– Else
» Send all the new data now
TCP Deadlock
• TCP Deadlock
– receiver advertises a window size of 0, the sender
stops sending data
– the window size update from the receiver is lost
• To solve it:
– the sender starts the persist timer when
AdvertisedWindow = 0
– When the persist timer expires, the sender sends a
small packet
TCP Services
•
•
•
•
•
•
Connection-oriented
Byte-stream service
Reliable and In-order
Flow Control
Error Control
Congestion Control (next session)
Congestion
Source
1
Even with flow control packets
might not reach the destination
Dest
1
Source
2
Source
3
Dest
2
• When the network cannot support the sender’s
rate
– Queues at the network elements overflow
Congestion Control vs. Flow
Control
• Congestion Control
– Mechanism to prevent sender from
overrunning the capacity of the network
• When network is the bottleneck
• Flow Control
– Mechanism to prevent sender from
overrunning the capacity of the receiver
• When receiver is the bottleneck
Congestion Control: Design
Approach
• Maintain another window at the sender called
CongestionWindow (cwnd)
– CongestionWindow is the max number of packets allowed in the
network
• Number of unACKed packets at the sender.
• Key: How to calculate congestion window (cwnd)
– Various approaches possible
– TCP estimates it based on observed packet losses
• Assumes packet loss as indication of congestion
• Since we don’t know whether the network or the receiver is
the bottleneck
– MaxWindow = MIN(CongestionWindow, AdvertisedWindow)
– EffectiveWin = MaxWindow – (LastByteSent – LastByteAcked)
Congestion Avoidance: (AIMD)
• If no congestion in the network (increase conservatively)
– Increase the congestion window additively every RTT
Every RTT
w=w+1
w = cwnd in segments
Every ACK reception
w = w + 1/w
w = cwnd in segments
Every ACK reception
cwnd = cwnd +
MSS*(MSS/cwnd)
cwnd in bytes
• If congestion in the network (decrease aggressively)
– Decrease the congestion window multiplicatively, immediately
cwnd = cwnd/2
cwnd in bytes
• How is congestion detected?
– Estimated (more later)
CongestionWindow Size
Congestion Avoidance: (AIMD)
Startup time
Time
• TCP’s saw tooth pattern
• Issues with additive increase
– takes too long to ramp up a connection from the
beginning
– The entire advertised window may be reopened when
a lost packet retransmitted and a single cumulative
ACK is received by the sender
TCP “Slow Start”: To start quickly!
• Maintain another variable slow
start threshold (ssthresh)
– Last known stable rate
– If (cwnd > ssthresh)
• State = congestion avoidance
– Else
• State = slow start
• In Slow start
– Increase the congestion window
exponentially every RTT
Every ACK reception
w=w+1
w = cwnd in segments
Every ACK reception
cwnd = cwnd +
MSS
cwnd in bytes
TCP: Congestion Detection and
Retransmit
• Loss of packet indicates congestion
– Timer Timeouts (No ACK)
• Set according to Jacobson/Karels algorithm
– On timer timeout
• ssthresh = max(2*MSS, effwin/2); cwnd = MSS
– Notice this will cause TCP to go into slow start
• Issue: takes a long time to detect a packet loss
– Affects throughput
– Any other quicker way of detecting a packet
loss?
Fast Retransmit
• Observation: A series of
duplicate ACKs might mean a
packet loss
• Solution
• Every time receiver receives a
packet (out-of-order), sends a
duplicate ACK
• Sender retransmit the missing
packet after it receives some
number of duplicate ACKs (e.g.
3 duplicate ACKs)
• Fast Retransmit does not replace
timeouts
• Issue: Reduces latency (early
retransmit) but still incurs loss in
throughput (slow start after
packet loss )
PKT 1
PKT 2
PKT 3
PKT 4
PKT 5
PKT 6
ACK 1
ACK 2
ACK 2
ACK 2
ACK 2
PKT 3
Retran
ACK 6
Fast Recovery
• Transmit a packet for
every ACK received till the
retransmitted packet is
ACK’d
– ssthresh= (2*MSS, cwdn/2);
cwnd = sshthred + 3
– On every ACK will the ACK of
retransmitted packet
• cwnd = cwnd + 1
• On reception of ACK of
retransmitted packet
– Start congestion avoidance
instead of slow start
• cwnd = ssthresh
Putting it all together (TCP Reno)
Homework
•
•
•
•
•
5.13 (3rd ed and 4th ed)
5.16
5.28
5.34
5.39
Due 4/5