Transcript ppt

TCP
1
Contents
•
•
•
•
•
•
TCP
TCP connection
TCP flow control
TCP congestion control
TCP timer
UDP
2
Transport Layer
• End-to-end data transfer
(cf) DLP(data link protocol)
– data transfer between adjacent nodes
DLP
DLP
DLP
Host
Host
Transport layer protocol
3
Transport Layer services
• Addressing the application process and delivering data
between processes
• What else should the transport layer do for application?
AP1
AP2
AP3
Transport
AP1
end-to-end
AP2
AP3
transport
IP
IP
IP
network
access 1
network network
access1 access2
network
Access 2
subnet 1
subnet 2
4
What the transport layer should do in the Internet(1)
• IP provides unreliable services to the upper layers.
– no error control
• IP does merely the header checksum, but do not send ACKs nor retransmit.
– no flow control/no congestion control
• IP doesn’t have any function to control the transmission rate depending on the
states of receivers or networks.
– duplicate packet discovery
• When packets are not delivered within the predefined time limit to the receiver
because of network congestion or taking detour, even though those packets are
not lost on the way, the sender retransmits the same packets.
• Also, the ACK packets are not delivered to the sender within the predefined time
limit, the sender times out and retransmits the same packets.
• The IP of receiver cannot detect those duplicate packets and delivers the packets
to the upper layers.
– out-of-order packet delivery
• Because IP use the datagram mode, packets can take different paths, consequently
they might arrive out of order.
5
What the transport layer should do in the Internet(2)
• The application data that are delivered by IP might
– be lost due to error or congestion, or
– arrive at the destination out of order, or
– be duplicated at the destination.
• Thus, the transport layer protocol in the Internet should
– provide the reliable services to the application layers if the application
requires reliable service. Otherwise all dirty work should be done by
application itself.
– There are two transport protocols in the Internet.
• TCP – provide reliable services.
• UDP - simple, streamlined delivery services to the application layers
which do not need reliable service.
6
Internet transport layer protocols
•
TCP(Transmission Control Protocol)
– provide reliable services to the application layers.
• Multiplexing (addressing the application services)
• error control (error detection and retransmission)
• flow control
• congestion control
• Guarantee no out-of-sequence of the packet order
• UDP(User Datagram Protocol)
– Provide unreliable services
– UDP does very simple function compared to TCP.
• Multiplexing (addressing the application services)
• Error detection (optional)
7
TCP service characteristics
• End-to-end reliable service
– guarantee the reliable data transfer between application processes
– No error, no loss, no out-of-sequence
• connection-oriented service
– Consists of three steps: connection setup, data transfer, connection release
• full duplex transmission
– TCP connection setup enables two-way connections.
•
stream-oriented transmission
– TCP views messages from application processes as continuous byte stream,
not as separate packets.
•
Graceful connection release
– When the connection terminates, TCP releases the connection after data
transfer is completed.
8
How to provide reliable services(1)
• Transmission unit is segment.
– The data sent to TCP from application processes are fragmented to have the
size proper for transmission. Each fragmented data is called a segment. So the
segment is the transmission unit when TCP sends application process data.
– On the contrary, UDP does not fragment the application data, instead send the
data as it was given from application processes.
• Management of the segment sequence
– Each segment is given a sequence number (viewed as byte streams), so
receiver TCP can recognize any loss of segments and the out-of-sequence of
arriving segments.
• ACK transmission
– When TCP receives correct segments, it always replies with ACK segment.
– For enhancing performance, it uses the accumulative ACK.
•
Timer management
– When TCP sends segments, it starts a timer. When the ACK for the segments
sent does not arrive until the timer times out, it resends the same segment.
9
How to provide reliable services(2)
• Error control (checksum)
– TCP checks any error on the segments it received using the checksum field in
the header. If it finds any error, it discards the segment.
– Also using the sequence number on the segment, it checks any loss of
segments or out-of-sequence of the segments.
• Order control
– The receiver stores the packets it receives in the buffer, and after keeping the
order of segments, it delivers them to application processes.
• Detection and discard of duplicate segments
– When the same segments arrives, the receiver discard the segment.
10
How to provide reliable services(3)
• Clear connection management
– Clear connection setup using 3 way handshake
– Also, clear connection release using 3 way handshake
– When one end station happens to reboot, the station will setup another TCP
connection in addition to the current TCP connection. In this case, TCP can
distinguish the segments of the previous connection and the newly
established connection.
•
Flow control
– TCP uses a buffer, and notifies the other TCP on the connection of the
available space in the buffer for receiving. So the other TCP can send only
the amount of segments and stop.
•
Congestion control
– TCP controls transmission rate depending on congestion state in the network.
11
TCP Header
IP datagram
TCP segment
IP header
TCP header
20 octets
20 octets
TCP data
12
TCP Header
16-bit source port number
16-bit destination port number
32-bit sequence number
TCP
32-bit acknowledgement number
APRS F
4bit hdr Reserved U
RC S SY I
length (6 bits) G K H T N N
16-bit window size
16-bit TCP checksum
16-bit urgent pointer
Options (if any)
header
Padding(if any)
Data (if any)
13
TCP Segment Format(code Bits)
Bit position
Name
function
11
URG
urgent pointer field valid
12
ACK
acknowledgment field valid
13
PSH
deliver data on receipt of this segment
14
RST
reset the sequence/acknowledgment numbers
15
SYN
synchronization
16
FIN
end of byte stream from sender
14
Port number: addressing application
• A connection is identified uniquely by 5 elements.
– (sender IP address, receiver IP address, protocol number, sender application
process port number, receiver application process port number)
– The combination of an IP address and a port number is sometimes called
socket.
AP
AP
AP
Port
AP
TCP connection
TCP UDP
TCP UDP protocol
IP
Network
access
IP addr
IP
IP
H/W addr
subnet
Network
access
Network
access
subnet
15
Connection Identification addresses
• IP address
– identifies a specific host in the Internet.
– has 1:1 mapping to the subnet physical address that the host is connected to.
• Protocol number
– identifies an upper layer protocol to which IP in the destination host should
send data.
• Port number
– identifies an application process to which the receiver IP should deliver data .
– well-known port numbers
• the port numbers that were already decided by ICANN for their uses
such as FTP server is 21, Telnet server is TCP 23, etc.
– Ephemeral number
• port numbers that is assigned temporarily for application processes
established presently.
16
Well Known TCP Ports(/etc/services)
Keyword
0
1
5
7
9
11
13
15
17
19
20
21
23
25
37
42
43
53
77
79
93
95
TCPMUX
RJE
ECHO
DISCARD
USERS
DAYTIME
QUOTE
CHARGEN
FTP-DATA
FTP
TELNET
SMTP
TIME
NAMESERVER
NICNAME
DOMAIN
FINGER
DCP
SUPDUP
UNIX keyword Description
echo
discard
systat
daytime
netstat
qotd
chargen
ftp-data
ftp
telnet
smtp
time
name
whois
nameserver
rje
finger
supdup
Reserved
TCP Multiplexor
Remote Job Entry
Echo
Discard
Active Users
Daytime
Network status program
Quote of the day
Character Generator
File Transfer Protocol
File Transfer Protocol
Terminal Connection
Simple Mail Transport Protocol
Time
Host Name Server
Who Is
Domain Name Server
any private RJE service
Finger
Device Control Protocol
SUPDUP Protocol
17
Sequence Number
• Segment number identifies the byte in the stream of data from the sending TCP to
the receiving TCP, It represents the first byte of data in the segment.
• The unit is not segments, but bytes..
• The size is 232 large enough to detect duplicate segments.
TCP user
TCP
SEND (200 byte data)
SEND (150 byte data)
SEND (100 byte data)
18
Acknowledge Number
• Accumulative ACK
Sender TCP
Receiver TCP
• By convention, the ACK number is the byte number of the segment that the
receiver expects to receive next time.
19
Duplicate segments in the same connection
Transport
Transport
Entity A
Entity B
A times out and retransmits SN0
A times out and retransmits SN1
assumption:
- seq. number: mod 8
- use the accumulative ACK
Solution: sequence
number space should
be large enough
Obsolete SN0 arrives
New SN0 arrives
20
Duplicate segments in different connections(1)
Transport
entity A
Transport
entity B
Old connection closed
New connection opened
Obsolete segment SN = 2 is accepted;
valid segment SN = 2 is discarded as duplicate
21
Duplicate segments in different connections(2)
• Global numbering
– If the sequence number of the last segment of the previous connection is N,
new connection use the first sequence number that is distant from N.
– TCP should remember the sequence number that was used in the last segment.
• 2 MSL Timer
– When TCP connection closes, new TCP connection is not allowed to open
immediately. New connection can open after the amount of time has passed.
– TCP implementation choose a value for the maximum segment life
time(MSL). It is the max. amount of time any segment can exit in the
network before being discarded.
– TCP connection can be reused after 2MSL wait is over.
22
Window Field
• This field is used for TCP flow control (often
called “Credit technique”).
• It is used for a receiver to notify a sender of the
size of empty space in the receiver TCP buffer.
• The unit is bytes.
• If the buffer size is larger than 216, it can be
extended using the option field.
• Its use is independent of the use of the
acknowledge number field that denotes the
success of failure of segment transmission.
23
PUSH
• Background
– Normally, when the sending TCP receives data from the sending application process,
TCP does not send the data immediately. Instead it stores the data in the its buffer,
waiting for additional data arrive for the prevention of Silly Window Syindrom.
– In the interactive application, however, the sending TCP is required to send data
immediately.
• PSH flag
– The sending application process tells its TCP when to set the PUSH flag.
– It is a notification to the sending TCP that the sending application process don’t want
the data to hang around in the TCP buffer, waiting for additional data to fill the buffer.
– When the receiver TCP receives the segment with the PSUH flag, it pass data to the
receiver application process, telling not to wait until any additional data
– The Socket API don’t provide a way for the application to tell its TCPto set the PUSH
flag. Setting this flag is up to the TCP implementation.
– BSD implementations ignores a received PUSH flag because they normally never
delay the delivery of received data to the application.
24
URGENT Bit & Urgent Pointer
• Urgent mode
– The sending TCP tell the other TCP that urgent data of some form has been
placed into the normal stream of data.
– The receiving TCP notifies the receiving application of the arrival of urgent
data. The application process will decide what to do on its own way.
– The URG bit is turned on and the urgent pointer is set to a positive offset that
must be added to the sequence number field in the TCP header to obtain the
sequence number of the last byte of urgent data.
– In the socket API, sending application process can set this bit using SO OOB.
• What is urgent mode used for?
– The two most common uses are Telnet and Rlogin when interactive uses type
the interrupt key(etc, ^C). Another is FTP, when interactive users abort a file
transfer.
25
TCP Option Fields
• MSS (Maximum Segment Size) option
– The maximum size of the data transmitted
– When a connections established, each end can announce the MSS it expects to receive.
An MSS option can only appear in a SYN segment. If one end does not receive an
MSS option form the other end, a default of 536 bytes is assumed.
• 576 (IP datagram default size) - 40 (IP/TCP header fixed size)
– In general, the larger the MSS the better, until fragmentation occurs.
•
Window Scale Option
– It increase the window size. It means the maximum window size can be 216x216=232.
• New window size = window size defined in the header x 2window scale factor
– The window size factor can be determined only during the connection setup phase.
• Time stamp option
– The sender fills the time stamp value when the segment leaves. When the receiver
sends an ACK for this segment, it enters the time stamp value that it receives from the
sender. When the sender receives the ACK, it can calculate the round trip time for this
segment.
26
Checksum
0
16
31
Source IP address
Checksum
scope
zero
Destination IP address
Protocol id
Segment length
TCP header
User Data
Pseudo-header
TCP
segment
• The checksum applies to three parts: pseudo-header, TCP header, and the
data coming form the application process)
• Checking the pseudo-header prevent packets from being delivered to
wrong hosts due to the corruption of the IP header.
• Divide the total bits into 16-bit words. Add all 16-bit sections, using
one’s complement arithmetic.
27
TCP summary
• Connection establishment
– 3 way handshake
• Connection termination
– support graceful close using the 3 way handshake.
– also support abrupt close using ABORT primitive.
• Data transfer
–
–
–
–
–
Each segment is assigned a sequence number with the unit of byte.
Error control by retransmission: selective repeat
Flow control by credit allocation
PUSH
URGENT POINTER
• Reset service
– RST
28
TCP Primitives
primitive
type
UNSPECIFIED_PASSIVE_OPEN
FULL_PASSIVE_OPEN
Request
Request
S
S
ACTIVE_OPEN
Request
C
ACTIVE_OPEN_WITH_DATA
Request
C
OPEN_ID
Local
response
Confirm
Confirm
Request
C
DELIVER
ALLOCATE
CLOSE
CLOSING
TERMINATE
ABORT
STATUS
STATUS_RESPONSE
Indication
Request
Request
Indication
Request
Request
Request
Local
Response
C/S
C/S
C/S
C/S
C/S
C/S
C/S
C/S
ERROR
Indicator
C/S
OPEN_SUCCESS
OPEN_FAILURE
SEND
Client/
Server
C
C
C/S
Parameters
Source port, timeout, timeout-action, precedence, security range
Source port, destination port, destination address, timeout,
timeout-action, precedence, security range
Source port, destination port, destination address, timeout,
timeout-action, precedence, security range
Source port, destination port, destination address, data, data
length, push flag, urgent flag, timeout, timeout-action, precedence,
security range
Local connection name, source port, destination port, destination
address
Local connection name
Local connection name
Local connection name, data, data length, push flag, urgent flag,
timeout, timeout-action
Local connection name, data, data length, push flag, urgent flag
Local connection name, data length
Local connection name
Local connection name
Local connection name, reason code
Local connection name
Local connection name
Local connection name, source port, source address, destination
address, connection state, receive window, send window, waiting
ack, waiting accept, urgent, precedence, security, timeout
Local connection name, reason code
29
Usage of TCP Service Primitives
TCP
Initiating(client)protocol
Responding(server)protocol
Client-IP-server
UNSPECIFIED_PASSIVE_OPEN
ACTIVE_OPEN
ACTIVE_OPEN_WITH_DATA
+
FULL_PASSIVE_OPEN
+
Connection
OPEN_ID
OPEN_RECEIVED
establishment
+
OPEN_SUCCESS
+
OPEN_FAILURE
SEND
+
+
DELIVER
Data
+
DELIVER
ALLOCATE
STATUS
transfer
+
SEND
+
+
+
Status/error
STATUS_REPORT
+
reporting
+
ALLOCATE
STATUS
STATUS_REPORT
+
ERROR
ERROR
CLOSE
+
+
CLOSING
TERMINATE
+
+
Connection
ABORT
CLOSE
+
TERMINATE
clearing
+
+
TERMINATE
30
Contents
•
•
•
•
•
•
TCP
TCP connection
TCP flow control
TCP congestion control
TCP timer
UDP
31
TCP Connection setup and release
client
server
SYN
SYN, ACK
ACK
Application
close
FIN
ACK
FIN
Deliver EOF to
application
Application
close
ACK
32
TCP Connection Setup : 3 Way Handshake
• Client-Server model
Client
TCP
TCP
ACTIVE_OPEN
Server
PASSIVE_OPEN
Send SYN
Send SYN
OPEN_RECEIVED
OPEN_SUCCESS
Send ACK
33
TCP Connection Setup : 3 Way Handshake
• Simultaneous open
ACTIVE_OPEN
ACTIVE_OPEN
Send SYN
Send SYN
Send ACK
Send ACK
OPEN_SUCCESS
OPEN_SUCCESS
34
Robustness of 3 Way Handshake
Obsolete SYN arrives
B accept and acknowledges
A initiates a connection.
Old SYN arrives at A;
A rejects
B accepts and acknowledge
A rejects B’s connection
(a) Delayed SYN
A acknowledges and
begins transmission
(b) Delayed (SYN, ACK)
35
TCP Half-Close: Graceful Disconnection
client
Application
shutdown
server
FIN
ACK of FIN
data
Application
read
Application
write
ACK of data
FIN
Deliver
EOF to
application
Deliver EOF to
application
Application
close
ACK of FIN
36
TCP Connection Release: 3 Way Handshake
(a)
Server side
Client side
CLOSE
TERMINATE
TCP
----
Client
TCP
Server
Send FIN
Send ACK
CLOSING
Send FIN
CLOSE
Send ACK
----
TERMINATE
(b)
ABORT
Send RST
TERMINATE
37
Connection Release: 3 Way Handshake
• Graceful disconnection – 3 way handshake
– Since the TCP connection is full-duplex, when one end request termination,
one way connection is terminate. But the other way connection can be
maintained while the other end keeps sending data.
• Abrupt disconnection
– One-sided termination because of network failure, etc. In this case data can
be lost.
38
Connection Release: 3 Way Handshake
• Graceful disconnection – 3 way handshake
– Problem due to out-of-sequence
• The one end sends FIN after sending the last segment. But the FIN
segment arrives ahead of the last segment.
• In this case, if the receiver TCP terminates as soon as it receives the FIN,
the receiver loses the segment that arrives after connection closure.
• To prevent this kind of loss, TCP assigns the sequence number to FIN
segment, which have the number incremented from the sequence number
of the last segment..
• When the other end is not cooperative to the termination request,
– The requesting end terminates the connection when the timer times out.
39
Crash & Connection Release
• The half-open can happen when any end of the connection breaks
down, since the other end cannot know the other end’s failure.
• In the half-open, the other end keeps retransmitting segments
allowed. If no reply arrives until the keepalive timer expires, it
terminates the connection.
• The TCP end that has broken down can terminate using RST
segment after rebooting.
– Since the rebooting TCP has lost all state information, it should send RST
segments for all segment it received, and the other end that received RST
segments must terminate the connection immediately.
40
TCP Entity State Diagram
Unspecified Passive Open or
Active Open or Active
Fully Specified Passive Open
Open with Data
CLOSED
Initialize SV;
Send SYN
Initialize SV
Close
Close
Clear SV
Clear SV
SYN SENT
Receive SYN
Send SYN
SYN RECEIVE
ACK
Send SYN
LISTEN
ACK
Receive ACK
Receive SYN,ACK
Receive SYN
of SYN
Receive FIN,ACK of SYN
Send ACK
LEGEND
Send ACK
ESTAB
SV = state vector
MSL = maximum segment lifetime
Close
Receive FIN
Send FIN
Send ACK
FIN WAIT
Receive
ACK of FIN
FIN WAIT 2
Receive FIN
CLOSE WAIT
Receive FIN
Receive SYN,ACK
Close
Send ACK
Send ACK
Send FIN
CLOSING
LAST ACK
Receive FIN,ACK
Receive
Send ACK
ACK of FIN
Send ACK
TIME WAIT
Timeout
(2MSL)
CLOSED
41
Contents
•
•
•
•
•
•
TCP
TCP connection
TCP flow control
TCP congestion control
TCP timer
UDP
42
TCP Traffic Control
• Traffic control
– There are two reasons for sender to reduce the rate of sending packets.
– When receiver’s buffer space is not enough,  flow control
– When the network is congested,  congestion control
network
congestion
Small-capacity
receiver
43
Sliding Window Flow Control
Segments sent, but
not acknowledged
0
1
Segments that can be sent
2
3
0
1
Window is shrinking
as the segments are sent
2
3
Window expands
as the acks are received
(a) sender’s window
Segments that were received
0
1
The last segment
That was acked
2
Segments that will be received
3
0
1
Window is shrinking
as the segments are received
2
3
Window expands
as acks are sent
(b) receiver’s window
44
Is the sliding window scheme enough?
window size = 3
0
1
2
3
0
1
2
3
I(0)
0
1
2
3
0
1
2
3
I(1)
I(2)
0
1
2
3
0
1
2
3
0
0
1
2
3
Window closed
0
1
2
3
Window closed
1
2
ACK(2)
0
1
2
3
0
1
2
0
3
1
2
3
0
1
2
3
I(3)
I(0)
0
1
2
3
0
1
2
3
3
I(1)
0
Window closed
ACKs not sent
1
2
3
0
1
2
3
0
1
Window closed, BUSY CONDITION
TIMEOUT
I(3)
Retransmit I(3),I(0),I(1)
I(0)
I(1)
Make the receiver’s state worse!!
45
What is wrong with the sliding window?
• No distinction between the ACK and the current available buffer
size.
– When the receiver TCP receives segments uncorrupted and stores them in the
buffer, but does not finish processing them,
• If the TCP doesn’t send any ACK, then the sender’s timer expires and try
to retransmit the segments. ==> It causes unnecessary loads to network!
• Otherwise, if the TCP sends ACKs, then the sender transmits new
segments, which may be discarded eventually. ==> aggravate the
receiver’s condition!
• Solution: credit allocation protocol
– It distinguishes the ACK from the credit information. The ACK information
informs the sender of successful transmission, while the credit information
notifies the sender of the its current empty buffer size.
46
Credit Allocation Protocol
window size = 3
0
1
2
3
0
1
2
3
I(0)
0
1
2
3
0
1
2
3
I(1)
I(2)
0
1
2
3
0
1
2
3
0
0
1
2
3
Closing Window
0
1
2
3
Closing Window
1
2
ACK 2, CDT 3
0
1
2
3
0
1
2
0
3
1
2
3
0
1
2
3
I(3)
I(0)
0
1
2
3
0
1
2
3
0
Closing Window
ACK 1, CDT 0
TIMEER
3
I(1)
1
2
3
0
1
2
3
Closing Window, BUSY CONDITION
0
1
Not retransmit I(3),I(0),I(1)
ACK 1, CDT 2
0
1
2
3
0
1
2
3
Open Window
I(2)
I(3)
0
1
2
3
0
1
2
IDLE CONDITION
3
1
47
Example of TCP Credit Allocation Mechanism.
Transport Entity A
Transport Entity B
SN = 1401
....1000 1001
2400 2401...
A may send 1400 octets
....1000 1001
2401...
A shrinks its transmit window with each
transmission
....1600 1601
2400 2401...
B is prepared receive 1400 octets,
beginning with 1001
1601
....1000 1001
....1000 1001
2001
2401...
....1600 1601
B acknowledges 3 segments (600 octets) but is only
prepared to receive 200 additional octets beyond the
original budget (I.e., B will accept octets 1601
through 2600
2001
A adjusts its window with each credit
....1600 1601
....1600 1601
2001
2600 2601….
A exhausts its credit
.…2600 2601
.…2600 2601
A receive new credit
4000 4001.…
4000 4001.…
B acknowledges 5 segments (1000 octets) and
restore the original amount of credit
48
Too Small Data & Immediate Window Update
•
Example of TELNET
A keystroke arrive
41 bytes IP packets
40 bytes ACK
40 bytes
window update
Application read 1 byte
of keystroke
Application echoes it
41 bytes IP packets
– When data arrives from application, if the sender TCP transmit it
immediately, or the receiver TCP sends window update right after
its buffer changes, then they have to exchange segments
frequently, but do little.
49
Silly Window Syndrome(caused by Receiver)
Receiver’s buffer is full
Application reads 1 byte
Room for one more byte
Header
Header
1 byte
Window update segment sent
New byte arrives
Receiver’s buffer is full
50
Silly Window Syndrome(caused by Sender)
Application writes 1 byte
Sender’s buffer has 1 byte.
Header
1 byte
TCP sends 1 byte.
Sender’s Buffer is empty
51
Avoiding SWS from the sender
•
Background
– Suppose the case that data from application arrives at TCP 1 byte at a time. In
that case TCP does not need to send small segment immediately every time it
receives data.
• Nagle’s algorithm
– If data arrives 1 byte at a time, TCP sends the first byte in a small segment,
and collect the next bytes in its buffer.
– TCP sends the data in the buffer as a single segment when the ACK for the
first segment arrives.
– And TCP store the next bytes in the buffer again until it receives the ACK for
the segment.
52
Avoiding SWS from the receiver
• Clark’s solution
– The receiver TCP does not send window update until before its buffer is half
empty or the size of data in the buffer becomes as large as the MSS.
• Delayed ACK
– TCP does not send an ACK the moment it receives a segment. Instead, it
delays the ACK, hoping to have data going to the same host as the ACK for
piggybacking.
– Most implementations use a 200 ms delay.
53
Contents
•
•
•
•
•
•
TCP
TCP connection
TCP flow control
TCP congestion control
TCP timer
UDP
54
Congestion Control
• Background
– Too much traffic has been injected into the network. The traffic inflow at this
moment is exceeding the capacity that the network can accommodate.
• So, the solution is simple. The traffic influx should be pull down
below the network capacity level. But the rate should be reduced
way ahead of reaching the full capacity level. (need very early
action!!)
• How can the network detect the early symptom of the congestion?
– Monitoring the buffer size of network nodes (eg, routers)
– Keeping track of the round-trip time of packets
55
TCP and congestion control
• In the Internet, TCP is responsible for the congestion control. (It is
somewhat odd!)
• Then, how does TCP detect the congestion?
– Timeout: No ACKs has arrived until timer expires.
– The timeout can be triggered by two occasions: One is the transmission error,
and the other is packet loss by congestion. But in the current network, the
transmission error happens very rarely, so we give the congestion the benefit
of the doubt.
• TCP Congestion control methods
–
–
–
–
Slow start
Congestion avoidance
Fast retransmit
Fast recovery
56
Slow Start
• Control parameters
– Awnd (advertised window by receiver)
• At the initial setup, the sender informs the receiver of its maximum buffer
size, which is the initial value of awnd.
• Every time the sender transmits an ACK, it advertises its current available
buffer size.
– Cwnd (congestion window)
• Determine how many segments can be sent without receiving ACKs..
• Slow Start
Initialize: cwnd = 1 MSS (max. segment size);
Every time each ACK arrives:
cwnd = cwnd + 1 MSS until min(cwnd, awnd) /* exponential growth */
Initial rate is slow, but ramp up exponentially fast.
57
Effect of Slow Start
receiver
sender
Cwnd = 1
Cwnd = 2
Segment 1
ACK 2
Segment 2
Segment 3
ACK 3
Cwnd = 3
Cwnd = 4
ACK 4
Segment 4
...
Segment 7
ACK 5
.....
Cwnd = 8
ACK 8
58
Congestion Avoidance
• If no ACKs arrive until timeout, TCP starts the
Congestion Avoidance algorithm.
• Congestion Avoidance algorithm
If (segment timeout) {
1. Set ssthresh = cwnd / 2 /* slow start threshold */
2. Set cwnd = 1 MSS
Restart “slow-start” until (cwnd=ssthresh)
3. If (cwnd  ssthresh)
cwnd = cwnd + 1 MSS every roundtrip time
}
59
Slow Start and Congestion Avoidance
CWND=1
A
B
CWND=2
CWND=3
CWND=4
CWND=5
CWND=6
CWND=7
CWND=8
CWND=9
CWND=10
CWND=11
CWND=12
CWND=13
CWND=14
CWND=15
CWND=16
(a) Slow start, ending with a time out
CWND=1
A
B
CWND=2
CWND=3
CWND=4
CWND=5
CWND=6
CWND=7
CWND=8
CWND=9
CWND=10
(b) Slow start followed by congestion avoidance
60
Slow Start and Congestion Avoidance
20
Time out occurs
15
10
Threshold
5
1
2
3
4
5
6
7
8
9
10
11
12
13 14
15
16
Round-trip times
61
Fast Retransmission and Fast Recovery
• Background
– TCP is required to generate an immediate acknowledgement (a duplicate
ACK) when an out-of-order segment is received.
– We don’t know whether a duplicate ACK is caused by a lost segment or just a
reordering of segments.
– If three or more duplicate ACKs are received in a row, it is a strong indication
that a segment has been lost. Three or more duplicate ACKs implies that there
is a flow of segments over the network.
– Therefore the Congestion Avoidance is too conservative approach to this case.
• Fast retransmission
– If 4 consecutive ACKs are received before timeout, then TCP do not wait for
the timeout and retransmit the segment immediately.
62
Fast Retransmit
A
B
63
Fast Recovery
• Fast recovery algorithm (avoiding initial slow start phase)
1. When the third duplicate ACK is received,
Set ssthresh = cwnd / 2;
Retransmit the missing segment;
cwnd = ssthresh + 3 ;
2. Each time another duplicate ACK arrives,
Increment cwnd by the segment size;
Transmit a new segment (if allowed by the new cwnd value);
3. When the next ACK arrives that acknowledges new data,
cwnd = ssthresh ;
cwnd = cwnd + 1 every roundtrip time ;
64
Fast Retransmission and Fast Recovery
seq #
&
cwnd
cwnd
sequence number
send time (sec)
65
Contents
•
•
•
•
•
•
TCP
TCP connection
TCP flow control
TCP congestion control
TCP timer
UDP
66
Round Trip Time & Timeout
• RTT is important because the timeout value is determined
based on RTT.
• RTT can change over time as route might change and as
network traffic changes.
• So, TCP should track these changes and modify its
timeout accordingly.
67
Round Trip Time & Timeout
• Original TCP specification
RTT(n+1) = a * RTT(n) + (1-a) * RTT_SAMPLE(n)
/* recommendation : a=0.9 */
RTO = b * RTT(n+1)
/* recommendation : b = 2 */
RTO: Retransmission Timeout value
RTT_SAMPLE : measured RTT
• Karn’s algorithm
– We cannot update the RTT estimation when an ACK for retransmitted
segment arrives because we don’t know to which segment the ACK
corresponds, the original one or the retransmitted one?
– Don’t calculate a new RTO until an acknowledgement is received for a
segment that was not retransmitted.
– Set the timeout after retransmission as follows:
Timeout = 2 * RTT(n) /* exponential growth */
– After the ACK for the retransmitted segment arrives, restart the calculation of
RTT_SAMPLE.
68
Jacobson’s Algorithm
•
Background
– We can have better performance when we consider variance together rather
than use simple RTT average values alone.
•
Jacobson’s algorithm
DIFF(n+1) = RTT_SAMPLE(n+1) - RTT(n)
DEV(n+1) = DEV(n) + h * (|DIFF(n+1)| - DEV(n)) /* typically h = 1/8 */
RTT(n+1) = RTT(n) + g * DIFF(n+1) /* typically g = 1/4 */
Timeout(n+1) = RTT(n+1) + 4 * DEV(n+1)
69
TCP Timers
•
•
•
•
Retransmission timer
Persist timer
Keepalive timer
2MSL timer
70
Retransmission Timer
• It is used for determining how long the TCP sender wait
for retransmission (timeout).
• In the real implementation, there are not each timer
operating for each segment. There is only one timer for
each connection.
71
TCP Persist Timer
• Background
– When the TCP receiver advertises window = 0, the TCP sender stops sending
temporarily. Afterwards, the receiver lets the sender know it can receive
segments again by sending new window advertisement. But if this new
window advertisement is lost, the sender will wait for the new advertisement
forever. (Deadlock!!)
• Solution
– After the sender knows window=0, the sender transmits window probe
segment periodically to check out if the receiver is ready to accept. The
window probe is sent according to the persist timer.
– Window probe is a segment of 1 byte length.
– TCP allows sender to transmit one byte even if the receiver’s window is
closed.
– TCP persist timer is increasing exponentially.
72
TCP Persist Timer
win=0
win=0
window probe
ACK(win=0)
win=256
window probe
lost
ACK(win=0)
Deadlock
window probe
ACK(win=0)
Persist Timer
(normal TCP Exponential backoff)
73
TCP Keepalive Timer
• If there is no activity on a given connection for a period of time, the server sends
a probe segment to see if the client is still alive.
• The keepalive timer specifies the interval at which the server want to know if
client’s host has either crashed or is down. The interval is normally 2 hours.
• When the Keepalive timer expires, the server sends a probe segment:
– (1) if the client is still alive,
• It will respond and there will be no more prove for next 2
– (2) if the client is down,
• It times out after 75 seconds, and the server sends a total of 10 probes, 75 seconds
apart, and if no response, the server terminates the connection.
– (3) if the client is rebooted,
• There is a response for the probe, but the reponse will be a reset.  terminating
the connection
– (4) if the client is alive but not unreachable,
• same as in case (2)
74
2 MSL Timer
• 2 MSL(Maximum Segment Lifetime)
– It is the maximum amount of time any segment can exist in the network
before being discarded.
– When TCP performs an active close and sends the final ACK(reponse to the
FIN), that connection must stay in the TIME_WAIT state for twice the MSL.
– If the final ACK is lost, the other TCP can resend the FIN segment.
– And, new TCP connection will open after 2 M니. (Some systems prevents
from using the port numbers existed during 2 M니)
• Quiet time concept
– Suppose that a host crashed before the timeout while it is in the 2 MSLwait
state, and then rebooted immediately. If the host open a new TCP connection
as soon as it reboot, it cannot distinguish old segment in the previous
connection from new segments in the new connection.
– To avoid this confusion, TCP is not allowed to open new connection for 1
MSL right after rebooting. This 1 MSL time is called quiet time.
75
2 MSL Timer
closed
closed
closed
Wait for 2 MSL and then terminate
closed
Wait for 2 MSL and then terminate
(a) Closing connections sequentially
(b) Closing simultaneously
76
Contents
•
•
•
•
•
•
TCP
TCP connection
TCP flow control
TCP congestion control
TCP timer
UDP
77
UDP
• Addressing and checksum
• Providing unreliable service to application
• Datagram-oriented
– one application data -> one UDP datagram
78
UDP Header
IP datagram
UDP datagram
IP header
UDP header
20 octets
8 octets
UDP data
16-bit source port number
16-bit destination port number
16-bit Length
16-bit Checksum
8
octets
Data (if any)
79
A Few Well-known UDP Ports
Decimal Keyword
0
7
9
11
13
15
17
19
37
42
43
53
67
68
69
111
123
161
162
512
513
514
525
ECHO
DISCARD
USERS
DAYTIME
QUOTE
CHARGEN
TIME
NAMESERVER
NICNAME
DOMAIN
BOOTPS
BOOTPC
TFTP
SUNRPC
NTP
-
UNIX Keyword
echo
discard
systat
daytime
netstat
qotd
chargen
time
name
whois
nameserver
bootps
bootpc
tftp
sunrpc
ntp
snmp
snmp-trap
biff
who
syslog
timed
Description
Reserved
Echo
Discard
Active Users
Daytime
Who is up or NETSTAT
Quote of the Day
Character Generator
Time
Host Name Server
Who is
Domain Name Server
Bootstrap Protocol Server
Bootstrap Protocol Client
Trivial File Transfer
Sun Microsystems RPC
Network Time Protocol
SNMP net monitor
SNMP traps
UNIX comsat
UNIX rwho daemon
system log
Time daemon
80
TCP and UDP
• TCP
– connection-oriented
– Reliable service provisioning
– Error control and flow control
– stream-oriented
– Good for stable transmission of
long persistent data
• UDP
– connectionless
– Unreliable services
– No error control and no flow control
– datagram-oriented
– Good for short data or data that is
permissible to error
81