cpt5 - NDSU Computer Science

Download Report

Transcript cpt5 - NDSU Computer Science

Chapter 5: End-to-End (Transport) Protocols
• Summary of underlying best-effort network capabilities (host-host)
– drops packets or datagrams
– re-orders packets or datagrams (send order versus receive order)
– delivers duplicate copies of a given packet or datagram
– limits size of packets or datagrams
– Arbitrarily long delays for delivering packets or datagrams
• End-to-end services desired (process-to-process)
– guarantee delivery of messages
– Same-order delivery of messages (same order as they are sent)
– deliver one copy of each message
– arbitrarily large message support
– Synchronization support sender-to-receiver (ie, connection oriented
– flow control (allowing receiver to regulate sender’s rate)
– multiple application process support on each host
End-to-End (Transport) Protocols continued
RPC
RPC
=
Remote Procedure Call
Simple Demultiplexing support (UDP)
• Unreliable, unordered, datagram service
• Adds demultiplexing
• No flow control
• Endpoints identified by ports
– servers have well-known ports
– see /etc/services on Unix
• Optional checksum
• Header format
Reliable Byte-Stream (TCP) Overview
• Connection-oriented, Byte-stream
– sending process writes stream of bytes
– TCP breaks into segments and
sends them as IP_datagrams
– receiving process reads stream of bytes
(a) 4 512-byte segments sent as IP datagrams
(b) 2048B read as stream
• Full duplex channel
• Flow control provided (to keep sender from overrunning receiver)
• Congestion control provided (to keep sender from overrunning network
End-to-End Issues
Based on sliding window protocol similar to that used at data
link level, but the situation is very different.
• Potentially connects many different hosts
– need explicit connection establishment and termination
• Potentially different RTT
– need adaptive timeout mechanism
• Potentially long delay in network
– need to be prepared for arrival of very old packets
• Potentially different capacity at destination
– need to accommodate different amounts of buffering
• Potentially different network capacity
– need to be prepared for network congestion
Segment
Format
Advertised
• Each connection is identified by a 4-tuple as a demux key:
<SrcPort, SrcIPAddr, DestPort, DestIPAddr>
• Sliding window alg + flow control involve the following fields:
AcknowledgmentNumber, SequenceNumber, AdvertisedWindowSize
HdrLen is in 32-bit words
6-bit Flag: used to relay control information
URGent set when urgent data is pointed to by UrgPtr
ACK set when AckNum is valid,
PUSh signifies that sender has flushed buffers
ReSeT says receiver has become confused (start over!)
SYNchronize, FINish set to establish/terminate connection
Checksum: pseudo header(SrcAdr+DestAdr+Lengths) + TCP_header + data
Connection Establishment and Termination
(Active Client)
(Passive Server)
3-WAY Handshake:
Client sends a segment to
the server with its start Seq#
(SYN=1, SeqNum=x)
Server sends a segment with
(SYN=1,ACK=1,AckNum=x+1,
SeqNum=y (its own start Seq#))
Client sends ack segment
with (ACK=1, AckNum=y+1)
Normal case
Call collision
Sliding Window
1.
2.
3.
Guarantees the reliable delivery of data
Ensures that data is delivered in order
Enforces flow control (that sender does not overrun receiver)
Basically the same as in the sliding window algorithm at the link level
For 1. (guaranteed reliable delivery).
Where TCP sliding window differs is that it folds flow control in as well.
Rather than fixed size window, receiver advertises a window size thru the
AdvertiseWindow field (based on available buffers).
Sender then is limited to having no more than that window size.
Treatment of Sequence number wrap-around is essentially the same as link level.
Sliding Window
Each byte has a
Sequence number.
ACKs are
Cumulative.
Send buffer
Window
Window
Sending side
LastByteAcked  LastByteSent
LastByteSent  LastByteWritten
Bytes between LastByteAcked and LastByteWritten must be buffered
Receiving side
– LastByteRead < NextByteExpected
– bytes between NextByteRead and LastByteRcvd must be buffered
Keeping the Pipe Full
Bandwidth
T1 (1.5Mbps)
Ethernet (10Mbps)
T3 (45Mbps)
FDDI (100Mbps)
STS-3 (155Mbps)
STS-12 (622Mbps)
STS-24 (1.2Gbps)
Time til Wraparound
6.4 hours
57 minutes
13 minutes
6 minutes
4 minutes
55 seconds
28 seconds
Delay x Bandwidth Product
18KB
122KB
549KB
1.2MB
1.8MB
7.4MB
14.8MB
Adaptive Retransmission
•
Original Algorithm
– Measure SampleRTT for each segment/ACK pair
– Compute weighted average of RTT
• EstimatedRTT = a * EstimatedRTT + b * SampleRTT
• where a + b = 1
• a between 0.8 and 0.9
• b between 0.1 and 0.2
– Set timeout based on EstimatedRTT
• TimeOut = 2 * EstimatedRTT
Karn/Partridge Algorithm
• Do not sample RTT when retransmitting
• Double timeout after each retransmission
Jacobson/Karels Algorithm (used today)
• New calculation for average RTT
Diff =
EstimatedRTT =
Deviation =
SampleRTT - EstimatedRTT
EstimatedRTT + (d * Diff)
Deviation + d * (|Diff|- Deviation)
(where d is a fraction between 0 and 1)
• Setting timeout value
TimeOut = m x EstimatedRTT + f x Deviation
(where m = 1 and f = 4)
• Notes
– algorithm only as good as granularity of clock (500ms on Unix)
– accurate timeout mechanism important to congestion control (later)
• TCP Extensions proposed
– Store timestamp in outgoing segments
– Use 32-bit timestamp
– Make modifications to advertised window
Remote Procedure Call (RPC)
Protocol Stack
BLAST: fragments and reassembles large messages
CHAN: synchronizes request and reply messages
SELECT: dispatches message to correct process
Simple RPC Protocol Stack
SELECT
Caller
(client)
Callee
(server)
Return
value
Arguments
Arguments
Reply
RPC
protocol
CHAN
Reply
BLAST
Server
stub
Client
stub
Request
Return
value
Request
RPC
protocol
IP
ETH
Bulk Transfer (BLAST)
Sender
Unlike AAL and IP, BLAST tries to recover
from lost fragments
Strategy
– Accumulates acks
– selective retransmission
– aka partial acknowledgements
0
16
31
ProtNum
MID
Blast header format
MID protects against wraparound
Length
NumFrags
Type
FragMask
Data
NumFrags = number of fragments
TYPE = DATA or SRR
FragMask distinguishes fragments
if Type=DATA, identifies this frag
if Type=SRR, identifies missing frags
Receiver
Request/Reply (CHAN)
• Guarantees message delivery
• Synchronizes client with server
• Supports at-most-once semantics
Client
Server
Timeline using Implicit Acks
Client
Server
…
Simple timeline
CHAN Header Format
typedef struct {
u_short Type;
u_short CID;
int
MID;
int
BID;
int
Length;
int
ProtNum;
} ChanHdr;
/*
/*
/*
/*
/*
/*
REQ, REP, ACK, PROBE */
unique channel id */
unique message id */
unique boot id */
length of message */
high-level protocol */
CHAN Session State
typedef struct {
u_char
type;
u_char
status;
int
retries;
int
timeout;
XkReturn ret_val;
Msg
*request;
Msg
*reply;
Semaphore reply_sem;
int
mid;
int
bid;
} ChanState;
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
CLIENT or SERVER */
BUSY or IDLE */
number of retries */
timeout value */
return value */
request message */
reply message */
client semaphore */
message id */
boot id */
Dispatcher (SELECT)
• Dispatch to
appropriate
procedure
• Synchronous
counterpart to UDP
• Address Space for Procedures
– flat: unique id for each possible procedure
– hierarchical: program + procedure number
SunRPC
SunRPC
• IP implements ~BLAST-equivalent
UDP
• SunRPC implements ~CHAN-equivalent
• UDP + SunRPC implement SELECT-equivalent
IP
– UDP dispatches to program (ports bound to programs)
– SunRPC dispatches to procedure within program
• SUN RPC header:
– XID (transaction id) similar to CHAN’s MID
– Server does not remember last XID it serviced
– Problem if client retransmits request
while reply is in transit
0
ETH
31
0
31
XID
XID
MsgType = CALL
MsgType = REPLY
RPCVersion = 2
Status = ACCEPTED
Program
Data
Version
Procedure
Credentials (variable)
Verifier (variable)
Data
Presentation Formatting
• Data types considered
– integers
– floats
– strings
– arrays
– structs
Application
Application
data
data
Presentation
Presentation
encoding
decoding
…
Message
• Types of data not considered
– images
– video
– multimedia documents
Message
Message