Chapter 2: Application layer

Download Report

Transcript Chapter 2: Application layer

What we covered last week
How to calculate the delay in
packet delivery
 Queueing delay

A few application protocols
 HTTP
HTTP 1.0
 HTTP 1.1

and congestion losses
 Transmission delay
• Pipelining
• Without pipelining
 Propagation delay
 Round Trip Time
 FTP
 Bandwidth-delay product
 Email and SMTP
 Delay in delivering multiple
packets across multiple hops
4/19/05
1
CS118
Why we need a naming database for the Internet
Domain Name System
 Hostname to IP address
People: many identifiers:

SSN, name, passport #
translation
Internet hosts, routers:

IP address (32 bit) - used
for addressing datagrams
 “name”, e.g.,
ww.yahoo.com - used by
humans

4/19/05
And vice versa
 has also been used for
other purposes

2
E.g. Load distribution
among replicated Web
servers: one website name
maps to a set of IP
addresses
CS118
DNS: Domain Name System
DNS consists of
 A hierarchical name space
 A distributed database
implemented in hierarchy of
many name servers
 An application-layer protocol
used by hosts, routers, and
name servers to communicate
to resolve names
(address/name translation)

4/19/05
Why not a centralize DNS?
 single point of failure
 traffic volume
 distant centralized
database
 Maintenance
 The most important
factor: the need for
distributed management
A core Internet function,
implemented as an applicationlayer protocol
3
CS118
A hierarchical name space
root
TLD (top level
domains)
com
edu
gov
...
mit
ucla xerox
dec
nasa nsf
org
.....
acm ieee
us
uk
fr
.....
.....
cs seas cad
Foo Bar
 each non-leaf node in the tree is a domain
 Each domain belongs to an administrative authority
 any domain can assign sub-domains below it
 no limit on the depth along any branch
 DNS name hierarchy is completely independent from the
Internet's topological structure
4/19/05
4
CS118
DNS: Implemented as a distributed database
Root DNS Servers
.com DNS servers
yahoo.com
amazon.com
DNS servers DNS servers
.org DNS servers
.edu DNS servers
ucla.edu
DNS servers
pbs.org
DNS servers
cs.ucla.edu
umass.edu
DNS servers
art art.ucla.edu
CS
netsec.cs.ucla.edu
The entire DNS name space is divided to a hierarchy of zones

a zone: a continuous sub-space in the DNS name tree
may contain domains at different levels
4/19/05
5
CS118
What makes a zone
 each zone is controlled by its own administrator,
served by its own name server(s)
 One
master server keeps a master zone file,
distributes it to multiple secondary servers
 Both are called authoritative servers for the zone
 Each server must be able to
 Resolve
all the names in its own zone
 Know where to direct queries for names belonging to
its sub-zones
CS
cs.ucla.edu
netsec.cs.ucla.edu
4/19/05
6
CS118
What's in the zone's master file:
 data that defines the top node of the zone
 including a list of all the servers for the zone
 authoritative data for all nodes in the zone
 for all of the nodes from the top node to leaf nodes (that are
outside of any sub-zone)
 data that describes delegated sub-zones
 Domain name, owner, etc
 “glue data”: IP address(es) for each sub-zone's name
server(s)
cs.ucla.edu
CS
netsec.cs.ucla.edu
4/19/05
7
CS118
How to resolve a DNS name?
EX: your browser needs IP address for www.amazon.com:
 Your host sends a query to a local DNS server
 The local server either finds the answer in its cache, or
otherwise sends a query to a root server
 The root server replies with pointers to .com DNS
servers
 The local server queries .com DNS server which replies
with pointer to amazon.com DNS server
 The local server queries amazon.com DNS server to get
the IP address for www.amazon.com, and sends the
answer back to your host
4/19/05
8
CS118
Local Name Server
 Each ISP (residential ISP, company, university)
has one.
 Also
called “default name server”, "local cache
server"
 Every host knows the IP address(es) of its local
DNS server(s)
 When a host makes a DNS query, query is sent to
its local DNS server
4/19/05
9
CS118
Example
root DNS server
2
 A host at cs.ucla.edu
3
wants IP address for
gaia.umass.edu
.edu DNS server
4
5
local DNS server
Toucan.CS.UCLA.EDU
1
8
requesting host
7
6
authoritative DNS server
dns.umass.edu
lixia.cs.ucla.edu
gaia.umass.edu
4/19/05
10
CS118
Recursive queries
root DNS server
recursive query:
puts burden of name
resolution on
contacted name
server.
heavy load?
2
3
7
4/19/05
.edu DNS server
local DNS server
Toucan.CS.UCLA.EDU
iterated query:
contacted server
replies with name of
server to contact
“I don’t know this
name, but ask this
server”
6
1
5
4
8
requesting host
authoritative DNS server
dns.umass.edu
lixia.cs.ucla.edu
gaia.umass.edu
11
CS118
DNS: caching and replication
 Virtually each and all Internet applications invoke DNS
lookup
 Redundant servers for each zone

“13” root servers
 once a name server learns a DNS name to IP address
mapping, it caches the mapping
 cache entries timeout (deleted) after some time
(specified in the DNS query reply)
 TLD servers typically cached in local name servers
• Thus root name servers not often visited
4/19/05
12
CS118
DNS records
DNS: all DNS servers storing Resource Records (RR)
RR format: (name,
Type=A
name is hostname
value is IP address
 Type=NS
 name is a domain (e.g.
foo.com)
 value is hostname of
authoritative name server
for this domain
4/19/05
value, type, ttl)
Type=CNAME
name is a alias name for some
“canonical” (the real) name
www.ibm.com is really
servereast.backup2.ibm.com
value is canonical name
Type=MX
value is name of mailserver
associated with name
13
e.g. name = cs.ucla.edu
value= mailman.cs.ucla.edu
type = MX
ttl = 172800
CS118
DNS protocol, messages
DNS protocol : query and reply messages, with same message format
msg header
identification: 16 bit # for
query, reply to query
uses same #
flags:
query or reply
recursion desired
recursion available
reply is authoritative
4/19/05
14
CS118
DNS protocol, messages
Name, type fields
for a query
RRs in response
to query
records for
authoritative servers
additional “helpful”
info that may be used
4/19/05
15
CS118
Inserting records into DNS
 Example: just created startup “Network Utopia”
 Register name networkuptopia.com at a registrar (e.g.,
Network Solutions)
Need to provide registrar with names and IP addresses of
your authoritative name servers (primary and secondary)
 Registrar inserts two RRs into the com TLD server:

(networkutopia.com, dns1.networkutopia.com, NS)
(dns1.networkutopia.com, 212.212.212.1, A)
 Put in authoritative server Type A record for
www.networkuptopia.com and Type MX record for
networkutopia.com
How do people get the IP address of Web site
www.networkutopia.com ?
4/19/05
16
CS118
How to use DNS in practice?
Two popular programs you can play on a unix:
 “host” – look up host names using domain servers


Command: host [-l] [-v] [-w] [-r] [-d] [-t query type] host [server]
Manual page: man host
 “nslookup” – query Internet name servers interactively
 Command: nslookup [-options…] [host-to-find | –[server] ]
 Manual page: man nslookup
4/19/05
17
CS118
> nslookup cs.ucla.edu
Server: Toucan.CS.UCLA.EDU
Address: 131.179.96.16
Name:
cs.ucla.edu
Address: 131.179.128.22
> nslookup -q=MX cs.ucla.edu
Server: Toucan.CS.UCLA.EDU
Address: 131.179.96.16
cs.ucla.edu pref. = 3, mail exchanger=Mailman.cs.ucla.edu
cs.ucla.edu pref. = 3, mail exchanger=Toucan.cs.ucla.edu
cs.ucla.edu
nameserver = NS0.cs.ucla.edu
cs.ucla.edu
nameserver = NS1.cs.ucla.edu
cs.ucla.edu
nameserver = NS2.cs.ucla.edu
cs.ucla.edu
nameserver = NS3.cs.ucla.edu
Mailman.cs.ucla.edu
internet address = 131.179.128.30
Toucan.cs.ucla.edu
internet address = 131.179.128.16
NS0.cs.ucla.edu internet address = 131.179.128.30
NS1.cs.ucla.edu internet address = 131.179.128.16
NS2.cs.ucla.edu internet address = 131.179.128.17
NS3.cs.ucla.edu internet address = 131.179.128.18
Chapter 2: Application layer
 2.1 Principles of network
applications
app architectures
 app requirements

 2.2 Web and HTTP
 2.4 Electronic Mail
 SMTP, POP3, IMAP
 2.5 DNS
4/19/05
 2.6 P2P file sharing
 2.7 Socket programming
with TCP
 2.8 Socket programming
with UDP
 2.9 Building a Web
server
19
CS118
P2P file sharing
Example
 Alice runs P2P client
application on her notebook
computer
 Intermittently connects to
Internet; gets new IP address
for each connection
 Asks for “Hey Jude”
 Application displays other
peers that have copy of Hey
Jude.
4/19/05
 Alice chooses one of the
peers, Bob.
 File is copied from Bob’s
PC to Alice’s notebook:
HTTP
 While Alice downloads,
other users uploading
from Alice.
 Alice’s peer is both a Web
client and a transient Web
server.
All peers are servers = highly
scalable!
20
CS118
P2P: centralized directory
original “Napster” design
1) when peer connects, it
informs central server:
Bob
centralized
directory server
1
peers
IP address
 content

1
3
1
2) Alice queries for “Hey
Jude”
3) Alice requests file from
Bob
2
1
Alice
4/19/05
21
CS118
P2P: problems with centralized directory
 Single point of failure
file transfer is
decentralized, but
locating content is
highly centralized
 Performance bottleneck
 Copyright infringement
4/19/05
22
CS118
Query flooding: Gnutella
Overlay network: graph
 edge between peer X and
Y if there’s a TCP
connection
 all active peers and edges
is overlay net
 Edge is not a physical
link
 Given peer will typically
be connected with < 10
overlay neighbors
 fully distributed
 no central server
 public domain protocol
 many Gnutella clients
implementing protocol
4/19/05
23
CS118
Gnutella: protocol
File transfer:
HTTP
 Query message
sent over existing TCP
connections
Query
 peers forward
QueryHit
Query message
 QueryHit
sent over
reverse
path
4/19/05
Query
QueryHit
Scalability:
limited scope
flooding
24
CS118
Gnutella: Peer joining
1. Joining peer X must find some other peer in
2.
3.
4.
5.
4/19/05
Gnutella network: use list of candidate peers
X sequentially attempts to make TCP with peers
on list until connection setup with Y
X sends Ping message to Y; Y forwards Ping
message.
All peers receiving Ping message respond with
Pong message
X receives many Pong messages. It can then
setup additional TCP connections
25
CS118
Exploiting heterogeneity: KaZaA
 Each peer is either a group
leader or assigned to a
group leader.
TCP connection between
peer and its group leader.
 TCP connections between
some pairs of group leaders.

 Group leader tracks the
content in all its children.
ordinary peer
group-leader peer
neighoring relationships
in overlay network
4/19/05
26
CS118
KaZaA: Querying
 Each file has a hash and a descriptor
 Client sends keyword query to its group leader
 Group leader responds with matches:
 For each match: metadata, hash, IP address
 If group leader forwards query to other group
leaders, they respond with matches
 Client then selects files for downloading
 HTTP requests
using hash as identifier sent to peers
holding desired file
4/19/05
27
CS118
KaZaA tricks
 Limitations on simultaneous uploads
 Request queuing
 Incentive priorities
 Parallel downloading
4/19/05
28
CS118
Chapter 2: Application layer
 2.1 Principles of network
applications
 2.2 Web and HTTP
 2.3 FTP
 2.4 Electronic Mail

SMTP, POP3, IMAP
 2.5 DNS
4/19/05
 2.6 P2P file sharing
 2.7 Socket programming
with TCP
 2.8 Socket programming
with UDP
 2.9 Building a Web
server
29
CS118
Socket-programming using TCP
Socket: a door between application process and end-endtransport protocol (UCP or TCP)
TCP service: reliable transfer of bytes from one process to
another
controlled by
application
developer
controlled by
operating
system
process
process
socket
TCP with
buffers,
variables
internet
controlled by
operating
system
host or
server
host or
server
4/19/05
socket
TCP with
buffers,
variables
controlled by
application
developer
30
CS118
Socket programming with TCP
Client must contact server
 server process must first
be running
 server must have created
socket (door) that
welcomes client’s contact
 When contacted by client,
server TCP creates new
socket for server process to
communicate with client
 allows server to talk
with multiple clients
 source port numbers
used to distinguish
clients (more in Chap 3)
Client contacts server by:
 creating client-local TCP
socket
 specifying IP address, port
number of server process
 When client creates
socket: client TCP
establishes connection to
server TCP
4/19/05
application viewpoint
TCP provides reliable, in-order
transfer of bytes (“pipe”)
between client and server
31
CS118
Client/server socket interaction: TCP
Server (running on hostid)
Client
create socket,
port=x, for
incoming request:
welcomeSocket =
ServerSocket()
TCP
wait for incoming
connection request connection
connectionSocket =
welcomeSocket.accept()
setup
send request using
clientSocket
read request from
connectionSocket
write reply to
connectionSocket
read reply from
clientSocket
close
connectionSocket
4/19/05
create socket,
connect to hostid, port=x
clientSocket =
Socket()
close
clientSocket
32
CS118
Socket programming with UDP
UDP: no “connection” between
client and server
 no handshaking
 sender explicitly attaches IP
address and port of
destination to each packet
 server must extract IP
address, port of sender from
received packet
application viewpoint
UDP provides unreliable transfer
of chunks of bytes (“datagrams”)
between client and server
UDP: transmitted data may be
lost, or received out of order
4/19/05
33
CS118
Client/server socket interaction: UDP
Server (running on hostid)
Client
create socket,
port=x, for
incoming request:
serverSocket =
DatagramSocket()
create socket,
clientSocket =
DatagramSocket()
Create, address (hostid, port=x,
send datagram request
using clientSocket
read request from
serverSocket
write reply to
serverSocket
specifying client
host address,
port number
4/19/05
read reply from
clientSocket
close
clientSocket
34
CS118
Chapter 2: Summary
 Application architectures
 client-server
 P2P
Learned about protocols
 typical request/reply message
exchange:
 Specific application protocols
 HTTP, FTP, SMTP, POP, IMAP,
DNS


 application service
 Typical message formats:
 headers: fields giving info about
data
 data: info being communicated
requirements:

reliability, bandwidth, delay
 Internet transport service
model


4/19/05
client sends request
server responds with status
code, data
 In-band vs. out-of-band
connection-oriented, reliable:
TCP
unreliable, datagrams: UDP
control messages
 Stateless vs. stateful protocols
35
CS118
identifying servers and services
 Each service is assigned a unique well-known port #
 HTTP: TCP/80, FTP: TCP/21, smtp: TCP/25, DNS: UDP/53
 server application process registers with local protocol
software with that port #
 a client requests a service by sending request to a
specific server host with the well-known port #
 server handles multiple requests concurrently
master process accepts incoming requests and creates a child
server process for each client, then goes to wait for future
request
 the child server process handles all msgs from the same client
process, each incoming msg identifies its server process by
(source addr + port, destination addr + port)

4/19/05
36
CS118
Transport services and protocols
 provide logical communication
application
transport
network
data link
physical
between app processes running on
different hosts
 transport protocols run in end
systems
 send side: breaks app
messages into segments,
passes to network layer
 rcv side: reassembles segments
into messages, passes to app
layer
 more than one transport protocol
available to apps
 Internet: TCP and UDP
4/19/05
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
network
data link
physical
application
transport
network
data link
physical
37
CS118
Transport vs. network layer
 network layer: logical
Household analogy:
12 kids sending letters to 12
kids
 processes = kids
 app messages = letters in
envelopes
 hosts = houses
 transport protocol = Ann
and Bill
 network-layer protocol =
postal service
communication between
hosts
 transport layer: logical
communication between
processes

4/19/05
relies on, enhances,
network layer services
38
CS118
Internet transport-layer protocols
 reliable, in-order
application
transport
network
data link
physical
delivery (TCP)
congestion control
 flow control
 connection setup

network
data link
physical
network
data link
physical
 unreliable, unordered
no-frills extension of
“best-effort” IP
application
transport
network
data link
physical
 services not available:
 delay guarantees
 bandwidth guarantees
4/19/05
network
data link
physical
network
data link
physical
delivery: UDP

network
data link
physical
39
CS118
Multiplexing/demultiplexing
Multiplexing at send host:
Demultiplexing at rcv host:
delivering received segments
to correct socket
= socket
application
transport
network
link
gathering data from multiple
sockets, enveloping data with
header (later used for
demultiplexing)
= process
P3
P1
P1
application
transport
network
P2
P4
application
transport
network
link
link
physical
host 1
physical
host 2
physical
host 3
Each process is identified by IP address and port#
A transport association is identified by [source addr, port#; destination addr, port#]
4/19/05
40
CS118
Multiplexing/demultiplexing: examples
 host receives IP datagrams
each datagram has source IP
address, destination IP address
 each datagram carries 1 transportlayer segment
 each segment has source,
destination port number
 host uses IP addresses & port numbers
to direct segment to appropriate socket

Web client
host A
4/19/05
Source IP: A
Dest IP: B
sour port:1180
dest. port: 80
Web clients
host C
Source IP: C
Dest IP: B
sour port:2211
dest. port: 80
Source IP: C
Dest IP: B
sour port:1180
dest. port: 80
Web
server B
port use: Web server
41
CS118
UDP: User Datagram Protocol [RFC 768]
 “best effort” service, UDP
segments may be:
 lost
 delivered out of order to
application processes
 connectionless:
 no prior handshaking
between UDP sender,
receiver
 each UDP segment
handled independently
of others
4/19/05
Why is there a UDP?
 no connection establishment
(which can add delay)
 simple: no connection state at
sender, receiver
 small segment header
 no congestion control: UDP can
blast away as fast as desired
42
CS118
UDP: more
 often used for streaming
multimedia apps
 loss tolerant
 rate sensitive
32 bits
Length, in
bytes of UDP
segment,
including
header
 other UDP uses
 DNS
 SNMP
 reliable transfer over
UDP: add reliability at
application layer
 application-specific
error recovery!
4/19/05
source port #
dest port #
length
checksum
Application
data
(message)
UDP segment format
43
CS118
UDP checksum
Goal: detect “errors” (e.g., flipped bits) in transmitted
segment
Sender:
Receiver:
 treat segment contents as
 compute checksum of
received segment
 check if computed checksum
equals checksum field value:
 NO - error detected
 YES - no error detected.
But maybe errors
nonetheless? More later
….
sequence of 16-bit integers
 checksum: addition (1’s
complement sum) of
segment contents
 sender puts checksum value
into UDP checksum field
4/19/05
44
CS118
Internet Checksum Example
 Note: When
adding numbers, a carryout
from the most significant bit needs to be
added to the result
 Example: add two 16-bit integers
1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
sum 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
checksum 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
4/19/05
45
CS118
How to Calculate UDP Checksum
 UDP header
 Length: # of bytes (including
both header & data)
 checksum: computed over
UDP header format
32 bits
source port #
dest port #
length
checksum
Application
data
(message)
• the pseudo header, and
• UDP header and data.
• if the field is 0, no checksum
 pseudo header: UDP's self-protection against misdelivered
IP packets
pseudo header is not
carried in UDP packet,
nor counted in the
length field
source IP address

4/19/05
destination IP address
zero
46
protocol
UDP length
CS118
Chapter 3 outline
 3.1 Transport-layer
 3.5 Connection-oriented
services
transport: TCP
 3.2 Multiplexing and
 segment structure
demultiplexing
 reliable data transfer
 3.3 Connectionless
 flow control
transport: UDP
 connection management
 3.4 Principles of reliable
 3.6 Principles of
data transfer
congestion control
 3.7 TCP congestion
control
4/19/05
47
CS118
Principles of Reliable data transfer
 characteristics of unreliable channel determines complexity of
reliable data transfer protocol (rdt)
We’ll:
 incrementally develop sender, receiver sides of reliable data
transfer protocol (rdt)
 consider only unidirectional data transfer

4/19/05
but control info will flow in both directions!
48
CS118
Reliable data transfer: getting started
rdt_send(): called from above,
(e.g., by app.). Passed data to
deliver to receiver upper layer
deliver_data(): called by
rdt to deliver data to upper layer
send
side
receive
side
rdt_rcv(): called when packet
arrives on rcv-side of channel
udt_send(): called by rdt,
to transfer packet over
unreliable channel to receiver
4/19/05
49
CS118
Reliable data transfer: getting started
 use finite state machines (FSM) to specify
sender, receiver
event causing state transition
actions taken on state transition
state
1
event
actions
state: when in this “state”,
the next state is
uniquely determined by
next event
4/19/05
state
2
State 3
50
CS118
Rdt1.0: reliable transfer over a reliable channel
 underlying channel perfectly reliable
 no
bit errors
 no loss of packets
 separate FSMs for sender, receiver:
 sender
sends data into underlying channel
 receiver read data from underlying channel
Wait for
call from
above
rdt_send(data)
Wait for
call from
below
packet = make_pkt(data)
udt_send(packet)
sender
4/19/05
rdt_rcv(packet)
extract (packet,data)
deliver_data(data)
receiver
51
CS118
Rdt2.0: channel with bit errors
 underlying channel may flip bits in packet
 checksum to detect bit errors
 the question: how to recover from errors:
 acknowledgements (ACKs): receiver explicitly tells sender that pkt
received OK
 negative acknowledgements (NAKs): receiver explicitly tells sender
that pkt had errors
 sender retransmits pkt on receipt of NAK
 new mechanisms in rdt2.0 (beyond rdt1.0):


4/19/05
error detection
receiver feedback: control msgs (ACK,NAK) rcvr->sender
52
CS118
rdt2.0: FSM specification
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for
Wait for
call from
ACK or
udt_send(sndpkt)
above
NAK
rdt_rcv(rcvpkt) && isACK(rcvpkt)
L
receiver
rdt_rcv(rcvpkt) &&
corrupt(rcvpkt)
udt_send(NAK)
Wait for
call from
below
sender
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
4/19/05
53
CS118
rdt2.0: operation with no errors
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for
Wait for
call from
ACK or
udt_send(sndpkt)
above
NAK
rdt_rcv(rcvpkt) && isACK(rcvpkt)
rdt_rcv(rcvpkt) &&
corrupt(rcvpkt)
udt_send(NAK)
Wait for
call from
below
L
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
sender FSM
4/19/05
receiver FSM
54
CS118
rdt2.0: error scenario
rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
Wait for
Wait for
call from
ACK or
udt_send(sndpkt)
above
NAK
rdt_rcv(rcvpkt) && isACK(rcvpkt)
rdt_rcv(rcvpkt) &&
corrupt(rcvpkt)
udt_send(NAK)
Wait for
call from
below
L
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
sender FSM
4/19/05
receiver FSM
55
CS118
rdt2.0 has a fatal flaw!
Handling duplicates:
What happens if
ACK/NAK corrupted?  sender retransmits current pkt
if ACK/NAK garbled
 sender adds sequence number
to each pkt
 receiver discards (doesn’t
deliver up) duplicate pkt
 sender doesn’t know what
happened at receiver!
 can’t just retransmit:
possible duplicate
stop and wait
Sender sends one packet,
then waits for receiver
response
4/19/05
56
CS118
rdt2.1: sender, handles garbled ACK/NAKs
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
Wait
for
Wait for
isNAK(rcvpkt) )
ACK or
call 0 from
udt_send(sndpkt)
NAK 0
above
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt)
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt)
L
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
isNAK(rcvpkt) )
udt_send(sndpkt)
4/19/05
L
Wait for
ACK or
NAK 1
Wait for
call 1 from
above
rdt_send(data)
sndpkt = make_pkt(1, data, checksum)
udt_send(sndpkt)
57
CS118
rdt2.1: receiver, handles garbled ACK/NAKs
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq0(rcvpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(NAK, chksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
not corrupt(rcvpkt) &&
has_seq1(rcvpkt)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
sndpkt = make_pkt(NAK, chksum)
udt_send(sndpkt)
Wait for
0 from
below
Wait for
1 from
below
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)
rdt_rcv(rcvpkt) &&
not corrupt(rcvpkt) &&
has_seq0(rcvpkt)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)
4/19/05
58
CS118
rdt2.1: discussion
Sender:
 seq # added to pkt
 two seq. #’s (0,1) will
suffice. Why?
 must check if received
ACK/NAK corrupted
 twice as many states

4/19/05
Receiver:
 must check if received
packet is duplicate

state indicates whether 0
or 1 is expected pkt seq #
 note: receiver cannot
know if its last
ACK/NAK received OK
at sender
state must “remember”
whether “current” pkt has
0 or 1 seq. #
59
CS118
rdt2.2: a NAK-free protocol
 same functionality as rdt2.1, using ACKs only
 instead of NAK, receiver sends ACK for last pkt
received OK
 receiver
must explicitly include seq # of pkt being
ACKed
 duplicate ACK at sender results in same action as
NAK: retransmit current pkt
4/19/05
60
CS118
rdt2.2: sender, receiver fragments
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
Wait for
Wait for
isACK(rcvpkt,1) )
ACK
call 0 from
0
udt_send(sndpkt)
above
sender FSM
fragment
rdt_rcv(rcvpkt) &&
(corrupt(rcvpkt) ||
has_seq1(rcvpkt))
udt_send(sndpkt)
Wait for
0 from
below
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt,0)
receiver FSM
fragment
L
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(ACK1, chksum)
udt_send(sndpkt)
4/19/05
61
CS118
rdt3.0: channels with bit errors and packet loss
Approach: sender waits
“reasonable” amount of time
for ACK
 retransmits if no ACK
received in this time
 if pkt (or ACK) just delayed
(not lost):
New assumption: underlying
channel can also lose
packets (data or ACKs)

checksum, seq. #, ACKs,
retransmissions will be of
help, but not enough


retransmission will be
duplicate, but use of seq. #’s
already handles this
receiver must specify seq # of
pkt being ACKed
 requires countdown timer
4/19/05
62
CS118
rdt3.0 sender
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt)
start_timer
rdt_rcv(rcvpkt)
L
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt,1)
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
isACK(rcvpkt,0) )
4/19/05
timeout
udt_send(sndpkt)
start_timer
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt,0)
stop_timer
stop_timer
timeout
udt_send(sndpkt)
start_timer
L
Wait
for
ACK0
Wait for
call 0from
above
L
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
isACK(rcvpkt,1) )
Wait
for
ACK1
Wait for
call 1 from
above
rdt_send(data)
rdt_rcv(rcvpkt)
L
sndpkt = make_pkt(1, data, checksum)
udt_send(sndpkt)
start_timer
63
CS118
rdt3.0 in action
4/19/05
64
CS118
rdt3.0 in action
4/19/05
65
CS118
Performance of rdt3.0
 example: 1 Gbps link, 15 ms prop. delay, 1KB packet:
Ttransmit =
U
L (packet length in bits)
8kb/pkt
=
= 8 microsec
R (transmission rate, bps)
10**9 b/sec
=
sender
L/R
RTT + L / R
=
.008
30.008
= 0.00027
microsec
onds
U sender: utilization – fraction of time sender busy sending
 1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps
link
 network protocol limits use of physical resources!

4/19/05
66
CS118
rdt3.0: stop-and-wait operation
sender
receiver
first packet bit transmitted, t = 0
last packet bit transmitted, t = L / R
first packet bit arrives
last packet bit arrives, send ACK
RTT
ACK arrives, send next
packet, t = RTT + L / R
U
4/19/05
=
sender
L/R
RTT + L / R
67
=
.008
30.008
= 0.00027
microsec
onds
CS118
Pipelined protocols
Pipelining: sender allows multiple, “in-flight”, yetto-be-acknowledged pkts
range of sequence numbers must be increased
 buffering at sender and/or receiver

4/19/05
68
CS118
Pipelining: increased utilization
sender
receiver
first packet bit transmitted, t = 0
last bit transmitted, t = L / R
first packet bit arrives
last packet bit arrives, send ACK
last bit of 2nd packet arrives, send ACK
last bit of 3rd packet arrives, send ACK
RTT
ACK arrives, send next
packet, t = RTT + L / R
Increase utilization
by a factor of 3!
U
4/19/05
sender
=
3*L/R
RTT + L / R
=
69
.024
30.008
= 0.0008
microsecon
ds
CS118
What if some packets get lost?
 Two generic forms of pipelined protocols: go-
Back-N, selective repeat
4/19/05
70
CS118
Go-Back-N
Sender:
 k-bit seq # in pkt header
 “window” of up to N, consecutive unack’ed pkts allowed
 ACK(n): ACKs all pkts up to, including seq # n (cumulative ACK)
may receive duplicate ACKs (see receiver)
 timer for each in-flight pkt
 timeout(n): retransmit pkt n and all higher seq # pkts in window

4/19/05
71
CS118
GBN: sender extended FSM
Call from application
rdt_send(data)
L
base=1
nextseqnum=1
Call from network
if (nextseqnum < base+N) {
sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum)
udt_send(sndpkt[nextseqnum])
if (base == nextseqnum)
start_timer
nextseqnum++
}
else
refuse_data(data)
timeout
start_timer
udt_send(sndpkt[base])
udt_send(sndpkt[base+1])
…
udt_send(sndpkt[nextseqnum-1])
Wait
rdt_rcv(rcvpkt)
&& corrupt(rcvpkt)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
base = getacknum(rcvpkt)+1
If (base == nextseqnum)
stop_timer
else
start_timer
4/19/05
72
CS118
GBN: receiver extended FSM
default
udt_send(sndpkt)
L
Wait
expectedseqnum=1
sndpkt =
make_pkt(expectedseqnum,ACK,chksum)
rdt_rcv(rcvpkt)
&& notcurrupt(rcvpkt)
&& hasseqnum(rcvpkt,expectedseqnum)
extract(rcvpkt,data)
deliver_data(data)
sndpkt = make_pkt(expectedseqnum,ACK,chksum)
udt_send(sndpkt)
expectedseqnum++
ACK-only: always send ACK for correctly-received pkt with
highest in-order seq #


may generate duplicate ACKs
need only remember expectedseqnum
 out-of-order pkt:
 discard (don’t buffer) -> no receiver buffering!
 Re-ACK pkt with highest in-order seq #
4/19/05
73
CS118
GBN in action
4/19/05
74
CS118
Selective Repeat
 receiver individually acknowledges all correctly
received pkts

buffers pkts (which may have arrived out-of-order), as
needed, for eventual in-order delivery to upper layer
 sender only resends pkts for which ACK not
received

sender timer for each unACKed pkt
 sender window
 N consecutive seq #’s
 again limits seq #s of sent, unACKed pkts
4/19/05
75
CS118
Selective repeat: sender, receiver windows
4/19/05
76
CS118
sender
Selective repeat
receiver
data from above :
pkt n in [rcvbase, rcvbase+N-1]
 if next available seq # in
 send ACK(n)
window, send pkt
timeout(n):
 out-of-order: buffer
 resend pkt n, restart timer
 in-order: deliver (also
deliver buffered, in-order
pkts), advance window to
next not-yet-received pkt
ACK(n) in [sendbase,sendbase+N]:
 mark pkt n as received
 if n smallest unACKed pkt,
pkt n in [rcvbase-N,rcvbase-1]
advance window base to next
unACKed seq #
 ACK(n)
otherwise:
 ignore
4/19/05
77
CS118
Selective repeat in action
4/19/05
78
CS118
Selective repeat: dilemma
Example:
 seq #’s: 0, 1, 2, 3
 window size=3
 receiver sees no
difference in two
scenarios!
 incorrectly passes
duplicate data as new in
(a)
Q: what relationship
between seq # size and
window size?
4/19/05
79
CS118
Sequence number: how many bits needed?
Example: Window size = 4, is 2 bits enough?
sender
1 2 3
0
1 2
3
0
1 2 3
0 1 2
3
0
reciver
(Max. seq# + 1) / 2  window-size
4/19/05
80
CS118
Three basic components
in reliable data delivery by retransmission
 sequence number: used by both sender and
receiver to uniquely identify individual frames
 Acknowledgment (ACK): reception report sent
by receiver
 Retransmission by the sender upon TIMEOUT
 must
4/19/05
know how long to wait before retry
81
CS118
Always Keeps the Big Picture in Mind
M
Ht M
Hn Ht M
Hl Hn Ht M
application
transport
network
link
physical
Web
server
Web
browser
HTTP
Socket interface
TCP
HTTP
Socket interface
TCP
Unreliable network
data packet
delivery
Application process
Application process
Write
bytes
TCP
TCP
Send buffer
Receive buffer
segment
4/19/05
Read
bytes
82
segment
CS118