Transcript Server

Chapter 2
Application Layer
A note on the use of these ppt slides:

The notes used in this course are substantially
based on powerpoint slides developed and
copyrighted by J.F. Kurose and K.W. Ross,
2007
2: Application Layer
Computer Networking:
A Top Down Approach,
4th edition.
Jim Kurose, Keith Ross
Addison-Wesley, July
2007.
1
Chapter 2: Application Layer




2.1 Principles of network
applications
2.2 Web and HTTP
2.3 FTP
2.4 Electronic Mail


SMTP, POP3, IMAP




2.6 P2P file sharing
2.7 Socket programming
with TCP
2.8 Socket programming
with UDP
2.9 Building a Web server
2.5 DNS
2: Application Layer
2
Application Architectures

Client-server

Peer-to-peer (P2P)

Hybrid of client-server
and P2P
client
client
client
2: Application Layer
3
Processes Communicating
Process: program running
within a host.
 Within same host, two
processes communicate
using inter-process
communication (defined
by OS)
 Processes in different
hosts communicate by
exchanging messages
Client process: process that
initiates communication
Server process: process
that waits to be
contacted
 Note: applications with
P2P architectures have
client processes & server
processes
2: Application Layer
4
Addressing Processes




To receive messages,
process must have identifier

Host device has unique 32bit IP address
Q: does IP address of host
on which process runs
suffice for identifying the
process?

Identifier includes both IP
address and port numbers
associated with process on
host
Example port numbers:



Answer: NO, many
processes can be running on
same host
To send HTTP message to
www.cs.ust.hk web server:



HTTP server: 80
Mail server: 25
IP address: 143.89.40.4
Port number: 80
More on this later
2: Application Layer
5
App-Layer Protocol Defines



e.g., request, response


What fields in messages & how
fields are delineated
Message semantics



Message syntax:


Public-domain protocols:
Types of messages
exchanged
Meaning of information in
fields
Defined in RFCs
Allows for interoperability
e.g., HTTP, SMTP
Proprietary protocols:

e.g., KaZaA
Rules for when and how
processes send & respond to
messages
2: Application Layer
6
What Transport Service does an App Need?
Data loss


Bandwidth
Some apps (e.g., audio, video)
can tolerate some loss
Other apps (e.g., file transfer,
telnet) require 100% reliable
data transfer
Timing

Some apps (e.g., Internet
telephony, interactive
games) require low delay
to be “effective”
 Some apps (e.g.,
multimedia) require
minimum amount of
bandwidth to be “effective”
 Other apps (“elastic apps”)
make use of whatever
bandwidth they get (BestEfforts)
2: Application Layer
7
Internet Transport Protocols Services
UDP service:
TCP service:





Connection-oriented: setup
required between client and
server processes
Reliable transport between
sending and receiving process
Flow control: sender won’t
overwhelm receiver
Congestion control: throttle
sender when network
overloaded
Does not provide: timing,
minimum bandwidth guarantees


Unreliable data transfer
between sending and
receiving process
Does not provide:
connection setup, reliability,
flow control, congestion
control, timing, or
bandwidth guarantee
Q: Why bother? Why is there a
UDP?
A: Low latency, low overhead,
simple
2: Application Layer
8
Chapter 2: Application Layer

2.1 Principles of network
applications




2.2 Web and HTTP
2.4 Electronic Mail


app architectures
app requirements



2.6 P2P file sharing
2.7 Socket programming
with TCP
2.8 Socket programming
with UDP
SMTP, POP3, IMAP
2.5 DNS
2: Application Layer
9
HTTP Overview
HTTP: hypertext transfer
protocol




Web’s application layer protocol
Client/server model
 Client: browser that
requests, receives, “displays”
Web objects
 Server: web server sends
objects in response to
requests
HTTP 1.0: RFC 1945
HTTP 1.1: RFC 2068
PC running
Explorer
Server
running
Apache Web
server
Mac running
Navigator
2: Application Layer
10
HTTP Overview (cont.)
HTTP is “stateless”
Uses TCP:




Client initiates TCP connection
(creates socket) to server, port
80
Server accepts TCP connection
from client
HTTP messages (applicationlayer protocol messages)
exchanged between browser
(HTTP client) and Web server
(HTTP server)
TCP connection closed

Server maintains no
information about past
client requests
aside
Protocols that maintain “state”
are complex!
 Past history (state) must be
maintained
 If server/client crashes, their
views of “state” may be
inconsistent, must be
reconciled
2: Application Layer
11
HTTP Connections
Nonpersistent HTTP
 At most one object is sent
over a TCP connection
 Each TCP connection is
closed after the server
sends the object
 HTTP/1.0 uses
nonpersistent HTTP
Persistent HTTP
 Multiple objects can be
sent over single TCP
connection between client
and server
 HTTP/1.1 uses persistent
connections in default
mode
2: Application Layer
12
Non-Persistent HTTP: Response Time
Definition of RTT: time to send
a small packet to travel from
client to server and back
Response time:



One RTT to initiate TCP
connection
One RTT for HTTP request
and first few bytes of HTTP
response to return
File transmission time
total = 2RTT+transmit time
initiate TCP
connection
RTT
request
file
time to
transmit
file
RTT
file
received
time
time
2: Application Layer
13
Persistent HTTP
Nonpersistent HTTP issues:

Requires 2 RTTs per object

OS overhead for each TCP
connection

Browsers often open parallel
TCP connections to fetch
referenced objects
Persistent HTTP

Server leaves connection open
after sending response

Subsequent HTTP messages
between same client/server
sent over open connection
Persistent without pipelining:

Client issues new request only
when previous response has
been received

One RTT for each referenced
object
Persistent with pipelining:

Default in HTTP/1.1

Client sends requests as soon
as it encounters a referenced
object

As little as one RTT for all the
referenced objects
2: Application Layer
14
User-Server State: Cookies

Why need cookies?

HTTP server is stateless


It is often desirable to identify users



Simplify server design and handle thousands of
simultaneous TCP connections
Restrict user access
Serve content as a function of the user identity
Cookies [RFC 2109]

Allow sites to keep track of users
2: Application Layer
15
User-Server State: Cookies
Many major Web sites use
cookies
Four components:
1) Cookie header line of HTTP
response message
2) Cookie header line in HTTP
request message
3) Cookie file kept on user’s
host, managed by user’s
browser
4) Back-end database at Web
site
Example:



2: Application Layer
Susan access Internet
always from same PC
She visits a specific ecommerce site for first time
When initial HTTP requests
arrives at site, site creates a
unique ID and creates an
entry in backend database
for ID
16
Cookies: Keeping “State” (cont.)
client
Cookie file
ebay: 8734
Cookie file
amazon: 1678
ebay: 8734
server
usual http request msg
usual http response +
Set-cookie: 1678
usual http request msg
cookie: 1678
usual http response msg
server
creates ID
1678 for user
cookiespecific
action
one week later:
Cookie file
amazon: 1678
ebay: 8734
usual http request msg
cookie: 1678
usual http response msg
2: Application Layer
cookiespectific
action
17
Cookies (continued)
What cookies can bring:




Authorization
Shopping carts
Recommendations
User session state (Web email)
Cookies and privacy:
aside
 Cookies permit sites to
learn a lot about you
How to keep “state”:
 Protocol endpoints: maintain
state at sender/receiver over
multiple transactions
 Cookies: http messages carry
state
 You may supply name and
e-mail to sites
2: Application Layer
18
Web Caches (Proxy Server)
Goal: satisfy client request without involving origin server


origin
server
User sets browser: Web
accesses via cache
Browser sends all HTTP
requests to cache


client
Object in cache: cache
returns object
Proxy
server
Else cache requests object
from origin server, then
returns object to client
client
2: Application Layer
origin
server
19
More about Web Caching


Cache acts as both client and
server
Typically cache is installed by
ISP (university, company,
residential ISP)
Why Web caching?



Reduce response time for client
request.
Reduce traffic on an institution’s
access link.
Internet dense with caches:
enables “poor” content
providers to effectively deliver
content (but so does P2P file
sharing)
2: Application Layer
20
Conditional GET
The copy of an object residing
cache
in the cache may be stale


Goal: don’t send object if cache
has up-to-date cached version
Cache: specify date of cached
copy in HTTP request
If-modified-since:
<date>

Server: response contains no
object if cached copy is up-todate:
HTTP/1.0 304 Not
Modified
server
HTTP request msg
If-modified-since:
<date>
HTTP response
object
not
modified
HTTP/1.0
304 Not Modified
HTTP request msg
If-modified-since:
<date>
HTTP response
object
modified
HTTP/1.0 200 OK
<data>
2: Application Layer
21
Chapter 2: Application Layer




2.1 Principles of network
applications
2.2 Web and HTTP
2.3 FTP
2.4 Electronic Mail


SMTP, POP3, IMAP




2.6 P2P file sharing
2.7 Socket programming
with TCP
2.8 Socket programming
with UDP
2.9 Building a Web server
2.5 DNS
2: Application Layer
22
FTP: the File Transfer Protocol
user
at host




FTP
FTP
user
client
interface
file transfer
local file
system
FTP
server
remote file
system
Transfer file to/from remote host
Client/server model
 Client: side that initiates transfer (either to/from remote)
 Server: remote host
Ftp: RFC 959
Ftp server: port 21
2: Application Layer
23
FTP: Separate Control, Data Connections





FTP client contacts FTP server at
port 21, specifying TCP as
transport protocol
Client obtains authorization over
control connection
Client browses remote directory
by sending commands over
control connection.
When server receives file transfer
command, server opens 2nd TCP
connection (for file) to client
After transferring one file, server
closes data connection.
TCP control connection
port 21
FTP
client
TCP data connection
port 20
FTP
server
 Server opens another TCP data
connection to transfer another
file.
 Control connection: “out of
band”
 FTP server maintains “state”:
current directory, earlier
authentication
2: Application Layer
24
Chapter 2: Application Layer




2.1 Principles of network
applications
2.2 Web and HTTP
2.3 FTP
2.4 Electronic Mail


SMTP, POP3, IMAP




2.6 P2P file sharing
2.7 Socket programming
with TCP
2.8 Socket programming
with UDP
2.9 Building a Web server
2.5 DNS
2: Application Layer
25
outgoing
message queue
Electronic Mail
user mailbox
Three major components:



User agents
Mail servers
Simple mail transfer protocol:
SMTP
User Agent

a.k.a. “mail reader”

Composing, editing, reading
mail messages

e.g., Eudora, Outlook, elm,
Netscape Messenger

Outgoing, incoming messages
stored on server
user
agent
mail
server
SMTP
user
agent
mail
server
SMTP
user
agent
SMTP
user
agent
mail
server
2: Application Layer
user
agent
user
agent
26
outgoing
message queue
Electronic Mail: Mail Servers
user mailbox
user
agent
Mail Servers



Mailbox contains incoming
messages for user
Message queue of outgoing (to
be sent) mail messages
SMTP protocol between mail
servers to send email messages
 Client: sending mail server
 Server: receiving mail
server
mail
server
SMTP
user
agent
mail
server
SMTP
user
agent
SMTP
user
agent
mail
server
user
agent
2: Application Layer
user
agent
27
Scenario: Alice sends Message to Bob
4) SMTP client sends Alice’s
message over the TCP
connection
5) Bob’s mail server places the
message in Bob’s mailbox
6) Bob invokes his user agent to
read message
1) Alice uses UA to compose
message and “to”
[email protected]
2) Alice’s UA sends message to
her mail server; message
placed in message queue
3) Client side of SMTP opens TCP
connection with Bob’s mail
server
1
user
agent
2
mail
server
3
mail
server
4
2: Application Layer
5
6
user
agent
28
Mail Access Protocols
user
agent
SMTP
SMTP
sender’s mail
server


access
protocol
user
agent
receiver’s mail
server
SMTP: delivery/storage to receiver’s server
Mail access protocol: retrieval from server
 POP: Post Office Protocol [RFC 1939]
 Authorization (agent <-->server) and download
 IMAP: Internet Mail Access Protocol [RFC 1730]
 More features (more complex)
 Manipulation of stored msgs on server
 HTTP: Hotmail , Yahoo! Mail, etc.
2: Application Layer
29
Chapter 2: Application Layer




2.1 Principles of network
applications
2.2 Web and HTTP
2.3 FTP
2.4 Electronic Mail


SMTP, POP3, IMAP




2.6 P2P file sharing
2.7 Socket programming
with TCP
2.8 Socket programming
with UDP
2.9 Building a Web server
2.5 DNS
2: Application Layer
30
DNS: Domain Name System
People: many identifiers:

Domain Name System:
SSN, name, passport #

Internet hosts, routers:


IP address (32 bit) - used for
addressing datagrams
“Name”, e.g., ww.yahoo.com used by humans
Q: map between IP addresses
and name ?

Distributed database
implemented in hierarchy of
many name servers
Application-layer protocol host,
routers, name servers to
communicate to resolve names
(address/name translation)
 Note: core Internet function,
implemented as applicationlayer protocol
 Complexity at network’s
“edge”
2: Application Layer
31
Distributed, Hierarchical Database
Root DNS Servers
com DNS servers
yahoo.com
amazon.com
DNS servers DNS servers
org DNS servers
pbs.org
DNS servers
edu DNS servers
poly.edu
umass.edu
DNS serversDNS servers
Client wants IP for www.amazon.com; 1st approx:
 Client queries a root server to find com DNS server
(top-level domain)
 Client queries com DNS server to get amazon.com DNS
server (authoritative)
 Client queries amazon.com DNS server to get IP
address for www.amazon.com
2: Application Layer
32
DNS: Root Name Servers


Contacted by local name server that can not resolve name
Root name server:
 Contacts authoritative name server if name mapping not known
 Gets mapping
 Returns mapping to local name server
a Verisign, Dulles, VA
c Cogent, Herndon, VA (also Los Angeles)
d U Maryland College Park, MD
k RIPE London (also Amsterdam,
g US DoD Vienna, VA
Frankfurt)
i Autonomica, Stockholm (plus 3
h ARL Aberdeen, MD
j Verisign, ( 21 locations)
other locations)
m WIDE Tokyo
e NASA Mt View, CA
f Internet Software C. Palo Alto,
CA (and 36 other locations)
13 root name
servers worldwide
b USC-ISI Marina del Rey, CA
l ICANN Los Angeles, CA
2: Application Layer
33
TLD and Authoritative Servers

Top-level domain (TLD) servers: responsible for com,
org, net, edu, etc, and all top-level country domains uk,
fr, ca, jp



Network solutions maintains servers for com TLD
Educause for edu TLD
Authoritative DNS servers: organization’s DNS servers,
providing authoritative hostname to IP mappings for
organization’s servers (e.g., Web and mail)

Can be maintained by organization or service provider
2: Application Layer
34
Local Name Server


Does not strictly belong to hierarchy
Each ISP (residential ISP, company,
university) has one


Also called “default name server”
When a host makes a DNS query, query is
sent to its local DNS server

Acts as a proxy, forwards query into hierarchy
2: Application Layer
35
Example

Iterated
query
Host at cis.poly.edu
wants IP address for
gaia.cs.umass.edu
root DNS server
2
3
TLD DNS server
4
local DNS server
5
dns.poly.edu
Recursive
query
1
8
requesting host
7
6
authoritative DNS server
dns.cs.umass.edu
cis.poly.edu
gaia.cs.umass.edu
2: Application Layer
36
Recursive Queries
root DNS server
Recursive
query
Recursive query:
2
 Puts burden of name
resolution on contacted
name server
 Heavy load?
Iterated query:
7
6
TLD DNS
server
local DNS server
dns.poly.edu
5
 Contacted server replies
with name of server to
contact
 “I don’t know this name,
but ask this server”
3
1
4
8
requesting host
authoritative DNS server
dns.cs.umass.edu
cis.poly.edu
gaia.cs.umass.edu
2: Application Layer
37
Chapter 2: Application layer

2.1 Principles of network
applications




2.2 Web and HTTP
2.4 Electronic Mail


app architectures
app requirements
SMTP, POP3, IMAP




2.6 P2P file sharing
2.7 Socket programming
with TCP
2.8 Socket programming
with UDP
2.9 Building a Web server
2.5 DNS
2: Application Layer
38
P2P: Centralized Directory
Original “Napster” design
1) When peer connects, it
informs central server:


Bob
centralized
directory server
1
peers
1
IP address
Content
2) Alice queries for “Hey Jude”
3) Alice requests file from
Bob
3
1
2
1
Alice
2: Application Layer
39
P2P: Problems with Centralized Directory



Single point of failure
Performance bottleneck
Copyright infringement
File transfer is
decentralized, but
locating content is
highly centralized
2: Application Layer
40
Query Flooding: Gnutella

Fully distributed



Overlay network: graph
No central server

Public domain protocol
Many Gnutella clients
implementing protocol



Edge between peer X and Y if
there’s a TCP connection
All active peers and edges is
overlay net
Edge is not a physical link
Given peer will typically be
connected with < 10 overlay
neighbors
2: Application Layer
41
Gnutella: Protocol
File transfer:
HTTP
 Query message
sent over existing TCP
connections
 Peers forward
Query message
 QueryHit sent
over reverse
path
Query
QueryHit
Query
QueryHit
Scalability:
limited scope flooding
2: Application Layer
42
Hierarchical Overlay


Between centralized index,
query flooding approaches
Each peer is either a group
leader or assigned to a group
leader.



TCP connection between peer
and its group leader.
TCP connections between
some pairs of group leaders.
Group leader tracks content
in its children
ordinary peer
group-leader peer
neighoring relationships
in overlay network
2: Application Layer
43
Comparing Client-Server, P2P Architectures
Question : How much time distribute file initially at
one server to N other computers?
us: server upload
bandwidth
Server
us
File, size F
dN
uN
u1
d1
u2
ui: client/peer i
upload bandwidth
d2
di: client/peer i
download bandwidth
Network (with
abundant bandwidth)
2: Application Layer
44
Comparing Client-Server, P2P Architectures
Minimum Distribution Time
3.5
P2P
Client-Server
3
2.5
2
1.5
1
0.5
0
0
5
10
15
20
25
30
35
N
2: Application Layer
45
P2P Case Study: BitTorrent
 P2P file distribution
tracker: tracks peers
participating in torrent
torrent: group of
peers exchanging
chunks of a file
obtain list
of peers
trading
chunks
peer
2: Application Layer
46
BitTorrent (1)

File divided into 256KB chunks

Peer joining torrent:

Has no chunks, but will accumulate them
over time

Registers with tracker to get list of peers, connects to subset of peers
(“neighbors”)

While downloading, peer uploads chunks to other peers

Peers may come and go

Once peer has entire file, it may (selfishly) leave or
(altruistically) remain
2: Application Layer
47
BitTorrent (2)
Sending Chunks: tit-for-tat
Pulling Chunks
 Alice sends chunks to four
 At any given time,
neighbors currently sending
different peers have
her chunks at the highest
different subsets of file
rate
chunks
 re-evaluate top 4 every 10
 Periodically, a peer (Alice)
secs
asks each neighbor for list
of chunks that they have.  every 30 secs: randomly
select another peer, starts
 Alice issues requests for
sending chunks
her missing chunks
 newly chosen peer may join
 rarest first
top 4
2: Application Layer
48
P2P Case Study: Skype



P2P (pc-to-pc, pc-tophone, phone-to-pc)
Voice-Over-IP (VoIP)
application
 also IM
Skype clients (SC)
Skype
login server
Proprietary applicationlayer protocol (inferred
via reverse engineering)
Hierarchical overlay
2: Application Layer
Supernode
(SN)
49
Skype: Making a Call

User starts Skype
 SC registers with SN
 list of bootstrap SNs
 SC logs in (authenticate)
Skype
login server
 Call: SC contacts SN will
callee ID

SN contacts other SNs
(unknown protocol, maybe
flooding) to find addr of
callee; returns addr to SC
 SC directly contacts callee, over TCP
2: Application Layer
50
Chapter 2: Summary
Most importantly: learned about protocols

Typical request/reply
message exchange:



Client requests info or
service
Server responds with data,
status code
Message formats:


Headers: fields giving info
about data
Data: info being
communicated
Important themes:
 Control vs. data msgs
 In-band, out-of-band
 Centralized vs.
decentralized
 Stateless vs. stateful
 Reliable vs. unreliable
msg transfer
 “Complexity at network
edge”
2: Application Layer
51