Transcript Lecture3
ELEG 651 and CPEG 419
Lecture 3 – Chapter 2
The Application Layer
Goals of this Chapter
• To understand common application protocols work
–
–
–
–
–
–
Web (http)
Email (smtp)
FTP
DNS
P2P
IM
• To understand how the design alternatives for application
design
– A network application runs on many hosts, it is a distributed
application
– This chapter discusses several designs of distributed
applications
Road Map
•
•
•
•
•
•
Application basics
Web
Email
FTP
DNS
P2P
– Graph theory
– State diagrams
– P2P design
• IM
Road Map
•
•
•
•
•
•
Application basics
Web
Email
FTP
DNS
P2P
– Graph theory
– State diagrams
– P2P design
• IM
Creating a network app
write programs that
– run on (different) end
systems
– communicate over network
– e.g., web server software
communicates with browser
software
No need to write software for
network-core devices
– Network-core devices do not
run user applications
– applications on end systems
allows for rapid app
development, propagation
application
transport
network
data link
physical
application
transport
network
data link
physical
application
transport
network
data link
physical
An App-layer protocol defines
• Types of messages
exchanged,
– e.g., request, response
• Message syntax:
– what fields in messages &
how fields are delineated
• Message semantics
– meaning of information in
fields
• Rules for when and how
processes send & respond
to messages
Public-domain protocols:
• defined in RFCs
• allows for
interoperability
• e.g., HTTP, SMTP
Proprietary protocols:
• e.g., Skype
Ports
• An application is
identified by the hosts IP
address, transport
protocols, and port
– E.g., A web server has a
particular IP address,
listens with TCP on port 80.
– A web browser on a host
will connect a request a file
from the web server. The
browser is identified by the
host’s IP address and a
TCP port.
host
(web server)
host
TCP
UDP
TCP
UDP
0
0
0
0
4567
4568
216-1
80
216-1
216-1
216-1
What transport service does an app
need?
Data reliability
• some apps (e.g., audio) can
tolerate some loss
• other apps (e.g., file
transfer, telnet) require
100% reliable data transfer
Timing
• some apps (e.g., Internet
telephony, interactive
games) require low delay to
be “effective”
Throughput
• some apps (e.g., multimedia)
require minimum amount of
throughput to be “useful” (i.e., in
order for the user to gain utility)
• other apps (“elastic apps”) make
use of whatever throughput they
get
Security
• Encryption, data integrity, …
Transport service requirements of common
apps
Application
Data loss
Throughput
Time Sensitive
file transfer
e-mail
Web documents
real-time audio/video
no loss
no loss
no loss
loss-tolerant
no
no
not really
yes, 100’s msec
stored audio/video
interactive games
instant messaging
loss-tolerant
loss-tolerant
no loss
elastic
elastic
some what elastic
audio: 5kbps-1Mbps
video:10kbps-5Mbps
same as above
few kbps up
elastic
yes, few secs
yes, 100’s msec
yes and no
Internet transport protocols services
TCP service:
UDP service:
• connection-oriented: setup
required between client and
server processes
• reliable transport between
sending and receiving process
• flow control: sender won’t
overwhelm receiver
• congestion control: throttle
sender when network overloaded
• does not provide: timing,
minimum throughput guarantees,
security
• unreliable data transfer
between sending and
receiving process
• does not provide: reliability,
flow control, congestion
control, timing, throughput
guarantee, or security
• Does not require
connection set-up
• Packets can be sent at any
rate desired (but this might
be cause considerable
congestion)
Internet apps: application, transport protocols
Application
e-mail
remote terminal access
Web
file transfer
streaming multimedia
Internet telephony
Application
layer protocol
Underlying
transport protocol
SMTP [RFC 2821]
Telnet [RFC 854]
HTTP [RFC 2616]
FTP [RFC 959]
HTTP (eg Youtube),
RTP [RFC 1889]
SIP, RTP, proprietary
(e.g., Skype)
TCP
TCP
TCP
TCP
TCP or UDP
typically UDP
Road Map
•
•
•
•
•
•
Application basics
Web
Email
FTP
DNS
P2P
– Graph theory
– State diagrams
– P2P design
• IM
Web and HTTP
• Web page consists of objects
• Object can be HTML file, JPEG image, Java applet, audio file,…
• Web page consists of base HTML-file which includes several
referenced objects
• The browser first requests the base file
• The base file species text and URLs of objects
• The browser requests these objects, where ever they are (not
always on the same server)
• HTTP is used to request the base file and all the other files
• Note, that HTTP can be used for other applications besides web
• Each object is addressable by a URL
• Example URL:
www.someschool.edu/someDept/pic.gif
host name
path name
HTTP overview
HTTP: hypertext
transfer protocol
• Web’s application layer
protocol
• client/server model
– client: browser that
requests, receives,
“displays” Web objects
– server: Web server
sends objects in
response to requests
PC running
Explorer
Server
running
Apache Web
server
Mac running
Navigator
HTTP overview (continued)
Uses TCP:
HTTP is “stateless”
• client initiates TCP connection
(creates socket) to server, port
80
• server accepts TCP connection
from client
• HTTP messages (applicationlayer protocol messages)
exchanged between browser
(HTTP client) and Web server
(HTTP server)
• TCP connection closed
• server maintains no
information about past
client requests
aside
Protocols that maintain “state”
are complex!
• past history (state) must be
maintained
• if server/client crashes, their
views of “state” may be
inconsistent, must be
reconciled
HTTP connections
Nonpersistent HTTP
• At most one object is
sent over a TCP
connection.
Persistent HTTP
• Multiple objects can
be sent over single
TCP connection
between client and
server.
Nonpersistent HTTP
Suppose user enters URL
www.someSchool.edu/someDepartment/home.index
(contains text,
references to 10
jpeg images)
1a. HTTP client initiates TCP connection to
HTTP server (process) at
www.someSchool.edu on port 80
1b. HTTP server at host
2. HTTP client sends HTTP request
message (containing URL) into TCP
connection socket. Message
indicates that client wants object
someDepartment/home.index
www.someSchool.edu waiting for TCP
connection at port 80. “accepts”
connection, notifying client
3. HTTP server receives request message,
forms response message containing
requested object, and sends message
into its socket
5. HTTP client receives response message
containing html file, displays html.
Parsing html file, finds 10 referenced
jpeg objects
time
6. Steps 1-5 repeated for each of 10 jpeg
objects
4. HTTP server closes TCP connection.
Non-Persistent HTTP: Response time
Definition of RTT: time for a small
packet to travel from client to
server and back.
Response time:
• one RTT to initiate TCP
connection
• one RTT for HTTP request and
first few bytes of HTTP
response to return
• file transmission time
total = 2RTT+transmit time
initiate TCP
connection
RTT
request
file
time to
transmit
file
RTT
file
received
time
time
Persistent HTTP
• Nonpersistent HTTP issues:
• requires 2 RTTs per object
• OS overhead for each TCP
connection
• browsers often open parallel
TCP connections to fetch
referenced objects
• Persistent HTTP
• server leaves connection open
after sending response
• subsequent HTTP messages
between same client/server
sent over open connection
• client sends requests as soon
as it encounters a referenced
object
• as little as one RTT for all the
referenced objects
HTTP request message
• two types of HTTP messages: request, response
• HTTP request message:
– ASCII (human-readable format)
request line
(GET, POST,
HEAD commands)
GET /somedir/page.html HTTP/1.1
Host: www.someschool.edu
User-agent: Mozilla/4.0
header Connection: close
lines Accept-language:fr
Carriage return,
line feed
indicates end
of message
(extra carriage return, line feed)
HTTP request message: general format
HTTP response message
status line
(protocol
status code
status phrase)
header
lines
data, e.g.,
requested
HTML file
HTTP/1.1 200 OK
Connection close
Date: Thu, 06 Aug 1998 12:00:15 GMT
Server: Apache/1.3.0 (Unix)
Last-Modified: Mon, 22 Jun 1998 …...
Content-Length: 6821
Content-Type: text/html
data data data data data ...
HTTP response status codes
In first line in server->client response message.
A few sample codes:
200 OK
– request succeeded, requested object later in this message
301 Moved Permanently
– requested object moved, new location specified later in this message
(Location:)
400 Bad Request
– request message not understood by server
404 Not Found
– requested document not found on this server
505 HTTP Version Not Supported
Trying out HTTP (client side) for yourself
1. Telnet to your favorite Web server:
telnet cis.poly.edu 80
Opens TCP connection to port 80
(default HTTP server port) at cis.poly.edu.
Anything typed in sent
to port 80 at cis.poly.edu
2. Type in a GET HTTP request:
GET /~ross/ HTTP/1.1
Host: cis.poly.edu
By typing this in (hit carriage
return twice), you send
this minimal (but complete)
GET request to HTTP server
3. Look at response message sent by HTTP server!
Wireshark (ethereal)
•
•
•
•
•
Wireshark captures all packets that pass through the hosts interface
To run Wireshark , libpcap (linux) or winpcap (windows) must be installed. It
comes with wireshark package
Then, run wireshark
Select Capture
Find the active interface
– E.g., mot generic dialup, nor vnp, nor packet scheduler, but wireless …. With IP
address
– Then select prepare
– Let’s watch TCP packets on port 80
• Next to capture filter, enter TCP port 80
–
–
–
–
•
•
•
Select update in realtime and autoscroll
Might need to enable or disable “capture in promiscuous mode”
Press start
Press close
Load www.eecis.udel.edu page in browser
Press stop in Wireshark
Find http request to 128.4.40.10.
– Right click and select follow TCP stream
Web caches (proxy server)
Goal: reduce network utilization by satisfying client request
without involving origin server
• user sets browser: Web
accesses via cache
• browser sends all
HTTP requests to
cache
– object in cache: cache
returns object
– else cache requests
object from origin server,
then returns object to
client
origin
server
client
client
Proxy
server
origin
server
More about Web caching
• cache acts as both client
and server
• typically cache is installed
by ISP (university,
company, residential ISP)
Why Web caching?
• reduce response time for
client request
• reduce traffic on an
institution’s access link.
• Internet dense with
caches: enables “poor”
content providers to
effectively deliver content
(but so does P2P file
sharing)
Caching example
origin
servers
Assumptions
• average object size = 100,000
bits
• avg. request rate from
institution’s browsers to origin
servers = 15/sec
• delay from institutional router to
any origin server and back to
router = 2 sec
Consequences
•
•
•
utilization on LAN = 15%
utilization on access link = 100%
total delay = Internet delay +
access delay + LAN delay
= 2 sec + minutes + milliseconds
public
Internet
1.5 Mbps
access link
institutional
network
10 Mbps LAN
institutional
cache
Caching example (cont)
origin
servers
possible solution
• increase bandwidth of access
link to, say, 10 Mbps
public
Internet
consequence
•
•
•
utilization on LAN = 15%
utilization on access link = 15%
Total delay = Internet delay +
access delay + LAN delay
= 2 sec + msecs + msecs
• often a costly upgrade
10 Mbps
access link
institutional
network
10 Mbps LAN
institutional
cache
Caching example (cont)
possible solution: install
cache
origin
servers
• suppose hit rate is 0.4
consequence
• 40% requests will be satisfied
almost immediately
• 60% requests satisfied by origin
server
• utilization of access link reduced
to 60%, resulting in negligible
delays (say 10 msec)
• total avg delay = Internet delay
+ access delay + LAN delay =
.6*(2.01) secs + .4*milliseconds
< 1.4 secs
public
Internet
1.5 Mbps
access link
institutional
network
10 Mbps LAN
institutional
cache
Conditional GET
cache
• Goal: don’t send object if
cache has up-to-date cached
HTTP request msg
version
If-modified-since:
<date>
• cache: specify date of cached
copy in HTTP request
If-modified-since:
<date>
• server: response contains no
object if cached copy is up-todate:
HTTP/1.0 304 Not
Modified
HTTP response
server
object
not
modified
HTTP/1.0
304 Not Modified
HTTP request msg
If-modified-since:
<date>
HTTP response
HTTP/1.0 200 OK
<data>
object
modified
Road Map
•
•
•
•
•
•
Application basics
Web
FTP
Email
DNS
P2P
– Graph theory
– State diagrams
– P2P design
• IM
FTP: the file transfer protocol
user
at host
FTP
FTP
user
client
interface
local file
system
file transfer
FTP
server
remote file
system
• transfer file to/from remote host
• client/server model
– client: side that initiates transfer (either to/from remote)
– server: remote host
• ftp: RFC 959
• ftp server: listens on port 21
FTP is weird: separate control and data connections
•
•
FTP client contacts FTP server at port 21,
TCP is transport protocol
client authorized over control connection
–
–
–
•
•
client browses remote directory by sending
commands over control connection.
Data is transferred over different
connections. Two approaches
–
–
•
This is done in “clear text” (i.e., unencrypted)
So if some one if sniffing packets, your
password might be learned.
Sniffing packets is difficult on ethernet,
encrypted wifi, and DSL, but is possible on
cable modems
TCP control connection
port 21
FTP
client
TCP data connection
port 20
FTP
server
Active
Passive
Active
– The client opens a TCP socket with
on some port (port number >1024)
– The client sends the server the port
– The server connects to the client’s
port where the servers source port is
20
•
Active mode is a problem for
firewalls
– If my desktop is not a server, if
should not receive any requests
for connections.
– But FTP servers will make such a
requests
FTP Passive mode
•
•
•
•
When a file is to be transferred, the
server opens a port (number>1024 and
TCP control connection
not 20)
port 21
The server sends this port number
information over the command
connection
TCP data connection
FTP
FTP
high port
The client connects to the servers over
client
server
this port.
Drawback of passive
– Some enterprises (companies) like
to control which applications are
used
• E.g., web browsing is ok, but
skype is not
– One way to do this is to block out
going connections based on the
port.
– However, this will cause FTP to
fail, unless the device that blocks
connections is smart
Road Map
•
•
•
•
•
•
Application basics
Web
FTP
Email
DNS
P2P
– Graph theory
– State diagrams
– P2P design
• IM
Email Protocol Design
•
Basic assumption: weak user agents and strong mail servers
–
–
–
–
–
•
The user wants to send the mail and leave
The user wants to get the mail
The user may come and go whenever (e.g., roaming laptop)
It should be possible to send mail to a user even if neither user is online at the same time.
We conclude that there must be a middle man/mail server.
Servers are not that strong: The protocol must be as robust as possible to servers being offline
–
No single server – why
•
•
–
•
Users
Mail servers
Each user has a mail box in its mail server
–
•
•
We conclude that there should be many mail servers
Two types of hosts
–
–
•
Single point of failure
The server would have to be too big (congestion)
Users retrieve mail from their mail server at there convenience
Users give mail to their mail servers to deliver the mail
Mail servers communicate with
–
–
The users that have mail boxes in the server
Other mail servers
user
agent
mail
server
mail
server
user
agent
Email Protocol Design
•
Two types of hosts
–
–
•
Each user has a mail box in its mail server
–
•
•
Users
Mail servers
Users retrieve mail from their mail server at there convenience
Users give mail to their mail servers to deliver the mail
Mail servers communicate with
–
–
The users that have mail boxes in the server
Other mail servers
User composes mail
and sends it to its
mail server (or a
mail server that will
send mail for it)
user
agent
mail
server
Mail server finds the
destination mail
server and attempts
to send the mail
mail
server
Destination user
requests emails from
mailbox
Destination server
gives mails to user
user
agent
Email Protocol Design
•
Two types of hosts
–
–
•
Each user has a mail box in its mail server
–
•
•
Users
Mail servers
Users retrieve mail from their mail server at there convenience
Users give mail to their mail servers to deliver the mail
Mail servers communicate with
–
–
The users that have mail boxes in the server
Other mail servers
User composes mail
and sends it to its
mail server (or a
mail server that will
send mail for it)
user
agent
Mail server finds the
destination mail
server and attempts
to send the mail
Destination server
gives mails to user
mail
server
mail
server
SMTP
Destination user
requests emails from
mailbox
SMTP
user
agent
POP3
IMAP
…
Electronic Mail: Details
outgoing
message queue
Three major components:
•
•
•
user agents
mail servers
simple mail transfer protocol: SMTP
user mailbox
user
agent
mail
server
User Agent
• a.k.a. “mail reader”
• composing, editing, reading mail
SMTP
messages
• e.g., Eudora, Outlook, elm, Mozilla
Thunderbird
mail
• Put outgoing on server (with SMTP)
server
• Get incoming messages from
server
user
agent
SMTP
SMTP
user
agent
user
agent
mail
server
user
agent
user
agent
Electronic Mail: mail servers
Mail Servers
•
•
•
mailbox contains incoming
messages for user
message queue of outgoing (to be
sent) mail messages
SMTP protocol between mail
servers to send email messages
– client: sending mail server
– “server”: receiving mail server
• Reliable: several attempts and
provide notification if delivery
fails
user
agent
mail
server
SMTP
SMTP
mail
server
user
agent
SMTP
user
agent
user
agent
mail
server
user
agent
user
agent
Electronic Mail: SMTP [RFC 2821]
• uses TCP to reliably transfer email message from client to
server, port 25
• direct transfer: sending server to receiving server
• Emails are pushed to servers (but users pull messages from
servers)
• three phases of transfer
– handshaking (greeting)
– transfer of messages
– closure
• command/response interaction
– commands: ASCII text
– response: status code and phrase
• messages must be in 7-bit ASCII
– Makes it difficult to send attachments
Scenario: Alice sends message to Bob
4) SMTP client sends Alice’s
message over the TCP
connection
5) Bob’s mail server places the
message in Bob’s mailbox
6) Bob invokes his user agent to
read message
1) Alice uses UA to compose
message and “to”
[email protected]
2) Alice’s UA sends message to
her mail server; message
placed in message queue
3) Client side of SMTP opens
TCP connection with Bob’s
mail server
1
user
agent
2
mail
server
3
mail
server
4
5
6
user
agent
Sample SMTP interaction
Client connects to server
S:
C:
S:
C:
S:
C:
S:
C:
S:
C:
C:
C:
S:
C:
S:
220 hamburger.edu
HELO crepes.fr
250 Hello crepes.fr, pleased to meet you
MAIL FROM: <[email protected]>
250 [email protected]... Sender ok
RCPT TO: <[email protected]>
250 [email protected] ... Recipient ok
DATA
354 Enter mail, end with "." on a line by itself
Do you like ketchup?
How about pickles?
.
250 Message accepted for delivery
QUIT
221 hamburger.edu closing connection
Try SMTP interaction for yourself:
• telnet mail.eecis.udel.edu 25
• see 220 reply from server
• enter HELO, MAIL FROM, RCPT TO, DATA, QUIT
commands
above lets you send email without using email client
(reader)
SMTP: final words
• SMTP uses persistent
connections
• SMTP requires message
(header & body) to be in 7-bit
ASCII
• SMTP server uses
CRLF.CRLF to determine end
of message
Comparison with HTTP:
• HTTP: pull
• SMTP: push
• both have ASCII
command/response
interaction, status codes
• HTTP: each object
encapsulated in its own
response msg
• SMTP: multiple objects sent in
multipart msg
Mail access
• POP3 and IMAP are two protocols for
access mail on a mail server
• Web-based mail works differently, the web
mail server and the mail server can be
integrated, so that there is no user agent.
Mail access protocols
user
agent
SMTP
SMTP
sender’s mail
server
access
protocol
receiver’s mail
server
• SMTP: delivery/storage to receiver’s server
• Mail access protocol: retrieval from server
– POP: Post Office Protocol [RFC 1939]
• authorization (agent <-->server) and download
– IMAP: Internet Mail Access Protocol [RFC 1730]
• more features (more complex)
• manipulation of stored msgs on server
– HTTP: gmail, Hotmail, Yahoo! Mail, etc.
user
agent
Road Map
•
•
•
•
•
•
Application basics
Web
FTP
Email
DNS
P2P
– Graph theory
– State diagrams
– P2P design
• IM
DNS – domain name system
• Change names, like www.yahoo.com into IP address.
• Services provided by DNS
– Name to address translation
– Host aliasing
• A host relay1.west-coast.yahoo.com could have two aliases, yahoo.com and
www.yahoo.com.
• In this case, the canonical hostname is relay1.west-coast.yahoo.com.
• DNS can provide canonical host names
– Mail server aliasing
• When a mail server wants to send a mail to [email protected], it does not send
it to www.udel.edu, but to mail.udel.edu. Or maybe udmail.udel.edu. DNS
can translate udel.edu to mail.udel.edu
– (Cheap) Load distribution
•
•
•
•
Cnn.com has several servers.
DNS will respond with all address,
but it will reorder the addresses every time.
If the client uses the first address listed, then each client will use different
servers.
• Content distribution networks (CDN) are better ways of load balancing
DNS - structure
• Centralized DNS?
– Pros – somewhat easy to maintain (there is only one
system). But it must always be online
– Cons
•
•
•
•
•
Single point of failure (the system crashes -> no web)
Congestion
Server would be far from some hosts (delay)
Database would be too big
The register bohacek-pc1.pc.udel.edu would require
interacting with the big server
• Instead, a distributed hierarchical database is
used.
Domain Hierarchy
edu
UD
eecis
upenn
art
bohacek_pc1 bohacek_pc10
com
yahoo
gov
cisco
whitehouse
mil
nasa
navy
org net uk
arpa
acm
in
Administrative Zones in the Domain Hierarchy
root
edu
UD
upenn
gov
whitehouse
com
mil
nasa
navy
arpa
yahoo
org net uk
cisco
acm
eecis art
bohacek_pc1 bohacek_pc10
It is possible that .edu and .gov are administered together
Note that UD administered art but not eecis
Some times a single service provider will administer the domains for a large number of .coms
in
Root servers
• Each layer in the hierarchy knows about the domain names below it
• The highest level is the root.
– There are 13 root “servers”
– Each of these servers is actually several servers, and some of the
machines that comprise a server are distributed geographically.
a Verisign, Dulles, VA
c Cogent, Herndon, VA (also LA)
d U Maryland College Park, MD
g US DoD Vienna, VA
h ARL Aberdeen, MD
j Verisign, ( 21 locations)
e NASA Mt View, CA
f Internet Software C. Palo Alto,
k RIPE London (also 16 other locations)
i Autonomica, Stockholm (plus
28 other locations)
m WIDE Tokyo (also Seoul,
Paris, SF)
CA (and 36 other locations)
b USC-ISI Marina del Rey, CA
l ICANN Los Angeles, CA
13 root name
servers
worldwide
overview
• Top-level domain (TLD) servers
– There are around 200 top-level domains
– These include com, edu, mil, info, in, uk, cn,
– Currently,
• network solutions maintains the TLD servers for
com
• Educause maintains the TLD servers for edu
– The root servers know the addresses and
names of all top level servers
• Organizations have a hierarchy of DNS
servers
DNS queries
•
•
•
•
•
•
Suppose a host needs the IP address of bohacek-pc1.eecis.udel.edu
If this IP address is not in cache, the host asks its local DNS server.
If the DNS server does not have it in cache, it checks if is had the IP address of the
DNS server of eecis.udel.edu in cache
If not, it checks if IP address of the dns server of udel.edu in cache
If not, it check if it has the IP address of the top-level domain server of edu in cache
It not, it asks the root server for the IP address of the edu TLD server
–
•
•
•
•
•
•
•
The DNS server always has the IP address of the root servers
The local DNS server asks the edu TLD server for address of bohackpc1.eecis.udel.edu.
The TLD server does not know that IP address, but instead gives the IP address of
the dns server for UD
The local DNS server asks the UD dns server for the address of bohackpc1.eecis.udel.edu.
The UD dns server does not know the address, but instead returns the address of the
eecis dns server.
The local DNS server asks the eecis dns server for the address of bohacekpc1.eecis.udel.edu
Eecis dns server replies with the address.
This address is returned to the host that orginally asked the question.
DNS Queries
Root server (IP address are always known)
Browser wants
to show www.
eecis.udel.edu
Browser needs
the IP address
of www.
eecis.udel.edu
What is the IP address
of www.eecis.udel.edu?
Root server does not know.
Instead, it responds with
dns server that might,
specifically, the TLD server
for .edu
TLD server for .edu
Host asks local
DNS server for IP
address of www.
eecis.udel.edu
It is 128.4.1.2
•
•
•
•
•
What is the ip address of
www.eecis.udel.edu?
TLD server does not know.
Instead replies with the
What is the ip address
nameof
and IP address of
www.eecis.udel.edu?
the UD DNS server
dns server
What is theUD
ip address
of does not
know.
Instead
it replies with
Local DNS server checks if itwww.eecis.udel.edu?
has the IP the name and IP address
address of www.eecis.udel.edu in
of the eecis dns server.
cache.
If not, it checks if is had the IP address
of the DNS server of eecis.udel.edu in
cache
It is 128.4.1.2
If not, it checks if IP address of the dns
server of udel.edu in cache
If not, it check if it has the IP address of
the top-level domain server of edu in
cache
.if not, …..
DNS Queries
Root server (IP addresses are always known)
What is the IP address of
www.eecis.udel.edu?
Browser
Browser
needs
wantsthe
to IP
address
show of
www.eecis.udel.edu
Host asks local
DNS server for IP
address of
www.eecis.udel.edu
Root server does not know.
Instead, it responds with
name and address of a
server that might,
specifically,
What is the IP
address ofthe TLD server
TLD server for .edu
for .edu
www.eecis.udel.edu?
TLDWhat
server
does
know.of
is the
ipnot
address
Instead
replies
with
the
www.eecis.udel.edu?
It is 128.4.1.2
name and IP address of
the UD DNS server
UD DNS server does not
What is the IP address of
1. Local DNS server checks if it has theknow.
IP Instead it replies with
www.eecis.udel.edu?
address of www.eecis.udel.edu in cache.
the name
and IP address
2. If not, it checks if is had the IP address
of
of the eecis dns server.
the DNS server of eecis.udel.edu in cache
It is 128.4.1.2
3. If not, it checks if it has the IP address of
the DNS server of udel.edu in cache
4. If not, it checks if it has the IP address of
the top-level domain server of edu in cache
5. .if not, …..
UD DNS server
eecis DNS server
DNS Queries
Browser
Browser
needs
wantsthe
to IP
address
show of
www.eecis.udel.edu
Host asks local
DNS server for IP
address of
www.eecis.udel.edu
It is 128.4.1.2
1. Local DNS server checks if it has the IP
address of www.eecis.udel.edu in cache.
2. If yes, then return it
DNS Queries
Browser
Browser
needs
wantsthe
to IP
address
show of
www.eecis.udel.edu
Host asks local
DNS server for IP
address of
www.eecis.udel.edu
It is 128.4.1.2
1. Local DNS server checks if it has the IP
address of www.eecis.udel.edu in cache.
2. If not, it checks if is had the IP address of
the DNS server of eecis.udel.edu in cache
3. If yes, query it…
What is the IP address of
www.eecis.udel.edu?
It is 128.4.1.2
eecis DNS server
DNS Queries
Browser
Browser
needs
wantsthe
to IP
address
show of
www.eecis.udel.edu
Host asks local
DNS server for IP
address of
www.eecis.udel.edu
It is 128.4.1.2
1. Local DNS server checks if it has the IP
address of www.eecis.udel.edu in cache.
2. If not, it checks if is had the IP address of
the DNS server of eecis.udel.edu in cache
3. If not, it checks if it has the IP address of
the DNS server of udel.edu in cache
4. If not, it checks if it has the IP address of
the top-level domain server of edu in cache
5. .if so, then query it…
What is the IP address of
www.eecis.udel.edu?
TLDWhat
server
does
know.of
is the
ipnot
address
Instead
replies
with
the
www.eecis.udel.edu?
name and IP address of
the UD DNS server
UD DNS server does not
What
is theit IP
address
know.
Instead
replies
withof
www.eecis.udel.edu?
the name
and IP address
of the eecis dns server.
TLD server for .edu
UD DNS server
It is 128.4.1.2
eecis DNS server
Attack on DNS
• Hackers have tried to bring down
DNS by performing a DoS on the
root servers
– DoS – denial of service. Sends more
packets or requests for service than the
server can accommodate. Resulting in
poor service for normal users.
• This failed because
– There are many very strong root servers and have
firewalls/filters
• The attacks used ICMP ping packets
• DNS requests would have been more effective
– It is rare that a root server is needed
• Usually only the TLD server is needed
• Or only a domain server.
DNS Message Details
• DNS Record
– (Name, Value, Type, Class, TTL)
– If Type = A
• Name is the host name
• Value is the IP address of the host
– If Type = NS
• Name is a domain name
• Value is the name of the DNS server for the domain
• E.g., (udel.edu, dns.udel.edu, NS, …, …)
– Type = MX
• Name is the domain name
• Value is the name of the mail server for the domain
• E.g., (udel.edu, mail.udel.edu, MX, …, …)
– Type = CName
• Name is a host name
• Value is the canonical name of the host
• E.g., (www.yahoo.com, relay-east.yahoo.com, CName, …, …)
– TTL is the time to live, so DNS caches can be timed out
– Class is no longer used, it is set as IN
DNS query
• (Name, Type, Class)
• (UDel.edu, MX, IN)
– Please provide the name of the UD’s mail
server
• (mail.UDel.edu, A, IN)
– Please provide the IP address for mail.udel.edu
DNS message format
DNS protocol : query and reply messages, both with
same message format
msg header
• identification: 16 bit #
for query, reply to
query uses same #
• flags:
– query or reply
– recursion desired
– recursion available
– reply is
authoritative
DNS message format
Name, type fields
for a query
RRs in response
to query
records for
authoritative servers
additional “helpful”
info that may be used
DNS Queries
Root server (IP addresses are always known)
0
0
1
0
(www.eecis.udel.edu, A,IN)
0
0
Browser
Browser
needs
wantsthe
to IP
address
show of
www.eecis.udel.edu
0
4
NS, IN)
0
1 (edu, edu-serverA.net,
(edu-serverA.net,
124.5.1.1,
A, IN)
0
0
1
0
(edu, edu-serverB.net,
NS, IN)
(www.eecis.udel.edu,
A,IN)
TLD
(edu-serverB.net, 124.5.1.2, A, IN)
0
0
server for .edu
(www.eecis.udel.edu, A,IN)
0
0
0
1
0
0
0
0
(www.eecis.udel.edu,
A,IN) 4
0
1
0
(udel.edu, dns2.udel.edu, NS, IN)
(udel.edu, dns2.udel.edu, 128.178.2.2, A, IN)
(www.eecis.udel.edu, 128.4.1.1, A, IN)
1. Local DNS server checks if it has the IP
address of www.eecis.udel.edu in cache.
2. If not, it checks if is had the IP address of
the DNS server of eecis.udel.edu in cache
3. If not, it checks if it has the IP address of
the DNS server of udel.edu in cache
4. If not, it checks if it has the IP address of
the top-level domain server of edu in cache
5. .if not, …..
0
0
0
1(udel.edu, dns1.udel.edu,
NS, IN)
0
0
0 128.173.2.1, A, IN)
(dns1.udel.edu,
4
(www.eecis.udel.edu,
A,IN)
UD DNS server
(eecis.udel.edu, dns1.eecis.udel.edu, NS, IN)
(dns1.eecis.udel.edu, 128.4.1.10, A, IN)
(eecis.udel.edu, dns2.udel.edu, NS, IN)
(dns2.udel.edu, 128.4.1.11, A, IN)
0
0
1
0
(www.eecis.udel.edu, 128.4.1.1, A, IN)
eecis DNS server
DNS Flags
• The DNS header has a query ID
– The query has this ID and the server copies this ID
into the response
• Flag indicating query or answer
• Flag indicating whether the server is the
authoritative server for the answer (as oppose to
a cached answer)
• A recursive desired flag indicating that the
host/server would like the server to perform the
recursive DNS lookup
• A recursive available flag indicating whether the
server is available to to the recursive lookup
DNS
• Which transport protocol should DNS use?
• Why?
Peer-to-peer file sharing
• About P2P
– 30% or more of the bytes transferred on the Internet are from
P2P users
– Skype is a very successful P2P VoIP app
• Written in 3-4 months
• Topics covered
– Scalability
– P2P querying
– Case study
• BitTorrent
• Skype
Pure P2P architecture
•
Review: What is the difference
between peer-to-peer and
client/server?
– Each hosts acts as both a
server and a client.
•
•
•
•
•
•
no always-on server
arbitrary end systems directly
communicate
peers are intermittently
connected and may change IP
addresses
Pure P2P has significant
drawbacks.
P2P-like systems with some
central servers are more
common.
But in all cases, the file transfer
is between peers, not from
servers.
peer-peer
File Distribution: Server-Client vs P2P
Question : How much time to distribute file from
one server to N peers?
us: server upload
bandwidth
Server
us
File, size F
dN
uN
u1
d1
u2
ui: peer i upload
bandwidth
d2
Network (with
abundant bandwidth)
di: peer i download
bandwidth
File distribution time: server-client
• Time for the server to send a
copy to a single client
– F/us
• Time for the server send N
copies:
– NF/us time
• client i takes F/di time to
download
Server
F
us
dN
u1 d1 u2
d2
Network (with
abundant bandwidth)
uN
Time to distribute F
to N clients using = dcs = max { NF/us, F/min(di) }
i
client/server approach
increases linearly in N
(for large N)
File distribution time: P2P
Server
• server must send one copy:
– F/us time
F
us
• client i download time
– F/di
• Total data to be downloaded
– NF
dN
u1 d1 u2
d2
Network (with
abundant bandwidth)
uN
• fastest possible transfer rate: us + Sui
dP2P = max { F/us, F/min(di) , NF/(us + Sui) }
i
Can you make a schedule for the download the take this amount?
Server-client vs. P2P: example
Client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us
Minimum Distribution Time
3.5
P2P
Client-Server
3
2.5
2
1.5
1
0.5
0
0
5
10
20
15
25
30
35
N
Conclusion: P2P systems are scalable. But the load is distributed to all
users, so P2P users have more load than clients in the client-server model.
Peer-to-peer Querying
•
•
While the file is transferred from the peer, how to find the file
Options
– Centralize directory
•
•
•
•
•
•
•
Napster
Single point of failure
Performance bottleneck
Target for the RIAA
Always up
Easy to find
Easy protocol
– Query flooding
•
•
•
•
Gnutella
Hosts find other host and form a network of neighbors (overlay network)
Search for a file (covered next)
How to set up the network – bootstrap?
– Have a central list of peers
– Have distributed lists of peers
– Search out a peer by scanning – like in project
• Once the file is found,
– the host could respond directly to the searcher,
– or it could send the response along the reveres path.
– In the later case, the peers along the way would learn about where the file is located (cache) and could
more quickly answer the next time the search is performed. But then we must worry about stale
information.
Querying Flooding State Diagram: Querying Host
Querying Flooding State Diagram: Listening Host
Expanding Ring Querying Flooding
Expanding Ring Querying Flooding State
Diagram: Querying Host
Expanding Ring Querying Flooding State
Diagram: Listening Host
Expanding Ring Querying Flooding State
Diagram: Listening Host
Querying Flooding State Diagram
Inform user that query failed
Inform user of file location
User Request for File
Set AttemptCounter = 0
AttemptCounter ++
AttemptCounter>MaxAttempts
else
Timer>TO
Send out a request for
file to all neighbors
Set Timer=0;
wait
Reply from peer
Listening Peer
wait
Request arrives
Have seen
request
before
Get request ID
Check for file in directory
Send request to all neighbors
File is in local dir
Send response to peer
that requested file
Expanding ring
(hierarchical peer-to-peer network)
•
•
KaZaA
Not all peers are equal – super peers (?)
–
•
•
•
•
•
•
•
Super peers (group leaders) have higher bit-rate connections, are more stable,
etc.
Peers connect to group leaders
The group leaders keep a list of file shared by all their children peer.
group leaders connect to a small number of other group leaders
A child host will ask its group leader for a file, if the group leader does not
know where it is, it will flood the network of group leaders. The response
from other group leaders follows a reverse path to the asking group leader
(so other leader can cache the response)
A file is identified with a ID (e.g., MD5) that can take a string (file) and
come to a unique ID. A small change in the file causes a large change in
the ID. It is not possible to construct two files that have the same ID. The
ID is a finger print.
Since files are ID-ed, multiple copies of the same file can be found and
these copies can be downloaded from multiple hosts in parallel.
Note the if you are downloading while other are uploading, the uploading
slows down the downloading, but only a little bit.
BitTorrent
• Centralized P2P
– A centralized server, or tracker, tracks the clients
involved in the P2P transfer
– This is similar to Napster
– Companies that host these site get sued and are
attacked by DDoS
• Components of BitTorrent System
–
–
–
–
Torrent Files
Trackers
Seeders
Peers
Torrent File
• Required to download
• Can be found on web sites or sent by email
• Contains information about the file and the tracker
– Announce: the URL of the tracker
– Creation date
– Info
•
•
•
•
Length of file
Name of file
Length of each piece (except for the last)
Pieces – the 20B SHA-1 value of each piece
– Note, the number of pieces can be determined counting the number of
bytes in the pieces field and dividing by 20
• If the download contains multiple files, then a single
torrent file will contain information about all files.
Tracker
• Make a HTTP Get request to the tracker
specifying the SHA-1 hash of the file to be
downloaded
– The request also includes the number of bytes
downloaded and the number uploaded
– If the client does not upload enough, the tracker might
not provide a reply
• The reply contains
– The time when the tracker information should be
refreshed (usually 30 minutes)
– A list of the peers
• IP address and port (usually 6881)
• Peer ID
File distribution with BitTorrent
tracker: tracks peers
participating in torrent
obtain list
of peers
trading
chunks
peer
BitTorrent (1)
• file divided into 256KB chunks.
• peer joining torrent:
– has no chunks, but will accumulate them over time
– registers with tracker to get list of peers, connects to subset of
peers (“neighbors”)
• while downloading, peer uploads chunks to other peers.
• peers may come and go
• once peer has entire file, it may (selfishly) leave or (altruistically)
remain
BitTorrent (2)
Pulling Chunks
• at any given time, different
peers have different subsets
of file chunks
• periodically, a peer (Alice)
asks each neighbor for list of
chunks that they have.
• Alice sends requests for her
missing chunks
– rarest first
– So rarest chunks are
spread, and chunks are
uniformly common
Sending Chunks: tit-for-tat
• Alice sends chunks to four
neighbors currently sending her
chunks at the highest rate
– re-evaluate top 4 every 10
secs
• every 30 secs: randomly select
another peer, starts sending
chunks
– newly chosen peer may join
top 4
– “optimistically unchoke”
BitTorrent: Tit-for-tat
(1) Alice “optimistically unchokes” Bob
(2) Alice becomes one of Bob’s top-four providers; Bob reciprocates
(3) Bob becomes one of Alice’s top-four providers
With higher upload rate,
can find better trading
partners & get file faster!
BitTorrent Pros/Cons
• Centralized server
• Slow to get the transfer started
– Web transfers start much faster and will achieve a
sustained rate
• Peers must upload
– Some peers might not be in position to upload (e.g.,
mobile phone)
• Chunks can be corrupted
– HBO distributed fake chunks
– Since the SHA-1 hash does not match what is given
in the Torrent File, the chunk is dropped after it is
downloaded
• This wastes bandwidth and can greatly increase download
time
Building and maintaining a peer-to-peer network /
overlay network
• Find potential neighbors
– If the current number of neighbors < N, then pick a address at
random and try to connect
– We will pick an address at random or from a file that is retrieved
from a server. But, one could pick an address at random, or
following some other scheme, e.g., pick an address at random
that might be physically close (in terms of physical hops, which
might actually be geographically far).
– This stage must run periodically. E.g., one could have a thread
that is checking the number of neighbors
Building and maintaining a peer-to-peer network /
overlay network
Tasks
1. Make new neighbor
–
2.
Perhaps with an upper limit
Tell neighbors that this host is still alive
Building and maintaining a peer-to-peer network /
overlay network
• Make new neighbors (handshaking)
– Active:
• Send a message (but they, of course are not yet a neighbor)
• When a reply comes they are a neighbor
– Passive
• Get a message (but again, they are not yet a neighbor)
• Send a message
• When the reply comes, they are a neighbor
– The passive acceptance must be active all the time (unless a restriction
on the maximum number of neighbors is made)
– Connections with neighbors must be bidirectional!
• A node is a neighbor only if communication with that node is verified to be
bidirectional
– This node can hear the neighbor and this node has proof that the neighbor can
hear it
• Standard approach
– Send a hello message with a list of nodes that this node has heard hellos from
– If this node receives a hello message from a node and this hello indicates that this
node has received a hello from this node, then the communication is bidirectional
Building and maintaining a peer-to-peer network /
overlay network
• Maintaining neighbors
– Many approaches, this one is used in OSPF, a very popular routing
algorithm
– Periodically send a heartbeat/hello every 10 sec
• The hello is unidirectional – there is no acknowledgement that it was
received.
• Inside the hello is a list of the senders neighbors
– If a node does not receive a hello from a neighbor within 40 sec (so at
least 3 heartbeats have been missed), then the node is declare to no
longer be a neighbor and removed from neighbor list.
– Or even better, if a node does not receive a hello from a neighbor within
40 sec that has this node listed as neighbor, then the node is declared
to be no longer a neighbor and is removed from neighbor list.
State Diagram
State Diagram
HTimer = 0
HTimer =: 10 sec
Sent Hello to neighbor
Have received a hello
from node, but this node
is not listed as a neighbor
Timer=40
HTimer=10
Send hello to neighbor
Have sent a hello to
neighbor, but have not
received a hello back
indicating bidirectional
Timer = 0
Node is not a neighbor of
any type
received a hello from
node and this node is
listed as a neighbor
Timer = 40
HTimer = 10 sec
HTimer = 0
HTimer =: 10 sec
Sent Hello to neighbor
Timer = 0
Bidirectional neighbor
Hello pkt arrive with this node
in list of neighbors
Timer = :40 sec
Maintaining neighbors
• while (1)
– Listen for hellos from neighbors
• When a hello arrives, check if this node is listed in the list of the
senders neighbors (See provided code)
• If so,
– Set the time when hello was last received from neighbor to the current time.
– Check if 10 sec have past since the last set of hellos have been
sent
• If so, then
– For each node in the neighbor list,
» make a hello packet
» Include in the hello packet the list of neighbors
» Send the hello packet
– Check to see if any neighbors have timed out
• For each node in the neighbor list
– Check if the time since last hello arrived and now is greater than 40 sec
– If so,
» Remove neighbor from list
• See assignment description
– http://www.eecis.udel.edu/~bohacek/Classes/
ELEG651Spring2008/Neighbors.htm
• C++ review
• Look at provided code
std::list
• The c++ standard template library includes many useful classes.
• In the projects, we will make extensive use of standard template
library lists.
• It saves a huge amount of time over writing and debugging your own
lists
#include <list>
// must have this
std::list<int> MyListOfInts
// declare a list of integers
// add some things to my list
for (int i=0; i<10; i++) {
MyListOfInts.push(i);
}
// search for the element equal to 5 and change it to 15
std::list<int>::iterator it;
for (it=MyListOfInts.begin(); it!=MyListOfInts.end(); ++it) {
if (*it == 5) {
printf(“ the number 5 is in the list\n”);
*it = 15;
break;
}
}
Removing elements from the middle of a list is a bit tricky.
After an element is deleted, it is not possible increment the iterator.
// delete all the elements equal to 5
std::list<int>::iterator it;
int Fd5 = 1;
while (Fd5==1) {
Fd5 = 0;
for (it=MyListOfInts.begin(); it!=MyListOfInts.end(); ++it) {
if (*it == 5) {
MyListOfInts.erase(it);
Fd5 = 1;
break;
}
}
}
// delete all the elements equal to 5
std::list<int>::iterator it, dit;
it = it=MyListOfInts.begin();
while (it!=MyListOfInts.end()) {
if (*it == 5) {
dit = it;
}
it++;
if (dit != NULL)
MyListOfInts.erase(dit);
dit = NULL;
}
}
struct HostID {
char IP[80];
int Port;
int LastHelloRec;
int LastHelloSent;
};
std::list<struct HostID> ActiveNeighbors;
int CheckIfInList(struct HostID HID, std::list<struct HostID> &List) {
// returns 1 if HID is in List
std::list<struct HostID>::iterator it;
for (it=List.begin(); it!=List.end(); ++it)
if (strcmp(it->IP,HID.IP)==0 && it->Port == HID.Port)
return 1;
return 0;
}
struct PktStruct {
int Type; char SenderIP[16];
int SenderPort;
int NumberOfRecentlyHeardNeighbors;
struct RecentNeighborEntry RecentNeighbors[100];
};
struct HostID GetSendersID(struct PktStruct Pkt) {
struct HostID HID;
strcpy(HID.IP, Pkt.SenderIP);
HID.Port = Pkt.SenderPort;
return HID;
}
struct HostID {
char IP[80];
int Port;
int LastHelloRec;
int LastHelloSent;
};
• See provided code and function ideas
• http://www.eecis.udel.edu/~bohacek/Class
es/ELEG651Spring2008/Neighbors.htm
Detailed StateChart
RecHelloMessage(Pkt);
struct HostID SenderHID = MakeHostIDFromPkt(Pkt);
else
CheckIfInList(SenderHID, ActiveNeighborList) == 1
it = GetListEntry(ActiveNeighborList, pkt.SenderID, pkt.SendPort);
it->LastHelloRec = GetCurrentTime();
Maintaining neighbors
• Homework
– Read project 2 and make a detailed
StateChart that describes the program