Lecture: Applications

Download Report

Transcript Lecture: Applications

CPEG 419
Introduction to Data Networking
Review of Lecture 1 and
continuation of chapter 1
Announcements
• Homework 1 due next week
• Project 1 due next week
Today
• Review and complete Chapter 1
• Start Chapter 2
Packet Switching Case
What is the probability of more than 100 users being active?
The probability of 101 users being active plus, 102 users being
active, plus, …., 200 users being active, which is
 k  k

0.2 1  0.2200k  10 14

k 101 200 
200
This is the binomial complimentary cumulative distribution
We conclude that if there are 200 users, then in “pretty much always” things will work fine
 k 
  300 0.2 1  0.2
300
Suppose that there are 300 users:
300 k
k
k 101


 k 
  400 0.2 1  0.2
400
Suppose that there are 400 users:
k
k 101


 10 8
400 k
 0.004
Still pretty good
Might be acceptable performance
Therefore: circuit switching could support 100 users, while
packet switching can support 400 users. A factor of 4 more!!!
Losses and delay in packet switched
networks
•
Losses
– Transmission losses
• In fiber links, bit-error is 10^-12 or better (i.e., less).
– What is the probability of packet error when there are 1400 bytes in a packet?
• In wireless links, the bit-error rate can be very high
– Congestion losses.
• If too many packets arrive at the same time, then the buffers will fill up and packets are
lost.
• Increasing the link speeds or reducing the number of users can reduce the probability of
loss.
• Increasing the size of the buffer reduces losses, but also increases delay.
•
Delay
–
–
–
–
Queuing delay
Transmission delay
Propagation delay
Processing delay
packet being transmitted (delay)
A
B
packets queueing (delay)
free (available) buffers: arriving packets
dropped (loss) if no free buffers
In the news
News sources
www.lightreading.com (general networks)
www.unstrung.com (wireless and mobile)
www.darkreading.com (network security)
www.alleyinsider.com (general tech business news)
arstechnica.com (general tech news)
The Protocol Stack
application
transport
network
link
physical
• The application layer includes
network applications and network
application protocols
– e.g. of applications: web, IM, email
– e.g., application protocols: OSCAR,
http, smtp, ftp, DNS.
• Provide a service to a user or
another application.
• Require service from the lower
layers, but typically only interact
with the transport layer.
The Protocol Stack
•
•
application
•
–
transport
–
–
–
–
network
link
The transport layer (typically) transports messages
from and to applications
Different transport layer protocols provide different
types of services.
Types of services MAY include
–
•
physical
Note that when a transport protocol provides these
services, the application does not have to.
–
–
•
Reliability: the sender application can be assured that
the data is correctly received, or receives an error
message.
Congestion and flow control: attempt to send data
quickly but not so quickly to cause congestion in the
network or at the receiving host
Error detection / correction
In order delivery
Break long messages into small chunks suitable for
transmission over the network
Multiplexing so that multiple transport layer connections
can occur simultaneously
This makes implementation of applications easier.
This allows careful design of transport protocols,
following the divide and conquer approach
The transport layer uses the network layer to deliver
packets, but does not require any type of service
guarantees from the network layer
–
In practice, the transport layer hopes for in order
delivery.
Transport layer protocols: TCP and UDP
•
application
transport
•
•
•
–
–
–
–
–
network
link
TCP and UDP are the most widely used
transport protocols.
Other protocols include SCTP (UD and Cisco
are active in developing SCTP), RTP (for
multimedia such as VoIP)
TCP and UDP will be covered in great detail
later. But for now:
TCP provides many services
•
Congestion control
Flow control
Reliability
Multiplexing
Error detection
UDP provides few services
– Error detection
– Multiplexing
– The application must implement any other
services that it requires.
physical
•
TCP requires a connection to be established,
UDP does not
Transport Multiplexing
• Transport layers use
ports to provide
multiplexing
– A two hosts can have
multiple simultaneous
connections by using ports.
– Well known ports can be
used to specify a particular
application
• E.g., web servers will
accept TCP connections
on port 80
• A host can have two
connections with a web
server by using different
ports
host
(web server)
host
TCP
UDP
TCP
UDP
0
0
0
0
4567
4568
216-1
80
216-1
216-1
216-1
Sockets – gateway between the app layer and the
transport layer
• process sends/receives
messages to/from its socket
• socket analogous to door
– sending process shoves
message out door
– sending process relies on
transport infrastructure on
other side of door which brings
message to socket at
receiving process
host or
server
host or
server
process
controlled by
app developer
process
socket
socket
TCP with
buffers,
variables
Internet
controlled
by OS
TCP with
buffers,
variables
TCP Sockets
•
•
•
An application accesses TCP and UDP through sockets.
TCP is connection based so one host must be listening and the other must
be connecting (calling)
The basic steps for a TCP listener
– Define socket variable as a TCP socket
– Bind socket to a port (the bind function)
• If some other application is or was recently (120 sec) listening on this port, this function
will fail.
• The application must check that this command succeeds.
– Listen on this port (the listen function)
– When a the other host connects, the listen function completes and data can be
send or received.
– Close socket
•
Basic steps for TCP caller
– Define socket variable as a TCP socket
• No port is given, the OS will assign which ever port is available. The application has no
control over the port
– Connect
– Send data
– Close socket
UDP Sockets
•
UDP are connectionless.
–
–
–
•
A host sends a packet when it wants.
There is no concept of one host connecting to another.
There is only the concept of one host sending a packet and the other host receiving the
packet. And either host can send or receive
Steps to send and then receive a UDP message
–
–
Define socket as a UDP socket
Bind socket to a port
•
–
–
Send message
Wait for message
•
•
•
•
–
•
If this port is in use, bind will fail
There are two ways to wait for messages, blocking or non-blocking
A blocking function will wait for a message to arrive. It might wait forever.
A non-blocking will return immediately, but if no message was waiting in the transport layer, then no
message is returned
select function allows a time out to be set. So the function will wait until a message arrives or the
timeout time to elapse.
Close socket
Steps to receive a UDP message
–
–
Define socket as a UDP socket
Bind socket to a port
•
–
–
–
If this port is in use, bind will fail
Send message
Wait for response
Close socket
Project 1
Due 9/16
• In this project messages will be sent over TCP and UDP.
• The project is description currently at
– http://www.eecis.udel.edu/~bohacek/Classes/CPEG419_2005/Pr
oj1/project1_part1.htm
• All the required information should be online.
• This project can be completed by cut and pasting from
the web site. But try to understand the steps.
• Let me know if there are typos.
The Protocol Stack
application
transport
network
link
physical
• The network layer routes packets
(datagrams) through the network
• The network layer gets packets
from the transport layer or from the
link layer.
• Depending on the destination
address, the network layer will give
the packet to the transport protocol
or to a specific link layer to send on
a specific link
• The network layer also provides
fragmenting of a large packet into
chunks suitable for the link layer
The Protocol Stack
application
transport
network
link
physical
• The link layer moves packets
(frames) between two hosts
• However, the link layer may
provide a wide range of services
including
–
–
–
–
Media access control
Error detection / correction
Routing over layer 2 networks
Reliability (where the network layer is
informed if the transmission fails)
The Protocol Stack
application
transport
network
link
physical
• The physical layer moves packets
(frames) between two connected
hosts
• This requires putting the bits onto a
physical medium and decoding
them from the medium.
• In this course we mostly neglect
the physical layer and assume that
is works correctly (each layer
always assumes that the other
layers work correctly)
• But the performance of a protocol
at a layer often dependent on the
other layers.
– One approach is for cross-layer design
source
message
segment
M
Ht
M
datagram Hn Ht
M
frame Hl Hn Ht
M
Encapsulation
application
transport
network
link
physical
link
physical
switch
destination
M
Ht
M
Hn Ht
Hl Hn Ht
M
M
application
transport
network
link
physical
Hn Ht
Hl Hn Ht
M
M
network
link
physical
Hn Ht
M
router
Chapter 2
The Application Layer
Goals of this Chapter
• To understand common application protocols work
–
–
–
–
–
–
Web (http)
Email (smtp)
FTP
DNS
P2P
IM
• To understand how the design alternatives for application
design
– A network application runs on many hosts, it is a distributed
application
– This chapter discusses several designs of distributed
applications
Road Map
•
•
•
•
•
•
Application basics
Web
Email
FTP
DNS
P2P
– Graph theory
– State diagrams
– P2P design
• IM
Road Map
•
•
•
•
•
•
Application basics
Web
Email
FTP
DNS
P2P
– Graph theory
– State diagrams
– P2P design
• IM
Creating a network app
write programs that
– run on (different) end
systems
– communicate over network
– e.g., web server software
communicates with browser
software
No need to write software for
network-core devices
– Network-core devices do not
run user applications
– applications on end systems
allows for rapid app
development, propagation
application
transport
network
data link
physical
application
transport
network
data link
physical
application
transport
network
data link
physical
An App-layer protocol defines
• Types of messages
exchanged,
– e.g., request, response
• Message syntax:
– what fields in messages &
how fields are delineated
• Message semantics
– meaning of information in
fields
• Rules for when and how
processes send & respond
to messages
Public-domain protocols:
• defined in RFCs
• allows for
interoperability
• e.g., HTTP, SMTP
Proprietary protocols:
• e.g., Skype
Ports
• An application is
identified by the hosts IP
address, transport
protocols, and port
– E.g., A web server has a
particular IP address,
listens with TCP on port 80.
– A web browser on a host
will connect a request a file
from the web server. The
browser is identified by the
host’s IP address and a
TCP port.
host
(web server)
host
TCP
UDP
TCP
UDP
0
0
0
0
4567
4568
216-1
80
216-1
216-1
216-1
What transport service does an app need?
Data reliability
• some apps (e.g., audio) can
tolerate some loss
• other apps (e.g., file
transfer, telnet) require
100% reliable data transfer
Timing
• some apps (e.g., Internet
telephony, interactive
games) require low delay to
be “effective”
Throughput
• some apps (e.g., multimedia)
require minimum amount of
throughput to be “useful” (i.e., in
order for the user to gain utility)
• other apps (“elastic apps”) make
use of whatever throughput they
get
Security
• Encryption, data integrity, …
Transport service requirements of common apps
Application
Data loss
Throughput
Time Sensitive
file transfer
e-mail
Web documents
real-time audio/video
no loss
no loss
no loss
loss-tolerant
no
no
not really
yes, 100’s msec
stored audio/video
interactive games
instant messaging
loss-tolerant
loss-tolerant
no loss
elastic
elastic
some what elastic
audio: 5kbps-1Mbps
video:10kbps-5Mbps
same as above
few kbps up
elastic
yes, few secs
yes, 100’s msec
yes and no
Internet transport protocols services
TCP service:
UDP service:
• connection-oriented: setup
required between client and
server processes
• reliable transport between
sending and receiving process
• flow control: sender won’t
overwhelm receiver
• congestion control: throttle
sender when network overloaded
• does not provide: timing,
minimum throughput guarantees,
security
• unreliable data transfer
between sending and
receiving process
• does not provide: reliability,
flow control, congestion
control, timing, throughput
guarantee, or security
• Does not require
connection set-up
• Packets can be sent at any
rate desired (but this might
be cause considerable
congestion)
Internet apps: application, transport protocols
Application
e-mail
remote terminal access
Web
file transfer
streaming multimedia
Internet telephony
Application
layer protocol
Underlying
transport protocol
SMTP [RFC 2821]
Telnet [RFC 854]
HTTP [RFC 2616]
FTP [RFC 959]
HTTP (eg Youtube),
RTP [RFC 1889]
SIP, RTP, proprietary
(e.g., Skype)
TCP
TCP
TCP
TCP
TCP or UDP
typically UDP
Road Map
•
•
•
•
•
•
Application basics
Web
Email
FTP
DNS
P2P
– Graph theory
– State diagrams
– P2P design
• IM
Web and HTTP
• Web page consists of objects
• Object can be HTML file, JPEG image, Java applet, audio file,…
• Web page consists of base HTML-file which includes several
referenced objects
• The browser first requests the base file
• The base file species text and URLs of objects
• The browser requests these objects, where ever they are (not
always on the same server)
• HTTP is used to request the base file and all the other files
• Note, that HTTP can be used for other applications besides web
• Each object is addressable by a URL
• Example URL:
www.someschool.edu/someDept/pic.gif
host name
path name
HTTP overview
HTTP: hypertext
transfer protocol
• Web’s application layer
protocol
• client/server model
– client: browser that
requests, receives,
“displays” Web objects
– server: Web server
sends objects in
response to requests
PC running
Explorer
Server
running
Apache Web
server
Mac running
Navigator
HTTP overview (continued)
Uses TCP:
HTTP is “stateless”
• client initiates TCP connection
(creates socket) to server, port
80
• server accepts TCP connection
from client
• HTTP messages (applicationlayer protocol messages)
exchanged between browser
(HTTP client) and Web server
(HTTP server)
• TCP connection closed
• server maintains no
information about past
client requests
aside
Protocols that maintain “state”
are complex!
• past history (state) must be
maintained
• if server/client crashes, their
views of “state” may be
inconsistent, must be
reconciled
HTTP connections
Nonpersistent HTTP
• At most one object is
sent over a TCP
connection.
Persistent HTTP
• Multiple objects can
be sent over single
TCP connection
between client and
server.
Nonpersistent HTTP
Suppose user enters URL
www.someSchool.edu/someDepartment/home.index
(contains text,
references to 10
jpeg images)
1a. HTTP client initiates TCP connection to
HTTP server (process) at
www.someSchool.edu on port 80
1b. HTTP server at host
2. HTTP client sends HTTP request
message (containing URL) into TCP
connection socket. Message
indicates that client wants object
someDepartment/home.index
www.someSchool.edu waiting for TCP
connection at port 80. “accepts”
connection, notifying client
3. HTTP server receives request message,
forms response message containing
requested object, and sends message
into its socket
5. HTTP client receives response message
containing html file, displays html.
Parsing html file, finds 10 referenced
jpeg objects
time
6. Steps 1-5 repeated for each of 10 jpeg
objects
4. HTTP server closes TCP connection.
Non-Persistent HTTP: Response time
Definition of RTT: time for a small
packet to travel from client to
server and back.
Response time:
• one RTT to initiate TCP
connection
• one RTT for HTTP request and
first few bytes of HTTP
response to return
• file transmission time
total = 2RTT+transmit time
initiate TCP
connection
RTT
request
file
time to
transmit
file
RTT
file
received
time
time
Persistent HTTP
• Nonpersistent HTTP issues:
• requires 2 RTTs per object
• OS overhead for each TCP
connection
• browsers often open parallel
TCP connections to fetch
referenced objects
• Persistent HTTP
• server leaves connection open
after sending response
• subsequent HTTP messages
between same client/server
sent over open connection
• client sends requests as soon
as it encounters a referenced
object
• as little as one RTT for all the
referenced objects
HTTP request message
• two types of HTTP messages: request, response
• HTTP request message:
– ASCII (human-readable format)
request line
(GET, POST,
HEAD commands)
GET /somedir/page.html HTTP/1.1
Host: www.someschool.edu
User-agent: Mozilla/4.0
header Connection: close
lines Accept-language:fr
Carriage return,
line feed
indicates end
of message
(extra carriage return, line feed)
HTTP request message: general format
HTTP response message
status line
(protocol
status code
status phrase)
header
lines
data, e.g.,
requested
HTML file
HTTP/1.1 200 OK
Connection close
Date: Thu, 06 Aug 1998 12:00:15 GMT
Server: Apache/1.3.0 (Unix)
Last-Modified: Mon, 22 Jun 1998 …...
Content-Length: 6821
Content-Type: text/html
data data data data data ...
HTTP response status codes
In first line in server->client response message.
A few sample codes:
200 OK
– request succeeded, requested object later in this message
301 Moved Permanently
– requested object moved, new location specified later in this message
(Location:)
400 Bad Request
– request message not understood by server
404 Not Found
– requested document not found on this server
505 HTTP Version Not Supported
Trying out HTTP (client side) for yourself
1. Telnet to your favorite Web server:
telnet cis.poly.edu 80
Opens TCP connection to port 80
(default HTTP server port) at cis.poly.edu.
Anything typed in sent
to port 80 at cis.poly.edu
2. Type in a GET HTTP request:
GET /~ross/ HTTP/1.1
Host: cis.poly.edu
By typing this in (hit carriage
return twice), you send
this minimal (but complete)
GET request to HTTP server
3. Look at response message sent by HTTP server!
Wireshark (ethereal)
•
•
•
•
•
Wireshark captures all packets that pass through the hosts interface
To run Wireshark , libpcap (linux) or winpcap (windows) must be installed. It
comes with wireshark package
Then, run wireshark
Select Capture
Find the active interface
– E.g., mot generic dialup, nor vnp, nor packet scheduler, but wireless …. With IP
address
– Then select prepare
– Let’s watch TCP packets on port 80
• Next to capture filter, enter TCP port 80
–
–
–
–
•
•
•
Select update in realtime and autoscroll
Might need to enable or disable “capture in promiscuous mode”
Press start
Press close
Load www.eecis.udel.edu page in browser
Press stop in Wireshark
Find http request to 128.4.40.10.
– Right click and select follow TCP stream
Web caches (proxy server)
Goal: reduce network utilization by satisfying client request
without involving origin server
• user sets browser: Web
accesses via cache
• browser sends all
HTTP requests to
cache
– object in cache: cache
returns object
– else cache requests
object from origin server,
then returns object to
client
origin
server
client
client
Proxy
server
origin
server
More about Web caching
• cache acts as both client
and server
• typically cache is installed
by ISP (university,
company, residential ISP)
Why Web caching?
• reduce response time for
client request
• reduce traffic on an
institution’s access link.
• Internet dense with
caches: enables “poor”
content providers to
effectively deliver content
(but so does P2P file
sharing)
Caching example
origin
servers
Assumptions
• average object size = 100,000
bits
• avg. request rate from
institution’s browsers to origin
servers = 15/sec
• delay from institutional router to
any origin server and back to
router = 2 sec
public
Internet
1.5 Mbps
access link
institutional
network
10 Mbps LAN
Consequences
•
•
•
utilization on LAN = 15%
utilization on access link = 100%
total delay = Internet delay +
access delay + LAN delay
= 2 sec + minutes + milliseconds
institutional
cache
Caching example (cont)
origin
servers
possible solution
• increase bandwidth of access
link to, say, 10 Mbps
public
Internet
consequence
•
•
•
utilization on LAN = 15%
utilization on access link = 15%
Total delay = Internet delay +
access delay + LAN delay
= 2 sec + msecs + msecs
• often a costly upgrade
10 Mbps
access link
institutional
network
10 Mbps LAN
institutional
cache
Caching example (cont)
origin
servers
possible solution: install
cache
• suppose hit rate is 0.4
consequence
• 40% requests will be satisfied
almost immediately
• 60% requests satisfied by origin
server
• utilization of access link reduced
to 60%, resulting in negligible
delays (say 10 msec)
• total avg delay = Internet delay
+ access delay + LAN delay =
.6*(2.01) secs + .4*milliseconds
< 1.4 secs
public
Internet
1.5 Mbps
access link
institutional
network
10 Mbps LAN
institutional
cache
Conditional GET
cache
• Goal: don’t send object if
cache has up-to-date cached
HTTP request msg
version
If-modified-since:
<date>
• cache: specify date of cached
copy in HTTP request
If-modified-since:
<date>
• server: response contains no
object if cached copy is up-todate:
HTTP/1.0 304 Not
Modified
HTTP response
server
object
not
modified
HTTP/1.0
304 Not Modified
HTTP request msg
If-modified-since:
<date>
HTTP response
HTTP/1.0 200 OK
<data>
object
modified
Road Map
•
•
•
•
•
•
Application basics
Web
FTP
Email
DNS
P2P
– Graph theory
– State diagrams
– P2P design
• IM
FTP: the file transfer protocol
user
at host
FTP
FTP
user
client
interface
local file
system
file transfer
FTP
server
remote file
system
• transfer file to/from remote host
• client/server model
– client: side that initiates transfer (either to/from remote)
– server: remote host
• ftp: RFC 959
• ftp server: listens on port 21
FTP is weird: separate control and data connections
•
•
FTP client contacts FTP server at port 21,
TCP is transport protocol
client authorized over control connection
–
–
–
•
•
client browses remote directory by sending
commands over control connection.
Data is transferred over different
connections. Two approaches
–
–
•
This is done in “clear text” (i.e., unencrypted)
So if some one if sniffing packets, your
password might be learned.
Sniffing packets is difficult on ethernet,
encrypted wifi, and DSL, but is possible on
cable modems
TCP control connection
port 21
FTP
client
TCP data connection
port 20
FTP
server
Active
Passive
Active
– The client opens a TCP socket with
on some port (port number >1024)
– The client sends the server the port
– The server connects to the client’s
port where the servers source port is
20
•
Active mode is a problem for
firewalls
– If my desktop is not a server, if
should not receive any requests
for connections.
– But FTP servers will make such a
requests
FTP Passive mode
•
•
•
•
When a file is to be transferred, the
server opens a port (number>1024 and
TCP control connection
not 20)
port 21
The server sends this port number
information over the command
connection
TCP data connection
FTP
FTP
high port
The client connects to the servers over
client
server
this port.
Drawback of passive
– Some enterprises (companies) like
to control which applications are
used
• E.g., web browsing is ok, but
skype is not
– One way to do this is to block out
going connections based on the
port.
– However, this will cause FTP to
fail, unless the device that blocks
connections is smart
Road Map
•
•
•
•
•
•
Application basics
Web
FTP
Email
DNS
P2P
– Graph theory
– State diagrams
– P2P design
• IM
Email Protocol Design
•
Basic assumption: weak user agents and strong mail servers
–
–
–
–
–
•
The user wants to send the mail and leave
The user wants to get the mail
The user may come and go whenever (e.g., roaming laptop)
It should be possible to send mail to a user even if neither user is online at the same time.
We conclude that there must be a middle man/mail server.
Servers are not that strong: The protocol must be as robust as possible to servers being offline
–
No single server – why
•
•
–
•
Users
Mail servers
Each user has a mail box in its mail server
–
•
•
We conclude that there should be many mail servers
Two types of hosts
–
–
•
Single point of failure
The server would have to be too big (congestion)
Users retrieve mail from their mail server at there convenience
Users give mail to their mail servers to deliver the mail
Mail servers communicate with
–
–
The users that have mail boxes in the server
Other mail servers
user
agent
mail
server
mail
server
user
agent
Email Protocol Design
•
Two types of hosts
–
–
•
Each user has a mail box in its mail server
–
•
•
Users
Mail servers
Users retrieve mail from their mail server at there convenience
Users give mail to their mail servers to deliver the mail
Mail servers communicate with
–
–
The users that have mail boxes in the server
Other mail servers
User composes mail
and sends it to its
mail server (or a
mail server that will
send mail for it)
user
agent
mail
server
Mail server finds the
destination mail
server and attempts
to send the mail
mail
server
Destination user
requests emails from
mailbox
Destination server
gives mails to user
user
agent
Email Protocol Design
•
Two types of hosts
–
–
•
Each user has a mail box in its mail server
–
•
•
Users
Mail servers
Users retrieve mail from their mail server at there convenience
Users give mail to their mail servers to deliver the mail
Mail servers communicate with
–
–
The users that have mail boxes in the server
Other mail servers
User composes mail
and sends it to its
mail server (or a
mail server that will
send mail for it)
user
agent
Mail server finds the
destination mail
server and attempts
to send the mail
Destination server
gives mails to user
mail
server
mail
server
SMTP
Destination user
requests emails from
mailbox
SMTP
user
agent
POP3
IMAP
…
Electronic Mail: Details
outgoing
message queue
user mailbox
Three major components:
•
•
•
user agents
mail servers
simple mail transfer protocol: SMTP
user
agent
mail
server
User Agent
• a.k.a. “mail reader”
• composing, editing, reading mail
SMTP
messages
• e.g., Eudora, Outlook, elm, Mozilla
Thunderbird
mail
server
• Put outgoing on server (with SMTP)
• Get incoming messages from
server
user
agent
SMTP
SMTP
user
agent
user
agent
mail
server
user
agent
user
agent
Electronic Mail: mail servers
Mail Servers
•
•
•
mailbox contains incoming
messages for user
message queue of outgoing (to be
sent) mail messages
SMTP protocol between mail
servers to send email messages
– client: sending mail server
– “server”: receiving mail server
• Reliable: several attempts and
provide notification if delivery
fails
user
agent
mail
server
SMTP
SMTP
mail
server
user
agent
SMTP
user
agent
user
agent
mail
server
user
agent
user
agent
Electronic Mail: SMTP [RFC 2821]
• uses TCP to reliably transfer email message from client to
server, port 25
• direct transfer: sending server to receiving server
• Emails are pushed to servers (but users pull messages from
servers)
• three phases of transfer
– handshaking (greeting)
– transfer of messages
– closure
• command/response interaction
– commands: ASCII text
– response: status code and phrase
• messages must be in 7-bit ASCII
– Makes it difficult to send attachments
Scenario: Alice sends message to Bob
4) SMTP client sends Alice’s
message over the TCP
connection
5) Bob’s mail server places the
message in Bob’s mailbox
6) Bob invokes his user agent to
read message
1) Alice uses UA to compose
message and “to”
[email protected]
2) Alice’s UA sends message to
her mail server; message
placed in message queue
3) Client side of SMTP opens
TCP connection with Bob’s
mail server
1
user
agent
2
mail
server
3
mail
server
4
5
6
user
agent
Sample SMTP interaction
Client connects to server
S:
C:
S:
C:
S:
C:
S:
C:
S:
C:
C:
C:
S:
C:
S:
220 hamburger.edu
HELO crepes.fr
250 Hello crepes.fr, pleased to meet you
MAIL FROM: <[email protected]>
250 [email protected]... Sender ok
RCPT TO: <[email protected]>
250 [email protected] ... Recipient ok
DATA
354 Enter mail, end with "." on a line by itself
Do you like ketchup?
How about pickles?
.
250 Message accepted for delivery
QUIT
221 hamburger.edu closing connection
Try SMTP interaction for yourself:
• telnet mail.eecis.udel.edu 25
• see 220 reply from server
• enter HELO, MAIL FROM, RCPT TO, DATA, QUIT
commands
above lets you send email without using email client
(reader)
SMTP: final words
• SMTP uses persistent
connections
• SMTP requires message
(header & body) to be in 7-bit
ASCII
• SMTP server uses
CRLF.CRLF to determine end
of message
Comparison with HTTP:
• HTTP: pull
• SMTP: push
• both have ASCII
command/response
interaction, status codes
• HTTP: each object
encapsulated in its own
response msg
• SMTP: multiple objects sent in
multipart msg
Mail access
• POP3 and IMAP are two protocols for
access mail on a mail server
• Web-based mail works differently, the web
mail server and the mail server can be
integrated, so that there is no user agent.
Mail access protocols
user
agent
SMTP
SMTP
sender’s mail
server
access
protocol
receiver’s mail
server
• SMTP: delivery/storage to receiver’s server
• Mail access protocol: retrieval from server
– POP: Post Office Protocol [RFC 1939]
• authorization (agent <-->server) and download
– IMAP: Internet Mail Access Protocol [RFC 1730]
• more features (more complex)
• manipulation of stored msgs on server
– HTTP: gmail, Hotmail, Yahoo! Mail, etc.
user
agent
Road Map
•
•
•
•
•
•
Application basics
Web
FTP
Email
DNS
P2P
– Graph theory
– State diagrams
– P2P design
• IM
DNS – domain name system
• Change names, like www.yahoo.com into IP address.
• Services provided by DNS
– Name to address translation
– Host aliasing
• A host relay1.west-coast.yahoo.com could have two aliases, yahoo.com and
www.yahoo.com.
• In this case, the canonical hostname is relay1.west-coast.yahoo.com.
• DNS can provide canonical host names
– Mail server aliasing
• When a mail server wants to send a mail to [email protected], it does not send
it to www.udel.edu, but to mail.udel.edu. Or maybe udmail.udel.edu. DNS
can translate udel.edu to mail.udel.edu
– (Cheap) Load distribution
•
•
•
•
Cnn.com has several servers.
DNS will respond with all address,
but it will reorder the addresses every time.
If the client uses the first address listed, then each client will use different
servers.
• Content distribution networks (CDN) are better ways of load balancing
DNS - structure
• Centralized DNS?
– Pros – somewhat easy to maintain (there is only one
system). But it must always be online
– Cons
•
•
•
•
•
Single point of failure (the system crashes -> no web)
Congestion
Server would be far from some hosts (delay)
Database would be too big
The register bohacek-pc1.pc.udel.edu would require
interacting with the big server
• Instead, a distributed hierarchical database is
used.
Domain Hierarchy
edu
UD
eecis
upenn
art
bohacek_pc1 bohacek_pc10
com
yahoo
gov
cisco
whitehouse
mil
nasa
navy
org net uk
arpa
acm
in
Administrative Zones in the Domain Hierarchy
root
edu
UD
upenn
gov
whitehouse
com
mil
nasa
navy
arpa
yahoo
org net uk
cisco
acm
eecis art
bohacek_pc1 bohacek_pc10
It is possible that .edu and .gov are administered together
Note that UD administered art but not eecis
Some times a single service provider will administer the domains for a large number of .coms
in
Root servers
• Each layer in the hierarchy knows about the domain names below it
• The highest level is the root.
– There are 13 root “servers”
– Each of these servers is actually several servers, and some of the
machines that comprise a server are distributed geographically.
a Verisign, Dulles, VA
c Cogent, Herndon, VA (also LA)
d U Maryland College Park, MD
g US DoD Vienna, VA
h ARL Aberdeen, MD
j Verisign, ( 21 locations)
e NASA Mt View, CA
f Internet Software C. Palo Alto,
k RIPE London (also 16 other locations)
i Autonomica, Stockholm (plus
28 other locations)
m WIDE Tokyo (also Seoul,
Paris, SF)
CA (and 36 other locations)
b USC-ISI Marina del Rey, CA
l ICANN Los Angeles, CA
13 root name
servers
worldwide
overview
• Top-level domain (TLD) servers
– There are around 200 top-level domains
– These include com, edu, mil, info, in, uk, cn,
– Currently,
• network solutions maintains the TLD servers for
com
• Educause maintains the TLD servers for edu
– The root servers know the addresses and
names of all top level servers
• Organizations have a hierarchy of DNS
servers
DNS queries
•
•
•
•
•
•
Suppose a host needs the IP address of bohacek-pc1.eecis.udel.edu
If this IP address is not in cache, the host asks its local DNS server.
If the DNS server does not have it in cache, it checks if is had the IP address of the
DNS server of eecis.udel.edu in cache
If not, it checks if IP address of the dns server of udel.edu in cache
If not, it check if it has the IP address of the top-level domain server of edu in cache
It not, it asks the root server for the IP address of the edu TLD server
–
•
•
•
•
•
•
•
The DNS server always has the IP address of the root servers
The local DNS server asks the edu TLD server for address of bohackpc1.eecis.udel.edu.
The TLD server does not know that IP address, but instead gives the IP address of
the dns server for UD
The local DNS server asks the UD dns server for the address of bohackpc1.eecis.udel.edu.
The UD dns server does not know the address, but instead returns the address of the
eecis dns server.
The local DNS server asks the eecis dns server for the address of bohacekpc1.eecis.udel.edu
Eecis dns server replies with the address.
This address is returned to the host that orginally asked the question.
DNS Queries
Root server (IP address are always known)
Browser wants
to show www.
eecis.udel.edu
Browser needs
the IP address
of www.
eecis.udel.edu
What is the IP address
of www.eecis.udel.edu?
Root server does not know.
Instead, it responds with
dns server that might,
specifically, the TLD server
for .edu
TLD server for .edu
Host asks local
DNS server for IP
address of www.
eecis.udel.edu
It is 128.4.1.2
•
•
•
•
•
What is the ip address of
www.eecis.udel.edu?
TLD server does not know.
Instead replies with the
What is the ip address
nameof
and IP address of
www.eecis.udel.edu?
the UD DNS server
dns server
What is theUD
ip address
of does not
know.
Instead
it replies with
Local DNS server checks if itwww.eecis.udel.edu?
has the IP the name and IP address
address of www.eecis.udel.edu in
of the eecis dns server.
cache.
If not, it checks if is had the IP address
of the DNS server of eecis.udel.edu in
cache
It is 128.4.1.2
If not, it checks if IP address of the dns
server of udel.edu in cache
If not, it check if it has the IP address of
the top-level domain server of edu in
cache
.if not, …..
DNS Queries
Root server (IP addresses are always known)
What is the IP address of
www.eecis.udel.edu?
Browser
Browser
needs
wantsthe
to IP
address
show of
www.eecis.udel.edu
Host asks local
DNS server for IP
address of
www.eecis.udel.edu
Root server does not know.
Instead, it responds with
name and address of a
server that might,
specifically,
What is the IP
address ofthe TLD server
TLD server for .edu
for .edu
www.eecis.udel.edu?
TLDWhat
server
does
know.of
is the
ipnot
address
Instead
replies
with
the
www.eecis.udel.edu?
It is 128.4.1.2
name and IP address of
the UD DNS server
UD DNS server does not
What is the IP address of
1. Local DNS server checks if it has theknow.
IP Instead it replies with
www.eecis.udel.edu?
address of www.eecis.udel.edu in cache.
the name
and IP address
2. If not, it checks if is had the IP address
of
of the eecis dns server.
the DNS server of eecis.udel.edu in cache
It is 128.4.1.2
3. If not, it checks if it has the IP address of
the DNS server of udel.edu in cache
4. If not, it checks if it has the IP address of
the top-level domain server of edu in cache
5. .if not, …..
UD DNS server
eecis DNS server
DNS Queries
Browser
Browser
needs
wantsthe
to IP
address
show of
www.eecis.udel.edu
Host asks local
DNS server for IP
address of
www.eecis.udel.edu
It is 128.4.1.2
1. Local DNS server checks if it has the IP
address of www.eecis.udel.edu in cache.
2. If yes, then return it
DNS Queries
Browser
Browser
needs
wantsthe
to IP
address
show of
www.eecis.udel.edu
Host asks local
DNS server for IP
address of
www.eecis.udel.edu
It is 128.4.1.2
1. Local DNS server checks if it has the IP
address of www.eecis.udel.edu in cache.
2. If not, it checks if is had the IP address of
the DNS server of eecis.udel.edu in cache
3. If yes, query it…
What is the IP address of
www.eecis.udel.edu?
It is 128.4.1.2
eecis DNS server
DNS Queries
Browser
Browser
needs
wantsthe
to IP
address
show of
www.eecis.udel.edu
Host asks local
DNS server for IP
address of
www.eecis.udel.edu
It is 128.4.1.2
1. Local DNS server checks if it has the IP
address of www.eecis.udel.edu in cache.
2. If not, it checks if is had the IP address of
the DNS server of eecis.udel.edu in cache
3. If not, it checks if it has the IP address of
the DNS server of udel.edu in cache
4. If not, it checks if it has the IP address of
the top-level domain server of edu in cache
5. .if so, then query it…
What is the IP address of
www.eecis.udel.edu?
TLDWhat
server
does
know.of
is the
ipnot
address
Instead
replies
with
the
www.eecis.udel.edu?
name and IP address of
the UD DNS server
UD DNS server does not
What
is theit IP
address
know.
Instead
replies
withof
www.eecis.udel.edu?
the name
and IP address
of the eecis dns server.
TLD server for .edu
UD DNS server
It is 128.4.1.2
eecis DNS server
Attack on DNS
• Hackers have tried to bring down
DNS by performing a DoS on the
root servers
– DoS – denial of service. Sends more
packets or requests for service than the
server can accommodate. Resulting in
poor service for normal users.
• This failed because
– There are many very strong root servers and have
firewalls/filters
• The attacks used ICMP ping packets
• DNS requests would have been more effective
– It is rare that a root server is needed
• Usually only the TLD server is needed
• Or only a domain server.
DNS Message Details
• DNS Record
– (Name, Value, Type, Class, TTL)
– If Type = A
• Name is the host name
• Value is the IP address of the host
– If Type = NS
• Name is a domain name
• Value is the name of the DNS server for the domain
• E.g., (udel.edu, dns.udel.edu, NS, …, …)
– Type = MX
• Name is the domain name
• Value is the name of the mail server for the domain
• E.g., (udel.edu, mail.udel.edu, MX, …, …)
– Type = CName
• Name is a host name
• Value is the canonical name of the host
• E.g., (www.yahoo.com, relay-east.yahoo.com, CName, …, …)
– TTL is the time to live, so DNS caches can be timed out
– Class is no longer used, it is set as IN
DNS query
• (Name, Type, Class)
• (UDel.edu, MX, IN)
– Please provide the name of the UD’s mail
server
• (mail.UDel.edu, A, IN)
– Please provide the IP address for mail.udel.edu
DNS message format
DNS protocol : query and reply messages, both with
same message format
msg header
• identification: 16 bit #
for query, reply to
query uses same #
• flags:
– query or reply
– recursion desired
– recursion available
– reply is
authoritative
DNS message format
Name, type fields
for a query
RRs in response
to query
records for
authoritative servers
additional “helpful”
info that may be used
DNS Queries
Root server (IP addresses are always known)
0
0
1
0
(www.eecis.udel.edu, A,IN)
0
0
Browser
Browser
needs
wantsthe
to IP
address
show of
www.eecis.udel.edu
0
4
NS, IN)
0
1 (edu, edu-serverA.net,
(edu-serverA.net,
124.5.1.1,
A, IN)
0
0
1
0
(edu, edu-serverB.net,
NS, IN)
(www.eecis.udel.edu,
A,IN)
TLD
(edu-serverB.net, 124.5.1.2, A, IN)
0
0
server for .edu
(www.eecis.udel.edu, A,IN)
0
0
0
1
0
0
0
0
(www.eecis.udel.edu,
A,IN) 4
0
1
0
(udel.edu, dns2.udel.edu, NS, IN)
(udel.edu, dns2.udel.edu, 128.178.2.2, A, IN)
(www.eecis.udel.edu, 128.4.1.1, A, IN)
1. Local DNS server checks if it has the IP
address of www.eecis.udel.edu in cache.
2. If not, it checks if is had the IP address of
the DNS server of eecis.udel.edu in cache
3. If not, it checks if it has the IP address of
the DNS server of udel.edu in cache
4. If not, it checks if it has the IP address of
the top-level domain server of edu in cache
5. .if not, …..
0
0
0
1(udel.edu, dns1.udel.edu,
NS, IN)
0
0
0 128.173.2.1, A, IN)
(dns1.udel.edu,
4
(www.eecis.udel.edu,
A,IN)
UD DNS server
(eecis.udel.edu, dns1.eecis.udel.edu, NS, IN)
(dns1.eecis.udel.edu, 128.4.1.10, A, IN)
(eecis.udel.edu, dns2.udel.edu, NS, IN)
(dns2.udel.edu, 128.4.1.11, A, IN)
0
0
1
0
(www.eecis.udel.edu, 128.4.1.1, A, IN)
eecis DNS server
DNS Flags
• The DNS header has a query ID
– The query has this ID and the server copies this ID
into the response
• Flag indicating query or answer
• Flag indicating whether the server is the
authoritative server for the answer (as oppose to
a cached answer)
• A recursive desired flag indicating that the
host/server would like the server to perform the
recursive DNS lookup
• A recursive available flag indicating whether the
server is available to to the recursive lookup
DNS
• Which transport protocol should DNS use?
• Why?
Peer-to-peer file sharing
• About P2P
– 30% or more of the bytes transferred on the Internet are from
P2P users
– Skype is a very successful P2P VoIP app
• Written in 3-4 months
• Topics covered
– Scalability
– P2P querying
– Case study
• BitTorrent
• Skype
Pure P2P architecture
•
Review: What is the difference
between peer-to-peer and
client/server?
– Each hosts acts as both a
server and a client.
•
•
•
•
•
•
no always-on server
arbitrary end systems directly
communicate
peers are intermittently
connected and may change IP
addresses
Pure P2P has significant
drawbacks.
P2P-like systems with some
central servers are more
common.
But in all cases, the file transfer
is between peers, not from
servers.
peer-peer
File Distribution: Server-Client vs P2P
Question : How much time to distribute file from
one server to N peers?
us: server upload
bandwidth
Server
us
File, size F
dN
uN
u1
d1
u2
ui: peer i upload
bandwidth
d2
Network (with
abundant bandwidth)
di: peer i download
bandwidth
File distribution time: server-client
• Time for the server to send a
copy to a single client
– F/us
• Time for the server send N
copies:
– NF/us time
• client i takes F/di time to
download
Server
F
us
dN
u1 d1 u2
d2
Network (with
abundant bandwidth)
uN
Time to distribute F
to N clients using = dcs = max { NF/us, F/min(di) }
i
client/server approach
increases linearly in N
(for large N)
File distribution time: P2P
Server
• server must send one copy:
– F/us time
F
us
• client i download time
– F/di
• Total data to be downloaded
– NF
dN
u1 d1 u2
d2
Network (with
abundant bandwidth)
uN
• fastest possible transfer rate: us + Sui
dP2P = max { F/us, F/min(di) , NF/(us + Sui) }
i
Can you make a schedule for the download the take this amount?
Server-client vs. P2P: example
Client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us
Minimum Distribution Time
3.5
P2P
Client-Server
3
2.5
2
1.5
1
0.5
0
0
5
10
20
15
25
30
35
N
Conclusion: P2P systems are scalable. But the load is distributed to all
users, so P2P users have more load than clients in the client-server model.
Peer-to-peer Querying
•
•
While the file is transferred from the peer, how to find the file
Options
– Centralize directory
•
•
•
•
•
•
•
Napster
Single point of failure
Performance bottleneck
Target for the RIAA
Always up
Easy to find
Easy protocol
– Query flooding
•
•
•
•
Gnutella
Hosts find other host and form a network of neighbors (overlay network)
Search for a file (covered next)
How to set up the network – bootstrap?
– Have a central list of peers
– Have distributed lists of peers
– Search out a peer by scanning – like in project
• Once the file is found,
– the host could respond directly to the searcher,
– or it could send the response along the reveres path.
– In the later case, the peers along the way would learn about where the file is located (cache) and could
more quickly answer the next time the search is performed. But then we must worry about stale
information.
Querying Flooding State Diagram
Inform user that query failed
Inform user of file location
User Request for File
Set AttemptCounter = 0
AttemptCounter ++
AttemptCounter>MaxAttempts
else
Timer>TO
Send out a request for
file to all neighbors
Set Timer=0;
wait
Reply from peer
Listening Peer
wait
Request arrives
Have seen
request
before
Get request ID
Check for file in directory
Send request to all neighbors
File is in local dir
Send response to peer
that requested file
Expanding ring
(hierarchical peer-to-peer network)
•
•
KaZaA
Not all peers are equal – super peers (?)
–
•
•
•
•
•
•
•
Super peers (group leaders) have higher bit-rate connections, are more stable,
etc.
Peers connect to group leaders
The group leaders keep a list of file shared by all their children peer.
group leaders connect to a small number of other group leaders
A child host will ask its group leader for a file, if the group leader does not
know where it is, it will flood the network of group leaders. The response
from other group leaders follows a reverse path to the asking group leader
(so other leader can cache the response)
A file is identified with a ID (e.g., MD5) that can take a string (file) and
come to a unique ID. A small change in the file causes a large change in
the ID. It is not possible to construct two files that have the same ID. The
ID is a finger print.
Since files are ID-ed, multiple copies of the same file can be found and
these copies can be downloaded from multiple hosts in parallel.
Note the if you are downloading while other are uploading, the uploading
slows down the downloading, but only a little bit.
BitTorrent
• Centralized P2P
– A centralized server, or tracker, tracks the clients
involved in the P2P transfer
– This is similar to Napster
– Companies that host these site get sued and are
attacked by DDoS
• Components of BitTorrent System
–
–
–
–
Torrent Files
Trackers
Seeders
Peers
Torrent File
• Required to download
• Can be found on web sites or sent by email
• Contains information about the file and the tracker
– Announce: the URL of the tracker
– Creation date
– Info
•
•
•
•
Length of file
Name of file
Length of each piece (except for the last)
Pieces – the 20B SHA-1 value of each piece
– Note, the number of pieces can be determined counting the number of
bytes in the pieces field and dividing by 20
• If the download contains multiple files, then a single
torrent file will contain information about all files.
Tracker
• Make a HTTP Get request to the tracker
specifying the SHA-1 hash of the file to be
downloaded
– The request also includes the number of bytes
downloaded and the number uploaded
– If the client does not upload enough, the tracker might
not provide a reply
• The reply contains
– The time when the tracker information should be
refreshed (usually 30 minutes)
– A list of the peers
• IP address and port (usually 6881)
• Peer ID
File distribution with BitTorrent
tracker: tracks peers
participating in torrent
obtain list
of peers
trading
chunks
peer
BitTorrent (1)
• file divided into 256KB chunks.
• peer joining torrent:
– has no chunks, but will accumulate them over time
– registers with tracker to get list of peers, connects to subset of
peers (“neighbors”)
• while downloading, peer uploads chunks to other peers.
• peers may come and go
• once peer has entire file, it may (selfishly) leave or (altruistically)
remain
BitTorrent (2)
Pulling Chunks
• at any given time, different
peers have different subsets
of file chunks
• periodically, a peer (Alice)
asks each neighbor for list of
chunks that they have.
• Alice sends requests for her
missing chunks
– rarest first
– So rarest chunks are
spread, and chunks are
uniformly common
Sending Chunks: tit-for-tat
• Alice sends chunks to four
neighbors currently sending her
chunks at the highest rate
– re-evaluate top 4 every 10
secs
• every 30 secs: randomly select
another peer, starts sending
chunks
– newly chosen peer may join
top 4
– “optimistically unchoke”
BitTorrent: Tit-for-tat
(1) Alice “optimistically unchokes” Bob
(2) Alice becomes one of Bob’s top-four providers; Bob reciprocates
(3) Bob becomes one of Alice’s top-four providers
With higher upload rate,
can find better trading
partners & get file faster!
BitTorrent Pros/Cons
• Centralized server
• Slow to get the transfer started
– Web transfers start much faster and will achieve a
sustained rate
• Peers must upload
– Some peers might not be in position to upload (e.g.,
mobile phone)
• Chunks can be corrupted
– HBO distributed fake chunks
– Since the SHA-1 hash does not match what is given
in the Torrent File, the chunk is dropped after it is
downloaded
• This wastes bandwidth and can greatly increase download
time