Transcript slides
Chapter 1
Introduction
A note on the use of these ppt slides:
We’re making these slides freely available to all (faculty, students, readers).
They’re in PowerPoint form so you can add, modify, and delete slides
(including this one) and slide content to suit your needs. They obviously
represent a lot of work on our part. In return for use, we only ask the
following:
If you use these slides (e.g., in a class) in substantially unaltered form,
that you mention their source (after all, we’d like people to use our book!)
If you post any slides in substantially unaltered form on a www site, that
you note that they are adapted from (or perhaps identical to) our slides, and
note our copyright of this material.
Thanks and enjoy! JFK/KWR
All material copyright 1996-2005
J.F Kurose and K.W. Ross, All Rights Reserved
Computer Networking:
A Top Down Approach
Featuring the Internet,
3rd edition.
Jim Kurose, Keith Ross
Addison-Wesley, July
2004.
What’s the Internet: “nuts and bolts” view
• millions of connected
computing devices: hosts
= end systems
• running network apps
• communication links
router
server
mobile
local ISP
regional ISP
– fiber, copper, radio,
satellite
– transmission rate =
bandwidth
• routers: forward packets
(chunks of data)
workstation
company
network
What’s the Internet: “nuts and bolts” view
• protocols control sending,
receiving of msgs
– e.g., TCP, IP, HTTP, FTP, PPP
• Internet: “network of
networks”
router
server
mobile
local ISP
– loosely hierarchical
– public Internet versus private
intranet
regional ISP
• Internet standards
– RFC: Request for comments
– IETF: Internet Engineering Task
Force
workstation
company
network
What’s a protocol?
human protocols:
• “what’s the time?”
• “I have a question”
• introductions
… specific msgs sent
… specific actions
taken when msgs
received, or other
events
network protocols:
• machines rather than
humans
• all communication
activity in Internet
governed by protocols
protocols define format,
order of msgs sent and
received among network
entities, and actions taken
on msg transmission,
receipt
What’s a protocol?
a human protocol and a computer network protocol:
Hi
TCP connection
request
Hi
TCP connection
response
Got the
time?
Get http://www.awl.com/kurose-ross
2:00
<file>
time
Q: Other human protocols?
Some network apps
•
•
•
•
•
•
E-mail
Web
Instant messaging
Remote login
P2P file sharing
Multi-user network
games
• Streaming stored
video clips
• Internet telephone
• Real-time video
conference
• Massive parallel
computing
“Cool” internet appliances
Web-enabled toaster +
weather forecaster
IP picture frame
http://www.ceiva.com/
World’s smallest web server
http://www-ccs.cs.umass.edu/~shri/iPic.html
Internet phones
Internet protocol stack
• application: (L7 & L6 of OSI) supporting
network applications
– FTP, SMTP, HTTP
• transport: (L5 & L4 of OSI) host-host data
transfer
– TCP, UDP
• network: routing of datagrams from
source to destination
– IP, routing protocols
• link: data transfer between neighboring
network elements
– PPP, Ethernet
• physical: bits “on the wire”
application
transport
network
link
physical
source
message
segment Ht
datagram Hn Ht
frame
Hl Hn Ht
M
M
M
M
Encapsulation
application
transport
network
link
physical
Hl Hn Ht
M
link
physical
Hl Hn Ht
M
switch
destination
M
Ht
M
Hn Ht
Hl Hn Ht
M
M
application
transport
network
link
physical
Hn Ht
Hl Hn Ht
M
M
network
link
physical
Hn Ht
Hl Hn Ht
M
M
router
Background: Addressing
• For a process to receive
messages, it must have an
identifier
• A host has a unique32-bit
IP address
• Q: does the IP address of
the host on which the
process runs suffice for
identifying the process?
• Answer: No, many
processes can be running
on same host
• Identifier includes
both the IP address and
port numbers
associated with the
process on the host.
• Example port
numbers:
– HTTP server: 80
– Mail server: 25
• More on this later
Web
and
HTTP
First some jargon
• Web page consists of objects
• Object can be HTML file, JPEG image, Java
applet, audio file,…
• Each object is addressable by a URL
• Web page consists of base HTML-file which
includes several referenced objects
• Example URL:
www.someschool.edu/someDept/pic.gif
host name
path name
HTTP overview
• HTTP: hypertext transfer
protocol
• Web’s application layer
protocol
• client/server model
PC running
Explorer
– client: browser that requests,
receives, “displays” Web objects
– server: Web server sends objects
in response to requests
Server
running
Apache Web
server
• HTTP 1.0: RFC 1945
–
http://www.rfc-editor.org/rfc/rfc1945.txt
• HTTP 1.1: RFC 2068
–
http://www.rfc-editor.org/rfc/rfc2068.txt
• HTTP state management
(cookies): RFC 2109
–
http://www.rfc-editor.org/rfc/rfc2109.txt
Mac running
Navigator
HTTP overview (continued)
Uses TCP:
• client initiates bi-directional
TCP connection (via
socket) to server, port 80
• server accepts TCP
connection from client
• HTTP messages
(application-layer protocol
messages) exchanged
between browser (HTTP
client) and Web server
(HTTP server)
– Messages encoded in text
• TCP connection closed
HTTP is “stateless”
• server maintains no
information about
past client requests
aside
Protocols that maintain
“state” are complex!
• past history (state) must
be maintained
• if server/client crashes,
their views of “state” may
be inconsistent, must be
reconciled
URL
(Uniform Resource Locator)
Way of identifying and accessing a web
page:
Example
http://www.lclark.edu/~jmache/index.html
“where”
“how”
Address or name of server
Type of transaction
“what”
(protocol)
Resource requested
URI
(Uniform Resource Identifier)
Identifies a resource and includes URLs, but
broader in context.
See http://www.w3.org/Addressing/ for more
details
Mark-up Languages
• A way of describing information in a document.
• Standard Generalized Mark-Up Language
(SGML) - a specification for a mark-up
language ratified in 1986.
• Key aspect - using pairs of tags that surround
information - a begin tag <tag_name> and a
matching end tag </tag_name> .
Example
<title> CS 393 home page </title>
HyperText Markup Language
(HTML)
A mark-up language used in web pages.
“Hypertext” refers to the text’s ability to link
to other documents.
“Markup” refers to providing information to
tell browser how to display page and other
things.
HTML page format
<HTML>
<HEAD>
</HEAD>
<BODY>
</BODY>
</HTML>
Signifies an HTML document
Head section includes information
about document - “metadata”
Body section contains text and
references to images to be
displayed
End of document
HTML Tags
• Tags specify details such as type of text.
Example
<B> to start bold text
</B> to end bold text
<I> to start italic text
</I> to end italic text
HTML page
<HTML>
<HEAD>
</HEAD>
<BODY>
Hello world
<I> My name is <B>Jens</B> </I>
</BODY>
</HTML>
Question
What does the previous HTML page display?
Answer
Hello World My name is Jens
HTML page
<HTML>
<HEAD>
</HEAD>
<BODY>
Line break tag - some tags in
<BR>
HTML are not in pairs
Hello world <P>
<I> My name is <B>Jens</B> </I>
</BODY>
</HTML>
Attributes
Many tags can have attributes which specify
something about the body between tag pair.
Example
Attributes
<FONT COLOR=red SIZE=3 FACE=Times>
This text is displayed in red in Times font,
about 12 pt.
</FONT>
HTTP request message
• two types of HTTP messages: request, response
• HTTP request message:
– ASCII
request
line (human-readable format)
(GET, POST,
GET /somedir/page.html HTTP/1.1
HEAD commands)
Host: www.someschool.edu
User-agent: Mozilla/4.0
header Connection: close
lines Accept-language:fr
Carriage return,
line feed
indicates end
of message
(extra carriage return, line feed)
HTTP request message: general format
HTTP request line (methods)
HTTP/1.0
• GET
– Return object specified by
URI
• POST
– Send data to server (forms)
• HEAD
– asks server to leave
requested object out of
response
– Return headers only of GET
response
HTTP/1.1
• GET, POST, HEAD
• PUT
– uploads file in entity body
to path specified in URL
field
• DELETE
– deletes file specified in the
URL field
• OPTIONS, TRACE,
CONNECT
HTTP request line (cont.)
• URI
– Object to retrieve
• E.g. http://www.cs.pdx.edu/index.html with a proxy
• E.g. /index.html if no proxy
• HTTP version
– Version being used
– HTTP 1.1
• Host: header required
• Connection: header supported
Common HTTP request headers
• Accept
– Acceptable document types, encodings, languages, character
sets
• If-Modified-Since
– For use with caching
• Referer
– URL which caused this page to be requested
• User-Agent
• Host
– For multiple web sites hosted on same server
• Connection
– Keep connection alive for subsequent request or close
Other HTTP request headers
• Authorization
– Authentication info for HTTP authentication
• From
– User email (when privacy is disabled)
Rest of HTTP request
Blank-line
Separate request headers from POST information
End of request
Body
If POST, send POST information
Handling user input (forms)
GET method:
• Input is uploaded in
URL field of request
line
POST method:
• Input is uploaded to
server in entity body
GET search?name=george&animal=monkey HTTP/1.1
Host: www.somesite.com
POST search HTTP/1.1
Host: www.somesite.com
Content-type: application/x-www-form-urlencoded
name=george&animal=monkey
HTTP response message
status line
(protocol
status code
status phrase)
header
lines
data, e.g.,
requested
HTML file
HTTP/1.1 200 OK
Connection close
Date: Thu, 06 Aug 1998 12:00:15 GMT
Server: Apache/1.3.0 (Unix)
Last-Modified: Mon, 22 Jun 1998 …...
Content-Length: 6821
Content-Type: text/html
data data data data data ...
HTTP response message: general format
HTTP response format
• Status-line
– HTTP version
– 3 digit response code
•
•
•
•
•
1XX – informational
2XX – success
3XX – redirection
4XX – client error
5XX – server error
– Reason phrase
HTTP response status codes
A few sample codes:
200 OK
– request succeeded, requested object later in this message
301 Moved Permanently
– requested object moved, new location specified later in
this message (Location:)
400 Bad Request
– request message not understood by server
404 Not Found
– requested document not found on this server
505 HTTP Version Not Supported
Common HTTP response headers
• Server
– server software
• Content-Encoding
– x-gzip
•
•
•
•
•
Content-Length
Content-Type
Expires
Last-Modified
ETag
Other HTTP response headers
• Location
– redirection
• WWW-Authenticate
– request for authentication
• Allow
– list of methods supported (GET, HEAD, etc)
Rest of HTTP response
Blank-line
Separate headers from data
Body
Data being returned to client
HTTP headers by function
• Authentication
– Client
• Authorization, ProxyAuthorization
– Server
• WWW-authenticate,
Proxy-Authenticate
• User, server tracking
– Client
• Cookie, Referer, From,
User-agent
– Server
• Set-cookie, Server
• Caching
– General
• Cache-control, Pragma
– Client
• If-Modified-Since, IfUnmodified-Since, IfMatch
– Server
• Last-Modified, Expires,
ETag, Age
Trying out HTTP (client side) for yourself
1. Telnet to your favorite Web server:
telnet cis.poly.edu 80
Opens TCP connection to port 80
(default HTTP server port) at cis.poly.edu.
Anything typed in sent
to port 80 at cis.poly.edu
2. Type in a GET HTTP request:
GET /~ross/ HTTP/1.1
Host: cis.poly.edu
By typing this in (hit carriage
return twice), you send
this minimal (but complete)
GET request to HTTP server
3. Look at response message sent by HTTP server!
User-server state: cookies
Many major Web sites use
cookies
Four components:
1) cookie header line of HTTP
response message
Set-cookie:
2) cookie header line in HTTP
request message
Cookie:
3) cookie file kept on user’s
host, managed by user’s
browser
4) back-end database at Web
site
Example:
– Susan access Internet
always from same PC
– She visits a specific ecommerce site for first time
– When initial HTTP requests
arrives at site, site creates a
unique ID and creates an
entry in backend database
for ID
Cookies: keeping “state” (cont.)
client
Cookie file
server
usual http request msg
usual http response +
ebay: 8734
Cookie file
amazon: 1678
ebay: 8734
Set-cookie: 1678
usual http request msg
cookie: 1678
usual http response msg
one week later:
Cookie file
amazon: 1678
ebay: 8734
usual http request msg
cookie: 1678
usual http response msg
server
creates ID
1678 for user
cookiespecific
action
cookiespectific
action
Cookies (continued)
What cookies can bring:
• authorization
• shopping carts
• Site preferences
• recommendations
• user session state
(Web e-mail)
aside
Cookies and privacy:
• cookies permit sites to
learn a lot about you
• you may supply name
and e-mail to sites
• search engines use
redirection & cookies
to learn yet more
• advertising companies
obtain info across sites
Web caches (proxy server)
Goal: satisfy client request without involving origin
server (i.e. do not send content that has not changed)
Why Web caching?
Info on web caching
http://www.ircache.net/
• Reduce response time for client http://www.squid.org
request.
ICP
• Reduce traffic on an institution’s
http://www.rfcaccess link.
editor.org/rfc/rfc2186.txt
• Reduce load on servers.
http://www.rfc• Enables “poor” content providers to
editor.org/rfc/rfc2187.txt
effectively deliver content (but so
does P2P file sharing)
More about Web caching
• Browser sends all HTTP
requests to cache
– object in cache: cache
returns object
– else cache requests object
from origin server, then
returns object to client
origin
server
client
Proxy
server
• Done directly at client
– Via browser web cache
• Along path from client to
origin server
– Via proxy web cache
– Proxy acts as both client and
server
– Typically cache is installed
by ISP (university, company,
client
origin
server
Caching example
Assumptions
• average object size = 100,000
bits
• avg. request rate from
institution’s browsers to
origin servers = 15/sec
• delay from institutional
router to any origin server
and back to router = 2 sec
Consequences
• utilization on LAN = 15%
• utilization on access link = 100%
• total delay = Internet delay +
access delay + LAN delay
= 2 sec + minutes + milliseconds
origin
servers
public
Internet
1.5 Mbps
access link
institutional
network
10 Mbps LAN
institutional
cache
Caching example (cont)
Possible solution
• increase bandwidth of access
link to, say, 10 Mbps
Consequences
• utilization on LAN = 15%
• utilization on access link = 15%
• Total delay = Internet delay +
access delay + LAN delay
= 2 sec + msecs + msecs
• often a costly upgrade
origin
servers
public
Internet
10 Mbps
access link
institutional
network
10 Mbps LAN
institutional
cache
Caching example (cont)
Install cache
• suppose hit rate is .4
origin
servers
public
Internet
Consequence
• 40% requests will be satisfied
almost immediately
• 60% requests satisfied by origin
server
• utilization of access link reduced
to 60%, resulting in negligible
delays (say 10 msec)
• total avg delay = Internet delay
+ access delay + LAN delay =
.6*(2.01) secs + .4*milliseconds
< 1.4 secs
1.5 Mbps
access link
institutional
network
10 Mbps LAN
institutional
cache
Conditional GET
• Goal: don’t send object if cache
cache has up-to-date cached
HTTP request msg
If-modified-since:
version
<date>
• cache: specify date of
cached copy in HTTP
HTTP response
HTTP/1.0
request
If-modified-since:
<date>
• server: response contains no
object if cached copy is upto-date:
HTTP/1.0 304 Not
Modified
server
object
not
modified
304 Not Modified
HTTP request msg
If-modified-since:
<date>
HTTP response
HTTP/1.0 200 OK
<data>
object
modified
HTTP caching
• Additional caching methods
– ETag and If-Match
• HTTP 1.1 has file signature as well
• When/how often should the original be checked
for changes?
– Check every time?
– Check each session? Day? Etc?
– Use Expires header
• If no Expires, often use Last-Modified as estimate
Example Cache Check Request
GET / HTTP/1.1
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
If-Modified-Since: Mon, 29 Jan 2001 17:54:18 GMT
If-None-Match: "7a11f-10ed-3a75ae4a"
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5;
Windows NT 5.0)
Host: www.cs.pdx.edu
Connection: Keep-Alive
Example Cache Check Response
HTTP/1.1 304 Not Modified
Date: Tue, 27 Mar 2001 03:50:51 GMT
Server: Apache/1.3.14 (Unix) (RedHat/Linux) mod_ssl/2.7.1 OpenSSL/0.9.5a
DAV/1.0.2 PHP/4.0.1pl2 mod_perl/1.24
Connection: Keep-Alive
Keep-Alive: timeout=15, max=100
ETag: "7a11f-10ed-3a75ae4a"