Transcript Chapter 28m

Chapter 28
Applications: World Wide Web(HTTP)
Guoying Yang
Liangqi Guo
Song Ye
Introduction
World Wide Web(WWW)
 The primary protocol used to transfer a
Web page from a server to a Web
browser.

Importance of The Web
History
* During the early of the internet, FTP data
transfers accounted for one third of Internet
traffic.
* By 1995, Web traffic became the largest
consumer of Internet backbone bandwidth.
 More people know about and use the Web.
 Most companies have Web sites to de business.

Architectural Components
Web pages: the Web consists of a large set of
documents that are accessible over the Internet.
 Each Web page is classified as a hypermedia
document.
* Suffix media: indicate that a document can
contain items other than text.
* Prefix hyper: a document can contain selectable
links that refer to other, related documents.

Architectural Components
Web browser consists of an application that a user
invokes to access and display a Web page.
 Web server obtain a copy of the specified page,
response the client’s request.
 HyperText Markup Language(HTML)
* Tags: give guidelines for display. Some tags
come in pairs that apply to all items between the
pair.
* For example: <center> ……</center>

Uniform Resource Locators
Uniform Resource Locator(URL)
* Each Web page is assigned a unique name(URL).
* A URL follows http scheme has the following
form:
http:// hostname [:port] / path [; parameters] [? query]
* port: an optional protocol port number .
* path: a string that identifies one particular
document on the server.
* parameters: an optional string supplied be the
client.
* ?query: an optional string used when the browser
send a question.

Uniform Resource Locators
The absolute form of a URL
http://www.cs.purdue.edu/people/comer
 The relative URL
* Communication has been established with a
specific server.
* Omits the address of the server.
* For example: only the string /people/comer/ is
need to specify the document named by the
absolute URL above.

An Example Document
<HTML>
The author of this text is
<A HREF=“
http://nas.cl.uh.edu/perkins
“ > TCP/IP </A>
<HTML>
Hypertext Transfer Protocol
 What
is HTTP?
The protocol used for communication
between a browser and a Web server or
between intermediate machines and
Web servers.
 HTTP has following characteristics:
Hypertext Transfer Protocol
Application Level.
 Request/Response.
 Stateless.
 Bi-Directional Transfer.
 Capability Negotiation.
 Support For Caching.
 Support For Intermediaries.

HTTP GET Request
The browser sends a GET request to which a server
responds by sending the requested item.
 Browser
Sends GET command followed by a URL and an
HTTP version number.
Examples:
GET http://www.cs.purdue.edu/people/comer/ HTTP/1.1
GET /people/comer/ HTTP/1.0

Server
Responds by sending a copy of the page.
Some Useful URLs




http://www.w3.org/Protocols/
http://www.w3.org/Protocols/rfc2616/rfc2616
.html
http://www.w3.org/Library/Examples/
http://www.w3.org/Library/User/Applications
.html
Error Messages
When a Web server receives an illegal request, it
usually generates error messages in valid HTML.
The browser will display the error message like this:
Error Messages’ Code and Meaning













400 Wrong request syntax.
401 Authorization required. A list of allowed authorization scheme will
also be sent.
402 No Chargeto field on the request for a paid service.
403 Forbidden resource
404 The server cannot find the URL requested.
405 Accessing the resource using a method not allowed.
406 Resource type incompatible with the client.
410 Resource no longer available and no forwarding information exist.
500 The server has encountered an internal error and cannot continue with
the request.
501 The server does not support the method of a legal request.
502 Secondary server does not return a valid response.
503 The service is unavailable, because the server is too busy.
504 Secondary server takes too long to respond.
Persistent Connections And
Lengths
What is persistent Connections?
Once a client opens a TCP connection to a
particular server, the client leaves the connection
in place during multiple request and response.
When either a client or server is ready to close the
connection, it informs the other side, and the
connection is closed.
Advantage: Reduced overhead.
Disadvantage: Need to identify the beginning and
end of each item sent over the connection.
Data Length And Program Output
To allow a TCP connection to persist through
multiple requests and responses. HTP sends a
length before each response. Thus, to provide
for dynamic Web pages, The HTTP standard
specifies that if the server does not know the
length of an item, the server can inform the
browser(client) that it will close the connection
after transmitting the item.
Length Encoding and header


HTTP borrows the format from e-mail,use 822 format
and MIME extension.
Example: KEYWORD
: information
content--length
: 34
content--Language

: en
content--Encoding : ascii
Other header:
Connection :close (used when server don’t know
the length )
Negotiation



In addition to specifying details about an item being
sent ,HTTP use header to permit client and server to
“negotiate” capabilities.
Capabilities include: connection
: representation
: control
: content
Two basic type of Negotiation:
Server--driven(server select)
Agent--driven ( 2 steps )
Continue:Agent driven



agent driven 2 steps :
1. First browser sent request to server to ask what
is available, the server return a list of possibilities
2. Browser select one of the possibilities and sent
a second request to obtain the item
Advantage: browser have full control about the
choice
Disadvantage:select one possibility ,sent two request
Accept header



Accept header:browser use this to specify which
media or representations are acceptable
Example :
Accept: text/html , text/plain; q=0.5 , text/x-dvi ;q=0.8
q is preference level
variety of Accept header:
Accept-Encoding
Accept-charset
Accept-Language
Conditonal request


HTTP allow a sender to make a request conditional
Example :
If-Modified-Since :Sat,01 Jan 2000 05:00:01 GMT
(avoid get item older than Jan 1,2000.)
Proxy server



A local server which is configured to cache copies of
web page of original source .
Advantage:
1 . Decrease latency
2 . Reduce load of server
To guarantee correctness ,HTTP includes explicit
support for proxy server:
1.How proxy handle each request
2. How header should be interpreted
3. How browser negotiate with a proxy
Caching



The goal of caching is improve efficiency : reduce
both latency and network traffic
How long should a item be kept in cache?
HTTP allows a server to control caching in two way:
1. Server specify caching details
2. HTTP allow browser to force REVALIDATION
SUMMARY


The World wild web consists of hypermedia
document stored on a set of web server and
accessed by browser. Each document is assigned a
URL that uniquely identifies it .
A browser and server use HTTP to communicate
HTTP is an application-level protocol with explicit
support for negotiation, proxy server,caching ,and
persistent connection.