Web Architecture I

Download Report

Transcript Web Architecture I

World Wide Web
Basics
Jan.2001
C.Watters
1
What is an internet anyway?
• 2 or more networks that can communicate
Jan.2001
C.Watters
2
Historical View: Internet
• 1969 - Telnet
• 1970 - 4 computers
– Stanford, UCLA, UC Santa Barbara, U Utah
•
•
•
•
1971 - FTP
1983 - 562 computers on the internet
1993 - 1.2 million computers on the internet
1999 - ?? Kazillions
Jan.2001
C.Watters
3
What is the World Wide Web ?
• Hypertext connectivity of “documents”
Jan.2001
C.Watters
4
Size of the Internet
Jan.2001
C.Watters
5
Jan.2001
C.Watters
6
The web
• The Web is protocol that uses the internet as
the communication structure
• links documents stored in computers
communicating with the internet
• main authority: W3 consortium
• www.w3.com
Jan.2001
C.Watters
7
Historical View: WWW
•
•
•
•
1989 - Berners Lee - web doc proposal
1990 - Berners Lee - text browser (physicists)
1992 - public access to web docs at CERN
1993 - 60 web servers & Mosaic (graphics)
–
(500 servers by year end)
• 1995 - more Internet email than US post
• 1999 - x million docs & y million servers
Jan.2001
C.Watters
8
Jan.2001
C.Watters
http://www.lexis-nexis.com
9
Basics
• Web server - machine that services internet
request
• Web client - machine that initiates internet
request
• Browser - software to interact with internet
data at the client
• TCP/IP - internet data protocol
• FTP - internet file transfer protocol
• HTTP - hypertext transfer protocol
Jan.2001
10
• HTML - hypertext C.Watters
markup language
Client-Server Model
Jan.2001
C.Watters
11
Looking in the Cloud
• /opt/sbin/traceroute
Jan.2001
C.Watters
12
CA*net2 layer 2 links
Jan.2001
C.Watters
13
CA*net3 Physical Links
Jan.2001
C.Watters
14
1. Client-Server & Web
•
•
•
•
•
Cloud model
TCP/IP
HTTP and MIME types
FTP
protocol stacks
Jan.2001
C.Watters
15
Servers and Clients
• Servers - computer systems at the end of a
network that store files and provide other
services
• Clients - computer systems that are end
points for users of the data
Jan.2001
C.Watters
16
Network Architectures
•
•
•
•
ISO’s OSI model
1970’s
International Organization for Standards
Open Systems Interconnection reference
Model
• 7 layer architecture
Jan.2001
C.Watters
17
ISO - OSI Model
•
•
•
•
•
•
•
Application layer
presentation layer
session layer
transport layer
network layer
data link layer
physical layer
Jan.2001
•
•
•
•
•
•
•
C.Watters
Ftp, telnet, etc
data compression, format
set up connections
end-to-end trans of packets
guide packets along links
send packet between nodes
deliver bits between nodes
18
ISO OSI model
Jan.2001
C.Watters
19
INTERNET MODEL
• 4 layers
• Application layer
– communication services (ftp, telnet, email)
• transport layer
– transmission of messages end-to-end
• network layer
– transmission of messages sequence of links
• link layer
– transmission of packet across one link
Jan.2001
C.Watters
20
Internet layers
Jan.2001
C.Watters
21
Application Layer
•
•
•
•
•
FTP
HTTP
SMTP
telnet
etc
Jan.2001
C.Watters
22
TCP/IP
• Suite of protocols made the standard for the Internet
• facilitates communication between heterogeneous
and similar networks that are connected together
• reliable, connection oriented, byte stream protocol
Jan.2001
C.Watters
23
Transport layer: TCP and UDP
• TCP
• UDP
– transmission control
protocol
– full duplex byte stream
– virtual path
(connected)
– error free
– uses
acknowledgements
– 16 bit address of ports
Jan.2001
C.Watters
–
–
–
–
–
user datagram protocol
connectionless
no acknowledgements
no flow control
no resending of
erroroneous packets
– some error detection
– 16 bit port addresses
24
TCP/IP
Transport Control Protocol
Internet Protocol
Jan.2001
C.Watters
25
TCP and IP
Jan.2001
C.Watters
26
Network Layer: IP
• Delivers packets up to 64kbytes, 1 at a time
• Each packet has a header
– sending host and intended host network addresses
– 32 bit addresses
• IP layer (like UDP)
– unreliable
– connectionless
Jan.2001
C.Watters
27
Link Layer: links
• Connect computer to Internet
• SLIP
– serial line IP (asynchronous, 1 char at a time)
– move IP packets to common link (phone line)
• PPP
– point-to-point protocol
– also synchronous transfer for packets
Jan.2001
C.Watters
28
Data encapsulation using TCP on
Ethernet
Jan.2001
C.Watters
29
TCP/IP apps
• TCP/IP software usually includes:
– remote terminal client using TELNET protocol
for remote login
– electronic mail client using SMTP protocol to
transfer e-mail to remote system
– file transfer client using FTP protocol to
transfer files between 2 machines
Jan.2001
C.Watters
30
HTTP
HyperText Transport Protocol
• Native protocol for WWW
• sits on top of internet’s TCP/IP protocol
• HTTP is a 4 step process per transaction
• uses a predefined set of document formats
from MIME
Jan.2001
C.Watters
31
MIME
• MIME - multipurpose internet mail extensions
– defines file formats (images, video, text, etc)
– e.g. Content-type: text/html
– Data type/subtype
»
text/html
»
text/plain
»
image/gif
»
video/mpeg
» application/msword
Jan.2001
C.Watters
» etc!!!
32
HTTP Connection
• 1. Client
– makes an HTTP request for a web page
– makes a TCP/IP connection
• 2. Server accepts request
– sends page as HTTP
• 3. Client downloads page
• 4. Server breaks the connection
Jan.2001
C.Watters
33
HTTP is Stateless!!!!
• Each operation or transaction makes a new
connection
• each operation is unaware of any other
connection
• each click is a new connection
• So how do they do those shopping carts??
Jan.2001
C.Watters
34
What does it look like?
• Header + object file
• Header
–
–
–
–
–
plain text
info about the object (MIME etc)
methods allowed
etc
browser sends a header to server each time you
ask for information
Jan.2001
C.Watters
35
– server sends a header
and possibly content
HTTP Transaction Example
GET /catalog/ip/ip.htm HTTP 1.0
Accept: text/plain
Accept: text/html
Referer: http://www.june.com/catalog.html
User-Agent: Mozilla/2.0
<CR/LF>
Jan.2001
C.Watters
36
HTTP REQUEST PROTOCOL
Request = Simple | Full
Simple = GET <URI> CRLF
Full
= Method URI ProtVersion CRLF
[<HTRQ Header>*] [CRLF <data>]
Method = GET | POST | HEAD | ….
<HTRQ Header> = <Fieldname>:<Value>CRLF
<data> = MIME conforming message
www.w3.org/Protocols/HTTP/
Jan.2001
C.Watters
37
HTTP Header fields
• General-header fields
– used for both requests and responses
• Request-header fields
– used for responses
– extra client information for use by server
– optional
Jan.2001
C.Watters
38
General-header fields
• Date: mon,11, Jan 1999 08:14:32 GMT
• MIME-version: 1.0
• Pragma: no cache
– directives
Jan.2001
C.Watters
39
Request-header fields
• acceptable MIME types for response
– Accept:text/html
– Accept:*.*
• 401 response from client
– Authorization: Basic abcdef (uuencoded
username and password)
• From:client-email-addr
Jan.2001
C.Watters
40
More Request-header fields
• If-Modified-Since:date
– conditional get
• source of current requested Url
– Referer:URL
• robot/browser identification
– User-Agent:Mozilla/2.0
Jan.2001
C.Watters
41
Looking at the HTTP Header Values
• In Perl
– $ENV{“From”}
• In Netscape
– www.cs.dal.ca/~watters/cgi-bin/webcourse/env.html
Jan.2001
C.Watters
42
HTTP Methods
• Client requests either
– simple request
– full request
Request-line= method Request-URI HTTP-version CRLF
GET /catalog/ip.html HTTP/1.0
Jan.2001
C.Watters
43
Simple requests
• Only for HTTP 0.9
• only uses Get method
• causes the server to locate and transfer the
object specified
• client responsible for handling the object
GET <uri> CRLF
Jan.2001
C.Watters
44
Full Request
• Uses HTTP version and more methods
• method tells server what to do to the
resource requested
• Methods
– GET
– POST
– HEAD
Jan.2001
C.Watters
45
GET Method
• Request server to retrieve object specified
• conditional GET
– request message includes
– If-Modified-Since in header
Jan.2001
C.Watters
46
HEAD Method
• Like GET but does not return the object
• returns a header about the resource
requested (metainformation)
• good way to test link validity
Jan.2001
C.Watters
47
POST Method
• Include an object in the request
• server should use that object in processing
the request
• must include a Content-Length in header
Jan.2001
C.Watters
48
HTTP Response Message
•
•
•
•
•
•
HTTP protocol version
3 digit status code
reason phrase
CRLF
optional header fields
CRLF
Jan.2001
C.Watters
49
HTTP Response Header Fields
• Additional information about the server
• such as:
– LOCATION: exact URI address
– SERVER:
server software (CERN/3.0)
– WWW-AUTHENTICATE:
• status 401 responses (unauthorized request)
• server challenges client
• client may use to send authorization info to server
Jan.2001
C.Watters
50
Understanding STATUS Codes
•
•
•
•
•
1xx - not yet in use
2xx - action successful
3xx - further action needed
4xx - client request error
5xx - server error
Jan.2001
C.Watters
51
HTTP Transaction
• 1. Client and server establish a connection
• 2. Client makes a request
• 3. Server makes a response
• 4. Server terminates connection
Jan.2001
C.Watters
52
• Step 1 establish connection
–
–
–
–
TCP/IP connection set up
uses a port number as application reference
usually port 80
ports < 1024 are privileges (>1024 are open)
• Step 2 client request
– Http message sent with a request line
– request-line = method URL HTTP version
Jan.2001
C.Watters
53
Web Port Assignments
•
•
•
•
•
•
21 FTP
23 Telnet
25 smtp (mail)
70 gopher
79 finger
80 HTTP
Jan.2001
C.Watters
54
• Step 3 Server response
– server sends Http message and optionally
requested data
– resp-message = HTTP version statuscode
reason-phrase [optional stuff]
• Step 4 connection terminated
– usually the server
– sometimes the client “stops” it
– anything else, whoever notices terminates
Jan.2001
C.Watters
55