Unravelling the Web: How Does it All Work?
Download
Report
Transcript Unravelling the Web: How Does it All Work?
Unraveling the Web:
How Does it All Work?
1
Web Enabling Technologies
TCP/IP
network (Internet & others)
URLs
HTTP
protocol and HTTP Servers
HTML & MIME type system
Other network service protocols and
servers
Web browsers
2
Uniform Resource Locators
Naming
scheme: unambiguios way of
telling where to find “things”
Indicates protocol, host, port, and “path”
Syntax: protocol://hostname:port/path
Examples:
http://whitney:8000/lectures/index.html
http://www.cs.purdue.edu/
3
Valid Characters in URLs
Valid:
Upper & lower case letters,
numerals and $, _, @, ., Special: = ; / # ? : % & +
Others must be escaped: %xx where xx
is two-digit hex code of character
Example: CR “%0D”, space “%20”,
percent “%25”
Help from browsers to handle this
4
URL Specificity
Complete
URL: all parts of URL
Partial URL: no protocol/host, contains
full path. Browsers must interpret
relative to current page
Relative URL: only last part of path,
similar to relative paths in OSs
Use: relative URLs for related docs,
partial URLs for docs on same server
and complete URLs for remote docs
5
Types of URLs
(Local)
File: file:///path_to_the_file
HTTP: http://hostname[:port]/path
FTP:
ftp://[user[:passwd]@]hostname[:port]/p
ath_to_the_file
Gopher: gopher://hostname:port/path
Telnet: telnet://hostname:port/
6
Types of URLs
WAIS:
wais://hostname:port/database_name?q
uery
News:
news:name.of.group/[article_selection]
MailTo: mailto:email_address
7
URLs
Uniform
Resource Locators
Universal Resource Locators
General concept of universal resource
identifiers (URIs): URLs and Universal
Resource Names (URNs) are
implemented
8
HyperText Transfer Protocol
Simple
request-reply readable protocol
Five requests: GET, HEAD, POST, PUT,
DELETE
Request includes document URL +
(possibly) additional info: User info,
browser info, capabilities, wishes, ...
Reply includes status information, a
reply header and data
9
HTTPD
Implements
HTTP
Two popular free implementations:
NCSA HTTPD and CERN HTTPD
Commercial: Netscape Server, ..
Special: Netscape Commerce Server, ..
Basic one is simple!
10
HyperText Markup Language
Based
on SGML, Standard Generalized
Markup Language
Markup is not layout: no fonts, points, ...
Abstract styles indicate parts of docs:
level 1 header, level 2 header,
paragraph, table, citations
HTML: Specifies structure of document,
not format!
11
MIME
Multipurpose
Internet Mail Extensions
Mechanism for sending multimedia data
over email
Content-Type: field indicates the type
Additional fields for length, encoding,
compression etc.
Metamail: MIME reference impl. for
range of email systems
12
MIME Types
Indicates
the semantic interpretation of
some content
Describes a document by referring to
standardized list of types organized by
type and a subtype
Examples: text/plain, text/html,
video/mpeg, image/gif, image/jpeg,
application/postscript, */x-*, ...
13
MIME Types and WWW
HTTP
request lists preferred doc typs
HTTP reply indicates doc type supplied
Browser uses type-specific handler to
“display” document
Some types are natively supported,
some not
Helper applications for non-native types
Easy to add new content types!
14
MIME Types and Filename
Extensions
Useful
mechanism for identifying MIME
type of a file object
Server relies on this heavily!
Example: .ps, .eps, .epsi, .epsf =>
application/postscript
Example: .gif => image/gif
15
Other Network Service
Protocols and Servers
FTP
client and ftpd
Telnet client and telnetd
NNTP client and nntpd
Goper client and gopherd
...
Hooked to the Web by smart browsers
16
Web Browsers
Support
HTTP and possibly other
protocols
Display HTML documents natively
Graphical browsers show in-line
images/graphics
Some support “special” HTML
extensions
Does all of above to browse the Web!
17