Unravelling the Web: How Does it All Work?

Download Report

Transcript Unravelling the Web: How Does it All Work?

Unraveling the Web:
How Does it All Work?
1
Web Enabling Technologies
 TCP/IP
network (Internet & others)
 URLs
 HTTP
protocol and HTTP Servers
 HTML & MIME type system
 Other network service protocols and
servers
 Web browsers
2
Uniform Resource Locators
 Naming
scheme: unambiguios way of
telling where to find “things”
 Indicates protocol, host, port, and “path”
 Syntax: protocol://hostname:port/path
 Examples:
http://whitney:8000/lectures/index.html
http://www.cs.purdue.edu/
3
Valid Characters in URLs
 Valid:
Upper & lower case letters,
numerals and $, _, @, .,  Special: = ; / # ? : % & +
 Others must be escaped: %xx where xx
is two-digit hex code of character
 Example: CR “%0D”, space “%20”,
percent “%25”
 Help from browsers to handle this
4
URL Specificity
 Complete
URL: all parts of URL
 Partial URL: no protocol/host, contains
full path. Browsers must interpret
relative to current page
 Relative URL: only last part of path,
similar to relative paths in OSs
 Use: relative URLs for related docs,
partial URLs for docs on same server
and complete URLs for remote docs
5
Types of URLs
 (Local)
File: file:///path_to_the_file
 HTTP: http://hostname[:port]/path
 FTP:
ftp://[user[:passwd]@]hostname[:port]/p
ath_to_the_file
 Gopher: gopher://hostname:port/path
 Telnet: telnet://hostname:port/
6
Types of URLs
 WAIS:
wais://hostname:port/database_name?q
uery
 News:
news:name.of.group/[article_selection]
 MailTo: mailto:email_address
7
URLs
 Uniform
Resource Locators
 Universal Resource Locators
 General concept of universal resource
identifiers (URIs): URLs and Universal
Resource Names (URNs) are
implemented
8
HyperText Transfer Protocol
 Simple
request-reply readable protocol
 Five requests: GET, HEAD, POST, PUT,
DELETE
 Request includes document URL +
(possibly) additional info: User info,
browser info, capabilities, wishes, ...
 Reply includes status information, a
reply header and data
9
HTTPD
 Implements
HTTP
 Two popular free implementations:
NCSA HTTPD and CERN HTTPD
 Commercial: Netscape Server, ..
 Special: Netscape Commerce Server, ..
 Basic one is simple!
10
HyperText Markup Language
 Based
on SGML, Standard Generalized
Markup Language
 Markup is not layout: no fonts, points, ...
 Abstract styles indicate parts of docs:
level 1 header, level 2 header,
paragraph, table, citations
 HTML: Specifies structure of document,
not format!
11
MIME
 Multipurpose
Internet Mail Extensions
 Mechanism for sending multimedia data
over email
 Content-Type: field indicates the type
 Additional fields for length, encoding,
compression etc.
 Metamail: MIME reference impl. for
range of email systems
12
MIME Types
 Indicates
the semantic interpretation of
some content
 Describes a document by referring to
standardized list of types organized by
type and a subtype
 Examples: text/plain, text/html,
video/mpeg, image/gif, image/jpeg,
application/postscript, */x-*, ...
13
MIME Types and WWW
 HTTP
request lists preferred doc typs
 HTTP reply indicates doc type supplied
 Browser uses type-specific handler to
“display” document
 Some types are natively supported,
some not
 Helper applications for non-native types
 Easy to add new content types!
14
MIME Types and Filename
Extensions
 Useful
mechanism for identifying MIME
type of a file object
 Server relies on this heavily!
 Example: .ps, .eps, .epsi, .epsf =>
application/postscript
 Example: .gif => image/gif
15
Other Network Service
Protocols and Servers
 FTP
client and ftpd
 Telnet client and telnetd
 NNTP client and nntpd
 Goper client and gopherd
 ...
 Hooked to the Web by smart browsers
16
Web Browsers
 Support
HTTP and possibly other
protocols
 Display HTML documents natively
 Graphical browsers show in-line
images/graphics
 Some support “special” HTML
extensions
 Does all of above to browse the Web!
17