Operation System II Presentation Project The World Wide

Download Report

Transcript Operation System II Presentation Project The World Wide

World Wide Web (WWW)
A Distributed Document-Based
System
Group E
Ricky Tong (D-A0-1611)
Eddy Leong (D-A0-1623)
Dick Lei (D-A0-1658)
Schedule of Presentation









Overview of World Wide Web
Document Model
HTML
DOM
XML
Document Type
MIME
Architectural Overview
Discussion Time
The World Wild Web

The www is a document-based system
 It can be view as a huge distributed system
consisting of millions of clients and servers
for accessing linked documents
 Sever maintain collections of documents,
while clients provide users an easy-to-use
interface for presenting and accessing those
documents
Overview of
World Wide Web

Documents are stored as files in the servers.
 Servers receive request and files are sent to
the clients.
 The client usually interacts with the web
server through a browser.
The overall organization of the
Web
Document Model

Some documents are represents are ASCII
text files.
 Some are expressed as a collection of script
that will run on the browser automatically
 Some contains references to other document
such as: hyperlink.
 The new document may replace the current
one or open in a new browser
HTML

Most web document are expressed in
HTML.
 An HTML file contains small markup tags
telling the Web browser how to display the
page.
 An HTML file must has .htm or .html
extension.
 Create the HTML file by simple text editor
Example of HTML
<html>
<head>
<title>Title of page</title>
</head>
<body>
This is my first homepage. <b>This text is bold</b>
</body>
</html>
Document Object Model

DOM provides a standard programming
interface to parsed web documents.
 The interface is specified in CORBA IDL.
 The interface is used by the scripts
embedded in a document.
 Scripts can be used to inspect and modify
the document that they are part of.
XML
(Extensible Markup Language)

XML is a meta-markup language providing
a format for describing structured data
 This facilitates more precise declarations of
content and more meaningful search results
across multiple platforms.
XML Example
<?xml version="1.0" ?>
<?xml-stylesheet href="greeting.xsl"
type="text/xsl"?>
<message>
<greeting>Hi</greeting>
<target>you all</target>
</message>
Other Document Types
There are many types of documents besides
HTML and XML:

Audio: .mp3
 Others: .pdf, etc
 Image : .gif and .jpeg
MIME (Multipurpose Internet
Mail Extensions)

It was originally developed to provide
information on the content of a message
body that was sent as part of E-mail.
 It is a specification for enhancing the
capabilities of standard Internet E-mail.
 It offers a simple standardized way to
represent and encode a wide variety of
media types for transmission via Internet
mail.
The 7 Content-types defined
in MIME







Text - represent textual information
Image - transmit still images
Audio - transmit audio or voice data
Video - transmit video data or moving image data
Message - encapsulate an entire RFC 822 format
messages
Multipart - combine several body parts of possibly
different types & subtypes
Application - transmit application or binary data
CGI
(Common Gateway Interface)

It is a standard for interfacing external
applications with information servers. Such
as HTTP or Web severs
 It is executed real time and give dynamic
information.
The principle of using serverside CGI programs
Server-side script

It is executed by the server when the
document has been fetched locally.
Client-side using
JavaScript
Server-Side Using
ASP
<script language="JavaScript">
<%
<!--
'
script code here
'script code here
--!>
'
</script>
%>
Client-side script

Client-side script is just software designed
to be run by the browser
Applet

It is another method to pass precompiled
programs to a client
 Applet is a Small Java program embedded
in an HTML page.
 For security reasons applets cannot read or
write data on client computer.
 The applet can only be executed if your
browser supports Java.
Servlet

Servlet is a precompiled program that is
executed in the address space of the server.
 Servlet is Java technology's answer to CGI
programming.
 The Web page is based on data submitted by
the user.
 The data change frequently.
Architectural details of a client
and server in the Web
HTTP Connections

HTTP is a client-server protocol by which
two machines can communicate over a
TCP/IP connection.
 HTTP is the protocol used for document
exchange in the World-Wide-Web.
 Everything that happens on the web
happens over HTTP transactions.
HTTP Headers

General Header Field (Use in both request and
response messages)
 Request Header Fields (Use in request messages
only)
 Response Header Fields (Used in response
message only)
 Entity Header Fields (Use in both request and
response messages, containing the information
about the entity-body of the message)
Request Header Example
GET /articles/news/today.asp HTTP/1.1
Accept: */*
Accept-Language: en-us
Connection: Keep-Alive
Host: localhost
Referer: http://localhost/links.asp
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5;
Windows NT 5.0)
Accept-Encoding: gzip, deflate
Response Header Example
HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
Date: Thu, 13 Jul 2000 05:46:53 GMT
Content-Length: 2291
Content-Type: text/html
Set-Cookie:
ASPSESSIONIDQQGGGNCG=LKLDFFKCINF
LDMFHCBCBMFLJ; path=/
Cache-control: private
Web Server

A Web server uses the client/server model
and the WWW Hypertext Transfer Protocol
 Every computer on the Internet that
contains a Web site must have a Web server
program.
 Two leading Web servers are Apache the
most widely-installed Web server, and
Microsoft's Internet Information Server
(IIS).
Apache Server
Processing HTTP Requests in
Apache Server








1. Resolving the document reference to a local file
name.
2. Client authentication.
3. Client access control.
4. Request access control.
5. MIME type determination of the response.
6. General phase for handling leftovers.
7. Transmission of the response.
8. Logging data on the processing of the request.
Server Cluster
The principle of TCP handoff
Scalable content-aware
cluster of web servers
Uniform Resource Identifiers
(URI)

A URI (Uniform Resource Identifier) is the way to
identify the points of content.
 The most common form of URI is the Web page
address.
 A URI typically describes: The mechanism used to
access the resource
The specific computer that the resource is housed in
The specific name of the resource (a file name) on the computer
Uniform Resource Locator
(URL)

A URL contains information on how and
where to access a document.
Uniform Resource Name
(URN)





A URN is an Internet resource with a name that
has persistent significance.
A URN looks something like a Web page address
or URL
Example: urn:def://blue_laser
Both URN and URL are types of a concept called
the URI.
The URN is still being developed by members of
the Internet Engineering Task Force (IETF).
Web Distributed Authoring and
Versioning (WebDAV)

An extension to HTTP is called WebDAV
 WebDAV provides a simple means to lock a
shared document, and to create, delete, copy,
and move documents from remote Web
servers.
 WebDAV supports a simple locking
mechanism.
 There are two types of write locks, the
exclusive write lock, and the shared write
lock.
Web Proxy Caching

Simply caching facility of Browser
 Web-proxy caching
 cache cover region or even country
hierarchical caching.
Neighbor Proxy Caching
Server Replication

Fault tolerance in the Web is mainly
achieved through client-side caching and
server replication.
 High availability in the Web is achieved
through redundancy that makes use of
generally available techniques in crucial
services such as DNS.
Security

Most of the security issues in the Web deal
with setting up a secure channel between a
client and server.
 The predominant approach for setting up a
secure channel in the Web is to use the
Secure Socket Layer (SSL)
 Transport Layer Security (TLS) an update
of SSL.
The position of TLS in the
Internet protocol stack
TLS with mutual
authentication
The End