Transcript PPT

Jigsaw
W3C’s Java Web Server
CHEN Ge CSIS, HKU March 9, 2000.
What is Jigsaw?
• Jigsaw is a Web Server developed by
W3C. The source code is of version 2.0.4.
• Jigsaw is written in pure Java, and the
document of Jigsaw claims “Jigsaw will
run on any platform that supports Java,
with no changes”
Jigsaw’s Internal Design
• Basic Concepts in Jigsaw
– Resource
– Frame
– Filter
– Indexer
Jigsaw’s Internal Design
– A Resource is a full Java object, containing
only information that the raw Resource (a
file, a directory...) can provide (e.g., for a file,
the size, last modification date...)
Jigsaw’s Internal Design
– A Frame is a full Java Object, containing all
the information needed to serve this
Resource using a specific Protocol (e.g.,
HTTPFrame for HTTP).
Jigsaw attaches Frames to Resources to
handle protocol related activities.
Jigsaw’s Internal Design
– A Filter is a full Java Object, associated to a
Frame, that can modify the Request and/or
the Reply. For example the Authentication is
handled by a special filter.
Jigsaw’s Internal Design
– A Indexer is also a full Java Object, which
tries to create and setup some resource
automatically. The resources can be created
depending on their name or their extension.
Once the resource has been created, the
Indexer is also in charge of attaching the
right frames to this resource, like the HTTP
frame, the filters and so on.
Jigsaw’s Internal Design
• A sample FileResource:
Jigsaw’s Internal Design
• The inheritance tree of Jigsaw:
Jigsaw’s Internal Design
• The ResourceStoreManage
– Jigsaw use a ResourceStoreManage to
manage all the resource used in the runtime
of the server.
– The ResourceStoreManage’s implementation
use some simple cache policy to make the
resource reference more efficient.
Jigsaw’s Internal Design
• The ResourceStoreManage
– It seems that, if we start multiple Jigsaws on
a machine, all the servers will share the
same ResourceStoreManage.
How Jigsaw Works?
• Jigsaw utilize Java’s thread extensively.
All the major classes of Jigsaw are
running as Java Threads.
• Jigsaw separates serving document into
two different processing stages
– Indexing Stage
– Serving Stage
How Jigsaw Works?
• Separate serving document into two
stages make the resource lookup and
resource sharing more efficent.
How Jigsaw Works?
• Using an sample http request handling
process in Jigsaw to explain how Jigsaw
works, and view some important classes
in Jigsaw.
How Jigsaw Works?
• When Jigsaw starts up
– An instance of the class httpd created.
httpd
How Jigsaw Works?
Indexer
Initialize()
httpd
manager
httpd
root
How Jigsaw Works?
• After initialization, the httpd call the
initializeServerSocket() to create the
server socket and the ClientSocketFacotry
which is a pool of SocketClient.
Indexer
manager
httpd root
Indexer
initializeServerSocket() manager
httpd root
factory
Server
port
How Jigsaw Works?
• After create the server socket
successfully, httpd creates a thread,
assigns itself to the thread, and runs as a
thread.
Indexer
manager
httpd root
factory
this.thread = new Thread (this);
;…;this.thread.run();
Indexer
manager
httpd root
factory
How Jigsaw Works?
public void run () {
…
while ( ( ! finishing) && ( socket != null ) ) {
Socket ns = null ;
try {
ns = socket.accept() ;
ns.setTcpNoDelay(true);
} catch (IOException e) {
…
}
if ( (socket != null) && (ns != null) && (factory != null) )
factory.handleConnection (ns) ;
}
// Our socket has been closed, perform associated cleanup.
cleanup(restarting) ;
}
Indexer
manager
httpd root
factory
How Jigsaw Works?
• When there’s an incoming connection
request, the httpd thread uses the client
pool to handle the request.
/archives/index.html
Indexer
manager
httpd root
factory
handleConnection()
SocketClientFactory
How Jigsaw Works?
• SocketClientFactory maintains a pool of
SocketClients. It first either finds a free
SocketClient thread in the pool, if
available, or kills some old connections to
get a free SocketClient, if the load have
not exceeds the max-load.
How Jigsaw Works?
• When the Factory find a free SocketClient,
it bind the incoming socket to the
SocketClient, and the SocketClient will
starts a thread to perform the request.
• Because all the SocketClients run as
threads, when a connection is bound to a
SocketClient, the server can keep on
listening on the net for new incoming
requests.
How Jigsaw Works?
• Then the SocketClient will process the
incoming request, and calls the httpd’s
perform(Request) method to lookup for
the necessary resources. The perform
method will return back a object
containing all the resources needed to
reply the request. SoketClient uses this
object to send back the request result.
How Jigsaw Works?
• After the request is replied, the
SocketClient close the connection and
return itself to the free SocketClient pool
in the SocketClientFactory.
Some Notices on Jigsaw
• Jigsaw uses a thread pool (cache) to
handle incoming request, instead creates
a new process or thread to deal with it.
• Jigsaw caches requested resources in a
resource hashtable according to the
resource id, in order to reduce file system
access.
Some Notices on Jigsaw
• When the httpd is looking up the
requested resource, if it is a read only
request, it will only increase the lock
number (reference number) of that object,
so one resource can be accessed
simultaneously by multiple request.
Some Notices on Jigsaw
• Jigsaw also uses some LRU algorithm to
discard long-unused resources in memory.
And also kills long idle connection and
SocketClient to decrease the server’s
work load.
Some Notices on Jigsaw
• It does not provide direct configuration
method to run it on multi-machines
cooperatively. But its internal design
makes it possible to replace or add some
of its classes to gain new functions.
End of Jigsaw
High-performance Web
Servers in other institutes
• JAWS---An Web Server in CS Dept.,
Washington University
– JAWS’s findings:
• “Factoring out I/O, the primary determinant to
server performance is the concurrency
strategy”
• “For single CPU machines, single-threaded
solutions are acceptable and perform well.
However, they do not scale for multi-processor
platforms. “
High-performance Web
Servers in other institutes
– JAWS’s findings:
• Process-based concurrency implementations
perform reasonably well when the network is
the bottleneck. However, on high-speed
networks like ATM, the cost of spawning a
new process per request is relatively high.
• Multi-threaded designs appear to be the
choice of the top Web server performers. The
cost of spawning a thread is much cheaper
than that of a process.
High-performance Web
Servers in other institutes
– JAWS’ Framework Overview:
High-performance Web
Servers in other institutes
– “... the key to developing high performance Web
systems is through a design which is flexible
enough to accommodate different strategies for
dealing with server load and is configurable from
a high level specification describing the
characteristics of the machine and the expected
use load of the server.”
– More on JAW:
• http://www.cs.wustl.edu/~jxh/research/resear
ch.html
High-performance Web
Servers in other institutes
• Scalable Web Server Architecture from
Lucent & UT Austin
High-performance Web
Servers in other institutes
• Scalable Web Server Architecture from
Lucent & UT Austin
– Redirection Server is used
– Data is distributed among the servers
– Problem occurs on constantly moving docs
High-performance Web
Servers in other institutes
• Web Server Clusters in UCSB
– Master/Slave Architecture
High-performance Web
Servers in other institutes
• Web Server Clusters in UCSB
– Masters handles static requests.
– Dynamic content requests may be processed
locally at masters, or redirected to a slave
node or another master.
– Slaves may be either dedicated or nondedicated.