notes - Academic Csuohio - Cleveland State University

Download Report

Transcript notes - Academic Csuohio - Cleveland State University

EEC-681/781
Distributed Computing
Systems
Lecture 8
Wenbing Zhao
[email protected]
Cleveland State University
2
Outline
• Midterm#1 results
• Processes and threads
• Clients and Servers
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
3
Midterm#1 Results
• P1 mean: 37/40
• P2 mean: 34/40
• P3 mean: 18/20
Fall Semester 2006
12
10
Number of Students
• Max: 98
• Min: 72
• Mean: 89
8
6
4
2
0
60-69
70-79
80-89
90-100
Grand Range
EEC-681: Distributed Computing Systems
Wenbing Zhao
4
Process
• Communication takes place between processes
• Process is a program in execution
• For an OS, process management and
scheduling are most important
• For distributed systems, other issues are equally
or more important
–
–
–
–
Multithreading
Client-Server organization
Code migration
Software agent
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
5
Process
• An operating system creates a number of virtual
processors, each one for running a different
program
• To keep track of these virtual processors, OS
maintains a process table
– CPU register values, memory maps, open files,
accounting info, privileges, etc.
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
6
Process
• OS ensures concurrency transparency for
different processes that share the same CPU
and other hardware resources
• Each process has its own address space
• Switch CPU between two processes is
expensive
– CPU context, modify registers for memory
management unit (MMU), invalidate address
translation caches such as in the translation lookaside
buffer (TLB)
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
7
Motivation to Use a Finer Granularity
• It is hard to program a single threaded process for efficient
distributed computing
– Difficult to use non-blocking system calls
• Could have used a pool of processes, but
– Creation/deletion of a process is expensive
– Inter-process communication (IPC) is expensive
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
8
Introduction to Threads
• Thread: A minimal software processor in whose
context a series of instructions can be executed
• Saving a thread context implies stopping the
current execution and saving all the data needed
to continue the execution at a later stage
• A process can have one or more threads
• Threads share the same address space.
=> Thread context switching can be done
entirely independent of the operating system
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
9
Context Switching
• Creating and destroying threads is much
cheaper than doing so for processes
• Process switching is generally more
expensive as it involves getting the OS in
the loop, i.e., trapping to the kernel
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
10
Threads and Distributed Systems
• Multithreaded clients:
– Hiding network latency
• Multithreaded servers:
– Improved performance and
– Better structure
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
11
Multithreaded Clients
• Multithreaded clients: hiding network latency
• Multithreaded Web client:
– Web browser scans an incoming HTML page, and
finds that more files need to be fetched
– Each file is fetched by a separate thread, each doing
a (blocking) HTTP request
– As files come in, the browser displays them
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
12
Multithreaded Servers
• Improve performance:
– Starting a thread to handle an incoming request is
much cheaper than starting a new process
– Multi-threaded server can scale well to a
multiprocessor system
– Hide network latency by reacting to next request while
previous one is being replied
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
13
Multithreaded Servers
• Better server structure:
– Using simple, well-understood blocking calls
simplifies the overall structure
– Multithreaded programs can be smaller and easier to
understand due to simplified flow of control
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
14
Multithreaded Servers
• Dispatcher/worker model
–
–
–
–
Thread-per-object
Thread-per-request
Thread-per-client
Thread pool
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
15
Multithreaded Servers
• Three ways to construct a server:
Model
Characteristics
Threads
Parallelism, blocking system calls
Single-threaded process
No parallelism, blocking system calls
Finite-state machine
Parallelism, nonblocking system calls
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
16
Client-Side Software
• User interface
– X-window system
– Model-View-Controller Pattern
• Providing distribution transparency
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
17
The X-Window System
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
18
The X-Window System
• X distinguishes two types of applications
– Normal application
• Can request creation of a window
• Mouse and keystroke events are captured when a window is
active
– X windows manager
• Given special permission to manipulate the entire screen
• Determines the look and feel
• X applications and X kernel interacts through an
X protocol
– Supports Unix and TCP/IP sockets
– X terminals
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
19
Model-View-Controller Pattern
• Invented in a Smalltalk context for decoupling
the graphical interface of an application from the
code that actually does the work
• MVC was originally developed to map the
traditional input, processing, output roles into the
GUI realm:
Input --> Processing --> Output
Controller --> Model --> View
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
20
Model-View-Controller
• Model - manages one or more data elements,
responds to queries about its state, and
responds to instructions to change state
• View - responsible for mapping graphics onto a
device. Multiple views might be attached to the
same model
• Controller - responsible for mapping end-user
action to application response
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
21
Model-View-Controller
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
Client-Side Software:
Providing Distribution Transparency
• Access transparency: client-side stubs for RPCs and RMIs
• Location/migration transparency: let client-side software
keep track of actual location
• Replication transparency: multiple invocations handled by
client stub
• Failure transparency:
mask server and
communication failures
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
22
23
Server-Side Software
• Basic model: A server is a process that waits
for incoming service requests at a specific
transport address
• A server typically listens on a well-known port:
ftp-data
ftp
ssh
telnet
smtp
Fall Semester 2006
20
21
22
23
25
File Transfer [Default Data]
File Transfer [Control]
Secure Shell
Telnet
Simple Mail Transfer
EEC-681: Distributed Computing Systems
Wenbing Zhao
24
Server-Side Software
• Superservers: Servers that listen to several
ports, i.e., provide several independent services
– When a service request comes in, they start a
subprocess to handle the request
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
25
Server-Side Software
• Iterative vs. concurrent servers:
– Iterative servers can handle only one client at
a time
– Concurrent servers can handle multiple
clients at the same time
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
26
Servers and State
• Stateless servers: Never keep accurate
information about the status of a client after
having handled a request
• Consequences:
– Clients and servers are completely independent
– State inconsistencies due to client or server crashes
are reduced
– Possible loss of performance because, e.g., a server
cannot anticipate client behavior
• Question: Does connection-oriented
communication fit into a stateless design?
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
27
Servers and State
• Stateful servers: Keeps track of the status of its clients
– Record that a file has been opened, so that prefetching
can be done
– Knows which data a client has cached, and allows
clients to keep local copies of shared data
• The performance of stateful servers can be
extremely high (from a particular client’s point of view)
• Drawback
– Crash recovery a lot more challenging
– Less scalable
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
28
Practical Implementation of Servers
• Servers need to maintain clients state
• Where to store such state? Database systems
• Solution – three-tier architecture
– Application servers interface directly to clients and
execute according to business logic
– Data (state) is stored in the data access tier so that the
application servers can be made stateless
– E.g., Web-page personalization using cookies
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
29
Reasons for Migrating Code
• Load balancing
– Migrate processes from heavy loaded machine to light
loaded machines
• Minimize communication
– Move code from client to server
– Move code from server to client
• Parallel execution
– Web crawlers
• Flexibility
– Dynamically configure distributed systems
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
30
Strong and Weak Mobility
• Process components:
– Code segment: set of instructions that make up the
program
– Resource/data segment: contains references to
external resources needed by the process, such as
files, devices, other processes
– Execution segment: contains the current execution
state of a process such as private data, stack, program
counter
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
31
Strong and Weak Mobility
• Weak mobility: Move only code and data
segment (and start execution from the
beginning) after migration
• Strong mobility: Move component, including
execution state
– Migration: move entire process from one machine to
another
– Cloning: start a clone, and set it in the same
execution state
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
32
Process-to-Resource Binding
• By identifier: the process requires a specific instance of a
resource
– A specific web page or a remote file
– local communication endpoint
• By value: the process requires the value of a resource
– Shared library
– Memory
• By type: the process requires that only a type of resource
is available
– A color monitor
– A printer
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
33
Resource-to-Machine Binding
• Fixed: the resource cannot be migrated
– Local devices
– local communication endpoints
• Fastened: the resource can, in principle, be
migrated but only at high cost
– Local databases
– complete web site
• Unattached: the resource can easily be moved
along with the process
– A cache
– Files
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
34
Migration in Heterogeneous Systems
• Challenges:
– The target machine may not be suitable to execute the
migrated code
– The definition of process/thread/processor context is
highly dependent on local hardware, operating system
and runtime system
• Solution: Make use of an abstract machine that is
implemented on different platforms
– Interpreted languages running on a virtual machine
(Java/JVM; scripting languages)
– Virtual machine
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
35
What’s an Agent?
• An agent is an autonomous process capable of
reacting to, and initiating changes in its
environment, possibly in collaboration with users
and other agents
– collaborative agent: collaborate with others in a
multiagent system
– mobile agent: can move between machines
– interface agent: assist users at user-interface level
– information agent: manage information from
physically different sources
Fall Semester 2006
EEC-681: Distributed Computing Systems
Wenbing Zhao
36
Agent Technology
• The general model of an agent platform
Intra-platform
communication
Fall Semester 2006
Management: Keeps track of where the agents on
this platform are (mapping agent ID to port)
Directory: Mapping of agent names & attributes
to agent IDs
ACC: Agent Communication Channel, used to
communicate with other platforms
EEC-681: Distributed Computing Systems
Wenbing Zhao