bit8_DC_Lecture_2

Transcript bit8_DC_Lecture_2

Tuesday, November 25, 2008
The illiterate of the 21st
century will not be those
who cannot read and write,
but those who cannot
learn, unlearn, and relearn.
--Unknown
1
Single Processor OS
P2
P1
Process management
Memory management
IPC -- Naming, Concur
Storage
OS Kernel
Processor
Memory
2
Distributed OS
P2
P1
Process management
Memory management
IPC -- Naming, Concur
Storage
OS Kernel
Processor
Memory
Processor
Memory
3
Issues
• Same but different: Distributed
P2
P1
Network
Delay
Lossy
4
Why Distribute?
• To access resources
• Improve availability
• Enhance fault-tolerance
• Consider Yahoo Mail vs. Airline Ticketing
System
5
Serial computing
• To be run on a single computer having a
single Central Processing Unit (CPU);
• A problem is broken into a discrete series of
instructions.
• Instructions are executed one after another.
• Only one instruction may execute at any
moment in time.
6
Parallel computing
• Simultaneous use of multiple compute resources to
solve a computational problem on multiple CPUs
• A problem is broken into discrete parts that can be
solved concurrently
• Instructions from each part execute simultaneously
on different CPUs
7
Grand Challenge Problems
• Traditionally, parallel and distributed computing has
been considered to be "the high end of computing“
– Motivated by numerical simulations of complex systems
and "Grand Challenge Problems":
•
•
•
•
•
•
•
Global change
Fluid turbulence
Vehicle dynamics
Ocean circulation
Viscous fluid dynamics
Superconductor modeling
….
8
9
BlueGene/L2 Supercomputer
IBM remains
dominant vendor
of
supercomputers
10
Agenda
• Basic Concepts
• Revision
– Distributed systems
– Characteristics
– Issues/challenges
• System Models
– Introduction
– Architectural Models
11
Revision Topics
• Introduction and History
• Examples of distributed systems
– The internet
– Interanets
– Moble and ubiquitous computing
• Resource sharing and the web
– The world wide web
•
•
•
•
•
•
• Challenges
Heterogeneity
Openness
Security
Scalability
Failure Handling
Concurrency
Transparency
HTML
URL
HTTP
Dynamic Pages
Web Services
Discussion of the web
Q1
12
Objectives of the lecture
• To provide models for helping to
perceive different distributed system
designs.
• To understand the characteristics of the
most common architectural models of
distributed systems.
• To understand some of the factors that
produce variety to these models.
13
Basic Elements
 Resources in a distributed systems are shared between
users. They are normally encapsulated within in one of
the computers and can be accessed from other
computers by communication.
 Each resource is managed by a program, the resource
manager, it offers communication interface enabling the
resource to be accessed by the users.
 Resource managers in general can be modeled as
processes. In object oriented system, resources are
encapsulated in objects.
14
Distributed Systems Architecture:
• Multi-tier architecture
– Alternatives (a) thru (e)
15
Monday, 01, 2008
"If you were plowing a
field, what would you
rather use,
two strong oxen or 1024
chickens?"
(Commenting on parallel architectures)
- Seymour Cray, Founder of Cray Research
Distributed system models
• Architectural models (definition):
The way in which the components of systems interact with one
another and the way in which they are mapped onto an
underlying network of computers
• Architectural models:
– Placements of parts
– Relationship between parts
• Examples
– Client /server
– Peer-to-peer model
– Proxy server
• Fundamental models:
– Formal description of system properties common in all
architectural models
17
– reliability, security, performance
Architectural models
• Architecture:
– Structure of separately specified components
• Over all goal:
– Structure should meet the present and future
requirements on
• Reliability
• Manageability
• Adaptability
• Cost-effectiveness
18
Architectural models
• Simplification by distinction between client, server and
peer processes
– Assessment of workload (responsibilities)
– Failure impact analysis
• Architectural models can be used to determine placement
of processes
• More dynamic systems can be built as variations on the
client-server model
– Moving code –applets, reduce delays, reduce traffic
– Adding/removing nodes or components
– Discovery and advertisement of services –mobile devices
19
Software Architecture
& Service Layers
Applications, services
• Software architecture
– Structuring software as
layers or modules
– Services offered and
requested by processes
on same or different
computers (More
recently)
Mid dlew are
Operatin g system
Platform
• Service layers (ch. 3 to 6)
– A distributed service can
be provided by one or
more interacting server
processes
– Network Time Protocol
Computer and netw ork hardw are
Software and Hardware service
layers in DS
20
Service Layers (contd..)
• Platform
• Operating system, hardware
– Lowest level of layer provide services to Layer
above them, So that they can be implemented
independently in all computers
– Facilitate communication and coordination among
process
– Supplies system programming interface
21
Service Layers (contd..)
• Middleware
– Masks heterogeneity
• Represented by processes or objects in a set
of computers that interact with each
– To Implement communication and
resource sharing
• RMI
• Group communication (Isis)
– Supplies application programming interface
22
Service Layers: Middleware
• Middleware
– Examples:
• Sun RPC
– Simple, remote procedure calling package
• OMG CORBA
• Microsoft DCOM
• Java RMI
– Provides
• Building blocks for software components
• Infrastructural services for application programs
23
Service Layers: Middleware
• Middleware services
–
–
–
–
–
Naming
Security
Transactions
Persistent storage
Event notification
The end-to-end argument e.g.
reliable file transfer (Saltzer, 1984)
Correct behaviour in distributed
programs depends upon checks
at various levels
• Limitations
– Many distributed applications rely entirely on the services
provided by the available middleware to support their needs for
communication & data sharing
• TCP reliable for packets (basic error correction and detection)
• Mail transfer service increase fault tolerance (maintain record of
progress)
• Some functions can only be correctly implemented with the help of
application level information
24
System Architectures: Client-Server Model
Client
invocation
res ult
Server
invocation
Server
res ult
Client
Key:
Process :
Computer:
• Server may in turn be client of other
servers
– Web server may be client of local file server
– DNS Servers
– Serch engine is both a server and client
25
Client Server
Communication
Three – tiered Architecture
System Architectures: Multiple Servers Model
Service
Server
Client
Server
Client
Server
• Web servers manage their resources independently
– A user can access any resource
• Sun NIS (Netwrok Infomation Service)
– Each NIS has its own replica of the password file
28
System Architectures:
Multiple Servers Model
• Services may be provided by multiple servers
• Partitioned or replicated service-related objects
• Replication provides
– Increased performance
– Increased availability
– Increased fault-tolerance
• But requires replica coordination / consistency
preservation
29
System Architectures:
Proxy Server Model
Web
server
Client
Proxy
server
Client
Web
server
• Cache: Store of reccently used data objects
– Cache
• collocated at each client (web bworser)
• proxy server (shared cache)
– Web browser
• recently visited pages
• HTTP to check the consistency
30
System Architectures:
Proxy Server Model
• Cache: a close store of recently used
data
– Considerably increases performance in
many applications
– But requires cache coherence protocols
• Proxy server: a shared cache of
resources
– Most commonly used for web access
31
System Architectures:
Peer-to-Peer Model
 Peer processes: processes that play similar roles
◦ No absolute distinction between client/server
◦ May still assume client/server roles from time to time
◦ Peer processes maintain the synchronization & consistency of
resources and actions
 Example
◦ Napster
 Elimination of server processes reduces inter-process communication
delay for local object access
 Increased fault-tolerance and scalability
 Coordination and to maintan replicas is difficult
32
System Architectures:
Peer-to-Peer Model
• P2P architecture is to exploit the resources in a large number of
participating computers for the fulfillment of a given task
Application
Application
Coordination
code
Coordination
code
Application
Coordination
code
33
Agenda
• Basic Concepts
• Revision
• System Models
– Introduction
– Architectural Models
• Software Layers
• System Architectures
– Variations
– Design requirements for distributed architectures
34
Variations on the Client-Server
Model
• Variations inside the client-server model
from following factors:
– the use of mobile code and mobile agents
– lightweight clients, based on users’ need for
low-cost computers and easy management
– the requirment to add and remove mobile
devices in a convinient manner
35
Variations on the Client-Server
Model
• Example of a mobile code
a) client request results in the downloading of applet code
Client
Applet code
Web
server
b) client interacts with the applet
Client
Applet
Web
server
36
Variations on the Client-Server
Model
• Example of a mobile code (cont.)
– Server push model
• Server initiates dialogue
• “Pushes” information to client
• Client needs application that listens for server
pushes
• Example
– stockbroker
37
Variations on the Client-Server
Model
• Example of a mobile agent
– a) The code and data are moved to the
server
Code + data
Users
Web
server
Results
38
Variations on the Client-Server
Model
• Mobile agents
– A running program that travels between computers in a
network
• Carries out tasks on someone’s behalf
– E.g., collect information, install programs
• Has internal knowledge, beliefs and goals
– Advantage: local access everywhere
• Reduction in communication costs
• Disconnected operation
– Potential security threat
• Limited applicability/ web crawlers
39
Variations on the Client-Server
Model
• Network computer
– All files related to management of application is stored
and managed remotely and application run locally.
– Local disk may be used as cache
– It downloads OS and application software from a remote
file server
• Minimum of local software is downloaded
– As data and files are stored remotely so user can migrate
from one NC to another NC
– Applications are run locally but the files are managed by a
remote file server
• e.g., NFS
40
Variations of Client-Server
Model (cont.)
Thin client
– is a PC with less of everything
– Does not even run its own applications
– Programs are run by a powerful computer server
– Citrix WinFrame, teleporting and PCAnyWhere: thin client
process for Windows
– Advantage
• Low cost
– Disadvantage
• Not good for highly interactive application like CAD
• Delay in communication can effect application operation
• e.g., X11 Server
41
Thin Clients
Network computer
or PC
Thin
Client
Compute server
network
Application
Process
• VNC
•WinFrame
42
Variations of Client-Server
Model (cont.)
• Mobile devices
– capable of wireless networking
• e.g. personal digital assistants (PDAs)
• Spontaneous networking
– the form of distribution that integrates mobile devices and
other devices into a given network
– Key features
• easy connection to a local network
– A device brought into a new network environments is transparently
reconfigured to obtain connectivity there
• easy integration with local services
– automatic discovery of available services with no special configuration
43
– discovery services
Mobile Devices and Spontaneous
networking
Music
service
gateway
Alarm
service
Internet
Hotel wireless
network
Discovery
service
Camera
TV/PC
Laptop
PDA
Guests
devices
44
Variations of Client-Server
Model (cont.)
• Spontaneous networking (cont.)
– Issues
• supporting convenient connection and integration
• limited connectivity
– how the system can support the user so that they can continue to work while
disconnected
• security and privacy
– track of users’ location
– Discovery services
• to accept and store details of services that are available on the network and
to respond to queries from clients about them
• Interfaces of discovery services
– registration service : accept registration requests from servers, stores properties
in database of currently available services
45
– lookup service : match requested services with available servers
Agenda
• Basic Concepts
• Revision
• System Models
– Introduction
– Architectural Models
• Software Layers
• System Architectures
– Variations
– Design requirements for distributed architectures
46
Design requirements for distributed
architectures
Performance issues
Quality of Service
Use of caching and replication
Dependability issues
47
Performance issues
 Responsiveness
 Users of interactive application require a fast and consistent
response to interaction
 Speed of response: load and performance of the server,
delays in all software components
 Software layers must be optimized
 Throughput
 The rate at which computational work is done
 Data transfer between processes
 Throughput of software layer and network
 Balancing computer loads
 The ability to run applet at client removes load from the web
server
 In some case load balancing may involve moving partiallycompleted work as the loads on hosts changes
48
Quality of Service (QoS)
 Quality of service
 The ability to meet the deadlines of users need
 Reliability
 Failure model
 Security
 Security model
 Performance
 Adaptability
 Example
 Time-critical data: movie service
49
Use of caching and replication
 The performance issues are major
obstacles to the successful deployment
of DS,
 Much progress has been made in the
design of systems that overcome
 data replication and caching
50
Dependability issues
• The dependability of computer systems
– correctness,
– security
– fault tolerance
• Correctness
• Fault tolerance
– reliability is achieved through redundancy
• Replication of data and process
• Redundancy in communication protocols
• Security
– the architectural impact of the requirement for security concerns the
need to locate sensitive data and other resources only in computers
51
that can be effectively secured against attack
Next Time:
Fundamental Models of
Distributed Systems
• Conclude Chapter 2
– Interaction
• Communication and coordination
– Failure
• Classification of faults and tolerance methods
– Security
• Classification of attacks and protection mechanisms
• Lab 2: Socket Programming
52
Teaching Assistant
• Zaeem Ahmad Abbasi
– (BIT7)
• [email protected]
53
Reading Material
Q.1 From the book, answer the
following questions
2.11
2.13
2.14
2.17
Paper Reading:
Read first 5 pages only !
Discussion date: 16/12/2008
A. D. Birrell and B. J. Nelson. Implementing
remote procedure calls. ACM Transactions on
Computer Systems 2(1):39-59, February 1984
54
Monday, 02, 2008
"If an infinite number of computer
programmers programmed for an
infinite number of years, they
would eventually come up with a
working operating system.
Bill Gates, being impatient, gave
them two days and took the first
one that was finished.
— Robert X. Cringely, InfoWorld
Revision
•
Architectural Model
– Goal
•
•
•
•
Reliability
Manageability
Adaptability
Cost-effectiveness
– Service Layers
• Platform
• Middleware
– System Architecture
• Client/Server
• Proxy
• Peer to Peer
– Variations on Client/Server
• Mobile code and mobile agent
– Design requirements for distributed systems
Objectives
• To provide fundamental models that
reflect common properties for
distributed system designs.
• To understand the characteristics of the
most common fundamental models of
distributed systems.
System models – what and
why?
• System model:
– Abstract, consistent description of a relevant aspect of a
distributed system.
• A system model could address:
– What are the main entities in the system?
– How do they interact?
– What are the characteristics that affect their individual
and collective behavior?
• Aid
– Design, analysis, discussion, etc.
• The purpose of a system model:
– Make explicit all assumptions.
– To make generalizations concerning what is possible
or impossible. (General purpose Algo, mathematical proof)
Fundamental models
Will our design work in this system ?
Yes, if our assumptions hold in the system
 Interaction model:
◦ Performance of processes and communication channels, absence
of a global clock, timing problems, …coordination, delays
 Failure model:
◦ Failures of processes and communication channels,
◦ We define and Classify the faults and their effects
◦ So, we can design systems that tolerate faults of each type
 Security model:
◦ Modular nature of DS and their openness
◦ It defines and classifies the forms that such attacks may take
◦ So we can design a system that are able to resist them
Interaction model - basics
• Interaction
– Multiple server processes may cooperate to provide
service. ( DNS)
– A set of peer processes may cooperate to achieve
common goal (voice conferencing system)
• Communication & Coordination
• Distributed Algorithm
– definition of the steps to be taken by each of the processes of
which DS is made of, including the transmission of messages.
– Rate at which each process proceed and the timing of
transmission of messages cannot in general be predicted.
– Each process has its own state.
• Significant factors affecting interacting processes:
– Communication performance. (latency, bandwidth, jitter)
– Lack of global notion of time.
Interaction model – Significant
factors
• Performance of communication channels:
– Latency.
• Delay between sending of a message by one process and its
receipt by another.
• Transmission time
– Time taken to for the first of the string of bits transmitted
through a network to reach its destination.
• Delay network access time
– Increase significantly with increase in network load.
• Operating system communication services time
– In sending and receiving messages. Varies with load on OS
– Bandwidth.
• total amount of information that can be transmitted in given time
– Jitter.
• Variation in the time taken to deliver a series of messages. e.g.
multimedia data
Interaction model – Significant
factors (cont.)
• Computer clocks and timing events.
– Local processes use time service
– Different time values for processes at different
systems
– Drift rate
• The relative amount of time that a clock differs
from a perfect reference clock
– Computers may use radio receivers to get
time from GPS
• Cost, accuracy, not inside, not for every computer
Interaction model –
synchronous vs. asynchronous
 Synchronous distributed systems:
1. The time to execute each step of a process has known
lower and upper bounds.
2. Each message transmitted over a channel is received
within a known bounded time.
3. Each process has a local clock whose drift rate from real
time has a known bound.
◦ Difficult to arrive at realistic values and to provide
guarantees
◦ Synchronous distributed systems can be built
 Known resource requirements
 Guaranteed resources
Interaction model – synchronous vs.
asynchronous
• Asynchronous distributed systems (web)
• no bounds on:
1. Process execution speed.
2. Message transmission delays. (email can take days)
3. Arbitrary clock drift rates.
– Design of web browsers
– Actual distributed systems are very often
asynchronous
• Sharing processors
• Sharing network
– Any solution valid for synchronous
Interaction model – event
ordering
s end
X
receiv e
1
m1
2
Y
receiv e
4
s end
3
m2
receiv e
Phy sical
time
receiv e
s end
Z
receiv e
receiv e
m3
A
t1
t2
m1
m2
receiv e receiv e receiv e
t3
Failure Models
• Failure
– System doesn’t give desired behavior
• Component-level failure
• System-level failure (incorrect result)
• Fault
– Cause of failure (component-level)
• Transient: Not repeatable
• Intermittent: Repeats, but (apparently) independent of
system operations
• Permanent: Exists until component repaired
• Failure Model
– How the system behaves when its not working properly
Failure models - Types
• Omission failures.
• Arbitrary failures.
• Timing failures.
Failure model - omission failures
(1)
• A process or communication channel fails to perform
actions that it is supposed to do.
• Process omission failures:
– Crash (and clean crash)
• Use timeouts.
– Process crash is called Fail-stop
• If other processes can detect certainly that
Process has been crashed. (fail to respond)
• Can be produced in synchronous systems only
where message delivery is guaranteed.
Failure model – omission failure
(2)
• Communication omission failures:
– Communication primitives are send and receive.
• Send-omission failures.
• Receive-omission failures.
• Channel-omission failures.
– Also known as dropping message
• Generally caused by
process p
– Lack of buffer space at receiving end or intervening gateway
– Network transmission error, detected by a checksum
process q
send m
receive
Communication channel
Outgoing message buffer
Incoming message buffer
Failure model –
Arbitrary failure
• Arbitrary or Byzantine failures
– Described as the worst possible failure semantics, in which
any type of error may occur
• Process/channel exhibits arbitrary behavior
• Process may return a wrong value in response to an
invocation
– Arbitrary Process failure
• Process may omit a step/s or Perform uninterested
step/s
– Arbitrary Communication Failure
• Messages contents can be corrupted, a duplicate
message can be sent or message can be lost on its way
• Rare and can be detected by checksum or message
numbering
Failure model – overview of omission
failures
Class of failure
Affects Description
Crash
Process
Process halts and remains halted. Other processes may
Fail-stop
Process
Process halts and remains halted. Other processes may
not be able to detect this state.
detect this state.
Channel A message inserted in an outgoing message buffer never
arrives at the other end’s incoming message buffer.
Send-omission Process A process completes a
but the message is not put
in its outgoing message buffer.
Receive-omission Process A message is put in a process’s incoming message
buffer, but that process does not receive it.
Arbitrary
Process/channel exhibits arbitrary behaviour: it may
Process
(Byzantine)
or channel send/transmit arbitrary messages at arbitrary times,
commit omissions; a process may stop or take an
incorrect step.
Omission
Failure model - timing
failures
•
•
Applicable in synchronous distributed systems.
Time limits
– Process execution time
– Message delivery time
– Clock drift rate
Class
Affects
Description
Performance
Process
Process exceeds the bounds on the
interval between two steps.
Performance
Channel
Clock
Process
A message’s transmission takes longer
than the stated bound.
Process’s local clock exceeds
the bounds on its rate of drift from real time.
 Real Time Operating System
Provides timing guarantee
 Multimedia

Failure model - remedies
• Masking failures:
– A knowledge of the failure characteristic of a component can enable us
to develop a reliable service which use such components which can fail.
– Converting failure, checksum, retransmit message, replication, restoring
information (convert arbitrary failure to omission failure)
– A service masks a failure, either by hiding it altogether or by converting it
into a more acceptable type of failure.
• Reliability of one-to-one communication:
– Correct message delivery in presence of failure
– Validity: Any message in the outgoing message buffer is eventually
delivered to the incoming message buffer.
– Integrity: The message received is identical to one sent, and no
messages are delivered twice.
– Threats: Malicious Users and Protocols
Security model - basics
• The security of a distributed system:
– securing the processes and the channels
– protecting the objects against unauthorized access.
• Protecting objects.
A c cess right s
Object
inv oc ation
Client
result
Principal (user)
Netw ork
Server
Principal (s erv er)
•Access rights:
• Who is allowed to perform operation
•Principal:
• Authority associated with each invocation and each result – The behalf on
which it is issued
Summary
• Models in general.
• Architectural models:
• Fundamental models:
– Interaction.
– Failure.
– Security.
Networking and Internetworking (Self Study)
(Discussion date: 26the August)
Objectives
• To provide fundamental models that
reflect common properties for
distributed system designs.
• To understand the characteristics of the
most common fundamental models of
distributed systems.
• To understand the code for writing
simple sockets --Next Time
Reading Material
Paper Reading:
Home Assignment 1:
Discussion date: 16th December
Discussion date: 16th December
Development of the Domain Name System, Paul
V. Mockapetris,Kevin J. Dunlap. Computer
Communication Review Vol. 18, No. 4, August
1988, pp. 123–133.
83
Extra
84
Notification Transmission
Mechanisms (Permanent Connections)
Client 1
Service
Update
Client 2
Client 3
Server
Notification Transmission
Mechanisms (Connect and Notify)
Client 1
Service
Update
Client 2
Client 3
Server
Client 1
Client 2
Client 3
Notification Transmission
Mechanisms (Multicast)
Client 1
Service
Update
Client 2
Client 3
Server
Client 1
Client 2
Client 3

bit8_DC_Lecture_2

Transcript bit8_DC_Lecture_2

Directory