Distributed Systems

Download Report

Transcript Distributed Systems

Distributed Systems
(Credit Hours: 3)
This course covers advanced topics in distributed
systems with a special emphasis on distributed
computing, and the services provided by distributed
operating systems. Important topics include parallel
processing, remote procedure call, concurrency,
transactions, shared memory, message passing and
scalability.
Reference Books:
1. Distributed Systems: Concept and Design by
Coulouris, Dollimore, and Kingberg
2. Distributed Operating Systems by Andrew S.
Tanenbaum
1
Course Evaluation
Attendance & Class Participation
Assignments
Critical Reviews
Mid Term
Final Term
Total Marks:
05
10
10
25
50
100
2
Overview
•
•
•
•
•
•
Multiprocessing (Parallel processing).
Tightly coupled processors.
Distributed system (DS).
Loosely coupled processors (Distributed).
Key features of DS.
Pros and Cons of DS.
3
Parallel Processing
From the beginning, computer scientists
had challenged computers with larger and
larger problems.
Eventually, computer processors were
combined on the same board together in
parallel to work on the same task together
by sharing the same memory.
This is called parallel processing.
4
Parallel Processing
•
•
•
•
Types of Parallel Processing.
MISD
SIMD
MIMD
5
Cont……
Processors are multiple
MISD – Multiple Instruction stream, Single Data stream
SIMD – Single Instruction stream, Multiple Data stream
MIMD – Multiple Instruction stream, Multiple Data stream
6
MISD
One piece of data is broken up and sent to many processor.
For searching a specific record.
CPU
CPU
Data
Search
CPU
CPU
Example: An unsorted dataset is broken up into sections of
records and assigned to several different processors, each
processor searches the sections of data base for a specific key.
7
MISD
An other example may be:
Multiple cryptography algorithms attempting to decrypt a single
coded message
8
SIMD
SIMD (Single Instruction, Multiple Data)
Is a technique applied to achieve parallel
execution from a set of processor with
data level parallel processing.
9
SIMD
Multiple processors execute the same instruction of separate data.
Data
CPU
Data
CPU
Data
Data
Multiply
CPU
CPU
Ex: A SIMD machine with 100 processors could multiply 100
numbers, each by the number three(3), at the same time.
10
•
•
SIMD
Single instruction: all processing units
execute the same instruction at any given
clock cycle
Multiple data: each processing unit can
operate on a different data element
11
MIMD
Multiple processors execute different instruction of separate data.
Data
Multiply
CPU
Data
Search
CPU
Data
Add
CPU
Data
Subtract
CPU
This is the most complex form of parallel processing. It is used
on complex simulations like modeling the growth of cities.
12
MIMD
•
•
•
Currently the most common type of parallel
computer
Multiple instruction: every processor may be
executing a different instruction stream
Multiple data: every processor may be working
with a different data stream
13
Tightly Coupled Processors (H/W
concepts)
– e.g., Multiprocessors, in which two or
more CPUs share a main memory.
– More difficult to build than multicomputers.
– Easier to program (Desktop
programming).
14
Multiprocessing system
• Each processor is assigned a
specific duty but, processors work in
close association possibly sharing
one memory module.
• These CPUs have local cashes and
have access to a central shared
memory. The IBM p690 Regatta is
an example of a multiprocessing
system. (Mainframe).
15
Multiprocessors
Consist of some number of CPUs, all connected to a
common bus along with a memory module
Bus-based multiprocessors require cashing.
With caching, memory incoherence becomes an issue
Write-through cache (Updating):
Any update goes through to the actual memory (not only
the cache)
Snooping (snoopy) cache (Reading):
Every cache monitors the bus, picks up any write-through
to memory and applies them to itself, if necessary
It is possible to put about 32 or possibly 64 CPUs on a
single bus
16
Fig: A bus-based multiprocessor
CPU
CPU
CPU
Cache
Cache
Cache
Memory
Bus
17
Why Multiprocessors?
1. Microprocessors as the fastest CPUs
Collecting several is much easier than
redesigning one.
2. IL multithreading is limited due to data
dependency on one processor.
3. Improvement in parallel softwares (scientific
apps, databases, OS) needs
multiprocessors.
18
Introduction to distributed systems
Definitions
A distributed system is a collection of
independent computers that appear to the
users of the system as a single computer
Tanenbaum
A distributed system is one in which
hardware components located at networked
computers communicate and coordinate their
actions only by passing messages.
Coulouris, Dollimore, Kindberg
19
Distributed Computing
• Distributed computing is the process of
aggregating the power of several computing
entities to collaboratively run a computational
task in a transparent and coherent way, so that it
appears as a single, centralized system.
• A distributed computer system is a loosely
coupled collection of autonomous computers
connected by a network using system software
to produce a single integrated computing
environment.
20
Features of DS
• Distributed computing consists of a
network of autonomous nodes.
• Loosely coupled.
• Node do not share the primary or
secondary storage.
• A well designed distributed system does
not crash if a node goes down.
21
Cont..
• If you are to perform a computing task
which is parallel in nature, scaling your
system is a lot cheaper by adding extra
node, compared to getting a faster single
machine.
• Of course, if your processing task is highly
non-parallel (every result depends on the
previous), using a distributed computing
system may not be very beneficial.
22
Cont…
• Network connections are the key feature.
• Establishing Remote access is by
message passing b/w nodes.
• Messages are from CPU to CPU.
• Protocols are designed for reliability, flow
control, failure detection etc.
• Mode of communication between nodes is
by sending and receiving the network
messages.
23
Distributed OS vs Networking OS
• The machines
supporting a
distributed
operating system
are running under a
single operating
system that spans
the network.
• With network
operating system
each machine runs
an entire operating
system.
24
Cont…
• Thus the print
task might be
running on one
machine, the file
system on an
other. Thus each
machines cooperates as for
the current
software part of
the DOS.
• While in NW OS
the entire node is
itself distributed
across the
network.
25
Advantages of DS over Centralized
system.
• Better price/performance than
mainframes.
• More computing power (Parallel and
distributed).
• Requests for some applications.
• Improved reliability because system can
survive crash of one processor.
• Incremental growth can be achieved by
adding one processor at a time.
• Shared ownership facilitated.
26
Disadvantages of DS.
• Network performance parameters.
• Latency: Delay that occurs after a send
operation is executed before data starts to
arrive at the destination computer.
• Data transfer rate: Speed at which data
can be transferred between two computers
once transmission has begun.
• Total network bandwidth: total volume of
traffic that can be transferred across the
27
network in a given time.
Disadvantages of DS.
• Dependency on reliability of the underlying
network.
• Higher security risk due to more possible
access points for intruders and possible
communication with insecure systems.
• Software complexity.
28
Loosely Coupled Processors in H/W
concepts
– e.g., Multi-computers, in which each of
the processors has its own memory.
– Easy to build and are commercially
available (PCs).
– More complex to program (Desktop+
Socket programming).
29
DS consists of workstations on a LAN
Workstation
Workstation
Workstation
Local memory
Local memory
Local memory
CPU
CPU
CPU
Network
30
Software Concepts
• Network Operating Systems (NOS)
– Loosely-coupled software on loosely-coupled
hardware
– e.g., a network of workstations connected by a LAN
– Each user has a workstation for his exclusive use
– Offers local services to remote clients
• Distributed Operating Systems (DOS)
– Tightly coupled-software over loosely-coupled
hardware
– Creating an illusion to a user that the entire network
of computers is a single timesharing system, rather
than a collection of distinct machines (single-system
image)
31
Software Concepts (Cont’d)
• DOS
– Users should not have to be aware of the existence of
multiple CPUs in the system
– No current system fulfills this requirement entirely
• Multiprocessor Operating Systems
– Tightly-coupled software on tightly-coupled hardware
– e.g., UNIX timesharing system with multiple CPUs
– Key characteristic is the existence of a single run
queue (same memory)
– Basic design is mostly same as traditional OS;
however, issues of process synchronization, task
scheduling, memory management, and security
become more complex as memory is shared by many
processors
32
Summary
Comparison of three different
ways of organizing n CPUs
Item
NOS
DOS
Multiprocessor
OS
Does it look like a virtual uni-processor?
No
Yes
Yes
Do all have to run the same OS at a time?
No
Yes
Yes
How many copies of the OS are there?
N
1-replicated 1
How is communication achieved?
Shared
Files
Messages
Shared
Memory
Are agreed upon network protocols required?
Yes
Yes
No
Is there a single run queue?
No
No
Yes
33