chapter 2 AOS INTRODUCTION

Download Report

Transcript chapter 2 AOS INTRODUCTION

Distributed Systems
Introduction to Distributed Systems
• Why do we develop distributed systems?
– availability of powerful yet cheap microprocessors
(PCs, workstations), continuing advances in
communication technology.
• What is a distributed system?
• A distributed system is a collection of independent
computers connected via a high speed computers
and appear to the users of the system as a single
system.
Examples of DS
– Network of workstations :Personal workstations +
a pool of processors + single file system
– Distributed manufacturing system (e.g. automated
assembly line) :Robots on the assembly line +
Robots in the parts department
– Network of branch office computers : A large bank
with hundreds of branch offices all over the world
Advantages of Distributed Systems
over Centralized Systems
• Economics: a collection of microprocessors offer a
better price/performance ratio than mainframes.
Low price/performance ratio: cost effective way to
increase computing power. microprocessors offer a
better price/performance than mainframes
• Speed: a distributed system may have more total
computing power than a mainframe. Ex. 10,000
CPU chips, each running at 50 MIPS. Not possible
to build 500,000 MIPS single processor since it
would require 0.002 nsec instruction cycle.
Enhanced performance through load distributing. a
distributed system may have more total computing
power than a mainframe
Advantages of Distributed Systems
over Centralized Systems
• Inherent distribution: Some applications are
inherently distributed. Ex. a supermarket chain, cooperative work ( a group of people located far
away stations, can work together and create join
report) .
• Reliability: If one machine crashes, the system as
a whole can still survive. Higher availability and
improved reliability. Ex: Nuclear reactors, aircrafts.
• Incremental growth: Computing power can be
added in small increments. Modular expandability i.
e Gradual expansion is possible .
• Another deriving force: the existence of large
number of personal computers, the need for
people to collaborate and share information
Advantages of Distributed Systems
over Independent PCs
– Data sharing: allow many users to access to a common
data base.Ex: Airline/ Railway Reservation system
– Resource Sharing: expensive peripherals like color
printers, Phototypesetters, storage devices can be
shared in DS .
– Communication: make human-to-human communication
easier, for example, by E-mail , chat.
– Flexibility: spread the workload over the available
machines in the most cost effective way
Disadvantages of Distributed
Systems
– Software: difficult to develop software , PL and
applications for distributed systems.
– Network: saturation, lossy transmissions. the
network can saturate or cause other problems i.e may
loose messages, & can be overloaded
– Security: easy access also applies to secrete
data.
Hardware Concepts of DS
• MIMD (Multiple-Instruction Multiple-Data)
• Tightly Coupled versus Loosely Coupled
MIMD
 Tightly coupled systems (multiprocessors)( pure DS )
o shared memory
o intermachine delay short, data rate high
 Loosely coupled systems (multicomputers)( EX: LAN)
o private memory
o intermachine delay long, data rate low
•
Bus versus Switched MIMD
• Bus: a single network, backplane, bus, cable or other
medium that connects all machines. E.g., cable TV
• Switched: individual wires from machine to machine, with
many different wiring patterns in use. Masg can be
passed from m/c to m/c . The roots can be changed by
using switch Ex: world wide telephonic system .
• Categories of MIMD( DS ) :
– Multiprocessors (shared memory)
• Bus multiprocessor
• Switched multiprocessor
– Multicomputers (private memory)
• Bus Multicomputers
• Switched Multicomputers
 write-through cache:if a word is written to
cache it should be written to memory as well
i.e propagate write immediately
 snoopy cache: whenever cache sees a write
occuring to mem address present in cache it
either removes that entry from cache or
update it with new value this is called as
Snoopy Cache .
Switched Multiprocessor
Switched Multiprocessor
• For Crossbar switch we need n CPUs & n
Memories so n2 crossbar switches are
needed .for large n it will be difficult.
• Solution is use Omega Network :
– contains four 2*2 switches .
– Each having 2 inputs and 2 outputs
– Each switch can route either input to either
output.
Multicomputers
• Bus-Based Multicomputers
– Each CPU has direct connection to its own
local memory .
– easy to build
 communication volume much smaller.
 relatively slow speed LAN (10-100 Mbps,
compared to 300 Mbps and up for a backplane
bus)
Bus Multicomputer
Switched Multicomputers
Switched Multicomputers:
-Each CPU has direct and exclusive access to its
own, private memory.
-interconnection networks: E.g., grid,
hypercube
- Grids : easy to understand and lay out on PCB
. Best suited for 2 D nature problems
Ex: Graph Theory .
-hypercube: n-dimensional cube: each CPU has
n connections to other CPUs.
- msg passing is easy becoz of nearest
neighbors are connected .
Switched Multicomputer
Software Concepts
• Software more important for users
• Three types:
1. Network Operating Systems
2. (True) Distributed Systems
3. Multiprocessor Time Sharing
Network Operating Systems
 loosely-coupled software on loosely-coupled hardware
 A network of workstations connected by LAN
 Here each user has a workstation for its exclusive use along with
Exclusive OS .
 All commands normally run locally.
 Sometimes user can log into another machine remotely
o rlogin machine
o rcp machine1:file1 machine2:file2
o To avoid communication problems one approach is to provide
a shared global file system accessible from all workstation .
 Files servers: the file system supported by one or more machines
is called as file servers .client and server model. Refer fig …
 Clients mount directories on file servers
 Best known network OS:
o Sun’s NFS (network file servers) for shared file systems
o A situation where each m/c has high degree of Autonomy and there
are a few system-wide requirements that os is called as NOS .
NFS
• NFS Architecture
– Server exports directories
– Clients mount exported directories
• NFS Protocols
– For handling mounting
– For read/write: no open/close, stateless
• NFS Implementation
(True) Distributed Systems
 tightly-coupled software on loosely-coupled hardware
 provide a single-system image or a virtual uniprocessor
 a single, global interprocess communication mechanism,
process management, file system; the same system call
interface everywhere
 Ideal definition:
“ A distributed system runs on a collection of computers
that do not have shared memory, yet looks like a
single computer to its users.”
Multiprocessor Operating Systems
Tightly-coupled software on tightly-coupled
hardware




Examples: high-performance servers
shared memory
single run queue
traditional file system as on a single-processor
system: central block cache
Design Issues of Distributed
Systems
•
•
•
•
•
Transparency
Flexibility
Reliability
Performance
Scalability
1. Transparency
• How to achieve the single-system image, i.e., how
to make a collection of computers appear as a
single computer. A system that realizes this goal is
said to be transparent .
• Hiding all the distribution from the users as well as
the application programs can be achieved at two
levels:
1) hide the distribution from users .
2) at a lower level, make the system look
transparent to programs.
1) and 2) requires uniform interfaces such as
access to files, communication.
Types of transparency
– Location Transparency: users cannot tell where hardware and
software resources such as CPUs, printers, files, data bases are
located. The name of resource must not encode the location of
resource , so machine1: prog.c or /machine1/proc.c are not
acceptable .
– Migration Transparency: resources must be free to move from one
location to another without their names changed.
E.g., /usr/lee, /central/usr/lee
– Replication Transparency: OS can make additional copies of files
and resources on its own without the users noticing.
– Concurrency Transparency: if the system is CT then The user will
not notice the existence of another user . to achieve CT , one
mechanism is to lock the resource automatically when one had
started to use it , unlocking it when the access was finished . Lock
and unlock for mutual exclusion.
– Parallelism Transparency: Automatic use of parallelism without
having to program explicitly.
There are times when Users do not want complete transparency.
2. Flexibility
• Make it easier to change
• Monolithic Kernel: each machine should run a traditional
kernel that provides most services itself.
• systems calls are trapped and executed by the kernel. All
system calls are served by the kernel, e.g., UNIX.
• Adv: performance
• Ex: today’s centralized operating system with n/w facilities
and remote services, Sprite os
• Microkernel: provides minimal services.
1) IPC
2) some memory management
3) some low-level process management and scheduling
4) low-level i/o
E.g.,Amoeba , Mach can support multiple file systems,
multiple system interfaces. Most of DS has this kind of
system .
3. Reliability
• The goal behind designing DS is to make
them more reliable than single processor
system.
• Aspects of reliability :
– Availability: fraction of time the system is
usable. Another tool to improve availability is
redundancy i.e key pieces of h/w and s/w can
be replicated .
– Security : files and other resources must be
protected from unauthorized users.
– Fault tolerance: need to mask failures,
recover from errors.
4. Performance
• Performance problem is compounded by the
fact that is communication which is essential
in DS .
• Performance loss due to communication
delays:
– fine-grained parallelism: high degree of
interaction, not good for DS.
– coarse-grained parallelism : best fit for DS (
jobs that involve large computations, less
interactions and little data)
• Performance loss due to making the system
fault tolerant.
5. Scalability
• Systems grow with time or become obsolete.
Techniques that require resources linearly in terms
of the size of the system are not scalable. e.g.,
broadcast based query won't work for large
distributed systems.
• Examples of bottlenecks
o Centralized components: a single mail server
o Centralized tables: a single URL address book
o Centralized algorithms: routing based on
complete information, only decentralized
algorithm should be used in Ds.
Characteristics of Decentralized Algorithm
• No m/c has complete information about the
system state .
• m/c makes decision based on local
information only
• Failure of one m/c does not ruin the algorithm
• There is no implicit assumption that a global
clock exists.
THANX
Experiments & Assignments
1. Simulating NOS commands:
1. Implementation of Stack
2. Implementation of Queue.
2. RMI supporting Distributing Computing in JAVA:
1. Design a GUI based Calculator
2. Retrieve Time & Date function from server to Client .
3. Develop a Program for Client Chat Server.
4. Remote Objects for Database Access.
5. Amoeba, V-System, Mach , Chorus – Case study
6. Centralized & Distributed Algorithms for Clock synchronization.
7. Deadlock prevention & Deadlock detection algorithms.
8. Election algorithms for co-ordinator .
9. Load balancing Algorithms.