Transcript - Way2MCA
Distributed
Computing
Why Networks?
Network connectivity is increasing.
Availability of powerful yet cheap microprocessors
(PCs, workstations, PDAs, embedded systems,
etc.)
Continuing advances in communication technology
Development
of
memory/storage
denser
and
cheaper
Distributed Computing Systems
A collection of independent computers that appears
to its users as a single coherent system.
Internet
Distributed Computing Systems
A collection of (perhaps) heterogeneous nodes
connected by one or more interconnection
networks which provides access to system-wide
shared resources and services.
It is basically a collection of interconnected
processors covering wide geographical area in
which each processor has its own local memory
and other peripherals.
The communication between any two processor
takes
place
by
message
passing
over
communication network.
Distributed Computing Systems
A type of computing in which different components
and objects comprising an application can be
located on different computers connected to a
network. So, for example, a word processing
application might consist of an editor component on
one computer, a spell-checker object on a second
computer, and a thesaurus on a third computer. In
some distributed computing systems, each of the
three computers could even be running a different
operating system.
The data used in a distributed processing
environment is also distributed across platforms.
Centralized Multi-user System
Mainframe or
Minicomputer
Network
Problems:
Single point of failure
Difficult to expand
Distributed Systems
Heterogeneous type of computers
Network
Servers and databases
Distributed vs. Centralized Systems
Why distribute?
– Information sharing among distributed users
(groupware)
– Resource sharing
– Shorter response time & higher throughput
– Flexibility to spread load
– Incremental growth – Extensibility
– Better Cost/performance ratio
– Higher Reliability/ Availability – higher fault
tolerance
– Inherently distributed application
– Flexibility in meeting user’s requirements
Disadvantages
More Software Components
– The more software components that comprise a system
the greater chance of errors occurring.
Security
– Providing easy distributed access increases the risk of a
security breach occurring.
Networking
– The underlying network can saturate or cause other
problems.
Hardware Considerations
Architecture of interconnected multiple processors are of two
types:
Tightly Coupled System
–
–
–
–
Single system wide primary memory
Communication takes place through shared memory.
Systems are limited by bandwidth of shared memory.
It is also know as Parallel Processing Systems.
Loosely Coupled System
– Each processor has its own local memory.
– It can have unlimited number of processors.
– Communication is done by passing message across the
network.
– It is known as Distributed Computing System.
Parallel vs. Distributed Architecture
Parallel vs. Distributed Systems
Parallel Systems
Distributed Systems
Memory
Tightly coupled shared Distributed memory
memory
Message passing, RPC,
and/or use of distributed
shared memory
Control
Global clock control
Processor
interconnect
ion
Order of Tbps
Order of Gbps
Bus, mesh, tree, mesh of Ethernet(bus),
tree,
and
hypercube and SCI (ring)
network
Main focus
Performance - Scientific Performance - cost and
computing
scalability,
Reliability,
Information/resource
sharing
No global clock control.
Synchronization algorithms
needed
token
ring
Distributed Operating System
Operating systems used for distributed computing systems
can be of two types– Network Operating System
– Distributed Operating System
NOS vs. DOS
Features
Single
System
Image
Autonomy
Fault
Tolerance
Network OS
Distributed OS
NO. User are aware of the YES.
Provides
virtual
fact that multiple computers uniprocessor image to the user.
are being used.
DOS
dynamically
and
Selection
of machine for automatically allocates jobs to
executing a job is manual.
various machines.
High. Local OS at each Low. A single system-wide
computer and communicate OS
via common communication
protocol. Shared file system.
Processes
&
resources
Each
computer functions managed globally
independently & manages its
own processes & resources.
Single set of globally valid
System
calls for different system calls
computers may be different
Unavailability
grows
faulty machines increase.
as
Unavailability remains little
even if faulty machines
increase.
Network Operating System
Distributed Operating Systems
Evolution of DCS
Batch Processing System
– Batching together jobs with similar needs.
– Automatic sequencing of jobs with control cards.
– Off-line processing (By using buffering, spooling)
– User does not directly interact with the computer system.
Disadvantage
– Less user interaction.
– No sharing of resources
– Job set up time still significant for new batch of jobs.
– CPU remains idle during transition.
– Speed mismatch (CPU & I/O dev)
Time-Sharing Systems
– Several dumb terminals are attached to main computer.
– Multiple user could now simultaneously execute
interactive jobs and share the resources.
– The CPU is multiplexed among several jobs that are kept
in memory .
• Advantages
– Reduces CPU idle time.
– Avoid duplication of software
• Disadvantages
– Increased overhead. (Time to swap page in and out)
– Terminals could not be placed far from main computer.
• Due to advancement in networking technologies LAN and
WAN came into existence – lead to evolution of Distributed
Computing.
Distributed Computing
System Models
Minicomputer Model
Minicomputer
Minicomputer
ARPA
net
Users
Minicomputer
Extension of Time sharing system
– User must log on to his home minicomputer.
– Thereafter, he can log on to a remote machine by telnet.
– Does not reflect uniprocessor image.
Used basically for Resource
performance devices)
sharing
(Database,
High-
Workstation Model
Workstation
Workstation
Com. Network
Workstation
Workstation
Workstation
Workstation
– A powerful, single-user computer, like a personal computer,
but has a more powerful microprocessor. Each has its own
local disk and a local file system – diskful workstation.
Process migration
– User first logs on to his/her personal workstation.
– If there are idle remote workstations, may migrate one or
more processes to one of them.
– Result of execution migrated back to user’s workstation.
Issues to be resolved:
– How to find an idle workstation
– How to migrate a job
– What if a user logs on to the remote machine executing
process of another machine – run two processes
simultaneously, kill remote process, migrate process back
to its home workstation ?
Examples – Sprite System, Xerox PARC
Workstation-Server Model
Workstation
Workstation
Workstation
100Gbps
LAN
MiniComputer
file server
MiniComputer
db server
MiniComputer
print server
Client workstations
– Largely Diskless
– Local disk of diskful workstation used for storage of
temporary files etc.
Server minicomputers
– Each minicomputer is dedicated to one or more different
types of services, for managing & providing access to
shared resources.
– Multiple servers used for a service for better scalability
and higher reliability.
User logs on to his machine. Normal computation activities
carried at home workstation but services provided by special
servers.
No process migration involved.
Advantages
– Cheaper – few minicomputers vs. large no. of diskful
workstations
– Backup and hardware maintenance easier
– Flexibility to access files from any file server
– No process migration
– Guaranteed response time
Disadvantage
– Does not exploit idle workstations
Client-Server model of communication
– RPC (Remote Procedure Call)
– RMI (Remote Method Invocation)
Example: V system
Processor-Pool Model
Terminals
100Gbps
LAN
Run
Server
Ser1
Server N
Pool of processors
Processors (microcomputers and minicomputers ) are pooled
together to be shared by the users as needed.
Each processor has its own memory to load and run a system
program or an application program of the DCS.
Clients:
– They log on to one of the terminals (diskless workstations
or graphic terminals)
– All services are dispatched to servers.
Servers:
– Necessary number of processors are allocated to each
user from the pool by run server
No concept of home machine. User logs on to system as
whole.
Better utilization of processing power but less interactivity
Greater flexibility – processors can act as extra servers
Unsuitable for high performance interactive application as
communication is slow between processor & terminal
Example – Amoeba, Cambridge Distributed System
Hybrid Model
Combines advantages of both the workstation – server and
processor - pool model
Based on workstation – server model with additional pool of
processors
The processor in the pool can perform large computations
WS-server model can perform user interactive jobs.
Hybrid model is more expensive to implement.
Issues in Distributed
Computing System
Transparency
How to achieve the single-system image, i.e., how
to make a collection of computers appear as a
single computer.
Transparency in a Distributed System
Access Transparency
– Hide differences in data representation & how a resource(
local or global) is accessed. Use global set of system calls
& global resource naming facility ( ex. URL).
Location Transparency
– Hides where a resource is located
– Name transparency – Name of resource should not reveal
its physical location. Resource names must be unique
system wide.
– User Mobility – User should be able to freely log on to any
machine in the system and access a resource with the
same name.
Replication Transparency
– Naming of replicas – map user supplied name of resource
to appropriate replica.
– Replication control – how where when
Failure Transparency
– Partial failure transparency
– Complete failure transparency
Migration Transparency
– Movement of object is handled automatically by system &
following issues are taken care of –
• Migration decision made automatically by system.
• Name of resource remains same on migration from
one node to another
• IPC ensures proper receipt of message by process,
even if it further migrates.
Concurrency Transparency
– Hide that a resource may be shared by several
competitive users. It is achieved by
• Event ordering property
• Mutual exclusion property
• No starvation property
• No deadlock property
Performance Transparency
– System is automatically reconfigured as per load varying
in the system.
Scaling Transparency
– System can expand in scale without disrupting activities of
users
Persistence Transparency
– Hides whether a (software) resource is in memory or on
disk
Reliability
Faults
– Fail stop : system stops functioning
– Byzantine failure : system produces wrong result
Fault avoidance
– Occurrence of faults is minimized by making components
more reliable
Fault tolerance
– Redundancy techniques
• K-fault tolerance needs K + 1 replicas
• K-Byzantine failures need 2K + 1 replicas.
– Distributed control
• Avoiding single point of failure
Fault detection and recovery
– Atomic transaction
– Stateless servers
– Ack & timeout based retransmissions of messages
Flexibility
Ease of modification
Ease of enhancement
Choosing appropriate kernel
– Monolithic kernel : Kernel where the entire operating
system is working in the kernel space and alone as
supervisor mode.
– Micro kernel : Kernel is reduced to contain minimal
facilities necessary, and the other system services reside
in user space in form of normal processes (as so called
servers). Because the servers do not run in kernel space
anymore, so called "context switches" are needed, to
allow user processes to enter privileged mode (and to exit
again).
Monolithic kernel vs. Micro kernel
Performance
Various performance metrics:
– response time
– throughput
– system utilization
– network capacity utilization
Design issues to increase performance
– Batch if possible
– Cache whenever possible
– Minimize copying of data
– Minimize network traffic
– Fine grained parallelism ( involve large no. of small
computations but more interaction) vs. coarse grained
parallelism ( involve large computations, low interaction
rates & little data)
Scalability
Capability of a system to adapt to increased service load.
– Avoid centralized entities
– Avoid centralized algorithms
– Perform most operations on client workstations
Geographical scalability also difficult. LAN usually based on
synchronous communication. WAN – inherently unreliable.
Difficult to scale system across multiple, independent
administrative domains.
Concept
Example
Centralized services
A single server for all users
Centralized data
A single on-line telephone book
Centralized algorithms
Doing routing based on complete information
Heterogeneity
Caused by interconnected sets of dissimilar hardware or
software systems (Ex: different topologies, protocols, word
lengths etc)
Data and instruction formats depend on each machine
architecture
If a system consists of K different machine types, we need
K–1 translation softwares, at sender/receiver.
Use intermediate standard data format.
Security
Lack of a single point of control & use of insecure networks for
data communication
Security concerns:
– Messages may be stolen, plagiarized(copied and
passed off as your own) or changed by an intruder.
– Message received by intended receiver & sent by genuine
sender.
Cryptography used for security.
Emulation of Existing OS
Middleware
Middleware is an additional layer of software that is used in
NOS to more or less hide the heterogeneity of the collection
of underlying platforms but also to improve distribution
transparency.
It offers a higher level of abstraction.
It is placed in the middle between applications & NOS.
Distributed System as Middleware
Distributed Computing Environment
(DCE)
•
It is an integrated set of services and tools that can be
installed as a coherent environment on top of existing OS
and serve as a platform for building and running distributed
application.
•
It runs on many different kinds of computers , OS , and
network produced by different vendors.
•
It hides differences between machines by automatically
performing
data-type
conversions,
thus
making
heterogeneous nature of system transparent to application
programmers.
DCE applications
DCE software
Operating system and networking
DCE Component
Component
Functionality
Thread package
Used in concurrent applications.
RPC facility
Necessary to build client-server applications.
Forms basis for communication.
Distributed Time
Service (DTS)
Synchronizes clocks of all computers in the system
Name Services
Allows resources to be uniquely named &
accessed in location transparent manner
Security
Services
Provides tools for authentication & authorization.
Distributed File
Service (DFS)
Provides systemwide file system
DCE Cells
• A cell is a group of user, machines or resources that have a
common purpose and share common DCE services.
• It helps to break down large system into smaller, manageable
units.
The minimum cell configuration requires:
• Cell directory server
• A security server
• Distributed time server and
• One or more client machine
Factors for deciding cell boundaries:
–
Purpose
• Users working on a common goal should be put in
same cell.
–
Administration
• Machines known & manageable by an administrator
put in one cell.
–
Security
• Users of machines that trust each other
–
Overhead
• Avoid communication overhead by putting users that
interact more in the same cell.
1. Suppose a component of a distributed system suddenly
crashes. How will this cause inconvenience to the users when
one of the following happens:
•
The system uses processor-pool model and crashed
component is a processor in the pool.
•
In processor-pool model , a user terminal crashes.
•
The system uses a workstation-server model and server
crashes.
•
In the workstation-server model , one of the client crashes.