Distributed Systems

Download Report

Transcript Distributed Systems

Distributed Systems
Definition
Definition of a Distributed System
A distributed system is a
collection of independent
computers that appears
to its users as a single
system.
A definition (Coulouris, et al)
 System of networked
computers that
 communicate and coordinate their
actions only by passing messages
 concurrent execution of programs
 components fail independently of
one another
A definition (Lamport)
 “You know you have a distributed
system when the crash of a
computer you’ve never heard of
stops you from getting any work
done.”
 inter-dependencies
 shared state
Distributed System
Operating System types
 Centralized Systems
 Process management
 Network System




Share resources
Remote access
Telnet / FTP
No direct control from machine to another
 Distributed system
 Global view of files system
 Global name
 Global time….
Distributed system properties
 Connect users and resources
 Transparency
 Scalability
 Openness
Transparency
 To hide the fact that machines are
physically distributed and how much
the distributed system appears as
single system
Transparency in a Distributed
System
Transparency
Description
Access
Hide differences in data representation and how a
resource is accessed
Location
Hide where a resource is located
Migration
Hide that a resource may move to another location
Relocation
Hide that a resource may be moved to another location
while in use
Replication
Hide that a resource may be shared by several
competitive users
Concurrency
Hide that a resource may be shared by several
competitive users
Failure
Hide the failure and recovery of a resource
Persistence
Hide whether a (software) resource is in memory or on
disk
Different forms of transparency in a distributed system.
Types of transparency
–

Location Transparency: users cannot tell where hardware
and software resources such as CPUs, printers, files, data
bases are located.
– Migration Transparency: resources must be free to move
from one location to another without their names changed.
E.g., /usr/lee, /central/usr/lee
– Replication Transparency: OS can make additional copies of
files and resources without users noticing.
– Concurrency Transparency: The users are not aware of the
existence of other users. Need to allow multiple users to
concurrently access the same resource. Lock and unlock for
mutual exclusion.
– Parallelism Transparency: Automatic use of parallelism
without having to program explicitly. The holy grail for
distributed and parallel system designers.
Users do not always want complete transparency: a fancy
printer 1000 miles away
Scalability
 It is to able to send anything to
anyone anywhere
 System scalability to size means
easily add more users & resources
 Scalability of geographically
Scalability
 A distributed system is scalable if it
remains effective as the number of users
and/or resources increase
 Challenges:




Controlling resource costs
Controlling performance loss
Preventing resources from running out
Avoiding performance bottlenecks
Scalability Problems
• No machine has complete information
about the system state.
• Machines make decisions based only on
local information.
• Failure of one machine does not ruin the
algorithm.
• There is no implicit assumption that a
global clock exists.
Openness / Flexibility
 The distributed system has a clear
rules controls its services
 Open distributed system is flexible
means easy to configure system for
different developers
Openness
 When protocols are known to developers
extensibility and maintainability becomes
possible
 Openness allows re-implementation of
different components of the system
 Important factors:
 Specification
 Documentation
 Published interfaces (often bypassing standards
organizations)
Reliability
• Distributed system should be more reliable
than single system. Example: 3 machines
with .95 probability of being up. 1-.05**3
probability of being up.
– Availability: fraction of time the system is
usable. Redundancy improves it.
– Need to maintain consistency
– Need to be secure
– Fault tolerance: need to mask failures, recover
from errors.
Security
 Three components:
 Confidentiality (protection against disclosure to
unauthorized individuals)
 Integrity (protection against alteration or
corruption)
 Availability (protection against interference with
the means of accessing the resources)
 The challenge: sending sensitive information
in a network message in a secure manner
efficiently
Security
 Scenario 1: Accessing exam information via a
network file system
 Authentication: how do we know for sure that the
user is a teacher who should have access to the
data?
 Scenario 2: Sending a credit card number over
the Internet
 Confidentiality: no other than the recipient should
be able to read the data
Hardware concept of DS
 Distributed system may be
multi-processor or multi-computers
Hardware Concepts
 Tightly Coupled versus Loosely
Coupled
 Tightly coupled systems (multiprocessors)
o shared memory
o intermachine delay short, data rate high
 Loosely coupled systems (multicomputers)
o private memory
o intermachine delay long, data rate low
Multicomputers
 Bus-Based Multicomputers
 easy to build
 communication volume much smaller
 relatively slow speed LAN (10-100 MIPS,
compared to 300 MIPS and up for a
backplane bus)
 Switched Multicomputers
 interconnection networks: E.g., grid,
hypercube
 hypercube: n-dimensional cube
11/14/00
Hardware Concepts
1.6
Multi-processor
Multi-computers
Multiprocessors
1.7
Switched Multiprocessors
 for connecting large number (over 64) of
processors
 crossbar switch: n**2 switch points
 omega network: 2x2 switches for n CPUs
and n memories, log n switching stages,
each with n/2 switches,
 total (n log n)/2 switches
 building a large, tightly-coupled, shared
memory multiprocessor is possible, but is
difficult and expensive
11/14/00
Multiprocessors (2)
1.8
Multicomputer Systems
Grid
Hypercube
1-9
Software Concepts
System
Description
Main Goal
DOS
Tightly-coupled operating system for multi-processors
and multicomputers
Tries to maintain
global view of
resources
NOS
Loosely-coupled operating system for multicomputers
(LAN and WAN)
Manages collection of
machines with local
services and OS
Middleware
Additional layer atop of NOS implementing generalpurpose services
Provide distribution
transparency
 DOS (Distributed Operating Systems)
 NOS (Network Operating Systems)
 Middleware
Software Concepts
• Software more important for users
• Three types:
1. Network Operating Systems
2. (True) Distributed Systems
3. Multiprocessor Time Sharing
11/14/00
Uniprocessor Operating
Systems
 Separating applications from operating
system code through a microkernel.
1.11
Network Operating Systems
 loosely-coupled software on loosely-coupled
hardware
 A network of workstations connected by LAN
 each machine has a high degree of autonomy
 Files servers: client and server model
 Clients mount directories on file servers
 Best known network OS:
o Sun’s NFS (network file servers) for shared file
systems
Multicomputer Operating
Systems (1)
 General structure of a multicomputer operating
system
1.14
Positioning Middleware
 General structure of a distributed system as
middleware.
1-22
(True) Distributed Systems
 tightly-coupled software on loosely-coupled
hardware
 provide a single-system image or a virtual
uniprocessor
 a single, global interprocess communication
mechanism, process management, file system;
the same system call interface everywhere
 Ideal definition:
“ A distributed system runs on a collection of computers that
do not have shared memory, yet looks like a single
computer to its users.”
Comparison between Systems
Item
Degree of transparency
Distributed OS
Multiproc.
Multicomp.
Network
OS
Middlewarebased OS
Very High
High
Low
High
Yes
Yes
No
No
Number of copies of OS
1
N
N
N
Basis for communication
Shared
memory
Messages
Files
Model specific
Resource management
Global,
central
Global,
distributed
Per node
Per node
Scalability
No
Moderately
Yes
Varies
Openness
Closed
Closed
Open
Open
Same OS on all nodes
 A comparison between multiprocessor operating
systems, multicomputer operating systems, network
operating systems, and middleware based distributed
systems.
Distributed object architecture
o1
o2
o3
o4
S (o1)
S (o2)
S (o3)
S (o4)
Software bus
o5
o6
S (o5)
S (o6)
A data mining system
Database 1
Integrator 1
Database 2
Report gen.
Visualiser
Integrator 2
Database 3
Display
Data mining system
 The logical model of the system is not
one of service provision where there
are distinguished data management
services
 It allows the number of databases
that are accessed to be increased
without disrupting the system
 It allows new types of relationship to
be mined by adding new integrator
objects