Transcript SHARCnet

Introduction
Outline





Definitions
Examples
Hardware concepts
Software concepts
Readings: Chapter 1
Quotable quote
“I think there is a world market for
maybe five computers”
Thomas J. Watson
Chairman of IBM, 1943
Definition of a Distributed System
(Tannenbaum and van Steen)
A distributed system is a piece of software that
ensures that
A collection of independent computers that
appears to its users as a single coherent
system.
Definition of a Distributed System
(Colouris)
A distributed system is
In which hardware and software components
located at networked computers
communicate and coordinate their actions
only by passing messages.
Definition of a Distributed System
(Lamport)
A distributed system is one in which I cannot get
something done because a machine I've never
heard of is down.
Primary Characteristics of a
Distributed System




Multiple computers
 Concurrent execution
 Independent operation and failures
Communications
 Ability to communicate
 No tight synchronization
Relatively easy to expand or scale
Transparency
Example: A typical intranet
(Coulouris)
email s erv er
Desktop
computers
print and other servers
Web server
Local area
netw ork
email s erv er
File s erv er
print
other servers
the rest of
the Internet
router/firew all
Example: A typical portion of the
Internet (Coulouris)
intranet
%
ISP
%
%
%
backbone
satellite link
desktop computer:
server:
network link:
Example: Portable and handheld
devices in a distributed system
(Coulouris)
Internet
Host intranet
WAP
gatew ay
Wireles s LAN
Mobile
phone
Laptop
Printer
Camera
Host site
Home intranet
Why Build Distributed Systems?



Economics
 Share resources
 Relatively easy to expand or scale
 Speed – A distributed system may have more total
computing power then a mainframe.
Reliability
 If a machine crashes, the system as a whole can survive.
People are distributed, information is distributed.
Key Design Goals







Connectivity
Transparency
Reliability
Consistency
Security
Openness
Scalability
Connectivity



It should be easy for users to access remote resources and to
share them with other users in a controlled fashion.
Resources that can be shared include printers, storage
facilities, data, files, web pages, etc;
 Why? Economical
Connecting users and resources makes collaboration and the
exchange of information easier.
 Just look at e-mail
Transparency



A distributed system that is able to present itself to users and
applications as if it were only a single computer system is said
to be transparent.
Very difficult to make distributed systems completely
transparent.
You may not want to, since transparency often comes at the
cost of performance.
Transparency in a Distributed System
Transparency
Description
Access
Hide differences in data representation and how a resource is accessed
Location
Hide where a resource is located
Migration
Hide that a resource may move to another location
Relocation
Hide that a resource may be moved to another location while in use
Replication
Hide that a resource may be shared by several competitive users
Concurrency
Hide that a resource may be shared by several competitive users
Failure
Hide the failure and recovery of a resource
Persistence
Hide whether a (software) resource is in memory or on disk
Different forms of transparency in a distributed system.
Degree of Transparency




The goal of full transparency is not always desirable.
Users may be located in different continents; distribution is apparent and
not something you want to hide.
Completely hiding failures of networks and nodes is (theoretically and
practically) impossible:
 You cannot distinguish a slow compuer from a failing one.
 You can never be sure that a server actually performed an operation
before a crash.
Full transparency will cost in performance.
 Keeping Web caches exactly up-to-date with the master copy
 Immediately flushing write operations to disk for fault tolerance.
Openness


An open distributed system allows for interaction with
services from other open systems, irrespectively of the
underlying environment.
 Systems should conform to well-defined interfaces.
 Systems should support portability of applications.
 Systems should easily interoperate. Interoperability is
characterized by the extent by which two implementations
of systems or components from different manufacturers
can co-exist and work together.
Example: In computer networks there are rules that govern the
format, contents and meaning of messages send and received.
Scalability


There are three dimensions to scalability:
 The number of users and processes (size scalability)
 The maximum distance between nodes (geographical
scalability)
 The number of administrative domains (administrative
scalability)
Most systems focus on administrative size scalability
 Well sort of – the typical solution is to buy powerful
servers.
Techniques for Scaling



Partition data and computations across multiple machines
 Move computations to clients (Java applets)
 Decentralized naming services (DNS)
 Decentralized information systems (WWW)
Make copies of data available at different machines
 Replicated file servers (for fault tolerance)
 Replicated databases
 Mirrored web sites
Allow client processes to access local copies
 Web caches (browser/Web proxy)
 File caching (at server and client)
Scaling – The problem


Applying scaling techniques is easy, except for the following:
 Having multiple copies (cached or replicated) leads to
inconsistencies – modifying one copy makes that copy
different from the rest.
 Always keeping copies consistent requires global
synchronization.
 Global synchronization is expensive with respect to
performance.
We have learned to tolerate some inconsistencies.
Challenges

Heterogeneity
 Networks
 Hardware
 Operating systems
 Programming languages
Challenges

Failure Handling
 Partial failures
Can non-failed components continue operation?
Can the failed components easily recover?
 Detecting failures
 Recovery
 Replication


Hardware Concepts



Multiprocessors
Multicomputers
Networks of computers
Multiprocessors and
Multicomputers
1.6
Different basic organizations and memories in distributed
computer systems
Multiprocessors



Coherent memory
 Each CPU writes though, reflected at other immediately
Note the use of cache memory for efficiency
Limited to small number of processors
Homogeneous Multicomputer
Systems
•
•
Grid
Hypercube
1-9
Multiprocessor Usage
•
Scientific and engineering applications often require loops
over large vectors e.g., matrix elements or points in a grid or
3D mesh. Applications include
•
•
•
•
•
•
computational fluid dynamics
dynamic structures
scheduling (airline)
health and biological modeling
economics and financial modelling (e.g., option pricing)
Other applications include:
•
•
Transaction processing
Video on demand
Networks of Computers




High degree of node heterogeneity
 Nodes include PCs, workstations, multimedia
workstations, palmtops, laptops
High degree of network heterogeneity
 This includes local-area ethernet, Atm and wireless
connections.
A distributed system should try to hide these differences.
In this course, the focus really is in networks of computers.
Software Concepts
System
Description
Main Goal
DOS
Tightly-coupled operating system for multi-processors and
homogeneous multicomputers
Hide and manage
hardware resources
NOS
Loosely-coupled operating system for heterogeneous
multicomputers (LAN and WAN)
Offer local services to
remote clients
Middleware
Additional layer atop of NOS implementing general-purpose
services
Provide distribution
transparency




An overview between
DOS (Distributed Operating Systems)
NOS (Network Operating Systems)
Middleware
Distributed Operating System
OS on each computer knows about the other computers
OS on different computer is generally the same
Services are generally (transparently) distributed across computers.
1.14
Distributed Operating System


This is harder to implement then a traditional operating
system. Why?
 Memory is not shared
 No simple global communication
 No simple systemwide synchronization mechanisms
 May require that OS maintain global memory map in
software.
 No central point where resource allocation decisions can
be made.
Only very few truly multicomputer operating systems exist.
Network Operating System




Each computer has its own operating system with networking
facilities
Computers work independently i.e., they may even have
different operating systems
Services are tied to individual nodes (ftp, telnet, www)
Highly file oriented
Middleware
OS on each computer need not know about the other computers
OS on different computers may be different
Services are generally (transparently) distributed across computers.
Middleware and Openness
1.23

In an open middleware-based distributed system, the protocols used by
each middleware layer should be the same, as well as the interfaces they
offer to applications.
Middleware Services


Communication Services
 Hide “primitive” socket programming
Data management in a distributed system
 Naming services
 Directory services (e.g., LDAP, search engines)
 Location services for tracking mobile objects
 Persistent storage facilities
 Data caching and replication
Middleware Services



Services giving applications control over when, where and
how they access data.
 Distributed transaction processing
 Code migration
Services for securing processing and communication:
 Authentication and authorization services
 Simple encryption services
 Auditing services
There are varying levels of success in being able to provide
these types of middleware services.