Lecture1-intro - SFU computer science

Download Report

Transcript Lecture1-intro - SFU computer science

Distributed Systems
CMPT 431 2008
Dr. Alexandra Fedorova
What is a Distributed System?
• Coulouris, et al:
– communicate and coordinate their actions only by
passing messages
CMPT 431 © A. Fedorova
2
What is a Distributed System?
• Andrew Tanenbaum:
– A collection of independent computers that appear to
the users as a single coherent system
– autonomous computers connected by a network
– software specifically designed to provide an integrated
computing facility
CMPT 431 © A. Fedorova
3
What is a Distributed System?
• Leslie Lamport:
– “You know you have a distributed system when the
crash of a computer you’ve never heard of stops you
from getting any work done.”
CMPT 431 © A. Fedorova
4
What is a Distributed System?
• A broader definition:
– A collection of processors executing independent
instruction streams that communicate and synchronize
their actions
– Communication may be done via messages or shared
memory
– Includes multi-process and multithreaded programs
running on a monolithic multiprocessor hardware
CMPT 431 © A. Fedorova
5
Distributed System: the Internet
©Pearson Education 2001
CMPT 431 © A. Fedorova
6
Distributed System: an Intranet
©Pearson Education 2001
CMPT 431 © A. Fedorova
7
Distributed System: Mobile Devices
©Pearson Education 2001
CMPT 431 © A. Fedorova
8
Other Examples of Distributed Systems
• Distributed Multimedia Systems
– Teleconferencing
– Distance learning
•
•
•
•
Cellular phone systems
IP Telephony
Flight management system in an aircraft
Automotive control systems (50+ embedded processors in a
Mercedes S-class)
• Distributed file systems (NFS, Samba)
• P2P file sharing
• The World Wide Web
CMPT 431 © A. Fedorova
9
Reasons for Distributing Systems I
• The need to share data across remote geographies
– Online Encyclopedia Britannica is accessed by users all over the world
– Computer users in different geographies send messages to each other
• Replication of processing power
– Independent processors working on the same task
– Distributed systems consisting of collections of microcomputers may have
processing power of a large supercomputer
• Use of heterogeneous components
– Compute-intensive sub-tasks of a problem are run on powerful computers
– Less resource-demanding sub-tasks run on less powerful computers
– More efficient use of resources
CMPT 431 © A. Fedorova
10
Reasons for Distributing Systems II
• Cost of hardware and management
– A collection of cheap computers may be less expensive than one large
supercomputer
– Small simple computers may be easier to manage than one large one
• Administrative/functional issues
– Payroll database is separate from registrar’s database
– Each is managed according to the needs of the organization
– Each is equipped with hardware that answers the needs of an organization
• Resilience to failures
– If one component fails, others can proceed with work on the task
• Scalability
– The system can be extended by adding more components (i.e., WWW)
CMPT 431 © A. Fedorova
11
Properties of Distributed Systems
• Heterogeneity
– Systems consist of heterogeneous hardware and software
components
• Concurrency
– Multiple programs run together
• Shared data
– Data is accessed simultaneously by multiple entities
• No global clock
– Each component has a local notion of time
• Interdependencies
– Independent components depend on each other
CMPT 431 © A. Fedorova
12
Challenges of DS: Heterogeneity
• Different network infrastructures (Ethernet, 802.11 – wireless)
• Hardware and software (e.g., operating systems, processors):
how can an Intel/Windows system understand messages sent
by an Macintosh OS X system?
• Programming languages – how can a Java program and a C
program communicate?
CMPT 431 © A. Fedorova
13
Challenges of DS: Security
• Shared data must be protected
– Privacy – avoid unintentional disclosure of private data
– Security – data is not revealed to unauthorized parties
– Integrity – protect data and system state from
corruption
• Denial of service attacks – put significant load on
the system, prevent users from accessing it
CMPT 431 © A. Fedorova
14
Challenges of DS: Synchronization
• Concurrent cooperating tasks need to synchronize
– When accessing shared data
– When performing a common task
• Synchronization must be done correctly to prevent data corruption:
– Example: two account owners; one deposits the money, the other one
withdraws; they act concurrently
– How to ensure that the bank account is in “correct” state after these
actions?
• Synchronization implies communication
• Communication can take a long time
• Excessive synchronization can limit effectiveness and scalability of a
distributed system
CMPT 431 © A. Fedorova
15
Challenges of DS: Absence of Global Clock
•
•
•
•
•
Cooperating task need to agree on the order of events
Each task has its own notion of time
Clocks cannot be perfectly synchronized
How to determine which even occurred first?
Example: Bank account, starting balance = $100
– Client at bank machine A makes a deposit of $100
– Client at bank machine B makes a withdrawal of $150
– Which event happened first?
– Should the bank charge the overdraft fee?
CMPT 431 © A. Fedorova
16
Challenges of DS: Partial Failures
• Detection of failures – may be impossible
– Has a component crashed? Or is it just slow?
– Is the network down? Or is it just slow?
– If it’s slow – how long should we wait?
• Handling of failures
– Retransmission
– Tolerance for failures
– Roll back partially completed task
• Redundancy against failures
– Duplicate network routes
– Replicated databases
CMPT 431 © A. Fedorova
17
Challenges of DS: Scalability
• Does the system remain effective as if grows?
• As you add more components:
– More synchronization
– More communication –> the system runs slowly.
• Avoiding performance bottlenecks:
– Everyone is waiting for a single shared resource
– In a centrally coordinated system, everyone waits for
the coordinator
CMPT 431 © A. Fedorova
18
Challenges of DS: Transparency I
• Concealing the heterogeneous and distributed nature of
the system so that it appears to the user like one system
• Transparency categories
– Access: access local and remote resources using identical
operations (NFS or Samba-mounted file systems)
– Location: access without knowledge of location of a resource
(URL’s, e-mail)
– Concurrency: allow several processes to operate concurrently
using shared resources in a consistent fashion (two users
simultaneously accessing the bank account)
CMPT 431 © A. Fedorova
19
Challenges of DS: Transparency II
• Transparency categories (continued)
– Replication: use replicated resource as if there were just one
instance
– Failure: allow programs to complete their task despite failures
– Mobility: allow resources to move around
– Performance: adaption of the system to varying load situations
without the user noticing it
– Scaling: allow system and applications to expand without need to
change structure of applications or algorithms
CMPT 431 © A. Fedorova
20
Course Objective
• Comprehensive introduction to distributed systems
• Theoretical aspects: models and architectures of
distributed systems
• Understand challenges in building distributed systems
• Practical aspect: implement advanced distributed systems
• Read latest research papers addressing distributed
systems
CMPT 431 © A. Fedorova
21
Topics Studied I
• Architecture models of distributed systems (client-server,
P2P, etc.)
• Operating system support: processes and threads,
synchronization and mutual exclusion
• Inter-process communication
• Distributed objects and remote invocation
• Distributed file systems
• Time and Global Clocks
• Coordination and agreement
CMPT 431 © A. Fedorova
22
Topics Studied II
•
•
•
•
•
•
Transactions and concurrency control
Replication
Distributed multimedia systems
Peer-to-Peer systems
Mobile and ubiquitous computing
Biologically inspired distributed systems
CMPT 431 © A. Fedorova
23
Course Structure
• Reading a textbook
– Reading is assigned for every class
– To be done before the lecture
– Midterm and final will test reading
• Programming assignments and a project
– Challenging assignments
– Require strong programming skills
– First assignment is in C; the rest are either C or Java
• Reading research papers
– Takes time
– Submit summaries of assigned articles
CMPT 431 © A. Fedorova
24
Programming Assignments
• Assignment #1 (already posted) – due January 29
– Solve a synchronization problem
– Multithreaded programming in C, using pthreads library and
mutexes
• Assignment #2 – due March 3
– Implement a distributed file system with a transactional interface
– Use C, C++ or Java
CMPT 431 © A. Fedorova
25
Grading
•
•
•
•
•
One midterm exam: 20%
Two homework assignments: 30%
Project assignment: 20%
Paper summaries: 10%
Final exam: 20%
CMPT 431 © A. Fedorova
26
Course Web Site
• Linked at
http://www.cs.sfu.ca/CC/index-by-course.html
• Visit often
• Contains:
–
–
–
–
Syllabus
Assignments
Deadlines
Instructor office hours and office location
CMPT 431 © A. Fedorova
27