Lecture21 - Computer Science

Download Report

Transcript Lecture21 - Computer Science

Advanced Operating Systems - Spring 2009
Lecture 21 – Monday April 6st, 2009
 Dan C. Marinescu
 Email: [email protected]
 Office: HEC 439 B.
 Office hours: M, Wd 3 – 4:30 PM.
 TA: Chen Yu
 Email: [email protected]
 Office: HEC 354.
 Office hours: M, Wd 1.00 – 3:00 PM.
1
Last, Current, Next Lecture
 Last time:
Distributed File System
 Today
 Andrew File System
 Network and Distributed Operating Systems
 Multiple Access Networks
 Next time:
 Interconnection Networks
2
Stateless vs. stateful servers
 Stateless service:
 longer request messages
 slower request processing
 additional constraints imposed on DFS design
 Some environments require stateful service
 A server employing server-initiated cache validation cannot
provide stateless service, since it maintains a record of which files
are cached by which clients
 UNIX use of file descriptors and implicit offsets is inherently
stateful; servers must maintain tables to map the file descriptors
to inodes, and store the current offset within a file
File Replication
 Replicas reside on failure-independent machines
 Improves availability and can shorten service time
 Naming scheme maps a replicated file name to a particular replica
 Existence of replicas should be invisible to higher levels
 Replicas must be distinguished from one another by different lower-level names
 Updates  an update to any replica must be reflected on all replicas
 Demand replication – reading a nonlocal replica causes it to be cached
locally, thereby generating a new nonprimary replica.
Andrew File System (AFS)
 AFS tries address issues such as:
 uniform name space
 location-independent file sharing
 client-side caching (with cache consistency)
 secure authentication (via Kerberos)
 server-side caching (via replicas)
 high availability
 scalability can span 5,000 workstations
 History
 A distributed computing environment developed since 1983 at CMU
 Purchased by IBM and released as Transarc DFS,
 Now open source  OpenAFS
AFS (cont’d)
 Clusters
 clients and servers form clusters interconnected by a backbone LAN
 a cluster  several workstations and a cluster server connected to the backbone by
a router
 Clients see a partitioned space of file names:
 a local name space and
 a shared name space
 the local name space is the root file system of a workstation, from which the shared
name space descends
 Servers collectively are responsible for the storage and management of the
shared name space.
 Opening a file causes it to be cached, in its entirety, on the local disk
AFS (cont’d)
 Vice  dedicated servers present the shared name space to the clients as an
homogeneous, identical, and location transparent file hierarchy.
 Workstations
run the Virtue protocol to communicate with Vice
 are required to have local disks where they store their local name space

 Andrew’s volumes  small units associated with the files of a single client
 fid (96 bits) identifies a Vice file or directory; three components:
 volume number
 vnode number – index into an array containing the inodes of files in a single volume
 uniquifier – allows reuse of vnode numbers, thereby keeping certain data structures,
compact
 Fids are location transparent; therefore, file movements from server to server do not
invalidate cached directory contents
 Location information
kept on a volume basis
 replicated on each server

File Operations in AFS
 Andrew caches entire files form servers.
 A client workstation interacts with Vice servers only during opening and
closing of files
 A component called Venus



caches files from Vice when they are opened, and stores modified copies of
files back when they are closed
Caches contents of directories and symbolic links, for path-name translation
Venus manages two separate caches:
 Status cache  kept in virtual memory to allow rapid servicing of stat
(file status returning) system calls
 Data cache  on the local disk; the UNIX I/O buffering mechanism does
some caching that are transparent to Venus
 LRU algorithm used to keep each of them bounded in size
 Exceptions to the caching policy are modifications to directories that are
made directly on the server responsibility for that directory
 Reading and writing to a file  done by the kernel without Venus
intervention on the cached copy
AFS implementation
 Client processes are interfaced to a UNIX kernel with the usual set of
system calls
 Venus carries out path-name translation component by component
 The UNIX file system is used as a low-level storage system for both servers
and clients
 The client cache is a local directory on the workstation’s disk
 Both Venus and server processes access UNIX files directly by their inodes
to avoid the expensive path name-to-inode translation routine
Distributed systems
 Distributed system  collection of heterogeneous systems (different
processor architecture, OS, libraries, applications) linked to each other by
an interconnection network.
 Communication  message passing.
 Advantages of distributed systems
 Resource sharing  better utilization of resources
 Fault-tolerance  systems fail independently, increase redundancy
 Scalability  the system can grow in time
 Supports collaborative environments in enterprise computing, engineering
(e.g., CAD systems), science (e.g., GRID), etc.
 Problems
 Resource management more difficult.


Hard to manage autonomous systems
New services necessary e.g., resource discovery
 Security
 Harder to construct distributed applications
Distributed system architecture
 Service-oriented architectures  set of services provided by
autonomous service providers. Based upon
 client-server paradigm and
 request-response communication
 GRID, semantic Web
 User-Coordinator-Executor architecture  multiple sites provide
computing resources ; the coordinator acts as an agent of the user and
starts applications at participating sites and then monitors the
execution. Potential use in high performance computing.
 Peer-to-peer architectures  the systems function simultaneously as
client and server
Service oriented-distributed systems
Autonomous vs non-autonomous systems
 Autonomous systems
 Resources of individual systems controlled by
the local operating systems.
 Often in distinct administrative domains.
 Open system  new resources added or removed continually
 Scalable. No one tries to maintain common state
 Non-autonomous systems Resources controlled by a
 Network operating system Users are aware of multiplicity of machines.

Access to resources of various systems done explicitly by:



Remote logging into the appropriate remote machine (telnet, ssh)
Remote Desktop (Microsoft Windows)
Transferring data from remote machines to local machines, via the File Transfer
Protocol (FTP) mechanism
 Distributed Operating System  Users not aware of multiplicity of machines


Access to remote resources similar to access to local resources
Common state.

Data Migration – transfer data

Computation Migration – transfer the computation, rather than data
Process migration
 Possible only if a homogeneous architecture.
 Load balancing.
 Synchronization across the network.
 Coordination