Lecture21 - Computer Science
Download
Report
Transcript Lecture21 - Computer Science
Advanced Operating Systems - Spring 2009
Lecture 21 – Monday April 6st, 2009
Dan C. Marinescu
Email: [email protected]
Office: HEC 439 B.
Office hours: M, Wd 3 – 4:30 PM.
TA: Chen Yu
Email: [email protected]
Office: HEC 354.
Office hours: M, Wd 1.00 – 3:00 PM.
1
Last, Current, Next Lecture
Last time:
Distributed File System
Today
Andrew File System
Network and Distributed Operating Systems
Multiple Access Networks
Next time:
Interconnection Networks
2
Stateless vs. stateful servers
Stateless service:
longer request messages
slower request processing
additional constraints imposed on DFS design
Some environments require stateful service
A server employing server-initiated cache validation cannot
provide stateless service, since it maintains a record of which files
are cached by which clients
UNIX use of file descriptors and implicit offsets is inherently
stateful; servers must maintain tables to map the file descriptors
to inodes, and store the current offset within a file
File Replication
Replicas reside on failure-independent machines
Improves availability and can shorten service time
Naming scheme maps a replicated file name to a particular replica
Existence of replicas should be invisible to higher levels
Replicas must be distinguished from one another by different lower-level names
Updates an update to any replica must be reflected on all replicas
Demand replication – reading a nonlocal replica causes it to be cached
locally, thereby generating a new nonprimary replica.
Andrew File System (AFS)
AFS tries address issues such as:
uniform name space
location-independent file sharing
client-side caching (with cache consistency)
secure authentication (via Kerberos)
server-side caching (via replicas)
high availability
scalability can span 5,000 workstations
History
A distributed computing environment developed since 1983 at CMU
Purchased by IBM and released as Transarc DFS,
Now open source OpenAFS
AFS (cont’d)
Clusters
clients and servers form clusters interconnected by a backbone LAN
a cluster several workstations and a cluster server connected to the backbone by
a router
Clients see a partitioned space of file names:
a local name space and
a shared name space
the local name space is the root file system of a workstation, from which the shared
name space descends
Servers collectively are responsible for the storage and management of the
shared name space.
Opening a file causes it to be cached, in its entirety, on the local disk
AFS (cont’d)
Vice dedicated servers present the shared name space to the clients as an
homogeneous, identical, and location transparent file hierarchy.
Workstations
run the Virtue protocol to communicate with Vice
are required to have local disks where they store their local name space
Andrew’s volumes small units associated with the files of a single client
fid (96 bits) identifies a Vice file or directory; three components:
volume number
vnode number – index into an array containing the inodes of files in a single volume
uniquifier – allows reuse of vnode numbers, thereby keeping certain data structures,
compact
Fids are location transparent; therefore, file movements from server to server do not
invalidate cached directory contents
Location information
kept on a volume basis
replicated on each server
File Operations in AFS
Andrew caches entire files form servers.
A client workstation interacts with Vice servers only during opening and
closing of files
A component called Venus
caches files from Vice when they are opened, and stores modified copies of
files back when they are closed
Caches contents of directories and symbolic links, for path-name translation
Venus manages two separate caches:
Status cache kept in virtual memory to allow rapid servicing of stat
(file status returning) system calls
Data cache on the local disk; the UNIX I/O buffering mechanism does
some caching that are transparent to Venus
LRU algorithm used to keep each of them bounded in size
Exceptions to the caching policy are modifications to directories that are
made directly on the server responsibility for that directory
Reading and writing to a file done by the kernel without Venus
intervention on the cached copy
AFS implementation
Client processes are interfaced to a UNIX kernel with the usual set of
system calls
Venus carries out path-name translation component by component
The UNIX file system is used as a low-level storage system for both servers
and clients
The client cache is a local directory on the workstation’s disk
Both Venus and server processes access UNIX files directly by their inodes
to avoid the expensive path name-to-inode translation routine
Distributed systems
Distributed system collection of heterogeneous systems (different
processor architecture, OS, libraries, applications) linked to each other by
an interconnection network.
Communication message passing.
Advantages of distributed systems
Resource sharing better utilization of resources
Fault-tolerance systems fail independently, increase redundancy
Scalability the system can grow in time
Supports collaborative environments in enterprise computing, engineering
(e.g., CAD systems), science (e.g., GRID), etc.
Problems
Resource management more difficult.
Hard to manage autonomous systems
New services necessary e.g., resource discovery
Security
Harder to construct distributed applications
Distributed system architecture
Service-oriented architectures set of services provided by
autonomous service providers. Based upon
client-server paradigm and
request-response communication
GRID, semantic Web
User-Coordinator-Executor architecture multiple sites provide
computing resources ; the coordinator acts as an agent of the user and
starts applications at participating sites and then monitors the
execution. Potential use in high performance computing.
Peer-to-peer architectures the systems function simultaneously as
client and server
Service oriented-distributed systems
Autonomous vs non-autonomous systems
Autonomous systems
Resources of individual systems controlled by
the local operating systems.
Often in distinct administrative domains.
Open system new resources added or removed continually
Scalable. No one tries to maintain common state
Non-autonomous systems Resources controlled by a
Network operating system Users are aware of multiplicity of machines.
Access to resources of various systems done explicitly by:
Remote logging into the appropriate remote machine (telnet, ssh)
Remote Desktop (Microsoft Windows)
Transferring data from remote machines to local machines, via the File Transfer
Protocol (FTP) mechanism
Distributed Operating System Users not aware of multiplicity of machines
Access to remote resources similar to access to local resources
Common state.
Data Migration – transfer data
Computation Migration – transfer the computation, rather than data
Process migration
Possible only if a homogeneous architecture.
Load balancing.
Synchronization across the network.
Coordination