AFS and Napster

Download Report

Transcript AFS and Napster

Other File Systems:
AFS, Napster
Recap
• NFS:
– Server exposes one or more directories
• Client accesses them by mounting the directories
– Stateless server
• Has problems with cache consistency, locking protocol
– Mounting protocol
• Automounting
• P2P File Systems:
– PAST, CFS
– Relies on DHTs for routing
2
Andrew File System (AFS)
• Named after Andrew Carnegie and Andrew Mellon
– Transarc Corp. and then IBM took development of AFS
– In 2000 IBM made OpenAFS available as open source
• Features:
–
–
–
–
–
–
–
Uniform name space
Location independent file sharing
Client side caching with cache consistency
Secure authentication via Kerberos
Server-side caching in form of replicas
High availability through automatic switchover of replicas
Scalability to span 5000 workstations
3
AFS Overview
• Based on the upload/download model
– Clients download and cache files
– Server keeps track of clients that cache the file
– Clients upload files at end of session
• Whole file caching is central idea behind AFS
– Later amended to block operations
– Simple, effective
• AFS servers are stateful
– Keep track of clients that have cached files
– Recall files that have been modified
4
AFS Details
• Has dedicated server machines
• Clients have partitioned name space:
– Local name space and shared name space
– Cluster of dedicated servers (Vice) present shared name space
– Clients run Virtue protocol to communicate with Vice
• Clients and servers are grouped into clusters
– Clusters connected through the WAN
• Other issues:
– Scalability, client mobility, security, protection, heterogeneity
5
AFS: Shared Name Space
• AFS’s storage is arranged in volumes
– Usually associated with files of a particular client
• AFS dir entry maps vice files/dirs to a 96-bit fid
– Volume number
– Vnode number: index into i-node array of a volume
– Uniquifier: allows reuse of vnode numbers
• Fids are location transparent
– File movements do not invalidate fids
• Location information kept in volume-location database
– Volumes migrated to balance available disk space, utilization
– Volume movement is atomic; operation aborted on server crash
6
AFS: Operations and Consistency
• AFS caches entire files from servers
– Client interacts with servers only during open and close
• OS on client intercepts calls, and passes it to Venus
– Venus is a client process that caches files from servers
– Venus contacts Vice only on open and close
• Does not contact if file is already in the cache, and not invalidated
– Reads and writes bypass Venus
• Works due to callback:
– Server updates state to record caching
– Server notifies client before allowing another client to modify
– Clients lose their callback when someone writes the file
• Venus caches dirs and symbolic links for path translation
7
AFS Implementation
• Client cache is a local directory on UNIX FS
– Venus and server processes access file directly by UNIX i-node
• Venus has 2 caches, one for status & one for data
– Uses LRU to keep them bounded in size
8
Napster
• Flat FS: single-level FS with no hierarchy
– Multiple files can have the same name
• All storage done at edges:
– Hosts export set of files stored locally
– Host is registered with centralized directory
• Uses keepalive messages to check for connectivity
– Centralized directory notified of file names exported by the host
• File lookup: client sends request to central directory
– Directory server sends 100 files matching the request to client
– Client pings each host, computes RTT and displays results
– Client transfers files from the closest host
• File transfers are peer-to-peer; central directory not part
9
Napster Architecture
Napster
Directory
Server 1
H1
H2
Firewall
Network
IP
Sprayer/
Redirector
Napster.com
H3
Napster
Directory
Server 2
Napster
Directory
Server 3
10
Napster Protocol
Napster
Directory
Server 1
H1
I have “metallica / enter sandman”
H2
Network
Firewall
IP
Sprayer/
Redirector
Napster.com
H3
Napster
Directory
Server 2
Napster
Directory
Server 3
11
Napster Protocol
Napster
Directory
Server 1
H1
I have “metallica / enter sandman”
H2
Network
“who has metallica ?”
Firewall
IP
Sprayer/
Redirector
“check H1, H2”
Napster.com
H3
Napster
Directory
Server 2
Napster
Directory
Server 3
12
Napster Protocol
Napster
Directory
Server 1
H1
I have “metallica / enter sandman”
H2
ping
ping
Network
“who has metallica ?”
Firewall
IP
Sprayer/
Redirector
“check H1, H2”
Napster.com
H3
Napster
Directory
Server 2
Napster
Directory
Server 3
13
Napster Protocol
Napster
Directory
Server 1
H1
I have “metallica / enter sandman”
H2
ping
ping
Network
“who has metallica ?”
Firewall
IP
Sprayer/
Redirector
“check H1, H2”
transfer
Napster.com
H3
Napster
Directory
Server 2
Napster
Directory
Server 3
14
Napster Discussion
• Issues:
–
–
–
–
Centralized file location directory
Load balancing
Relies on keepalive messages
Scalability an issue!
• Success: ability to create and foster an online community
– Built in ethics
– Built in faults
– Communication medium
• Had around 640000 users in November 2000!
15
Other P2P File Systems
• Napster has a central database!
– Removing it will make regulating file transfers harder
• Freenet, gnutella, kazaa … all are decentralized
• Freenet: anonymous, files encrypted
– So not know which files stored locally, which file searched
• Kazaa: allows parallel downloads
• Torrents for faster download
16