P2P File Systems
Download
Report
Transcript P2P File Systems
Peer-to-Peer Filesystems
Slides originally created by Tom Roeder
Announcements
Welcome back from Spring break!
We are in the final stretch
Homework 4 and Project 4 design doc this week
Homework 5 and 6 due in April
Project 5 and 6 due in April and beginning of May
Prelim II, Thursday April 26th???
Final, Wednesday May 17th
Goals for Today
What is P2P?
P2P File Sharing Systems
Overlays
Case study: Napster
Structured
Unstructured
Distributed hash tables (DHTs)
Hash table ontop of structured overlay
Case studies: PAST, CFS, OpenDHT
Content-Addressable Storage (CAS)
Nature of P2P Systems
P2P: communicating peers in the system
In some sense, P2P is older than the name
normally an overlay in the network
Hot topic because of Napster, Gnutella, etc
many protocols used symmetric interactions
not everything is client-server
What’s the real definition?
no-one has a good one, yet
depends on what you want to fit in the class
Nature of P2P Systems
Standard definition
Minimally: is the Web a P2P system?
symmetric interactions between peers
no distinguished server
We don’t want to say that it is
but it is, under this definition
I can always run a server if I want: no asymmtery
There must be more structure than this
Let’s try again
Nature of P2P Systems
Recent definition
Try again: is the Web P2P?
No distinguished initial state
Each server has the same code
servers cooperate to handle requests
clients don’t matter: servers are the P2P system
No, not under this def: servers don’t interact
Is the Google server farm P2P?
Depends on how it’s set up? Probably not.
Case study: Napster
File Sharing service created by Shawn Fanning in 1999
Flat FS: single-level FS with no hierarchy
Multiple files can have the same name
All storage done at edges:
Hosts export set of files stored locally
Host is registered with centralized directory
Centralized directory notified of file names exported by the host
File lookup: client sends request to central directory
Uses keepalive messages to check for connectivity
Directory server sends 100 files matching the request to client
Client pings each host, computes RTT and displays results
Client transfers files from the closest host
File transfers are peer-to-peer; central directory not part
Napster Architecture
Napster
Directory
Server 1
H1
H2
Firewall
Network
IP
Sprayer/
Redirector
Napster.com
H3
Napster
Directory
Server 2
Napster
Directory
Server 3
Napster Protocol
Napster
Directory
Server 1
H1
I have “metallica / enter sandman”
H2
Network
Firewall
IP
Sprayer/
Redirector
Napster.com
H3
Napster
Directory
Server 2
Napster
Directory
Server 3
Napster Protocol
Napster
Directory
Server 1
H1
I have “metallica / enter sandman”
H2
Network
“who has metallica ?”
Firewall
IP
Sprayer/
Redirector
“check H1, H2”
Napster.com
H3
Napster
Directory
Server 2
Napster
Directory
Server 3
Napster Protocol
Napster
Directory
Server 1
H1
I have “metallica / enter sandman”
H2
ping
ping
Network
“who has metallica ?”
Firewall
IP
Sprayer/
Redirector
“check H1, H2”
Napster.com
H3
Napster
Directory
Server 2
Napster
Directory
Server 3
Napster Protocol
Napster
Directory
Server 1
H1
I have “metallica / enter sandman”
H2
ping
ping
Network
“who has metallica ?”
Firewall
IP
Sprayer/
Redirector
“check H1, H2”
transfer
Napster.com
H3
Napster
Directory
Server 2
Napster
Directory
Server 3
Napster Discussion
Issues:
Centralized file location directory
Load balancing
Relies on keepalive messages
Scalability an issue!
Success: ability to create and foster an online community
Built in ethics
Built in faults
Communication medium
Had upto 40 million users in June 2000!
May actually be lower, 26.4 million by February 2001
Ultimately found to be illegal service
Other P2P File Sharing Systems
Napster has a central database!
Removing it will make regulating file transfers harder
Freenet, gnutella, kazaa … all are decentralized
Freenet: anonymous, files encrypted
So not known which files stored locally, which file searched
Kazaa: allows parallel downloads
(Bit)Torrents for faster download
Legality. Are there any good legal uses for P2P systems?
Overlays
P2P systems possess some degree of self-organization
Two types of overlays connect most P2P systems
Unstructured
each node finds its peers and helps maintain the system structure
No infrastructure set up for routing
Random walks, flood search
Structured
Small World Phenomenon: Kleinberg
Set up enough structure to get fast routing
We will see O(log n)
For special tasks, can get O(1)
Overlays: Unstructured
From Gribble
a common unstructured overlay
look at connectivity
more structure than it seems at first
Overlays: Unstructured
Gossip: state synchronization technique
Convergence of state is reasonably fast
Instead of forced flooding, share state
Do so infrequently with one neighbor at a time
Original insight from epidemic theory
with high probability for almost all nodes
good probabilistic guarantees
Trivial to implement
Saves bandwidth and energy consumption
Overlays: Structured
Need to build up long distance pointers
think of routing within levels of a namespace
eg. namespace is 10 digit numbers base 4
0112032101
then you can hop levels to find other nodes
This is the most common structure imposed
Distributed Hash Tables
One way to do this structured routing
Assign each node each node an id from space
eg. 128 bits: SHA-1 salted hash of IP address
build up a ring: circular hashing
assign nodes into this space
Value
diversity of neighbors
even coverage of space
less chance of attack?
Distributed Hash Tables
Why “hash tables”?
Stored named objects by hash code
Route the object to the nearest location in space
key idea: nodes and objects share id space
How do you find an object without its name?
Cost of churn?
Close names don’t help because of hashing
In most P2P apps, many joins and leaves
Cost of freeloaders?
Distributed Hash Tables
Dangers
Sybil attacks: one node becomes many
id attacks: can place your node wherever
Solutions hard to come by
crytpo puzzles / money for IDs?
Certification of routing and storage?
Many routing frameworks in this spirit
Very popular in late 90s early 00s
Pastry, Tapestry, CAN, Chord, Kademlia
Applications of DHTs
Almost anything that involves routing
illegal file sharing: obvious application
backup/storage
filesystems
P2P DNS
Good properties
O(log N) hops to find an id (but how good is this?)
Non-fate-sharing id neighbors
Random distribution of objects to nodes
Pastry: Node state
Pastry: Node Joins
Find another geographically nearby node
Hash IP address to get Pastry id
Try to route a join message to this id
get routing tables from each hop and dest
select neighborhood set from nearby node
get the leaf set from the destination
Give info back to nodes so they can add you
Assuming the Pastry ring is well set up, this
procedure will give good parameters
Pastry: Node Joins
Consider what happens from node 0
bootstraps itself
next node to come adds itself and adds this node
Neighborhood information will be bad for a while
need a good way to discover network proximity
This is a current research problem
On node leaves, do the reverse
If a node leaves suddenly, must be detected
removal from tables by detecting node
Pastry: Routing
The key idea: grow common prefix
given an object id, try to send to a node with at
least one more digit in common
if not possible, send to a node that is closer
numerically
if not possible, then you are the destination
Gives O(log N) hops
Each step gets closer to destination
Guaranteed to converge
How Does Routing/Lookup Work?
Source
Assign IDs to nodes
Leaf set is successors
and predecessors
Map hash values to node
with closest ID
110…
0…
All that’s needed for
correctness
Routing table matches
successively longer
prefixes
111…
10…
Allows efficient lookups
Lookup ID
PAST: Pastry Filesystem
Now a simple filesystem follows:
Punt on metadata/discovery
to get a file, hash its name and look up in Pastry
to store a file, store it Pastry
Can implement directories as files
Then just need to know the name of root
Shown to give reasonable utilization of
storage space
PAST: File Replication
Since any one node might fail, replicate
Uses the neighbor set for k-way storage
Keeps the same file at each neighbor
Diversity of neighbors helps fate-sharing
Certification
Each node signs a certificate
Says that it stored the file
Client will retry storage if not enough certificates
OK guarantees
PAST: Tradeoffs
No explicit FS structure:
Speed vs. storage
Could build any sort of system by storing files
Basically variable-sized block storage mechanism
This buys simplicity at the cost of optimization
See Beehive for this tradeoff
Makes it an explicit formula; can be tuned
Ease of use vs. security
Hashes make file discovery non-transparent
Rationale and Validation
Backing up on other systems
no fate sharing
automatic backup by storing the file
But
Cost much higher than regular filesystem
Incentives: why should I store your files?
How is this better than tape backup?
How is this affected by churn/freeloaders
Will anyone ever use it?
PAST: comparsion to CFS
CFS: a filesystem built on Chord/DHash
Pastry is MSR, Chord/DHash is MIT
Very similar routing and storage
PAST: comparison to CFS
PAST stores files, CFS blocks
Thus CFS can use more fine-grained space
lookup could be much longer
CFS claims: ftp-like speed
get each block: must go through routing for each
Could imagine much faster: get blocks in parallel
thus routing is slowing them down
Remember: hops here are overlay, not internet, hops
Load balancing in CFS
predictable storage requirements per file per node
DHT Deployment Today
CFS
PAST
(MIT)
(MSR/
Rice)
Chord
DHT
Pastry
DHT
pSearch
(UCB)
Antiquity
(UCB)
Tapestry
DHT
Bamboo
DHT
CAN
DHT
OStore
(HP)
PIER
(UCB)
Coral
i3
(NYU)
(UCB)
Bamboo Kademlia Chord
DHT
DHT
DHT
Every application deploys its own DHT
(DHT as a library)
connectivity
IP
Why so many DHT implementations?
Distributed Hash Table
More generally, provide lookup functionality
Peer-to-peer algorithm to offering put/get interface
Associative map for peer-to-peer applications
Map application-provided hash values to nodes
(Just as local hash tables map hashes to memory locs.)
Put/get then constructed above lookup
Many proposed applications
File sharing, end-system multicast, aggregation trees
DHT Deployment Tomorrow?
CFS
PAST
(MIT)
(MSR/
Rice)
Chord
DHT
Pastry
DHT
pSearch
PIER
(UCB)
Antiquity
(UCB)
(HP)
(UCB)
Tapestry
DHT
Bamboo
DHT
CAN
DHT
Bamboo Kademlia
DHT
DHT
OStore
indirection
Coral
i3
(NYU)
(UCB)
DHT
OpenDHT: one DHT, shared across applications
(DHT as a service)
connectivity
IP
Chord
DHT
Issues
Faster lookup.
Churn
When nodes come and go, some algorithms perform
repairs that involve disruptive overheads
For example, CFS and PAST are at risk of copying tons of
data to maintain replication levels!
Robustness to network partitioning
Cornell’s Kelips, Beehive DHTs achieve 1-hop lookups
Chord, for example, can develop a split brain problem!
Legal uses
Conclusions
Tradeoffs are critical
Why are you using P2P to begin with?
What sort of security/anonymity guarantees?
DHT applications
Think of a good one and become famous
PAST
CFS
caches whole files
Save some routing overhead
Harder to implement true filesystem
Easier space management since stores fixed sized-blocks
But possible higher management overheads
OpenDHT
All DHT’s have same put/get interface, but different implementations
Use same implementation for all applications
References
A. Rowstron and P. Druschel, "Pastry: Scalable, distributed object
location and routing for large-scale peer-to-peer systems".
IFIP/ACM International Conference on Distributed Systems
Platforms (Middleware), Heidelberg, Germany, pages 329-350,
November, 2001.
A. Rowstron and P. Druschel, "Storage management and caching in
PAST, a large-scale, persistent peer-to-peer storage utility", ACM
Symposium on Operating Systems Principles (SOSP'01), Banff,
Canada, October 2001.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and
Hari Balakrishnan, Chord: A Scalable Peer-to-peer Lookup Service
for Internet Applications, ACM SIGCOMM 2001, San Deigo, CA,
August 2001, pp. 149-160.
Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, and
Ion Stoica, Wide-area cooperative storage with CFS, ACM SOSP
2001, Banff, October 2001.
References
Stefan Saroiu, P. Krishna Gummadi, and Steven D. Gribble. A
Measurement Study of Peer-to-Peer File Sharing Systems,
Proceedings of Multimedia Computing and Networking 2002
(MMCN'02), San Jose, CA, January 2002.Kleinberg
C. G. Plaxton, R. Rajaraman, and A. W. Richa. Accessing nearby
copies of replicated objects in a distributed environment. In
Proceedings of the 9th Annual ACM Symposium on Parallel
Algorithms and Architectures, Newport, Rhode Island, pages 311320, June 1997.
Sean Rhea, Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia
Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu. OpenDHT: A
Public DHT Service and Its Uses. Proceedings of ACM SIGCOMM,
Philidelphi, Pennsylvania, pages 73 - 84, August 2005.
Sean Rhea, Patrick Eaton, Dennis Geels, Hakim Weatherspoon,
Ben Zhao, and John Kubiatowicz. Pond: The OceanStore Prototype.
In Proceedings of the 2nd USENIX Conference on File and Storage
Technologies (FAST). San Francisco, California, pages 1 - 14,
March, 2003.