Chord: A Distributed P2P Network

Download Report

Transcript Chord: A Distributed P2P Network

Chord: A Distributed
P2P Network
Spencer Sevilla
CMPE 252A
Goals of Chord
• Chord is a distributed overlay network
• Chord stores key-value mappings
• For a given key, Chord returns the
value(s)
R. Schlegel and D. S. Wong, “Anonymous overlay network supporting authenticated routing,” Information
Sciences vol. 210
Prior Work
• DNS highly structured and centralized
• Freenet/Ohaha is anonymous but
unreliable
• Globe exploits network-locality, but
enforces a hierarchy on the overlay
nodes
• CAN scales poorly
Chord Overview
• Any node joining a Chord is given a key
• This key is an m-bit hash of its IP
address
• Nodes arrange themselves into a 2
m
Identifier Space ring based on this key
• This hash function must be uniform
Joining A Chord
• A node must start with the IP address of
at least one node already in the Chord
• Node generates its key and goes
around the Chord until it finds its
successor
• Last, it tells its predecessor to update
their pointer to it
Storing Objects
• First, hash the object’s key to create
another m-bit identifier
• The responsible node for this object is
the first one that comes after it in the
ring
• This node stores the key-value pairing
log(N) Lookups
• We don’t want to go all around the
Chord every single time we do a
lookup: O(N)
• Nodes should cache the address of
other nodes to improve performance
• This caching is based on adding
powers-of-two from the node’s starting
ID
Chord Properties
• “Finger table” ensures log(n) lookups
• Uniform hash function, law of large
numbers ensures load-balancing
• When a new node joins the Chord,
average 1/N entries transferred to that
node
• It is assumed that O >> N
Results: LoadBalancing
• Excellent in a large-enough network
(10^4)
• In smaller networks, the mean value is
1/N but the variance is much too high!
• Variance due to the size of the “bins”
• Solution: add “virtual nodes” to the
network until it is sufficiently large
Results: Continued
• Path-length: proven and demonstrated
to be log(n), holds for both small and
large n
• If p fraction of nodes fail, then p fraction
of the queries will as well: no lookupfailures!
• In a real-world topology across the
entire USA, latency ranges from 180280ms
Considerations
• Significant overhead when
joining/leaving the network or changing
hash
• Chord assumes a fully-connected
underlay network where bandwidth not
a concern
• No guarantees made for lookups during
these stabilization or mobility periods
Conclusions
• Chord efficiently locates items in a
distributed/decentralized system
• Chord offers a powerful primitive to
support other systems (NFS, DNS, ...)
• Chord is highly scalable, failure tolerant,
is remarkably simple, and is provably
correct
?
//