Transcript chord

Chord: A Scalable Peer-toPeer Lookup Service for
Internet Applications
Ion Stoica
David Liben-Nowell
M. Frans Kaashoek
Robert Morris
David R. Karger
Frank Dabek
Hari Balakrishnan
Presented By-
Manan Rawal, Bhaskar Gupta
Introduction



Efficient lookup of a node which stores data
items for a particular search key.
Provides only one operation: given a key, it
maps the key onto a node.
Example applications:




Co-operative Mirroring
Time-shared storage
Distributed indexes
Large-Scale combinatorial search
Problems addressed





Load Balance: Distributed hash function
spreads keys evenly over the nodes
Decentralization: Fully distributed
Scalability: Lookup grows as a log of number
of nodes
Availability: Automatically adjusts internal
tables to reflect changes.
Flexible Naming: No constraints on key
structure.
Chord Protocol




Assumes communication in underlying
network is both symmetric and transitive.
Assigns keys to nodes with consistent
hashing
Hash function balances the load
When Nth node joins or leaves only O(1/N)
fraction of keys moved.
Chord protocol




Consistent hashing function assigns each
node and key an m-bit identifier using SHA-1
base hash function.
Node’s IP address is hashed.
Identifiers are ordered on a identifier circle
modulo 2m called a chord ring.
succesor(k) = first node whose identifier is >=
identifier of k in identifier space.
Chord protocol
m=6
10 nodes
Theorem
For any set of N nodes and K keys, with
high probability:

1.
2.
Each node is responsible for at most (1+e)K/N
keys.
When an (N+1)st node joins or leaves the
network, responsibility for O(K/N) keys changes
hands.
e = O(log N)
Simple Key Location Scheme
N1
N8
K45
N48
N14
N42
N38
N32
N21
Scalable Lookup Scheme
N1
Finger Table for N8
N56
N8
N51
finger 6
N48
finger 1,2,3
N8+1
N14
N8+2
N14
N8+4
N14
N8+8
N21
N8+16
N32
N8+32
N42
N14
finger 5
N42
finger 4
N38
N32
k-1
m
N21 finger [k] = first node that succeeds (n+2 )mod2
Scalable Lookup Scheme
// ask node n to find the successor of id
n.find_successor(id)
if (id belongs to (n, successor])
return successor;
else
n0 = closest preceding node(id);
return n0.find_successor(id);
// search the local table for the highest predecessor of id
n.closest_preceding_node(id)
for i = m downto 1
if (finger[i] belongs to (n, id))
return finger[i];
return n;
Lookup Using Finger Table
N1
lookup(54)
N56
N8
N51
N48
N14
N42
N38
N32
N21
Scalable Lookup Scheme


Each node forwards query at least halfway
along distance remaining to the target
Theorem: With high probability, the number of
nodes that must be contacted to find a
successor in a N-node network is O(log N)
Dynamic Operations and Failures
Need to deal with:




Node Joins and Stabilization
Impact of Node Joins on Lookups
Failure and Replication
Voluntary Node Departures
Node Joins and Stabilization

Node’s successor pointer should be up to
date


For correctly executing lookups
Each node periodically runs a “Stabilization”
Protocol

Updates finger tables and successor pointers
Node Joins and Stabilization

Contains 6 functions:






create()
join()
stabilize()
notify()
fix_fingers()
check_predecessor()
Create()

Creates a new Chord ring
n.create()
predecessor = nil;
successor = n;
Join()


Asks m to find the immediate successor of n.
Doesn’t make rest of the network aware of n.
n.join(m)
predecessor = nil;
successor = m.find_successor(n);
Stabilize()


Called periodically to learn about new nodes
Asks n’s immediate successor about successor’s predecessor p


Checks whether p should be n’s successor instead
Also notifies n’s successor about n’s existence, so that successor
may change its predecessor to n, if necessary
n.stabilize()
x = successor.predecessor;
if (x  (n, successor))
successor = x;
successor.notify(n);
Notify()

m thinks it might be n’s predecessor
n.notify(m)
if (predecessor is nil or m  (predecessor, n))
predecessor = m;
Fix_fingers()

Periodically called to make sure that finger table entries are
correct


New nodes initialize their finger tables
Existing nodes incorporate new nodes into their finger tables
n.fix_fingers()
next = next + 1 ;
if (next > m)
next = 1 ;
finger[next] = find_successor(n + 2next-1);
Check_predecessor()

Periodically called to check whether
predecessor has failed

If yes, it clears the predecessor pointer, which can
then be modified by notify()
n.check_predecessor()
if (predecessor has failed)
predecessor = nil;
Theorem 3

If any sequence of join operations is
executed interleaved with stabilizations, then
at some time after the last join the successor
pointers will form a cycle on all nodes in the
network
Stabilization Protocol


Guarantees to add nodes in a fashion to
preserve reach ability
By itself won’t correct a Chord system that
has split into multiple disjoint cycles, or a
single cycle that loops multiple times around
the identifier space
Impact of Node Joins on Lookups

Correctness

If finger table entries are reasonably current


If successor pointers are correct but finger tables
are incorrect


Lookup finds the correct successor in O(log N) steps
Correct lookup but slower
If incorrect successor pointers

Lookup may fail
Impact of Node Joins on Lookups

Performance

If stabilization is complete


Lookup can be done in O(log N) time
If stabilization is not complete

Existing nodes finger tables may not reflect the new nodes


Doesn’t significantly affect lookup speed
Newly joined nodes can affect the lookup speed, if the new
nodes ID’s are in between target and target’s predecessor

Lookup will have to be forwarded through the intervening nodes,
one at a time
Theorem 4

If we take a stable network with N nodes with
correct finger pointers, and another set of up
to N nodes joins the network, and all
successor pointers (but perhaps not all finger
pointers) are correct, then lookups will still
take O(log N) time with high probability
Failure and Replication


Correctness of the protocol relies on the fact
of knowing correct successor
To improve robustness



Each node maintains a successor list of ‘r’ nodes
This can be handled using modified version of
stabilize procedure
Also helps higher-layer software to replicate data
Theorem 5

If we use successor list of length r = O(log N)
in a network that is initially stable, and then
every node fails with probability ½, then with
high probability find_successor returns the
closest living successor to the query key
Theorem 6

In a network that is initially stable, if every
node fails with probability ½, then the
expected time to execute find_successor is
O(log N)
Voluntary Node Departures


Can be treated as node failures
Two possible enhancements


Leaving node may transfers all its keys to its
successor
Leaving node may notify its predecessor and
successor about each other so that they can
update their links
Conclusion


Efficient location of the node that stores a desired
data item is a fundamental problem in P2P networks
Chord protocol solves it in a efficient decentralized
manner




Routing information: O(log N) nodes
Lookup: O(log N) nodes
Update: O(log2 N) messages
It also adapts dynamically to the topology changes
introduced during the run
Future Work

Using Chord to detect and heal partitions
whose nodes know of each other.



Every node should know of some same set of
initial nodes
Nodes should maintain long-term memory of a
random set of nodes that they have visited earlier
Malicious set of Chord participants could
present an incorrect view of the Chord ring

Node n periodically asks other nodes to do a
lookup for n
THANK YOU