THE LANDMARK HIERARCHY: A NEW HIERARCHY FOR

Download Report

Transcript THE LANDMARK HIERARCHY: A NEW HIERARCHY FOR

Chord: A Scalable Peer-to-peer Lookup
Service for Internet Applications
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan
Presented by Alexei Semenov
Introduction
• Main problem with peer-to-peer applications - we need to
efficiently locate the node that stores a particular data item.
• Chord: Only one operation: given a key, it maps the key onto
a node. Uses a variant of consistent hashing to assign keys to
Chord nodes.
• The advantages of using the consistent hashing:
– balances the load
– little movement of keys when nodes join or leave the system
• What distinguishes Chord from many other peer-to-peer
lookup protocols?
– Simplicity
– Provable correctness
– Provable performance
2
Related Work
• Chord vs traditional name and location services
– Freenet provides anonymity, while Chord doesn’t
– Globe exploits network locality better than Chord
– Plaxton provides stronger guarantees than Chord
• Though in some aspects Chord performs worse than other
services, it’s advantage is that it still performs well and in
some other aspects even better. And it is considerably less
complicated.
3
System Model
• Features of Chord:
–
–
–
–
–
Load balance
Decentralization
Scalability
Availability
Flexible naming
• Chord software performs as a library = linked with the client
and server applications using it. There are two ways of
interaction between the application and the Chord:
– Chord provides a lookup(key) algorithm, that yields the IP address of
the node responsible for the key.
– Chord software on each node notifies the application of changes in the
set of keys that the node is responsible for.
4
The Base Chord Protocol –
Consistent Hashing (1)
• Chord uses consistent hashing, but improves its scalability by
avoiding the requirement that every node knows about every
other node.
• Consistent hash function assigns each node and key an m-bit
identifier using a base hash-function such as SHA-1. A node’s
identifier is chosen by hashing the node’s IP address, while a
key identifier is produced by hashing the key.
• Consistent hashing assigns keys to nodes as follows:
Identifiers are ordered in an identifier circle modulo 2^m. Key
k is assigned to the first node whose identifier is equal to or
follows k in the identifier space. This node is called the
successor node of k.
5
The Base Chord Protocol –
Consistent Hashing (2)
• Example: m=3 The
successor identifier 1 is
node 1, so key 1 would be
located at node 1. Similarly,
key 2 would be located at
node 3, and key 6 at node
0.
Consistent hashing enables nodes to enter and leave the
network with minimal disruption.
6
The Base Chord Protocol –
Scalable Key Location
• Using only consistent hashing may require to traverse all
nodes to find the appropriate mapping. That’s why Chord
maintains an additional routing information.
• Each node n maintains a routing table with at most m entries,
where m is the number of bits in the key/node identifiers. This
table is called the finger table.
• A finger table entry includes both the Chord identifier and the
IP address (and port number) of the relevant node.
7
The Base Chord Protocol –
Node Joins
• Nodes can leave or join at any time. Preserving the ability to
locate every key in the network may present a challenge.
Chord deals with this problem by making sure that:
– Each node’s successor is correctly maintained
– For every key k, node successor(k) is responsible for k.
– Each node’s predecessor is correctly maintained
• When a node n joins the network, Chord performs 3
operations:
– Initializes the predecessor and fingers of node n
– Updates the fingers and predecessors of existing nodes to reflect the
addition of n
– Notifies the higher layer software so that it can transfer state associated
with keys that node n is now responsible for.
8
Concurrent Operations and failures
• Stabilization
– Needed in case of concurrent joins. Basic ”stabilization” protocol is used
to keep nodes’ successor pointers up to date, which is sufficient to
guarantee correctness of lookups. Successor pointers are then used to
verify and correct finger table entries, which allows these lookups to be
fast as well as correct.
• Failures and Replication
– When a node n fails, nodes whose finger tables include n must find n’s
successor. Besides the failure of n must not allow any disruption of
queries that are in progress.
– To successfully recover from the failure, one needs to maintain correct
successor pointers. For that matter each Chord node maintains a
”successor-list” of its r nearest successors on the Chord ring.
9
Stabilization Example
10
Simulation Results
• Protocol Simulator
– Implemented in iterative style, which means that a node that resolves a lookup
initiates all communication. It asks a series of nodes for information from their
finger tables, each time moving closer on the Chord ring to the desired
successor.
• Load Balance
– The number of keys per node exhibits large variations that increase linearly with
the number of keys.
• Path Length
– The mean path length increases logarithmically with the number of nodes.
• Simultaneous Node Failures
– No significant lookup failure
11
Experimental Results
•
Prototype implementation of Chord was deployed on the Internet. Chord
nodes at ten sites on a subnet of the RON test-bed in the USA: in California,
Colorado, Massachusetts, New York, North Carolina and Pennsylvania.
Chord software runs on UNIX, uses 160-bit keys obtained from the SHA-1
cryptographic hash function, and uses TCP to communicate between
nodes. Chord runs in iterative style.
•
Figure shows the measured
latency of Chord lookups over a
range of number of nodes.
•
Lookup latency grows slowly with
the total number of nodes, which
confirms with the simulation
results, demonstrating Chord’s
scalability.
12
Conclusion
• Chord features simplicity, provable correctness and provable
performance even when there are concurrent node arrivals and
departures.
• It continues to function properly even when the node’s information is
only partially correct.
• It scales well with the number of nodes, recovers from large
numbers of simultaneous node failures and joins, and answers most
lookups correctly even during recovery.
• Chord might be valuable to peer-to-peer, large-scale distributed
applications such as cooperative file sharing, time-shared available
storage systems, etc.
13