Chord - Networked Systems Laboratory
Download
Report
Transcript Chord - Networked Systems Laboratory
Distributed Hash Table Systems
Hui Zhang
University of Southern California
Outline
• History
• A brief description of DHT systems
General definition
Basic components
• Chord, a DHT system.
• Proximity routing in DHTs
• Reading list for interested audience
CS551 - DHT
2
History: client-server model
server
query
index table
data
client c
client d
data transferring
client a
client b
CS551 - DHT
3
History: peer-to-peer model (Napster)
index table
server
data
client a
client c
data
client d
data
query
data
client b
data transferring
CS551 - DHT
4
History: peer-to-peer model (Gnutella)
index table
data
index table
client a
client c
index table
data
index table
query
data
client b
data transferringclient d
CS551 - DHT
data
5
History: peer-to-peer model (DHT systems)
index table
data
index table
client a
client c
index table
data
index table
query
data
client b
data transferringclient d
CS551 - DHT
data
6
DHT systems - definition
• a new class of peer-to-peer routing infrastructures
- CAN, Chord, Pastry, Tapestry, Viceroy, Kademlia, etc.
• support a hash table-like functionality on Internet-like
scale
- given a key, map it onto a node
• numerous applications, services, and infrastructures
are proposed to be built on top of a DHT system.
- web caching,
- naming service,
- event notification,
- indirection services,
- DoS attack prevention,
- application-layer multicast,
- etc.
CS551 - DHT
7
Basic DHT components
• an overlay space and its partition approach
• a routing protocol
local routing table
next-hop decision
• a base hash function
variants: proximity-aware, locality-preserving, etc.
CS551 - DHT
8
Consistent hashing [Karger et al. STOC97]
data
server
Hashing
Overlay space
CS551 - DHT
9
Chord [Stoica et al. Sigcomm2001]
• Overlay space
- one-dimensional unidirectional key space 0 – 2m-1.
Given two m-bit identifiers x and y, d(x, y)=(y-x+2m) % 2m
- Each node is assigned a random m-bit identifier.
- Each node is responsible for all keys equal to or before
its identifier until its predecessor node’s identifier in the key
space.
CS551 - DHT
10
Chord – an example (m=8)
Network node
0
Data
256
224
32
64
192
96
160
128
120
A Chord network with 8 nodes and 8-bit key space
CS551 - DHT
11
Chord – routing protocol
• Routing table (finger table)
- (at most) m entries. The ith entry of node n contains the
pointer to the first node that succeeds n by at least 2(i-1) on the
key space, 1 i m.
• Next-hop decision:
- For the given target key k, find the closest finger before (to) k
and forward the request to it.
- Ending condition: The request terminates when k lies
between the ID range of current node and its successor node.
- The routing path length is O(log n) for a n-nodes network with
high probability (w.h.p.).
CS551 - DHT
12
Chord – routing table setup
Network node
Data
Pointer
0
64
32
96
128
160
192
224
255
[8,16)[16,32)
[1,2)
[2,4)
[4,8)
Range
4
5
Range
Range
Range
123Range
[32,64)
Range 6
[64,128)
Range 7
[128,256)
Range 8
A Chord network with 8 nodes and m(=8)-bit key space
CS551 - DHT
13
Chord – a lookup for key 120
Network node
Data
Overlay routing
physical link
0
32
64
96
128
160
192
224
255
120
A Chord network with N(=8) nodes and m(=8)-bit key space
CS551 - DHT
14
Chord – node joining
2. Initializing
3.
1.
Updating fingers
Transferring
fingers
keysof existing nodes
•
Between the new node and its successor node.
CS551 - DHT
15
Chord – stabilization mechanism
•
Refresh the pointer to the immediate successor up to
date.
- guarantee correctness of lookups
•
Refresh the pointers to the rest (m-1) fingers up to
date.
- keep lookups fast.
CS551 - DHT
16
Chord – successors-list mechanism
CS551 - DHT
17
Chord – simulation result
[Stoica et al. Sigcomm2001]
CS551 - DHT
18
Chord – “failure” experiment
The fraction of lookups that fail as a function
of the fraction of nodes that fail.
CS551 - DHT
[Stoica et al. Sigcomm2001]
19
Outline
• History
• A brief description of DHT systems
General definition
Basic components
• Chord, a DHT system.
• Proximity routing in DHTs
• Reading list for interested audience
CS551 - DHT
20
Overlay performance of Chord
• Expected routing table size per node
- O(log N) entries
• Expected hops per request
- O(log N) hops
CS551 - DHT
21
Latency stretch in Chord
Network node
Data
Overlay routing
physical link
0
64
32
96
128
160
192
224
255
80
120
U.S.A
China
64
0
80
96
128
A Chord network with N(=8) nodes and m(=8)-bit key space
CS551 - DHT
22
Latency stretch
• =
[Ratnasamy et al. 2001]
latency for each lookup on the overlay topology
average latency on the underlying topology
• In Chord, (logN) hops per lookup in average
(logN) stretch in original Chord.
Could Chord do better, e.g., O(1) stretch, without much
change?
CS551 - DHT
23
Three categories of proximity routing
[Gummadi2003]
• Proximity identifier selection
By carefully assign IDs to the nodes, make the overlay
topology congruent to the underlying network topology.
• Proximity route selection
When forwarding a request message, the choice of the next
hop depends on both the overlay distance and the underlying
distance.
• Proximity neighbor selection
The neighbors in the routing table are chosen based on their
proximity.
CS551 - DHT
24
Underlying network topology decides
the performance of proximity routing
Chord node
Physical link
Router
The worse-case scenario: equal-distance
CS551 - DHT
25
Latency expansion
• Let Nu(x) denote the number of nodes in the
network G that are within latency x of node u.
- power-law latency expansion: Nu(x) grows (i.e.
``expands'‘) proportionally to xd, for all nodes u.
Examples: ring (d=1), mesh (d=2).
- exponential latency expansion: Nu(x) grows proportionally
to x for some constant > 1.
Examples: random graphs.
CS551 - DHT
26
“Latency-stretch” theorem - I
• “Bad news” Theorem
If the underlying topology G is drawn from a family of graphs
with exponential latency expansion, then the expected
latency of Chord is (L•logN), where L is the expected
latency between pairs of nodes in G.
Chord node
Physical link
Router
The worse-case scenario: equal-distance
CS551 - DHT
27
“Latency-stretch” theorem - II
• “Good news” Theorem
If
(1) the underlying topology G is drawn from a family of graphs
with d-power-law latency expansion, and
(2) for each node u in the Chord network, it samples (log N)d
nodes in each range with uniform randomness and keeps the
pointer to the nearest node for future routing,
expected
latency
a request
is O(L), where L is the
then the latency
stretch
of of
Chord
is O(1).
expected latency between pairs of nodes in G.
CS551 - DHT
28
Two remaining questions
• How does each node efficiently achieve (log N)d
samples from each range?
• Do real networks have power-law latency
expansion characteristic?
CS551 - DHT
29
Uniform sampling in terms of ranges
Node x: the node at hop x
Node 0: the request initiator
Node t: the request terminator
routing path
Node 1
node t is in its range-x (< d)
Node 0
Node 2
node t is in its range-d
node t is in its range-y (< x)
Node t
For a routing request with (log N) hops, final node t will be a
random node in (log N) different ranges
CS551 - DHT
30
Lookup-Parasitic Random Sampling
1. Recursive lookup.
2. Each intermediate hop appends its IP address to
the lookup message.
3. When the lookup reaches its target, the target
informs each listed hop of its identity.
4. Each intermediate hop then sends one (or a
small number) of pings to get a reasonable
estimate of the latency to the target, and update
its routing table accordingly.
CS551 - DHT
31
LPRS-Chord:
convergence time
Convergence Time
CS551 - DHT
32
LPRS-Chord:
topology with power-law expansion
Ring Stretch
(at time 2logN)
CS551 - DHT
33
One remaining question
• Do real networks have power-law latency
expansion characteristic?
CS551 - DHT
34
Internet router-level topology:
latency measurement
• Approximate link latency by geographical latency
- assign geo-locations to nodes using Geotrack[Padmanabhan2001].
• A large router-level topology dataset
- 320,735 nodes, mapped to 603 distinct cities all over the world.
- 92,824 node pairs are sampled to tractably compute the latency
expansion of this large topology.
CS551 - DHT
35
Internet router-level topology:
latency expansion
CS551expansion
- DHT
latency
36
LPRS-Chord on router-level topology
Stretch on the router-level subgraphs (at time 2logN)
CS551 - DHT
37
Outline
• History
• A brief description of DHT systems
General definition
Basic components
• Chord, a DHT system.
• Proximity routing in DHTs
• Reading list for interested audience
CS551 - DHT
38
A few interesting papers on DHT
• K. Gummadi, R. Gummadi, S. Gribble, S. Ratnasamy, S.
Shenker, I. Stoica. “The Impact of DHT Routing
Geometry on Resilience and Proximity.” Sigcomm 2003.
• A. Gupta, B. Liskov, and R. Rodrigues. “Efficient Routing
for Peer-to-Peer Overlays.” NSDI 2004.
• H. Balakrishnan, K. Lakshminarayanan, S. Ratnasamy, S.
Shenker, I. Stoica, M. Walfish. “A Layered Naming
Architecture for the Internet.” Sigcomm 2004.
CS551 - DHT
39
Thank you!
CS551 - DHT
40
Backup slides
CS551 - DHT
41
A simple random sampling solution
Network node
Pointer
Distance
measurement
0
2m-2
2m-1
2m-1
A Chord network with m-bit key space
CS551 - DHT
42
A simple random sampling solution
Network node
Pointer
Distance
measurement
0
2m-2
2m-1
2m-1
A Chord network with m-bit key space
CS551 - DHT
43
Term definition (II)
• Range
- for a given node in a Chord overlay with ID j, its i-th range Ri(j)
is the interval [j+2i-1, j+2i) on the key space, where 1 i m.
• Frugal routing
1. after each hop, the search space for the target reduces by
a constant factor, and
2. If w is an intermediate node in the route, v is the destination,
and v Ri(w), then the node after w in the route depends only
on w and i.
CS551 - DHT
44
LPRS-Chord:
simulation methodology
Phase 1. N nodes join the network one-by-one.
Phase 2. each node on average inserts four
documents into the network.
Phase 3. each node generates, on average 3logN
data requests one-by-one.
- LPRS actions are enabled only in Phase 3
- Performance measurement begins at Phase 3
CS551 - DHT
45
Comparison of 5 sampling strategies:
definitions
• Consider a lookup that is initiated by node x0,
then forwarded to node x1, x2, ..., and finally
reaches the request terminator, node xn:
1. Node xi samples node xn, 0 i < n
2. Node xn samples nodes x0, …, xn-1
3. Node xi samples node xi-1, 1 i n
4. Node x0 samples nodes xn
5. Node xi samples node x0, 0 < i n
CS551 - DHT
46
Comparison of 5 sampling strategies:
simulation result
CS551 - DHT
47
Zipf-ian document popularity
CS551 - DHT
48
Impact of skewed request distributions
CS551 - DHT
49