Transcript ppt - ICS

P2P Group Meeting
(ICS/FORTH)
Monday, 28 March, 2005
A Scalable Content-Addressable Network
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard
Karp, Scott Shenker
In a nutshell...
CAN is a distributed hash table.
Consider a virtual world with specified
geometry.
Add some nodes uniformly.
Each node holds information about its zone.
Each node holds information about its
neighbors.
Zone: a chunk of the entire hash table.
How it works?
We want to store a (key, data) pair.
Use a uniform hash function to map the key in a point P in the
virtual world.
P lies in a zone.
The node which is the owner of the zone now holds the (key,
data) pair.
How it works?
We want to retrieve the (key, value) pair.
Apply the same hash function to key.
You'll get P.
Now you can travel from your node to the node-owner of the zone
which P belongs to, using simple geometry.
Now, the horry details...
CAN uses a virtual d-dimentional Cartesian coordinate
space on a d-torus.
The entire coordinate space is dynamically partitioned to
zones, owned by n nodes.
Add some black magic and you get:
Average Routing Path: (d/4)(n1/d) hops.
Nodes maintain: 2d neighbors.
(*If d=lgn/2, then we have O(lgn))
Design – Construction
Bootstrap process: via host caches.
After the initial connection the new node selects a random
point P and sends a JOIN request.
The node that owns P splits its zone and sets the new node
as a neighbor.
Periodically update messages are exchanged between nodes
located closed to the "neighborhood".
Node insertion is a local process and affects only
O(dimensions) existing nodes.
Node Departure
Explicit hand over: a node sends its zone to one of
its neighbors (the one with the smallest one), before
it leaves the system.
TAKEOVER: If a node doesn't receive an UPDATE
message from one of its neighbors for a long period,
then it takes the zone (recovery).
Design Improvements
Goal: decrease routing latency by adding some nice
features.
Feature addition tradeoff: Per node state and system
complexity is increased.
1. Multiple dimensions
2. Realities
Reality: An independent coordinate space with a
specific zone mapping.
Having multiple realities, a data object is replicated
to various locations in the system.
2. Realities
Realities vs Dimensions
3. RTT based routing
Each node forwards a message to the neighbor with higher RTT
(round-trip-time) ratio.
4. Zone Overloading
Multiple peers (MAXPEERS~3,4) share the same zone in
the system.
Advantages
- Reduced path length. Zone overloading has the same
affect as reducing the nodes of the system.
- Reduced per-hop latency. Each node has multiple choices.
- Improved fault tolerance. A zone is less possible to be
vacant.
5. Multiple hash functions
6. Topology Forcing
Goal: apply an ordering based on RTT measures between
each node and a set of landmarks (well known machines,
i.e. DNS root names servers).
Stretch: the ratio of the latency on the CAN network to the
average latency on the IP network.
6. Topology Forcing
7. Uniform Partitioning
When a new node enters in the system, another node
must split its zone. If uniform partitioning is
enabled, the node with the largest zone volume will
be selected.
Uniform partitioning is a load balancing technique.
8. Hot Spot Management
Caching: Each node maintains a cash with keys of
popular data objects.
Replication: Each node may replicate popular data
objects to its neighbors.
Design Review
Results with a fixed system size
System Size = n18
Transit-Stub Topology Generator
TS topologies model networks using a 2-level hierarchy of
routing domains with transit domains that interconnect
lower level stub domains.
H(intra-transit, transit-stub, intra-stub)
R(low_limit, up_limit) {random values }
Example
H(100, 10, 1): A Transit-Stub topology with a hierarchical
link delay assignment of 100ms on intra -transit links, 10ms
on transit-stub links and 1ms on intra-stub links.
Main Result
In a system with approximately 260,000 peers, CAN
may achieve routing with a latency that is well
within a factor of two of the underlying network
latency.
Thank you for your time. :-)
Elias Athanasopoulos
[email protected]
http://www.csd.uoc.gr/~elathan/