Transcript current

Plethora: Infrastructure and
System Design
Introduction
• Peer-to-Peer (P2P) networks:
– Self-organizing distributed systems
– Nodes receive and provide services cooperatively
– No predetermined client/server roles
• Key features:
– Scalable
– Adaptive and reconfigurable
– Leverage technology trends (network/processor/memory)
• Key problems:
– Locating and routing objects efficiently
– Consistency management
– Fault-Tolerance
Location and Routing - DHT
• Apply structure to the network:
– Inputs hashed to a key
– Each node responsible for a subset of keys
• Nodes maintain small routing tables
• Queries routed to neighboring nodes that ensure
progress towards the ultimate destination.
Location and Routing - DHT
0XXX
1XXX
2XXX
3XXX
0112
2321
START
0112 routes a
message to
key 2000.
2032
First hop fixes
first digit (2)
2001
END
2001 closest
live node to
2000.
Second hop fixes
second digit (20)
Motivation
• Virtualization destroys locality.
• Query responses do not contain locality information.
• Recent studies show that queries for multiple keys in P2P networks
follow a Zipf-like distribution.
• Exploit geographic locality.
• Build highly-distributed collaborative environments and applications:
–
–
–
–
information lifecycle
distributed file systems
software distribution
archival storage and disaster recover
IP Addresses as Virtual IDs
• Incorporate locality into overlay networks:
– Explore addressing scheme of the underlying network.
• In most cases, nodes with IP addresses that are numerically close
are also physically close.
• Organization of the Internet in ASs. By correcting a few bits in each
hop, the last hops would be inside an AS.
• Issues:
– IP space is not uniformly populated by peers.
– Load imbalance at the peers.
– The upper bound of O(log n) can no longer be guaranteed.
IP Addresses as Virtual IDs
IP Addresses as Virtual IDs
IP Addresses as Virtual IDs
• 2,420 nodes. 20 keys per node.
Plethora
• Two-level overlay
– One global overlay
– Several local overlays
• Global overlay is the main repository of data.
– Global overlay helps nodes organize themselves into local
overlays.
• Local overlays exploit the organization of the Internet in ASs.
– Size of the local overlay is controlled by an overlay leader.
– Uses efficient distributed algorithms for merging and splitting
local overlays.
Cache Organization
Simulation Setup
• Internet topology generated using GT-ITM
topology generator.
• 10,000 overlay nodes selected randomly from
the hosts.
• NLANR web proxy trace with 500,254 objects.
• Zipf distribution parameters: {0.75, 0.80, 0.85,
0.90, 0.95}
• Local cache size: 5MB (LRU replacement
policy).
IP Addresses as Virtual IDs
Simulation Results
Zipf-parameter
Cache Hit Ratio
Gain
0.75
76%
31.0%
0.80
79%
33.5%
0.85
81%
36.0%
0.90
83%
38.7%
0.95
86%
41.3%
Summary
• IP addresses as virtual IDs:
– Overlays with good locality properties.
– Non-uniform realworld distribution:
• severe load imbalance
• no bounded latency
• Plethora Routing Core:
– Two-level overlay architecture.
– Local overlays are created to cluster nodes that are close in the
underlying network.
• Significant performance gains
– Low maintenance overhead
Latency Hiding
• For large-scale collaborative and
distributed applications:
– latency effects are still an issue.
– need resiliency in the presence of network
failures.
• Record updates using a transactional
versioning system:
– Aggregate updates
– Distributed conflict resolution
Versioning and Transaction Model
Global home
Local Home
T1
T2
T3
commit
Tk
Tk+1
Tk+2
Development Issues
• Implementation of versioning trees
– Efficient update and commit protocols
– Dealing with failures (node, network)
• Object structure of the repository to exploit
versioning semantics
• Guarantees on object access and
consistency of updates