P2P Simulation Platform Enhancement

Download Report

Transcript P2P Simulation Platform Enhancement

P2P Simulation
Platform
Enhancement
Shih Chin, Chai
Superviser: Dr. Tim Moors
Assessor: Dr. Robert Malaney
INTRODUCTION
Part of the Peer-to-Peer (P2P) Sharing of
Networks Performance Measurements
Project
 Focus on enhancing the current simulation
platform for the development of
meaningful simulations

001
012
P2P Network

A distributed system
architecture:
 No
centralized control
 Nodes are symmetric in
function

212 ?
212 ?
33
2
212
305
Fully decentralised
Bob
Alice
Large number of unreliable
nodes
Judy
Jane
Centralised
P2P networking
Focus at the application level
MOTIVATION

Many internet applications, currently
based on the client-server architecture, will
be more robust if built on decentralized
self-organizing overlays (aka Peer-to-Peer
Systems).
More Robust????
Reliable: no single point of failure, many
replicas.
 Scalable: evolves smoothly to millions of
nodes.
 High capacity through parallelism: many
disks, many network connections, many
CPUs.

Latest Generation
Chord (MIT)
 Tapestry (UCB)
 Pastry (Microsoft & Rice)
 CAN (UCB & ACIRI)
 ………..

Basic lookup in Chord
Consistent
Hashing
keyID =
SHA_1(key)
N120
N10 “Where is key 80?”
N105
nodeID =
SHA_1(IP)
“N90 has K80”
N32 K15, K30
K80 N90
N60
7-bit ID space
P2P networking
Locality Awareness
Minimize wide-area traffic, bandwidth
utilization, congestion, and sensitivity to
wide-area faults
 Performance in the local area is particulary
important when many paths can stay
entirely within the local area.
 We want to make sure that the lookup
path stay locally whenever it’s possible.

Search for Low Stretch
The measure of locality efficiency is
stretch, the ratio of the distance traveled to
find a copy of an object to the distance to
the closest copy.
 Two of the Distributed Hash Tables(DHT):
Tapestry & Pastry

Roles of Simulators
Evaluate the performance of p2p systems,
in terms of cost (e.g. bandwidth) and value
(e.g. reliability)
 Provide a "good" abstraction of the real
network and application for experimental
purposes.

The Dangers in Simulating P2P
Differences in performance may be due to
simulator, not p2p system, if different
simulators are used.
 Simplified to the point where key facets of
the network behavior have been lost

Today’s P2P Research
Lack of common simulation platform until
recently p2psim, peersim have been
developed, & publicly available.
 Current projects still mainly use own
simulators. (Bad..)
 Q: Could a small change in the model
result in a large change in the outputs?
More treacherous..

My Approach
The most useful simulator for long-term
research interests would be the one that
incorporates various proposals by different
researchers, e.g. ns-2, Opnet
 This thesis is about a collaborative effort to
contribute toward a common network
simulator in P2P networking.

Intro to p2psim
Developed by MIT research group
 Written in C++
 Multi-threaded
 Discrete-event simulator
 Currently supports Chord, Accordion,
Koorde, Kelips, Tapestry, and Kademlia.

Initial (Part A) simulation results


Aim: To evaluate the stretch performance of
Tapestry
Setup:
 Hardware:
Pentium 4 CPU 3GHz
 OS: Linux 9, gcc version 3.2.2
 Topology: King-topology
 Node: 1740
 Run Time: ~40hours
 Method: Evaluation under churn
Simulation Results
Analysis
The graph shows that the stretch
decreases as bandwitch per node
increases. However, the actual path of a
query takes does not show.
 Conclusion: More aggressive approach is
needed to evaluate the actual locality
performance

Goals for thesis part B
Modify p2psim to output path information
for individual queries so that complete
stretch characteristics can be determined
&
 Implement DHTs that have good support
for locality, proximity and stretch in p2psim,
such as Pastry

Plan Part I
Expected challenge: Current simulator
only supports end-to-end latencies.
 Possible solution: Use an IP-layer
topology file, GT-ITM with p2psim,
because GT-ITM deals with IP-layer nodes,
which possibly enable us to count IP hops
of a query.

Plan Part II
Expected challenge: Huge program.
>3000 lines of C++ code as the outcome.
 Proposed tools:

 The
Tapestry code
 Based on the paper, “Pastry: Scalable,
decentralised object location and routing for
large-scale p2p systems.” (Microsoft & Rice)
 P2psim mailing list
Task Schedule
Debugging & Refinement
Week 1-4:
Modify
p2psim to
output stretch
Week 5 – 10:
Implement Pastry
Week 11 – 12:
Evaluation and
Testing
Documentation & Project Management
Week 13 –
14: Final
Report and
Open Day
Summary
Enhance p2psim
 Add in new features:

 Output
query path
 New protocol
To evaluate the stretch issues our project
after. Nonetheless,
 For long term research interest.

Reference







Li JY, Stribling J, Morris R., Kaashock M.F., Gil T.M., A performance
vs. cost framwork for evaluating DHT design tradeoffs under churn,
MIT.
Stoica I., Morris R., Karger D., Kaashock MF., Balakrishnan H.,
Chord, MIT and Berkeley.
Zhao B.Y, Kubiatowicz J., Joseph AD., Tapestry, UCB.
Rowstron A., Druschel P., Pastry, Microsoft and Rice University.
Kurose J., Levine B., Towsley D., Peer-peer and Application level
networking, http://gaia.cs.umass.edu/cs791n
Floyd S., Paxson V., 2001, Difficulties in simulating the Internet,
ACIRI, Berkeley.
Risson J., Moors T., Towards Robust Internet Applications: SelfOrganizing Overlays, UNSW.
ANY QUESTION?
Tapestry Mesh
3
Incremental
suffix-based routing
4
NodeID
0x79FE
NodeID
0x23FE
NodeID
0x993E
4
NodeID
0x035E
3
NodeID
0x43FE
4
NodeID
0x73FE
3
NodeID
0x44FE
2
2
4
3
1
1
3
NodeID
0xF990
2
3
NodeID
0x555E
1
NodeID
0x73FF
2
NodeID
0xABFE
NodeID
0x04FE
NodeID
0x13FE
4
NodeID
0x9990
1
2
2
NodeID
0x423E
3
NodeID
0x239E
1
NodeID
0x1290
Routing to Nodes
Example: Octal digits, 218 namespace, 005712  627510
005712
0 1
2
3 4 5
6 7
340880
0 1
2
3 4 5
6 7
Neighbor
Map
005712
For “5712” (Octal)
0712
943210
834510
387510
0 1
0 1
0 1
2
2
2
3 4 5
3 4 5
3 4 5
6 7
6 7
6 7
727510
0 1
2
3 4 5
6 7
627510
0 1
2
3 4 5
6 7
340880
943210
x012
xx02
xxx0
1712
x112
5712
xxx1
2712
x212
834510
xx22
5712
3712
x312
xx32
xxx3
4712
x412
xx42
xxx4
5712
x512
xx52
xxx5
6712
x612
xx62
xxx6
7712
727510
5712
xx72
xxx7
4
3
2
387510
1
Routing Levels
627510
Object Location
Randomization and Locality
Pastry: Routing
Proximity Neighbor Selection
Node is chosen
based on the
proximity metric
Routing step:
 1. check the
leaf set
 2. then the
routing table