slides - Academia Sinica

Download Report

Transcript slides - Academia Sinica

Scalable Peer-to-peer Network for
Highly Synchronized Simulations
Shun-Yun Hu
Institute of Physics, Academia Sinica
2005/03/11
Outline




Introduction
Voronoi-based Overlay Network (VON)
Simulation Results
Conclusion
A Look at Simulations

Simulations are important tools in scientific research

Larger scale and higher resolution (more accurate
and detailed simulations) are constantly sought

However, computational resource can be limited
An Untapped Potential

300 Million PCs on the Internet (2000 est.)

Up to 80% to 90% of CPU is wasted

Large supply of computing resource, growing rapidly
An Example: SETI@Home


Search for Extraterrestrial Intelligence (SETI)
UC Berkeley Project launched in May 1999

PC User downloads a screen saver
Calculations are done using idle CPU time

2005/03 statistics (in 6 years)




5.3 M world-wide participants
2.2 M years of single-processor CPU
54 teraflop machine (current top 3: 70.72, 51.87, 35.86)
Simulation: Folding@Home




Stanford Project launched in Sept. 2000
Seeks to determine protein’s 3D structure
Screensaver that downloads “work units”
2002 Statistics:



30,000 volunteers
1 M days of single-processor CPU
Published 23 papers in: Science, Nature, Nature
Structural Biology, PNAS, JMB, etc.
The Grand Question

Can we build the ultimate simulator for large-scale
simulation utilizing millions of computers world-wide?

Potential applications:





Nuclear reaction
Star clusters
Atomic-scale modeling in material science
Weather, earthquakes
Biology (protein, ecosystem, brain, ...)
Current Limitations

Current methodology





Issues:



Centralized server + many clients
Client requests “work unit” to process
Communication is minimized
Clients do not communicate
Only suitable for “embarrassingly parallel” simulations
Sophisticated server-side algorithm and management required
An alternative: peer-to-peer (P2P) computing
What is Peer-to-Peer (P2P)?
[Stoica et al. 2003]
 Distributed systems without any centralized control
or hierarchical organization
 Runs software with equivalent functionality

Examples



File-sharing:
VoIP:
DHT:
Napster, Gnutella, eDonkey
Skype
Chord, CAN, Pastry
Peer-to-Peer Overlay
A P2P overlay network
source: [Keller & Simon 2003]
Promise & Challenge of P2P

Promises



Growing resource, decentralized  Scalable
Commodity hardware
 Affordable
Challenges


Topology maintenance
 dynamic join/leave
Efficient content retrieval  no global knowledge
A Simulation Scenario

How can we utilize P2P for simulation-purpose?
Answer: depends on what you want to simulate

We observe that many simulations…





are spatially-oriented (i.e. based on coordinate systems)
run in discrete time-steps
require synchronization at each time-step
exhibit localized interaction (i.e. short-range interaction)
example: molecular dynamics (MD) simulation
Scenario Defined for P2P



Many simulated entities (nodes) on a 2D plane ( > 1,000)
Positions (coordinates) may change at each time-step
How to synchronize positions with those in Area of Interest
(AOI)?
Area of Interest
P2P Design Goals

Observation:


the contents are information from AOI neighbors
P2P content discovery is a neighbor discovery problem

Solve the Neighbor Discovery Problem in a fullydistributed, message-efficient manner.

Specific goals:


Scalable
Fast
 Limit & minimize message traffics
 Direct connection with AOI neighbors
Outline




Introduction
Voronoi-based Overlay Network (VON)
Simulation Results
Conclusion
Voronoi Diagram


2D Plane partitioned into regions by sites, each
region contains all the points closest to its site
Can be used to find k-nearest neighbor easily
Neighbors
Region
Site
Design Concepts
Use Voronoi to solve the neighbor discovery problem




Identify enclosing and boundary neighbors
Each node constructs a Voronoi of all AOI neighbors
Enclosing neighbors are minimally maintained
Mutual collaboration in neighbor discovery
Circle
Area of Interest (AOI)
White
self
Yellow
enclosing neighbor (E.N.)
L. Blue
boundary neighbor (B.N.)
Pink
E.N. & B.N.
Green
AOI neighbor
D. Blue unknown neighbor
Procedure (JOIN)
1) Joining node sends coordinates to any existing node
Join request is forwarded to acceptor
2) Acceptor sends back its own neighbor list
joining node connects with other nodes on the list
Joining node
Acceptor’s region
Procedure (MOVE)
1) Positions sent to all neighbors, mark messages to B.N.
B.N. checks for overlaps between mover’s AOI and its E.N.
2) Connect to new nodes upon notification by B.N.
Disconnect any non-overlapped neighbor
Boundary
neighbors
New
neighbors
Non-overlapped
neighbors
Demonstration
Simulation video


General movements (30 nodes, 800x600 world)
Local vs. global view
Outline




Introduction
Voronoi-based Overlay Network (VON)
Simulation Results
Conclusion
Simulation Method

Condition





World-size:
AOI:
Trials:
Time-steps:
1000x1000
150
10 ~ 250 nodes
1000
Behavior model



Random movement:
Constant velocity:
Movement duration:
random direction
5 units/step
random (1-25 steps)
Consistency Metrics

Topology Consistency [Kawahara, 2004]
Number of observed AOI neighbors
Number of actual AOI neighbors

Drift Distance [Diot, 1999]
Distance between observed position and actual position
(average over all nodes)
Topology Consistency
Topology Consistency (%)
Topology Consistency Measurements
100
99
98
97
96
95
94
93
92
91
90
0
50
100
150
Number of Nodes
200
250
Drift Distance
Drift Distance
100
90
average
80
maximum
70
60
50
40
30
20
10
0
0
50
100
150
Number of Nodes
200
250
Scalability (1)
Transmission Size Per Node Per Second
6.0
5.0
Size (kb)
send (max)
4.0
send (avg)
recv (max)
3.0
recv (avg)
2.0
1.0
0.0
0
25
50
75
100 125 150 175 200 225 250
Number of Nodes
Scalability (2)
Average Neighbor Size Measurements
18
16
Neighbor Size
14
12
connected
10
AOI
8
6
4
2
0
0
50
100
150
Number of Nodes
200
250
Scalability (3)
Comparison of Voronoi-based P2P and Client-Server
180
160
Size (kb)
140
send (avg)
120
recv (avg)
100
CS-send (avg)
80
CS-recv (avg)
60
40
20
0
0
25
50
75
100 125 150 175 200 225 250
Number of Nodes
Outline




Introduction
Voronoi-based Overlay Network (VON)
Simulation Results
Conclusion
Summary

Idle CPU and networks are untapped potential
resources for large-scale simulation

Current approaches do not support simulations that
require frequent synchronization / updates

A promising solution: Voronoi-based P2P Overlay



Leverage knowledge of each peer to maintain topology
Properties: scalable, efficient, fully-distributed
Enable simulations with frequent localized synchronization
Future Works

3D Voronoi

Heterogeneous node capacities

Node failures

Application to actual research problems
Acknowledgements











Dr. Jui-Fa Chen
(陳瑞發老師)
Dr. Wei-Chuan Lin
(林偉川老師)
Members of the Alpha Lab, TKU CS
Guan-Ming Liao
(廖冠名)
Dr. Chin-Kun Hu
(胡進錕老師)
LSCP, Institute of Physics, Academia Sinica
Joaquin Keller
Bart Whitebook
Jon Watte
(France Telecomm R&D, Solipsis)
(butterfly.net)
(there.com)
Dr. Wen-Bing Horng
Dr. Jiung-yao Huang
(洪文斌老師)
(黃俊堯老師)
Protein Folding Problem

Find native state (lowest free energy) 3D structure
given a 1D sequence of amino acids

Timescale limitation of classical MD methods





Secondary structure folds in 0.1 ~ 10 ms
Small protein folds in tens of ms
Current record: 1ms (villin headpiece)
full-atomic simulation of 1 ns takes one CPU day
100 ~ 10,000 gap (it might take decades)
Folding@Home Parallelization

Dynamics of complex
system involves crossing of
free energy barriers

Most time is spent in free
energy minimum “waiting”

Possible to simulate using
trajectories much shorter
than folding time

“ensemble dynamics” (same
coords, different velocities)
Simulation Specifics

free energy barrier crossing is identified by spike in
energy variance

Fs peptide (5-residue)


(fold time 10ns and 160 +/-10ns)
Artificial mini-protein BBA5 (23-residue)


Tens of thousands of 5-20ns trajectories (total of 700us)
Mean folding time is 10ms, 10 out of 10,000 folds in 10ns
Procedure (LEAVE)
1) Simply disconnect
2) Others then update their Voronoi
new B.N. is discovered via existing B.N.
Leaving node
(also a B.N.)
New boundary neighbor
Scalability (1)
Average transmission size per node per second
5.0
send (basic)
4.5
recv (basic)
4.0
send (dAOI)
Size (kb)
3.5
recv (dAOI)
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0
25
50
75
100
125
150
Number of Nodes
175
200
225
250
Scalability (2)
Maximum transmission size per second among all nodes
6.0
send (basic)
recv (basic)
5.0
send (dAOI)
recv (dAOI)
Size (kb)
4.0
3.0
2.0
1.0
0.0
0
25
50
75
100
125
150
Number of Nodes
175
200
225
250
Scalability (3)
Average neighbor size for basic and dynamic AOI models
18
connected (basic)
16
connected (dAOI)
14
AOI (basic)
Neighbor Size
AOI (dAOI)
12
10
8
6
4
2
0
0
50
100
150
Number of Nodes
200
250
Problems of Voronoi Approach

Message traffic


Circular round-up of nodes
Redundant message sending
(inherent to fully-distributed design)

Incomplete neighbor discovery


Can happen with inconsistent / incorrect neighbor list
Fast moving node