Agent and Spatial Based parallelization of Biological Network motif

Download Report

Transcript Agent and Spatial Based parallelization of Biological Network motif

March 2, 2016
UWB Shizuoka Univ. Workshop
1
AGENT-BASED PARALLELIZATION
OF MICRO-SIMULATION AND SPATIAL DATA ANALYSIS
Munehiro Fukuda
Computing & Software Systems
University of Washington Bothell
March 2, 2016
UWB Shizuoka Univ. Workshop
2
Background
Simulation
• Conventional models
• Macroscopic
• Mathematical flow models
• Partial differential equations
Data Analysis
• Text data analysis
• Text, JSON, CSV formats
• Keep reading line by line into memory
• Examples: Hadoop, Spark, Storm, Kafka
• Agent-based models
• Spatial data analysis
Verification
&
Calibration
• Micro-simulation
• Binary
• Emergent collective group behavior
• Array, raster, grid data
• Graph
• Need to be structured in memory
March 2, 2016
UWB Shizuoka Univ. Workshop
3
Challenges
Agent-Based Models
Spatial Data Analysis
CPU Scalability:
The more accuracy we
pursue, the more agents
we need in a simulation
Memory Scalability:
The more accuracy we
pursue, the longer timeseries or the wider spatial
data we need
Examples:
MatSim: 20 minutes to
simulate 200K cars driving
through Bellevue on I-405
FluTe: 2 hours to simulate
epidemic in 10M
individuals
Examples:
 WRF: generates a 250MB
Parallelization
NetCDF file of the regional
climate data for every six
hours, which sums up 120
files or 30GB data per month,
360GB per year, and 18TB for
a 50-year simulation.
March 2, 2016
UWB Shizuoka Univ. Workshop
4
Hypothesis and Research Goals
• Hypothesis
• Users may choose agents if their programmability is easy
regardless of 10-20% performance drawback (because of their
management overheads).
• Agent-Based Simulation
• Achieving highest-speed ABM simulation using GPU, which assists
real-time cyber-physical computation.
• Spatial Data Analysis
• allowing spatial data analysis to be coded intuitively with agents,
which improves programmability in data analysis.
March 2, 2016
5
UWB Shizuoka Univ. Workshop
Multi-Agent Spatial Simulation Library
Y-axis
Agents
Agents
Xaxis
mnode0.uwb.edu
mnode2.uwb.edu
CPU Core 3
CPU Core 2
CPU Core 1
System Memory
LAN
Thread 3
Thread 2
Thread 1
Thread 0
CPU Core 3
CPU Core 2
System Memory
mnode1.uwb.edu
Process Rank 2
CPU Core 0
Thread 2
Thread 1
Thread 3
socket
Process Rank 1
CPU Core 1
CPU Core 3
CPU Core 2
CPU Core 1
System Memory
Thread 0
socket
Process Rank 0
CPU Core 0
Thread 3
Thread 2
Thread 1
Places
CPU Core 0
MASS Library Layer
Agents
(x,y)
Thread 0
Application Layer
A Bag of Agents
March 2, 2016
UWB Shizuoka Univ. Workshop
6
MASS Specification
Public static void main( String[ ] args ) {
MASS.init( args );
Places space = new Places( handle, “MySpace”,
params, xSize, ySize);
Agents agents = new Agents( handle, “MyAgents”,
params, space, population );
func1( )
func1( )
func1( )
func2( )
func1( )
…
func2( )
…
func2( )
space.callAll( MySpace.func1, params );
func3( )
space.exchangeAll( MySpace.func2, neighbors );
agents.exchangeAll( MyAgents.func3 );
agents.manageAll( );
MASS.finish( );
}
func2( )
func3( )
func1( )
March 2, 2016
UWB Shizuoka Univ. Workshop
7
MASS Status
• Library
• Java (implemented)
• C++ (implemented)
• CUDA (in design)
( me#t#
( me#t#+#1#
( me#t#+#2#
x,#y+1#
• Test programs
• 2D wave dissemination
• Sugarscape: artificial society
x%1,#y#x,#y#x+1,#y#
x,#y%1#
! me$t$
• Applications
• Transport simulation (in Java)
• Epidemiologic simulation (in C++)
• Climate analysis (in Java)
• Network motif search (in Java)
! me$t$+$1$
March 2, 2016
UWB Shizuoka Univ. Workshop
8
Multi-Agent-Based Transport Simulation
Parallelized by Zach Ma and to be presented at WSC2015
• Agents represent vehicles.
• They move over a traffic network every simulation click.
• http://matsim.org/uploads/showcase/MATVis-Time.pdf
March 2, 2016
9
UWB Shizuoka Univ. Workshop
Porting MatSim to MASS Java
• Road network
• Converted to an
adjacency matrix
L5
ND
L6
NA
L1
NB
L2
L5
L3
L1
L4
L2
L6
NA
NB
NC
ND
NE
L1
L2
L3
L6
L4
L5
NE
NC
L4
Road or Link
Traveler
...
• Vehicle driving
• Converted to agent
migration
Network
Parameters
...
...
Detail
...
March 2, 2016
UWB Shizuoka Univ. Workshop
Parallel MatSim Programmability and
Performance
LoC
Files
Original MatSim
5,144
46
Code addition/modification by MASS
585
8
%
11.3%
17.4%
2.5
Speedup Ratio
2
1.5
1
0.5
0
Single Thread
MATSim Parallel
2 Thread
MASS with 2 Node
4 Thread
MASS with 4 Node
10
March 2, 2016
UWB Shizuoka Univ. Workshop
11
FluTE: Influenza Epidemic Simulation
from Univ. New Mexico
person
Infected
communities
Contagious
http://www.cs.unm.edu/~dlchao/flute/
March 2, 2016
UWB Shizuoka Univ. Workshop
12
MASS C++ Parallelized FluTE
Programmability and Performance
By Osmond Gunarso and Zac Brownell
MPI Performance
MASS Performance
March 2, 2016
UWB Shizuoka Univ. Workshop
UW Climate Analysis
Parallelized by Jason Woodring and Submitted to ICPADS 2015
13
March 2, 2016
UWB Shizuoka Univ. Workshop
14
Agent-Based ToE Analysis (in Java)
March 2, 2016
UWB Shizuoka Univ. Workshop
15
UWCA Programmability and Performance
• Handling structured scientific data
• Dispatching agents to only data portions to analyze
• Supporting runtime analysis
sec
sec
200
150
50
0
100
1400
1200
1000
800
600
400
200
0
1
1
4
7
#cpus
8
#threads
Synchronous Migration
Asynchronous Migration
15 1
4
#threads/cpu
UWCA
CDO/NCL
8
16
The biggest overhead is reading files into a cluster system
March 2, 2016
UWB Shizuoka Univ. Workshop
Biological Network Motifs
Parallelized by Matt Kipps and Presented at HPCC 2015
• Biological network: protein-protein interaction.
• Network motifs: significantly and uniquely
recurring sub-graphs.
• Motif search: sub-graph enumeration and
statistical testing
16
March 2, 2016
UWB Shizuoka Univ. Workshop
17
Parallelization of Network Programs
• Vertex-oriented approaches
• Paradigm change from the
original sequential algorithms
• Examples
• MapReduce
• Pregel
• GraphLab
• Flow-oriented approaches
• Closer to the original
sequential algorithms
• Examples
• Olden
• MASS: our approach
March 2, 2016
18
UWB Shizuoka Univ. Workshop
MASS Agent-Based Parallelization
(in Java)
Crawler agent
1
2
rank 0
3
4
5
rank 1
6
7
8
rank 2
9
March 2, 2016
UWB Shizuoka Univ. Workshop
19
MASS Agent Execution Performance
• Agent explosion
• 2365-node network
with motif size 4
• 400,000 agents
• 400,000-node
network with motif
size 5
• 5.5 million agents
• Needs to address
• Memory allocation
overheads
• Agent management
overheads
March 2, 2016
UWB Shizuoka Univ. Workshop
20
Performance Improvement Plans
for Simulation
A Development of MASS GPU Version
1. Parallel agent spawn and termination
No use of heap space and no memory lock
2. Avoidance of agent collision
No lock on memory space and no repetitive agent migration
3. UI to agents
Micro agent tracking and intermittent space check-pointing
Check-pointing an entire
simulation space
Tracking a given agent’s
footprint into memory
Timeline
March 2, 2016
UWB Shizuoka Univ. Workshop
21
Performance Improvement Plans
for Spatial Data Analysis
1. Supporting asynchronous agent migration
Will mitigate synchronous master-work communication overheads.
2. Pooling idle agents
Will reduce memory and agent management overheads.
3. Restricting explosion of agent population
Will reduce the heap space required.
March 2, 2016
UWB Shizuoka Univ. Workshop
22
Our Current Group Members
Parallel File I/O for MASS Java
MASS Dev. Environment UWCA Performance Improvement UrbanSim Parallelization
MASS Java Testing & Doc.
with MASS Java
Performance Improvement
of MASS Java
Testing and Performance
Evaluation of MASS C++
on AWS
MASS C++ Testing & Doc.
Agent Collision Avoidance
In MASS C++
March 2, 2016
UWB Shizuoka Univ. Workshop
23
Conclusions
• Programmability: We demonstrated Intuitive parallelization
using multi-agents
• Execution performance:
• Simulation: The MASS library should support CUDA for a generalpurpose agent-based real-time GPU computing.
• Spatial data analysis: We need performance tune-ups: (1)
asynchronous migration, (2) agent pool, and (3) prevention of agent
explosion.
• For more information, please visit:
http://depts.washington.edu/dslab/MASS/
We’ll release the MASS library to the public by March
2016 (hopefully.. )