Transcript document

Optimistic Parallel Discrete Event
Simulations of Physical Systems
using Reverse Computation
Yarong Tang
Kalyan S. Perumalla
Richard M. Fujimoto
Homa Karimabadi
Jonathan Driscoll
Yuri Omelchenko
College of Computing, Georgia Tech
SciberNet Inc.
06/01/2005
The Big Picture
Global Multi-Scale Kinetic
Simulations of the Earth's
Magnetosphere
Multi-physics; multi-scale
The goal: to understand how
solar wind interacts with the
Earth’s magnetosphere
(space weather)
People
PDES simulation: lead by Richard Fujimoto and Kalyan Perumalla
Compiler optimization: lead by Pande Santosh
Physics modeling: lead by Homa Karimabadi @ Scibernet Inc.
PADS 06/01/05
Georgia Institute of Technology
Outline
Motivation
Overview
The PIC model (Particle-in-Cell)
Implementation
Performance
Conclusion & Future work
PADS 06/01/05
Georgia Institute of Technology
Why optimistic simulation?
Synchronization in PDES: ensures LPs to
process events in time stamp order
Conservative approach
Avoids violation of local causality constraint
Inefficient for applications with little lookahead
(low “predictability”)
Plasma simulation highly dynamic, nearly-zero
lookahead in certain regions
Optimistic Approach
PADS 06/01/05
Georgia Institute of Technology
Why reverse computation?
Optimistic synchronization:
Allows causality errors to happen, but uses a
rollback mechanism to recover
State saving: saves state prior to computation; restores
saved state
Plasma simulation has a lots of events → lots of memory!
Event computation cost is small compared to state saving
overhead → significant overhead!
Reverse computation: rollback by performing the inverse
of the event computation
Reduces both memory and time needed for state saving
Can be automated
Excellent match for large, high-performance models
PADS 06/01/05
Georgia Institute of Technology
Outline
Motivation
Overview
The PIC model (Particle-in-Cell)
Implementation
Performance
Conclusion
Future work
PADS 06/01/05
Georgia Institute of Technology
Particle-in-Cell (PIC) Model
Charging a spacecraft due to a periodical beam
injection from its surface
Conceptually simple, yet sufficiently complex to
capture characteristics of plasma dynamics
What are we interested in the simulation?
Charge on spacecraft surface
Plasma dynamics
PADS 06/01/05
Georgia Institute of Technology
Challenges
Complicated interactions between physical
entities: cells (modeled by LPs) and particles
Cell1
Cell2
Cell1 sends a particle to cell2
Cell1 updates state and may “wake up” all its particles
Cell2 updates state and may “wake up” all its particles
PADS 06/01/05
Georgia Institute of Technology
Particle Movement Illustrated
Cell 1
Move
Cell 2
Add
Cell 3
Add
Add
Move
Move
Move
Add
Simulation Time
PADS 06/01/05
Georgia Institute of Technology
Challenges cont.d
Memory requirement
Particle state
Velocity, acceleration, position, direction, type,
exit_time, lastmove_time, etc.
Cell state
Center field, boundary fields, coordinates, particle queue
Q: How to minimize the number of auxiliary bits
for efficient reverse code?
PADS 06/01/05
Georgia Institute of Technology
Challenges cont.d
Floating point arithmetic throughout
simulation
Simple increment/decrement won’t work
Destructive operations are dominant
Particle’s exit_time is computed by solving a pair
of quadratic equations → not reversible
“Wakeup” event → recompute all particle states
Delete operations in particle queue
PADS 06/01/05
Georgia Institute of Technology
Our observations
“delete” in queue is the inverse of “insert”
and vise versa
Particles never “disappear”; one deletion in a cell
corresponds to an insertion in another cell
Partial particle state is carried over by msgs
No need to re-compute these state variables
Reversing physical states can resort to physics laws
No need for “brute-force” when reversing code
Application semantics is the magic for irreversible code
PADS 06/01/05
Georgia Institute of Technology
Implementation example
Shell::arrival( ParticleArrivalEvent *e ) {
if( this cell is active ) {
update cell state;
insert particle in cell;
} else if ( e is a beam particle ) {
activate cell;
}
}
Shell::departure( ParticleDepartureEvent *e ) {
if( particle bounced from right neighbor ){
bounce particle back; // no state change
} else {
update cell state;
}
}
Shell::inject( ParticleInjectEvent *e ) {
update cell state;
insert beam particles;
}
PADS 06/01/05
Shell::undo_arrival( ParticleArrivalEvent *e ) {
if ( cell was activated ) {
undo_activate cell;
} else if ( cell already active ) {
delete particle in cell;
undo_update cell state;
}
}
Shell::undo_departure( ParticleDepartureEvent *e ) {
if ( particle was bounced from right neighbor ) {
undo_bounce particle;
} else {
undo_update cell state;
}
}
Shell::undo_inject( ParticleInjectEvent *e ) {
delete beam particles;
undo_update cell state;
}
Georgia Institute of Technology
Reverse Computation Illustrated
Forward
AddParticleToPQ()
Q = Q'+q/c.w2
a = Q.q/m
v = v'+a.dt
x = x'+v.dt+½a.dt2
Cell 2
 x' 
 v' 
S'  
 a' 
 
Q '
?
Add
@ 10
Add
@ 20
Ion
E
Reverse
x' = x-v.dt-½a.dt2
v' = v-a.dt
Q' = Q-q/c.w2
a' = Q'.q/m
DelParticleFromPQ()
…
Move
x
v
S 
a
 
Q 
@ 30
Move
@ 40
E
E
dt
Cell 3 Move
@5
Ion
PADS 06/01/05
Simulation Time
Georgia Institute of Technology
Experiment
Simulation engine: μsik (by Kalyan
Perumalla)
General-purpose parallel/distributed sim engine
Supports multiple synchronization approaches
Hardware platform:
SMP machines running Red Hat Linux with a
customized 2.4.18-10smp kernel
8 PIII 550MHz Xeon processors sharing 4GB of
memory
PADS 06/01/05
Georgia Institute of Technology
Simulation configuration
[1] Karimabadi, H, et al. A New Asynchronous Methodology for Modeling of
Physical Systems: Breaking the Curse of Courant Condition. J. Computational
Physics, 2005
Normalized units
Simulate 7000 cells with 70 initially active cells
Spacecraft has a radius of 500 & cell has a width
of 0.24
Solar wind plasma initially loaded with uniformly
distributed electrons and protons (30 of each per
cell)
Injected positron beam has an energy of 10 keV
with a period of 4ms
PADS 06/01/05
Georgia Institute of Technology
Phase space comparison
Time-stepped
PADS 06/01/05
Sequential DES
Georgia Institute of Technology
Validation by phase space comparison
Sequential DES
PADS 06/01/05
PDES
Georgia Institute of Technology
PDES vs. sequential DES
PADS 06/01/05
Georgia Institute of Technology
Total Number of Committed Events
Events distribution
PADS 06/01/05
Georgia Institute of Technology
Conclusions
Being the First to apply reverse computation
to parallel simulations of physical systems
Propose application-level reverse computation
Encouraging results for 1D model, better
performance expected for models with a
higher dimension
PADS 06/01/05
Georgia Institute of Technology
Ongoing and future work
Performance comparison with checkpointing approaches
More complex applications on larger-scale
systems
Apply reverse computation to other
domains of physics applications involving
partial differential equations
PADS 06/01/05
Georgia Institute of Technology
Any questions?
Thank you!
PADS 06/01/05
Georgia Institute of Technology