Transcript PPT2

MFE Simulation Data
Management
SLAC DMW 2004
March 16, 2004
W. W. Lee and S. Klasky
Princeton Plasma Physics Laboratory
Princeton, NJ
Spatial & Temporal
Scales Present Major
Challenge to Theory &
Simulations
atomic mfp
electron-ion mfp
system size
skin depth
tearing length
ion gyroradius
• Huge range of spatial
and temporal scales.
• Overlap in scales
often means strong
(simplified) ordering
not possible
• Different codes/theory
for different scales.
• 5+years: Integration
of physics into Fusion
Simulation Project
Debye length
electron gyroradius
Spatial Scales (m)
10-6
10-2
10-4
100
pulse length
Inverse ion plasma frequency
inverse electron plasma frequency
ion gyroperiod
electron gyroperiod
10-10
current diffusion
confinement
Ion collision
electron collision
105
100
10-5
Temporal Scales (s)
102
Major Fusion Codes
Data Rates of Major Fusion Codes
Code
(GB)
now / 5yr
Runtime Processors Mbs
now/5yr (hr) Now/5yr
Now/5yr
GTC
4,000 / 100,000
300/150
2048
80/ 1600
Gyro
10 / 100
30/30
512/2048
.8/ 8
GS2
10 / 100
30/30
512/2048
.8 / 8
Degas2
.1
1
10
.2
Transp
.05
3
1
.04
Nimrod
5/ 50
20/20
128
.6/ 6
M3D
10 / 100
20/20
128
1.1/ 11
NSTX
.25/shot
0.25 * 40
1/ 4
Total (TB)
4.3 / 101
9, 36
Plasma Turbulence Simulation
• Gyrokinetic Particle-In-Cell Simulation
-- Reduced Vlasov-Maxwell Equations
• Simulations on MPP Platforms
-- Cray T3E & IBM SP (NERSC), Cray-X1 (ORNL),
SX6 (Earth Simulator, Japan)
• Simulation of Burning Plasmas
-- International Tokamak Experimental Reactor (ITER)
• Integrated Fusion Simulation Project (MFE)
• Visualization -- turbulence evolution & particle orbits
Gyrokinetic Approximation
• Gyromotion
• Polarization provides quasineutrality
[W. W. Lee, PF ‘83; JCP ‘87]
Earth Simulator
18%
10
(Ethier)
Ion Temperature Gradient Driven Turbulence
QuickTime™ and a
Video decompressor
are needed to see this picture.
Electrostatic Potential
QuickTime™ and a
Cinepak decompressor
are needed to see this picture.
Particle Trajectories
Data Management challenges
• GTC is producing TBs of data
– Data rates: 80Mbs now, 1.6Gbs 5 years.
– Need QOS to stream data.
• This data needs to be post-processed
– Essential to parallelize the post-processing routines to handle
our larger datasets.
– We need a cluster to post process this data.
• M (supercomputer processors) x N (cluster processors) problem.
• QOS becomes more important to sustain this post-processing.
• The post-processed data needs to be shared among
collaborators
– Different sections of the post-processed data may go to different
users .
– Post-processed data, along with other metadata should be
archived into a relational database.
Post processing of GTC Data.
• Particle Data
– No compression possible.
– Sent to 1 cluster for visualization/analysis.
– Work being done with K. Ma, U.C. Davis: Visualize a million
particles.
– Gain new insights into the theory.
• Field Data
– Geometric/Temporal compression of the data is possible.
– Data needs to be streamed to a local cluster at PPPL.
– Reduced subset needs to be sent to PPPL + collaborators.
• Use Logistic Network. [Beck, UT-K]
• Data transfer needs to be automatic, and integrated into a
dataflow/webflow for use with parallel analysis routines.
– We desire to see post-processed data during the simulation.
After the analysis
• Post-processed data needs to be saved
into a relational database
– How do we query this abstract data to
compare it with experiments?
– 3D correlation functions
– Processing of TBs of data/run now, 100’s of
TBs of data/run in 5 years.
– Data mining techniques will be necessary to
understand this data.