Ian_Foster Argonne Loosely Coupled April 2008

Transcript Ian_Foster Argonne Loosely Coupled April 2008

From the Heroic to the Logistical
Programming Model Implications
of New Supercomputing Applications
Ian Foster
Computation Institute
Argonne National Laboratory &
The University of Chicago
With thanks to: Miron Livny, Ioan Raicu, Mike
Wilde, Yong Zhao, and many others.
What will we do
with 1+ Exaflops
and 1M+ cores?
1) Tackle Bigger and Bigger Problems
Computational
Scientist
as
Hero
3
2) Tackle Increasingly Complex Problems
Computational
Scientist
as
Logistics
Officer
4
“More Complex Problems”
 Use ensemble runs to quantify climate model uncertainty
 Identify potential drug targets by screening a database of
ligand structures against target proteins
 Study economic model sensitivity to key parameters
 Analyze turbulence dataset from multiple perspectives
 Perform numerical optimization to determine optimal
resource assignment in energy problems
 Mine collection of data from advanced light sources
 Construct databases of computed properties of chemical
compounds
 Analyze data from the Large Hadron Collider
 Analyze log data from 100,000-node parallel computations
5
Programming Model Issues
 Massive task parallelism
 Massive data parallelism
 Integrating black box applications
 Complex task dependencies (task graphs)
 Failure, and other execution management issues
 Data management: input, intermediate, output
 Dynamic task graphs
 Dynamic data access involving large amounts of data
 Long-running computations
 Documenting provenance of data products
6
Problem Types
Input
data
size
Hi
Data
analysis,
mining
Much data and
complex tasks
Med
Heroic
MPI
Lo tasks
1
Many loosely coupled tasks
1K
1M
Number of tasks
7
An Incomplete and Simplistic View of
Programming Models and Tools
Single task, modest data
MPI, etc., etc., etc.
Many Tasks
DAGMan+Pegasus
Karajan+Swift
Much Data
MapReduce/Hadoop
Dryad
Complex Tasks, Much Data
Dryad, Pig, Sawzall
Swift+Falkon
8
Many Tasks
Climate
Ensemble
Simulations
(Using FOAM,
2005)
Image courtesy Pat
Behling and Yun
Liu, UW Madison
NCAR computer + grad student
160 ensemble members in 75 days
TeraGrid + “Virtual Data System”
250 ensemble members in 4 days
9
Many Many Tasks:
Identifying Potential Drug Targets
Protein
target(s)
x
2M+ ligands
(Mike Kubal, Benoit Roux, and others)
10
ZINC
3-D
structures
PDB
1
protein
protein
descriptions (1MB)
2M
6
structures
(6GB
GB)
Manually prep
DOCK6 rec file
Manually prep
FRED rec file
DOCK6
Receptor
(1 per protein:
defines pocket
to bind to)
FRED
Receptor
(1 per protein:
defines pocket
to bind to)
NAB
Script
Template
BuildNABScript
NAB
Script
start
FRED
~4M x 60s x 1 cpu
DOCK6
NAB script
parameters
(defines flexible
residues,
#MDsteps)
Amber prep:
2. AmberizeReceptor
4. perl: gen nabscript
~60K cpu-hrs
Select best ~5K Select best ~5K
Amber
~10K x 20m x 1 cpu
~3K cpu-hrs
Amber Score:
1. AmberizeLigand
3. AmberizeComplex
5. RunNABScript
Select best ~500
GCMC
~500 x 10hr x 100 cpu
~500K cpu-hrs
end
report
ligands
complexes
4 million tasks
500K cpu-hrs
11
DOCK on SiCortex
 CPU cores: 5760
 Tasks: 92160
 Elapsed time: 12821 sec
 Compute time: 1.94 CPU years
 Average task time: 660.3 sec
(does not
include ~800
sec to stage
input data)
Ioan Raicu,
Zhao Zhang
12
MARS Economic Model
Parameter Study
CPU Cores
1600
1400
180 360 540 720 900 1080 1260 1440
8000000
Idle CPUs
Busy CPUs
Wait Queue Length
Completed Micro-Tasks
7000000
6000000
5000000
1200
1000
4000000
800
3000000
600
2000000
400
1000000
200
0
0
0
180 360 540 720 900 1080 1260 1440
Time (sec)
Mike Wilde, Zhao Zhang
13
Micro-Tasks
 2,048 BG/P CPU cores
 Tasks: 49,152
 Micro-tasks: 7,077,888
 Elapsed time: 1,601 secs 0
2000
 CPU Hours: 894
1800
B. Berriman, J. Good (Caltech)
J. Jacob, D. Katz (JPL)
14
Montage in MPI
and Swift
3500
GRAM/Clustering
3000
MPI
2500
Falkon
2000
1500
1000
500
to
ta
l
Ad
d
m
Ad
d(
su
b)
m
Ba
ck
gr
ou
nd
m
it
Di
ff/
F
m
Pr
oj
ec
t
0
m
Time (s)
 MPI: ~950 lines of C for one stage
 Pegasus: ~1200 lines of C + tools to
generate DAG for specific dataset
 SwiftScript: ~92 lines for any dataset
(Yong Zhao, Ioan Raicu, U.Chicago)
Components
15
MapReduce/Hadoop
Namenode
Metadata (Name, replicas, …):
/home/sameerp/foo, 3, …
/home/sameerp/docs, 4, …
Metadata
ops
Client
Datanodes
I/O
Client
Rack 1
10000
Swift+PBS
Hadoop
863
Time (sec)
1000
Word Count
4688
1143
7860
Rack 2
Hadoop DFS Architecture
1795
221
100
10
1
75MB
350MB
Data Size
703MB
ALCF: 80 TB memory,
8 PB disk,
78 GB/s I/O bandwidth
Soner Balkir, Jing Tie, Quan Pham
16
Extreme Scale Debugging:
Stack Trace Sampling Tool (STAT)
Cost per sample on BlueGene/L
131,072
processes
1-deep (VN Mode)
2-deep (VN Mode)
3-deep (VN Mode)
2.5
Latency (secs)
2
1.5
1
0.5
0
0
20000
40000
60000
80000
100000
120000
140000
Number of Application Tasks
Bart Miller, Wisconsin
17
Summary
 Peta- and exa-scale computers enable us to tackle new
types of problems at far greater scales than before
– Parameter studies, ensembles, interactive data
analysis, “workflows” of various kinds
– Potentially an important source of new applications
 Such apps frequently stress petascale hardware and
software in interesting ways
 New programming models and tools are required
– Mixed task and data parallelism, management of many
tasks, complex data management, failure, …
– Tools for such problems (DAGman, Swift, Hadoop, …)
exist but need refinement
 Interesting connections to distributed systems community
More info: www.ci.uchicago.edu/swift
18
Amiga Mars – Swift+Falkon
 1024 Tasks (147456 micro-tasks)
 256 CPU cores
19

Ian_Foster Argonne Loosely Coupled April 2008

Transcript Ian_Foster Argonne Loosely Coupled April 2008

Directory