Ian_Foster Argonne Loosely Coupled April 2008
Download
Report
Transcript Ian_Foster Argonne Loosely Coupled April 2008
From the Heroic to the Logistical
Programming Model Implications
of New Supercomputing Applications
Ian Foster
Computation Institute
Argonne National Laboratory &
The University of Chicago
With thanks to: Miron Livny, Ioan Raicu, Mike
Wilde, Yong Zhao, and many others.
What will we do
with 1+ Exaflops
and 1M+ cores?
1) Tackle Bigger and Bigger Problems
Computational
Scientist
as
Hero
3
2) Tackle Increasingly Complex Problems
Computational
Scientist
as
Logistics
Officer
4
“More Complex Problems”
Use ensemble runs to quantify climate model uncertainty
Identify potential drug targets by screening a database of
ligand structures against target proteins
Study economic model sensitivity to key parameters
Analyze turbulence dataset from multiple perspectives
Perform numerical optimization to determine optimal
resource assignment in energy problems
Mine collection of data from advanced light sources
Construct databases of computed properties of chemical
compounds
Analyze data from the Large Hadron Collider
Analyze log data from 100,000-node parallel computations
5
Programming Model Issues
Massive task parallelism
Massive data parallelism
Integrating black box applications
Complex task dependencies (task graphs)
Failure, and other execution management issues
Data management: input, intermediate, output
Dynamic task graphs
Dynamic data access involving large amounts of data
Long-running computations
Documenting provenance of data products
6
Problem Types
Input
data
size
Hi
Data
analysis,
mining
Much data and
complex tasks
Med
Heroic
MPI
Lo tasks
1
Many loosely coupled tasks
1K
1M
Number of tasks
7
An Incomplete and Simplistic View of
Programming Models and Tools
Single task, modest data
MPI, etc., etc., etc.
Many Tasks
DAGMan+Pegasus
Karajan+Swift
Much Data
MapReduce/Hadoop
Dryad
Complex Tasks, Much Data
Dryad, Pig, Sawzall
Swift+Falkon
8
Many Tasks
Climate
Ensemble
Simulations
(Using FOAM,
2005)
Image courtesy Pat
Behling and Yun
Liu, UW Madison
NCAR computer + grad student
160 ensemble members in 75 days
TeraGrid + “Virtual Data System”
250 ensemble members in 4 days
9
Many Many Tasks:
Identifying Potential Drug Targets
Protein
target(s)
x
2M+ ligands
(Mike Kubal, Benoit Roux, and others)
10
ZINC
3-D
structures
PDB
1
protein
protein
descriptions (1MB)
2M
6
structures
(6GB
GB)
Manually prep
DOCK6 rec file
Manually prep
FRED rec file
DOCK6
Receptor
(1 per protein:
defines pocket
to bind to)
FRED
Receptor
(1 per protein:
defines pocket
to bind to)
NAB
Script
Template
BuildNABScript
NAB
Script
start
FRED
~4M x 60s x 1 cpu
DOCK6
NAB script
parameters
(defines flexible
residues,
#MDsteps)
Amber prep:
2. AmberizeReceptor
4. perl: gen nabscript
~60K cpu-hrs
Select best ~5K Select best ~5K
Amber
~10K x 20m x 1 cpu
~3K cpu-hrs
Amber Score:
1. AmberizeLigand
3. AmberizeComplex
5. RunNABScript
Select best ~500
GCMC
~500 x 10hr x 100 cpu
~500K cpu-hrs
end
report
ligands
complexes
4 million tasks
500K cpu-hrs
11
DOCK on SiCortex
CPU cores: 5760
Tasks: 92160
Elapsed time: 12821 sec
Compute time: 1.94 CPU years
Average task time: 660.3 sec
(does not
include ~800
sec to stage
input data)
Ioan Raicu,
Zhao Zhang
12
MARS Economic Model
Parameter Study
CPU Cores
1600
1400
180 360 540 720 900 1080 1260 1440
8000000
Idle CPUs
Busy CPUs
Wait Queue Length
Completed Micro-Tasks
7000000
6000000
5000000
1200
1000
4000000
800
3000000
600
2000000
400
1000000
200
0
0
0
180 360 540 720 900 1080 1260 1440
Time (sec)
Mike Wilde, Zhao Zhang
13
Micro-Tasks
2,048 BG/P CPU cores
Tasks: 49,152
Micro-tasks: 7,077,888
Elapsed time: 1,601 secs 0
2000
CPU Hours: 894
1800
B. Berriman, J. Good (Caltech)
J. Jacob, D. Katz (JPL)
14
Montage in MPI
and Swift
3500
GRAM/Clustering
3000
MPI
2500
Falkon
2000
1500
1000
500
to
ta
l
Ad
d
m
Ad
d(
su
b)
m
Ba
ck
gr
ou
nd
m
it
Di
ff/
F
m
Pr
oj
ec
t
0
m
Time (s)
MPI: ~950 lines of C for one stage
Pegasus: ~1200 lines of C + tools to
generate DAG for specific dataset
SwiftScript: ~92 lines for any dataset
(Yong Zhao, Ioan Raicu, U.Chicago)
Components
15
MapReduce/Hadoop
Namenode
Metadata (Name, replicas, …):
/home/sameerp/foo, 3, …
/home/sameerp/docs, 4, …
Metadata
ops
Client
Datanodes
I/O
Client
Rack 1
10000
Swift+PBS
Hadoop
863
Time (sec)
1000
Word Count
4688
1143
7860
Rack 2
Hadoop DFS Architecture
1795
221
100
10
1
75MB
350MB
Data Size
703MB
ALCF: 80 TB memory,
8 PB disk,
78 GB/s I/O bandwidth
Soner Balkir, Jing Tie, Quan Pham
16
Extreme Scale Debugging:
Stack Trace Sampling Tool (STAT)
Cost per sample on BlueGene/L
131,072
processes
1-deep (VN Mode)
2-deep (VN Mode)
3-deep (VN Mode)
2.5
Latency (secs)
2
1.5
1
0.5
0
0
20000
40000
60000
80000
100000
120000
140000
Number of Application Tasks
Bart Miller, Wisconsin
17
Summary
Peta- and exa-scale computers enable us to tackle new
types of problems at far greater scales than before
– Parameter studies, ensembles, interactive data
analysis, “workflows” of various kinds
– Potentially an important source of new applications
Such apps frequently stress petascale hardware and
software in interesting ways
New programming models and tools are required
– Mixed task and data parallelism, management of many
tasks, complex data management, failure, …
– Tools for such problems (DAGman, Swift, Hadoop, …)
exist but need refinement
Interesting connections to distributed systems community
More info: www.ci.uchicago.edu/swift
18
Amiga Mars – Swift+Falkon
1024 Tasks (147456 micro-tasks)
256 CPU cores
19