Transcript Document

Input
Hi
Data
Size
MapReduce/MTC
(Data Analysis,
Mining)
MTC
(Big Data and
Many Tasks)
Med
HPC
(Heroic
MPI
Tasks)
HTC/MTC
(Many Loosely
Coupled Tasks)
Low
1
•
•
•
•
1K
1M
Number of Tasks
Bridge the gap between HPC and HTC
Applications structured as DAGs
Data dependencies will be files that are
written to and read from a file system
Loosely coupled apps with HPC
orientations
•
Falkon
 Fast and Lightweight Task Execution Framework
 http://datasys.cs.iit.edu/projects/Falkon/index.html
•
Swift
 Parallel Programming System
 http://www.ci.uchicago.edu/swift/index.php
Paving the Road to Exascales with Many-Task Computing
1
• the technique of distributing computational
and
communication loads evenly across processors of a parallel
machine, or across nodes of a supercomputer
• Different scheduling strategies
– Centralized scheduling: poor scalability (Falkon, Slurm, Cobalt)
– Hierarchical scheduling: moderate scalability (Falkon, Charm++)
– Distributed scheduling: possible approach to exascales (Charm++)
• Work Stealing: a distributed load balancing strategy
– Starved processors steal tasks from overloaded ones
– Various parameters affect performance:
•
•
•
•
Number of tasks to steal (half)
Number of neighbors (square root of number of all nodes)
Static or Dynamic random neighbors (Dynamic random neighbors)
Stealing poll interval (exponential back off)
Paving the Road to Exascales with Many-Task Computing
2
No waiting tasks
Has Waiting
Tasks
TaskEnd
Available
cores
Failed
Steal
TaskRec
Fi
Ta
s
sk
ta
as
H
First node
needs tasks
o
s
ed
ne
e ks
o d ta s
t n re
o
m
Start
sk
s
rs
•
Global Event Queue
Insert Event(time:t)
Sorted by time
•
Visual
Log
N
•
light-weight and scalable discrete event
SIMulator for MAny-Task computing
execution fabRIc at eXascales
supports centralized (FIFO) and distributed
(work scheduling) scheduling
has great scalability (millions of nodes,
billions of cores, trillions of tasks)
future extensions: task dependency, work
flow system simulation, different network
topologies, data-aware scheduling
Dispa
tch
tasks
•
TaskDisp
Paving the Road to Exascales with Many-Task Computing
3
• a real implementation of distributed MAny-Task execution
fabRIc at eXascales
g
re
Throughput (tasks/sec)
5000
)
(1
ip
lis
rs
h
2)
ip
t(
me m
70%
60%
3000
50%
40%
2000
request load (7)
lis
30%
20%
3.8%
em
m
nd
3.7%
5.0%
Compute node
10%
0%
64
send tasks (10)
se
4.7%
0
send load (8)
be
e st
80%
4000
request tasks (9)
128
256
Scale (# of nodes)
512
Compute node
)
submit tasks using ZHT(4)
90%
1000
h
rs
t(
2)
be
ber
sh i p
em
l i st
n
m
req
u
tio
tra
nd
(3)
is
se
Index Server
100%
SimMatrix Throughput
MATRIX Throughput
Difference
Client
se
nd
loa
d
(8
lookup task status using ZHT(5)
request load (7)
send task status info (6)
Compute node
Paving the Road to Exascales with Many-Task Computing
4
Difference (%)
6000
• DataSys Laboratory
Ioan Raicu
Anupam Rajendran
Tonglin Li
Kevin Brandstatter
• University of Chicago
Zhao Zhang