Transcript Slides
Work Stealing for Irregular Parallel Applications on
Computational Grids
Vladimir Janjic
University of St Andrews
12th December 2011
In this talk…
Feudal Stealing algorithm for scheduling irregular parallel
applications
Combination of Grid-GUM and Cluster-aware Random
Stealing
Irregular parallel applications -- task trees highly unbalanced
November 21, 2006
What is work stealing?
Work Stealing -- passive, distributed, dynamic scheduling
method
Idle “thieves” steal work from busy “victims”
November 21, 2006
To steal or not to steal?
Why stealing?
Dynamic, adaptive and “cheap”
Does not require prior knowledge of task dependencies =>
good for irregular applications
Inherently distributed => scalability
Why not stealing?
Not optimal
Possibly slow work distribution
November 21, 2006
Work stealing on Computational Grids
Grid-GUM (GpH), Satin (Java d&c), Javelin (Java), Atlas (Java)
The main problem : Steal attempts can be expensive due to
high latencies
Especially for irregular applications, where all work may
be concentrated on a few nodes
The main questions are where to send steal attempts and how
to respond to them
Use load information (Grid-GUM)
November 21, 2006
Cluster-aware Random Stealing (CRS)
Local (within a cluster) and remote (outside of cluster) stealing
done in parallel
Works well for regular applications on heterogeneous
environments with a lot of parallelism
Not so well for irregular
November 21, 2006
Centralised and distributed work
stealing
November 21, 2006
Feudal Stealing
Use the CRS algorithm as a base
Local stealing done using Random Stealing
Remote stealing done via cluster head nodes
Only head nodes (and a victim) visited
Head nodes hold load information
November 21, 2006
3
2
1
4
5
6
7
8
9
Cluster 0
Local load
PE Load
2
2
3
3
4
0
5
7
6
0
7
0
8
1
9
5
Remote Load
Cl Time Load
1 1000 23
2
0
0
3 2000
0
Feudal Stealing
November 21, 2006
How is load information in head nodes
obtained?
Load of nodes inside the cluster periodically sent
Load of remote clusters updated from remote-steal messages
Cluster load information attached to remote-steal messages
(similar to Grid-GUM)
November 21, 2006
Evaluation of Feudal Work Stealing
Using simulations, on generic benchmarks for load balancing
algorithms (UTS -- Unbalanced Tree Search)
For regular and less-irregular applications, performs as well as
CRS and better than Grid-GUM
For highly-irregular applications, better than CRS and GridGUM
November 21, 2006
Comparison of Feudal Stealing, CRS and
Grid-GUM
QuickTime™ and a
decompressor
are needed to see this picture.
November 21, 2006
Improvements of Feudal Stealing over
CRS
November 21, 2006
Conclusions
Feudal Stealing works well for irregular parallel applications on
Computational Grids
Sacrifices some desirable features of “pure” work stealing in
order to make better selection of remote targets
Tested only using simulations. Implementation in Grid-GUM
under way
Tested only on artificial applications (unbalanced tree search)
November 21, 2006
More info
Vladimir Janjic, Load Balancing of Irregular Parallel
Applications On Heterogeneous Distributed Computing
Environments, PhD Thesis, University of St Andrews, 2011
Vladimir Janjic, Kevin Hammond, Think Locally Steal Globally :
Using Dynamic Load Information in Work-Stealing on
Computational Grids, Submitted to CCGrid 2012
Vladimir Janjic, Kevin Hammond, Feudal Work-Stealing, In
preparration, planned for submission to EuroPar 2012
November 21, 2006