zoltan - ODU Computer Science

Download Report

Transcript zoltan - ODU Computer Science

Fine-Grain Sparse Matrix
Partitioning with Constraints
Erik Boman
Sandia National Labs, Albuquerque, NM
Umit Catalyurek
Ohio State University
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,
for the United States Department of Energy’s National Nuclear Security Administration
under contract DE-AC04-94AL85000.
Outline
• Load-balancing & partitioning
• Sparse matrix-vector multiplication
• Models: bipartite graph, hypergraph
• 1d and 2d distributions
• Constrained matrix partitioning
• Hypergraph with fixed vertices
• Vertex cover model
• Numerical results
2
Load Balancing, Graph Partitioning
• Load Balancing
– Assign work to processors to distribute work evenly
and minimize communication
• Graph or hypergraph partitioning
– Vertices (weighted) = computation
– Edge (weighted) = data dependence
Proc 3
Proc 1
Proc 2
3
Parallel Sparse Matrix-Vector Multiply
• Compute y=Ax, where A is large and sparse
– A is distributed among processors
• Kernel in scientific computing
– Iterative methods (Krylov)
– Eigenvalue computations
• PageRank
• Nice model problem
– Other applications have similar computation and
communication pattern
4
Parallel Sparse Matrix-vector Algorithm
• Step 1: Expand (fan-out)
– Send xj to processors with nonzero in column j
• Step 2: Local multiply-add
– yi = yi + Aij xj
• Step 3: Fold (fan-in)
– Send partial results of y to relevant processors
• Step 4: Sum partial results
Two communication phases (expand and fold).
5
Communication (p=2)
•Row distribution
•Column distribution
2 1 1 4 3 x
2 1 1 4 3 x
6
9
22
41
64
y
3
1
4 1
5 9 2
6
5 3
5 8 9
A
6
9
22
41
64
y
3
1
4 1
5 9 2
6
5 3
5 8 9
A
6
1d Models
• 1d distribution is most common
– Assign either rows or columns
• Models to represent a sparse matrix
– Graph (only symmetric)
– Bipartite graph
– Hypergraph
7
Bipartite Graph Model
•G=(R,C,E), where
– R are row vertices
– C are column vertices
•Partition both R and C
– But only use R for row
distribution
•Works in nonsym. case
•Edge cut approximates
comm. volume
C2
C4
• Is NOT exact
C1
C3
C5
8
1d Hypergraph Model
•Hypergraph
– Hyperedge is a set of
vertices (1 or more)
– Rows = vertices
– Columns = hyperedges
•Partition
– Minimize hyperedges cut
•Edge cut is exactly comm
volume
– Aykanat & Catalyurek (’96)
9
Sparse Matrix Partitioning
•2D partitioning methods
– Reduces communication
volume further
– Many variations
•2D Mondriaan
– Recursive hypergraph
bisection
– Bisseling & Vastenhouw
•2D Cartesian (checkerboard)
– First partition rows
– Then partition columns
• Via multiconstraint hgraph
partitioning
– Catalyurek & Aykanat
Courtesy: Rob
Bisseling
10
Fine-Grain Matrix Partitioning
•Fine-grain partitioning
– Assign each nonzero in
matrix separately
– Ultimate flexibility
•Fine-grain hypergraph
model
– Catalyurek & Aykanat
(2001)
– Each nonzero is a vertex
– Each row and column is a
hyperedge
– Exact model for comm.
volume
11
Bipartite Fine-Grain Model
•Bipartite graph model
•Partition both
– vertices (rows, columns)
– edges (nonzeros)
•Minimize #vertices with incident
edge in other partition
12
Constrained Matrix Partitioning
• Given A, x, y
– where x and y already have a prescribed parallel
distribution
• Find optimal parallel distribution of A
• Application: Iterative solver for Ax=b
– Preconditioner code requires specific layout of the
vectors
– We are free to choose distribution for A, which is only
used for matrix-vector multiply
13
Bisection
•First consider bisection
(p=2).
•Suppose wlog the 2*2 block
structure to the right
– Can always permute to get
this form
•We only need solve for the
two off-diagonal blocks
– Independent subproblems
14
Hypergraph with fixed vertices
Add constraints to finegrain hypergraph:
•Each row-net should contain a
special blue vertex
•Each column-net should
contain a special red vertex
•These special vertices are
fixed, cannot change partition
•Partitioning with fixed vertices
available in Patoh and Zoltan
packages
15
Cover nonzeros by rows/columns
Columns (4)
Rows (4)
Both (3)
16
Graph Model: Vertex Cover
x
x
x
x
x
x
x
x
17
Bipartite Matching
König’s Theorem:
Let G be a bipartite graph,
let C be a minimum vertex cover for G,
and let M be a maximum-cardinality matching.
Then |C| = |M|.
Polynomial time algorithm for bipartite graphs.
18
K-way Generalization
•Partition into k sets
– Decouples into k(k-1) independent subproblems
19
Numerical Results
•Bisection (k=2), split using natural order
Name
Original
Rows/cols
Reduced
Rows/cols
Partitioner
Patoh/Zolt.
Matching
Hopcr.-Karp
Cage12
130K
130K
18K
33K
16,240
18,740
17,480
Xyce320k
321K
321K
4,951
5,699
2,264
2,264
2,264
Finan512
75K
75K
19K
15K
14,413
14,011
12,879
Tbdlinux
112K
20K
41K
10K
9,398
11,600
8,577
Matching code provided by Dobrian, Halappanavar, and Pothen.
20
Conclusions
• Two new algorithms for fine-grain matrix
partitioning with constraints
• Empirical results show similar quality
– VC/matching usually faster than partitioning
– More experiments needed
• Hypergraph partitioning:
– Exact model, heuristic solver
• Vertex cover via matching
– Exact algorithm, but no balance constraint
• K-way partitioning decouples, fully parallel
21