1 Tools and Primitives for High Performance Graph Computation John R. Gilbert University of California, Santa Barbara Aydin Buluç (LBNL) Adam Lugowski (UCSB) SIAM Minisymposium on Analyzing Massive Real-World Graphs July 12, 2010 Support: NSF, DARPA, DOE, Intel
35
Embed
Tools and Primitives for High Performance Graph Computationgilbert/talks/SIAMannual12... · 2010-07-12 · 1 Tools and Primitives for High Performance Graph Computation. John R. Gilbert.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Tools and Primitives for High Performance Graph Computation
John R. GilbertUniversity of California, Santa Barbara
Aydin Buluç (LBNL)Adam Lugowski (UCSB)
SIAM Minisymposium on Analyzing Massive Real-World GraphsJuly 12, 2010
Support: NSF, DARPA, DOE, Intel
2
An analogy?
As the “middleware” of scientific computing,
linear algebra has suppliedor enabled:
• Mathematical tools
• “Impedance match” to computer operations
• High-level primitives
• High-quality software libraries
• Ways to extract performancefrom computer architecture
• Interactive environments
Computers
Continuousphysical modeling
Linear algebra
3
An analogy?
Computers
Continuousphysical modeling
Linear algebra
Discretestructure analysis
Graph theory
Computers
4
An analogy? Well, we’re not there yet ….
Discretestructure analysis
Graph theory
Computers
√ Mathematical tools
? “Impedance match” to computer operations
? High-level primitives
? High-quality software libs
? Ways to extract performancefrom computer architecture
? Interactive environments
5
• By analogy to numerical scientific computing. . .
• What should the combinatorial BLAS look like?
The Primitives Challenge
C = A*B
y = A*x
μ = xT y
Basic Linear Algebra Subroutines (BLAS):Speed (MFlops) vs. Matrix Size (n)
6
Primitives should …
• Supply a common notation to express computations
• Have broad scope but fit into a concise framework
• Allow programming at the appropriate level ofabstraction and granularity
• Scale seamlessly from desktop to supercomputer
• Hide architecture-specific details from users
7
The Case for Sparse Matrices
Many irregular applications contain coarse-grained parallelism that can be exploited
by abstractions at the proper level.
Traditional graph computations
Graphs in the language of linear algebra
Data driven,unpredictable communication.
Fixed communication patterns
Irregular and unstructured, poor locality of reference
Operations on matrix blocks exploit memory hierarchy
Fine grained data accesses, dominated by latency
Coarse grained parallelism, bandwidth limited
The case for sparse matrices
8
Identification of Primitives
Sparse matrix-matrix multiplication (SpGEMM)
Element-wise operations
x
Matrices on various semirings: (x, +) , (and, or) , (+, min) , …
Sparse matrix-dense vector multiplication
Sparse matrix indexing
x
.*
Sparse array-based primitives
9
Multiple-source breadth-first search
X
1 2
3
4 7
6
5
AT
10
XAT ATX
1 2
3
4 7
6
5
Multiple-source breadth-first search
11
• Sparse array representation => space efficient
• Sparse matrix-matrix multiplication => work efficient
• Three possible levels of parallelism: searches, vertices, edges
XAT ATX
1 2
3
4 7
6
5
Multiple-source breadth-first search
12
A Few Examples
13
Betweenness Centrality (BC)What fraction of shortest paths pass through this node?
Brandes’ algorithm
A parallel graph library based on distributed-memory sparse arrays
and algebraic graph primitives
Typical software stack
Combinatorial BLAS[Buluc, G]
14
BC performance in distributed memory
• TEPS = Traversed Edges Per Second
• One page of code using C-BLAS
0
50
100
150
200
250
25 36 49 64 81 100
121
144
169
196
225
256
289
324
361
400
441
484
TEPS
sco
reM
illio
ns
Number of Cores
BC performance
Scale 17
Scale 18
Scale 19
Scale 20
RMAT power-law graph,
2Scale vertices, avg degree 8
15
KDT: A toolbox for graph analysis and pattern discovery[G, Reinhardt, Shah]