Top Banner
QuickTime™ and a decompressor are needed to see this picture An Interactive Environment for Combinatorial Scientific Computing Viral B. Shah John R. Gilbert Steve Reinhardt With thanks to: Brad McRae, Stefan Karpinski, Vikram Aggarwal, Min Roh QuickTime™ and a decompressor are needed to see this picture
31

An Interactive Environment for Combinatorial Scientific Computing

Jan 12, 2016

Download

Documents

viho

An Interactive Environment for Combinatorial Scientific Computing. Viral B. Shah John R. Gilbert Steve Reinhardt With thanks to: Brad McRae, Stefan Karpinski, Vikram Aggarwal, Min Roh. HPC today is exciting !. Complex software stack. Computational ecology, CFD, data exploration. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

An Interactive Environment for Combinatorial Scientific Computing

Viral B. Shah John R. GilbertSteve Reinhardt

With thanks to: Brad McRae, Stefan Karpinski, Vikram Aggarwal, Min Roh

QuickTime™ and a decompressor

are needed to see this picture.

Page 2: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

HPC today is exciting !

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

Page 3: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Complex software stack

Distributed Sparse MatricesArithmetic, matrix multiplication, indexing, solvers (\, eigs)

Graph Analysis & PD Toolbox

Graph querying & manipulation, connectivity, spanning trees,

geometric partitioning, nested dissection, NNMF, . . .

Preconditioned Iterative Methods

CG, BiCGStab, etc. + combinatorial preconditioners (AMG, Vaidya)

Applications

Computational ecology, CFD, data exploration

Page 4: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Star-P

A = rand(4000*p, 4000*p);

x = randn(4000*p, 1);

y = zeros(size(x));

while norm(x-y) / norm(x) > 1e-11

y = x;

x = A*x;

x = x / norm(x);

end;

Page 5: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Star-P architecture

Page 6: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Parallel sorting

• Simple, widely used combinatorial primitive

• [V, perm] = sort (V)

• Used in many sparse matrix and array algorithms: sparse(),

indexing, concatenation, transpose, reshape, repmat etc.

• Communication efficient

3 6 8 1 5 4 7 2 9

1 2 3 4 5 6 7 8 9

Page 7: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Sorting performance

Time spent in different phases of Psort (192 processors on SGI Altix)

0.0001

0.001

0.01

0.1

1

10

100

1000

1E+06 1E+07 1E+08 1E+09 1E+10 1E+11

Problem Size

Time (seconds)

Sequential sorting Splitters using mediansCommunication MergingTotal time

Page 8: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

P0

P1

P2

Pn

5941 532631

23 131

Each processor stores:

• # of local nonzeros (# local edges)• range of local rows (local vertices)• nonzeros in a compressed row data structure (local edges)

Distributed sparse arrays

1

2 326

53

41

31

59

Page 9: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Sparse matrix operations

• dsparse layout, same semantics as ddense

• Matrix arithmetic: +, max, sum, etc.

• matrix * matrix and matrix * vector

• Matrix indexing and concatenation

A (1:3, [4 5 2]) = [ B(:, J) C ] ;

• Linear solvers: x = A \ b; using MUMPS/SuperLU (MPI)

• Eigensolvers: [V, D] = eigs(A); using PARPACK (MPI)

Page 10: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Sparse matrix multiplication

B

= x

C A

for j = 1:nC(:, j) = A * B(:, j)

SPA

gather scatter/accumulate

All matrix columns and vectors are stored compressed except the SPA.

See A. Buluc (MS42, Fri 10am)

Page 11: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

Interactive data exploration

A graph plotted with relaxed Fiedler co-ordinates

Page 12: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

A 2-D density spy plot

Density spy plot of an R-MAT power law graph

Page 13: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Breadth-first search: sparse matvec

AT

1 2

3

4 7

6

5

(AT)2x

x ATx

• Multiply by adjacency matrix step to neighbor vertices

• Work-efficient implementation from sparse data structures

Page 14: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Maximal independent set

1 2

3

4 7

6

5

degree = sum(G, 2);

prob = 1 ./ (2 * deg);

select = rand (n, 1) < prob;

if ~isempty (select & (G * select);

% keep higher degree vertices

end

IndepSet = [IndepSet select];

neighbor = neighbor | (G * select);

remain = neighbor == 0;

G = G(remain, remain);

Luby’s algorithm

Page 15: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

• Many tight clusters, loosely interconnected

• Vertices and edges permuted randomly

A graph clustering benchmark

Fine-grained, irregular data access

Page 16: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Clustering by BFS

% Grow each seed to vertices

% reached by at least k

% paths of length 1 or 2

C = sparse(seeds, 1:ns, 1, n, ns);

C = A * C;

C = C + A * C;

C = C >= k;

• Grow local clusters from many seeds in parallel

• Breadth-first search by sparse matrix * matrix

• Cluster vertices connected by many short paths

Page 17: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

1213

12125

55

1313

13 13

12 12

[ignore, leader] = max(G);

S = sparse(leader,1:n,1,n,n) * G;

[ignore, leader] = max(S);

• Each vertex votes for its highest numbered neighbor as its leader

• Number of leaders is roughly the same as number of clusters

• Matrix multiplication gathers neighbor votes

• S(i,j) is the number of votes for i from j’s neighbors

Clustering by peer pressure

Page 18: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

SSCA #2 v1.1 - scale 21

1

10

100

1000

8 24 40 56 72 88 104 120

Processors

Time (sec)

Data Generator

Kernel 1

Kernel 2

Kernel 3

Kernel 4

Scaling up

• Graph with 2 million nodes, 321 million directed edges, 89 million undirected edges, 32 thousand cliques

• Good scaling observed from 8 to 120 processors of an SGI Altix

Page 19: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Graph Laplacian

Graph of Poisson’s Equation on a 2D gridG = grid5 (10);

Page 20: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Spanning trees

Maximum weight spanning tree T = mst (G, ‘max’);

Page 21: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

A combinatorial preconditionerV. Aggarwal

Augmented Vaidya’s preconditionerV = vaidya_support (G);

Page 22: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Quadtree meshes and AMGV. Aggarwal and M. Roh

Page 23: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Wireless traffic modelingS. Karpinski

• Non-negative matrix factorizations (NNMF) for wireless traffic modeling

• NNMF algorithms combine linear algebra and optimization methods

• Basic and “improved” NMF factorization algorithms implemented:

– euclidean (Lee & Seung 2000)

– K-L divergence (Lee & Seung 2000)

– semi-nonnegative (Ding et al. 2006)

– left/right-orthogonal (Ding et al. 2006)

– bi-orthogonal tri-factorization (Ding et al. 2006)

– sparse euclidean (Hoyer et al. 2002)

– sparse divergence (Liu et al. 2003)

– non-smooth (Pascual-Montano et al. 2006)

Page 24: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

A meta-algorithm

sphericalk-means

ANLS

K-L div.

same asCDFs

Page 25: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Landscape ConnectivityB. McRae

• Landscape connectivity governs the degree to which the landscape facilitates or impedes movement

• Need to model important processes like:

– Gene flow (to avoid inbreeding)

– Movement and mortality patterns

• Corridor identification, conservation planning

QuickTime™ and a decompressor

are needed to see this picture.

Page 26: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Pumas in southern California

Joshua Tree National Park

Los AngelesPalm Springs

Habitat quality model

Page 27: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Model as a resistive network

Habitat

Nonhabitat

Reserve

Page 28: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Processing landscapes

QuickTime™ and a decompressor

are needed to see this picture.

Combinatorial methodsGraph construction

Graph contraction

Connected components

Numerical methods Linear systems

Combinatorial preconditioners

Page 29: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Results

• Solution time reduced from 3 days to 5 minutes for typical problems

• Aiming for much larger problems: Yellowstone-to-Yukon (Y2Y)

Page 30: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Multi-layered software tools

Distributed Sparse MatricesArithmetic, matrix multiplication, indexing, solvers (\, eigs)

Graph Analysis & PD Toolbox

Graph querying & manipulation, connectivity, spanning trees,

geometric partitioning, nested dissection, NNMF, . . .

Preconditioned Iterative Methods

CG, BiCGStab, etc. + combinatorial preconditioners (AMG, Vaidya)

Applications

Computational ecology, CFD, data exploration

Page 31: An Interactive Environment for  Combinatorial Scientific Computing

QuickTime™ and a decompressor

are needed to see this picture.

Thanks for coming

Thank You