Top Banner
1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore. SURAgrid All-hands Meeting, 27 September, 2007, Washington D.C.
20

1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

Dec 17, 2015

Download

Documents

Winfred Wilson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

1

A Common Application Platform (CAP) for SURAgrid

-Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

SURAgrid All-hands Meeting, 27 September, 2007, Washington D.C.

Page 2: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

2

Introduction

Problem Statement:

“How to quickly grid-enable scientific applications to exploit SURAgrid resources ?”

Goals for Grid-enabling Applications: Dynamic resource discovery Collective resource utilization Simple Job Management and Accounting Minimal Programming Efforts

Page 3: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

3

Identifying Patterns

Algorithm StructuresSupport StructuresRelationships

Page 4: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

4

Basic Process

Problems ProgramsAlgorithms Implementation

Algorithm Structures

Supporting Structures

Programming Environments

Page 5: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

5

Algorithm Structures

1. Organize by Tasks Task Parallelism Divide and Conquer

2. Organize by Data Decomposition Geometric Decomposition Recursive Data

3. Organize By Flow of Data Pipeline Event-Based Coordination

How to organize? (Linear and Recursive)

Page 6: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

6

Support Structures

Program Structures

1. SPMD

2. Master/Worker

3. Loop Parallelism

4. Fork/Join

Data Structures

1. Shared Data

2. Shared Queue

3. Distributed Array

Problems ProgramsAlgorithms Implementation

Page 7: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

7

Relationships: AS and SS

AS

SS

Task Parallelism

Divide & Conquer

Geometric Decomposition

Recursive Data

Pipeline Event-based Coordination

SPMD **** *** **** ** *** **Loop Parallelism **** ** ***Master/Worker **** ** * * * *Fork/Join ** **** ** **** ****

Relationship between Supporting Structures (SS) patterns and Algorithm Structure (AS) patterns.

Page 8: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

8

Relationships: SS and PE

SS PE OpenMP MPI Java

SPMD *** **** **Loop Parallelism **** * ***Master-Worker ** *** ***Fork-Join *** ****

Relationship between Supporting Structures (SS) patterns and Programming Environments (PE).

Page 9: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

9

Important Observation

“This (SPMD) pattern is by far the most commonly used pattern for structuring parallel programs. It is particularly relevant for MPI programmers and

problems using the Task Parallelism and Geometric Decomposition patterns. It has also

proved effective for problems using the Divide and Conquer, and Recursive Data patterns.”

Single Program, Multiple Data (SPMD). This is the most common way to organize a prallel program, especially on MIMD computers. The idea is that a single program is written and loaded onto each node of a parallel computer. Each copy of the single program runs independently (aside from coordination events), so the instruction streams executed on each node can be completely different. The specific path through the code is in part selected by the node ID.

Page 10: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

10

The Computing Continuum

OpenMPMPIJava

Page 11: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

11

Scientific Applications

What are they?How are they built?

Page 12: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

12

Scientific Applications

Life Sciences: Biology, Chemistry, …

Engineering: Aerospace, Civil, Mechanical, Environmental

Physics: QCD, Black Holes, …

….

The SCaLeS Reports: http://www.pnl.gov/scales/

Page 13: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

13

Building Blocks

OpenMP; MPI; BLACS, … Metis/ParMetis, Zoltan, Chaco, … BLAS and LAPACK, … ScaLapack, MUMPS, SuperLU, … PETSc, Aztec, …

“A core requirement of many engineering and scientific applications is the need to solve linear and non-linear systems of equations, eigensystems and other related

problems.” – The Trilinos Project

Page 14: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

14

ScaLAPACK

ScaLAPACK

BLAS

LAPACK BLACS

MPI/PVM/...

PBLASGlobal

Local

platform specific

Level 1/2/3Level 1/2/3

Linear systems, least squares, singular value decomposition,

eigenvalues.

Linear systems, least squares, singular value decomposition,

eigenvalues.

Communication routines targeting

linear algebra operations.

Communication routines targeting

linear algebra operations.

Parallel BLAS.

Parallel BLAS.

Communication layer (message

passing).

Communication layer (message

passing).

http://acts.nersc.gov/scalapack

Page 15: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

15

PETSc

Computation and Communication KernelsMPI, MPI-IO, BLAS, LAPACK

Profiling Interface

PETSc PDE Application Codes

Object-OrientedMatrices, Vectors, Indices

GridManagement

Linear SolversPreconditioners + Krylov Methods

Nonlinear Solvers,Unconstrained Minimization

ODE Integrators Visualization

Interface

Portable, Extensible Toolkit for Scientific Computation

Page 16: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

16

Observation

“Explicit message passing will remain the dominant programming model for the foreseeable future because of the

huge investment in application codes.”

-Jim Tomkis, Bob Balance, and Sue Kelly, ASC PI Meeing, Nevada, Feb 2007

Page 17: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

17

Common Application Platform

Basic ArchitectureBuilding BlocksConclusions

Page 18: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

18

Basic Architecture

SiteA

MetaScheduler (MS)

Site1

Site2

Site3

SURAgrid Portal

Command-Line(GSISSH, etc.)

Browser

A

B

Page 19: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

19

Building Blocks

MPICH-G2

Page 20: 1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

20

Conclusions

Powershift ! Onus is now on System Administrators

Load Balancing Minimize communication costs Limits: ScaLAPACK (Latency<500 ms) Dynamic Redistribution of Work

Heterogeneous Environment Issues with floating-point operations