Top Banner
High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April 19 th , 2007
69

High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Dec 27, 2015

Download

Documents

Alexander Ellis
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

High Performance Computing: Concepts, Methods & Means

HPC Libraries

Hartmut Kaiser PhDCenter for Computation & Technology

Louisiana State University

April 19th, 2007

Page 2: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Outline

• Introduction to High Performance Libraries• Linear Algebra Libraries (BLAS, LAPACK)• PDE Solvers (PETSc) • Mesh manipulation and load balancing

(METIS/ParMETIS, JOSTLE)• Special purpose libraries (FFTW)• General purpose libraries (C++: Boost)• Summary – Materials for test

2

Page 3: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Outline

• Introduction to High Performance Libraries• Linear Algebra Libraries (BLAS, LAPACK)• PDE Solvers (PETSc) • Mesh manipulation and load balancing

(METIS/ParMETIS, JOSTLE)• Special purpose libraries (FFTW)• General purpose libraries (C++: Boost)• Summary – Materials for test

3

Page 4: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Puzzle of the Day

#include <stdio.h>

int main(){ int a = 10; switch(a) { case '1': printf("ONE\n"); break;

case '2': printf("TWO\n"); break;

defa1ut: printf("NONE\n"); } return 0;}

4

If you expect the output of the above program to be NONE, I would request you to check it out!

Page 5: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Application domains

• Linear algebra– BLAS, ATLAS, LAPACK, ScaLAPACK, Slatec, pim

• Ordinary and partial Differential Equations– PETSc

• Mesh manipulation and Load Balancing – METIS, ParMETIS, CHACO, JOSTLE, PARTY

• Graph manipulation– Boost.Graph library

• Vector/Signal/Image processing– VSIPL, PSSL.

• General parallelization– MPI, pthreads

• Other domain specific libraries– NAMD, NWChem, Fluent, Gaussian, LS-DYNA

5

Page 6: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Application Domain Overview

• Linear Algebra Libraries – Provide optimized methods for constructing sets of linear equations,

performing operations on them (matrix-matrix products, matrix-vector products) and solving them (factoring, forward & backward substitution.

– Commonly used libraries include BLAS, ATLAS, LAPACK, ScaLAPACK, PaLAPACK

• PDE Solvers: – Developing general-porpose, parallel numerical PDE libraries– Usual toolsets include manipulation of sparse data structures, iterative

linear system solvers, preconditioners, nonlinear solvers and time-stepping methods.

– Commonly used libraries for solving PDEs include SAMRAI, PETSc, PARASOL, Overture, among others.

6

Page 7: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Application Domain Overview

• Mesh manipulation and Load Balancing – These libraries help in partitioning meshes in roughly equal sizes

across processors, thereby balancing the workload while minimizing size of separators and communication costs.

– Commonly used libraries for this purpose include METIS, ParMetis, Chaco, JOSTLE among others.

• Other packages:– FFTW: features highly optimized Fourier transform package

including both real and complex multidimensional transforms in sequential, multithreaded, and parallel versions.

– NAMD: molecular dynamics library available for Unix/Linux, Windows, OS X

– Fluent: computational fluid dynamics package, used for such applications as environment control systems, propulsion, reactor modeling etc.

7

Page 8: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Outline

• Introduction to High Performance Libraries• Linear Algebra Libraries (BLAS, LAPACK)• PDE Solvers (PETSc) • Mesh manipulation and load balancing

(METIS/ParMETIS, JOSTLE)• Special purpose libraries (FFTW)• General purpose libraries (C++: Boost)• Summary – Materials for test

8

Page 9: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

BLAS

• (Updated set of) Basic Linear Algebra Subprograms

• The BLAS functionality is divided into three levels: – Level 1: contains vector operations of the form:

as well as scalar dot products and vector norms

– Level 2: contains matrix-vector operations of the form

as well as Tx = y solving for x with T being triangular

– Level 3: contains matrix-matrix operations of the form

as well as solving for triangular matrices T. This level contains the widely used General Matrix Multiply operation.

9

Page 10: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

BLAS

• Several implementations for different languages exist– Reference implementation (F77 and C)

http://www.netlib.org/blas/– ATLAS, highly optimized for particular

processor architectures– A generic C++ template class library providing

BLAS functionality: uBLAS http://www.boost.org

– Several vendors provide libraries optimized for their architecture (AMD, HP, IBM, Intel, NEC, NViDIA, Sun)

10

Page 11: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

BLAS: F77 naming conventions

11

Page 12: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

BLAS: C naming conventions

• F77 routine name is changed to lowercase and prefixed with cblas_

• All routines which accept two dimensional arrays have a new additional first parameter specifying the matrix memory layout (row major or column major)

• Character parameters are replaced by corresponding enum values

• Input arguments are declared const• Non-complex scalar input parameters are passed by value• Complex scalar input argiments are passed using a void*• Arrays are passed by address• Output scalar arguments are passed by address• Complex functions become subroutines which return the result

via an additional last parameter (void*), appending _sub to the name

12

Page 13: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

BLAS Level 1 routines

• Vector operations(xROT, xSWAP, xCOPY etc.)

• Scalar dot products (xDOT etc.)

• Vector norms(IxAMX etc.)

13

Page 14: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

BLAS Level 2 routines

• Matrix-vector operations(xGEMV, xGBMV, xHEMV, xHBMV etc.)

• Solving Tx = y for x, where T is triangular(xGER, xHER etc.)

14

Page 15: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

BLAS Level 3 routines

• Matrix-matrix operations(xGEMM etc.)

• Solving for triangular matrices(xTRMM)

• Widely used matrix-matrix multiply (xSYMM, xGEMM)

15

Page 16: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Demo 1

• Shows solving a matrix multiplication problem using BLAS expressed in FORTRAN, C, and C++

• Shows genericity of uBLAS, by comparing generic and banded matrix versions

• Shows newmat, a C++ matrix library which uses operator overloading

16

Page 17: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Outline

• Introduction to High Performance Libraries• Linear Algebra Libraries (BLAS, LAPACK)• PDE Solvers (PETSc) • Mesh manipulation and load balancing

(METIS/ParMETIS, JOSTLE)• Special purpose libraries (FFTW)• General purpose libraries (C++: Boost)• Summary – Materials for test

17

Page 18: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

LAPACK

• Linear Algebra PACKage– http://www.netlib.org/lapack/– Written in F77– Provides routines for

• Solving systems of simultaneous linear equations, • Least-squares solutions of linear systems of equations, • Eigenvalue problems, • Householder transformation to implement QR

decomposition on a matrix and • Singular value problems

– Was initially designed to run efficiently on shared memory vector machines

– Depends on BLAS– Has been extended for distributed (SIMD) systems

(ScaPACK and PLAPACK)

18

Page 19: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

19

LAPACK (Architecture)

Page 20: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

LAPACK naming conventions

20

Page 21: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Demo 2

• Shows how using a library might speed up the computation considerably

21

Page 22: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Outline

• Introduction to High Performance Libraries• Linear Algebra Libraries (BLAS, LAPACK)• PDE Solvers (PETSc) • Mesh manipulation and load balancing

(METIS/ParMETIS, JOSTLE)• Special purpose libraries (FFTW)• General purpose libraries (C++: Boost)• Summary – Materials for test

22

Page 23: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

PETSc (pronounced PET-see)

• Portable, Extensible Toolkit for Scientific Computation (http://www-unix.mcs.anl.gov/petsc/petsc-as/)– Suite of data structures and routines for the scalable

(parallel) solution of scientific applications modeled by partial differential equations (PDEs)

– Employs the MPI standard for all message-passing communication

– Intended for use in large-scale application projects– Includes a large suite of parallel linear and nonlinear

equation solvers– Easily used in application codes written in C, C++,

Fortran and Python• Good introduction:

http://www-unix.mcs.anl.gov/petsc/petsc-as/documentation/tutorials/nersc02/nersc02.ppt

23

Page 24: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

PETSc (general features)

• Features include:– Parallel vectors

• Scatters (handles communicating ghost point information)

• Gathers

– Parallel matrices • Several sparse storage formats • Easy, efficient assembly.

– Scalable parallel preconditioners – Krylov subspace methods – Parallel Newton-based nonlinear solvers – Parallel time stepping (ODE) solvers

24

Page 25: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

PETSc (Architecture)

25

PETSc: Module architecture and layers of abstraction

Page 26: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

PETSc: Component details

• Vector operations (Vec): Provides the vector operations required for setting up and solving large-scale linear and nonlinear problems. Includes easy-to-use parallel scatter and gather operations, as well as special-purpose code for handling ghost points for regular data structures.

• Matrix operations (Mat): A large suite of data structures and code for the manipulation of parallel sparse matrices. Includes four different parallel matrix data structures, each appropriate for a different class of problems.

• Preconditioners (PC): A collection of sequential and parallel preconditioners, including

– (sequential) ILU(k) (incomplete factorization), – LU (lower/upper decomposition), – both sequential and parallel block Jacobi, overlapping additive Schwarz

methods• Time stepping ODE solvers (TS): Code

for the time evolution of solutions of PDEs. In addition, provides pseudo-transient continuation techniques for computing steady-state solutions.

26

Page 27: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

PETSc: Component details

• Krylov subspace solvers (KSP): Parallel implementations of many popular Krylov subspace iterative methods, including

– GMRES (Generalized Minimal Residual method), – CG (Conjugate Gradient), – CGS (Conjugate Gradient Squared), – Bi-CG-Stab (BiConjugate Gradient Squared), – two variants of TFQMR (transpose free QMR), – CR (Conjugate Residuals), – LSQR (Least Square Root).

All are coded so that they are immediately usable with any preconditioners and any matrix data structures, including matrix-free methods.

• Non-linear solvers (SNES): Data-structure-neutral implementations of Newton-like methods for nonlinear systems. Includes both line search and trust region techniques with a single interface. Employs by default the above data structures and linear solvers. Users can set custom monitoring routines, convergence criteria, etc.

27

Page 28: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Outline

• Introduction to High Performance Libraries• Linear Algebra Libraries (BLAS, LAPACK)• PDE Solvers (PETSc) • Mesh manipulation and load balancing

(METIS/ParMETIS, JOSTLE)• Special purpose libraries (FFTW)• General purpose libraries (C++: Boost)• Summary – Materials for test

28

Page 29: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Mesh libraries

• Introduction– Structured/unstructured meshes– Examples

• Mesh decomposition

29

Page 30: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Introduction to Meshes and Grids

• Mesh/Grid : 2D or 3D representation of the computational domain.

• Common 2D meshes are composed of triangular or quadrilateral elements

• Common 3D meshes are composed of hexahedral, tetrahedral or pyramidal elements

30

TriangleQuadrilateral

Tetrahedron

Hexahedron Prism

2D Mesh elements

3D Mesh elements

Page 31: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Structured Grids (Meshes)• Cartesian grids, logically

rectangular grids• Mesh info accessed implicitly

using grid point indices– Efficient in both computation

and storage• Typically use finite difference

discretization

Unstructured Meshes• Mesh connectivity information

must be stored– Incurs additional memory and

computational cost• Handles complex geometries

and grid adaptivity• Typically use finite volume or

finite element discretization• Mesh quality becomes a

concern

31

Structured/Unstructured Meshes

Page 32: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Mesh examples

32

Page 33: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Meshes are used for Computation

33

Page 34: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Mesh Decomposition

• Goal is to maximize interior while minimizing connections between subdomains. That is, minimize communication.

• Such decomposition problems have been studied in load balancing for parallel computation.

• Lots of choices:• METIS, ParMETIS -- University of Minnesota.• PARTI -- University of Maryland,• CHACO -- Sandia National Laboratories,• JOSTLE -- University of Greenwich,• PARTY -- University of Paderborn,• SCOTCH -- Université Bordeaux,• TOP/DOMDEC -- NAS at NASA Ames Research Center.

http://www.hlrs.de34

Page 35: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Mesh Decomposition

• Load balancing– Distribute elements evenly across processors.

– Each processor should have equal share of work.

• Communication costs should be minimized. – Minimize sub-domain boundary elements.

– Minimize number of neighboring domains.

• Distribution should reflect machine architecture.– Communication versus calculation.

– Bandwidth versus latency.

• Note that optimizing load balance and communication cost simultaneously is an NP-hard problem.

http://www.epcc.ed.ac.uk/epcc-tec/documents/meshdecomp-slides/MeshDecomp-13.html

35

Page 36: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

36http://www.hlrs.de

36

Mesh decomposition

Page 37: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Static Grids (Meshes)

• Decomposition need only be carried out once

• Static decomposition may therefore be carried out as a preprocessing step, often done in serial

Dynamic Meshes

• Decomposition must be adapted as underlying mesh or processor load changes.

• Dynamic decomposition therefore becomes part of the calculation itself and cannot be carried out solely as a pre-processing step.

37

http://www.epcc.ed.ac.uk/epcc-tec/documents/meshdecomp-slides/MeshDecomp-14.html

Static and Dynamic Meshes

Page 38: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

HP J67001 CPUSolve Time: 13:26Baseline Time

38

src : Amy Apon, http://www.csce.uark.edu/~aapon/courses/concurrent/notes/marc-ddm.ppt

Page 39: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Linux Cluster2 CPU’sSolve Time: 5:20Speed-Up: 2.5X

39

src : Amy Apon, http://www.csce.uark.edu/~aapon/courses/concurrent/notes/marc-ddm.ppt

Page 40: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Linux Cluster4 CPU’sSolve Time: 3:07Speed-Up: 4.3X

40

src : Amy Apon, http://www.csce.uark.edu/~aapon/courses/concurrent/notes/marc-ddm.ppt

Page 41: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Linux Cluster8 CPU’sSolve Time: 1:51Speed-Up: 7.3X

41

src : Amy Apon, http://www.csce.uark.edu/~aapon/courses/concurrent/notes/marc-ddm.ppt

Page 42: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Linux Cluster16 CPU’sSolve Time: 1:03Speed-Up: 12.8X

42

src : Amy Apon, http://www.csce.uark.edu/~aapon/courses/concurrent/notes/marc-ddm.ppt

Page 43: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Speedup due to decomposition

# CPUs Run-times (s)

1 806

2 320

4 187

8 111

16 63

43

Page 44: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

44http://www.hlrs.de

44

Jostle and Metis

Page 45: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Jostle

45

45

http://www.hlrs.de

Page 46: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Jostle

46

46

http://www.hlrs.de

Page 47: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Jostle

47

47

http://www.hlrs.de

Page 48: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Metis

48

48

http://www.hlrs.de

Page 49: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

ParMetis

49

49

http://www.hlrs.de

Page 50: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Metis (serial)

50

50

http://www.hlrs.de

Page 51: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Comparison

51

51

http://www.hlrs.de

Page 52: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Outline

• Introduction to High Performance Libraries• Linear Algebra Libraries (BLAS, LAPACK)• PDE Solvers (PETSc) • Mesh manipulation and load balancing

(METIS/ParMETIS, JOSTLE)• Special purpose libraries (FFTW)• General purpose libraries (C++: Boost)• Summary – Materials for test

52

Page 53: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

FFTW

• Fastest Fourier Transform in the West

• Portable C subroutine library for computing discrete cosine/sine transform (DCT/DST)

• Computes arbitrary size discrete Fourier and Hartley transforms on real or complex data, in one or more dimensions

• Optimized for speed through application of special-purpose compiler genfft (codelet generator), originally written in OCaml; performance comparable even with vendor optimized libraries

• Free software, distributed under GPL; also available under commercial MIT license

• Developed at MIT by Matteo Frigo and Steven G. Johnson• Won J. H. Wilkinson Prize for Numerical Software in 1999• Most recent stable version is 3.1.2 (http://www.fftw.org)

53

Page 54: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Main FFTW Features

• C and FORTRAN interfaces, C++ wrappers available• Speed, including support for SSE, SSE2, 3dNow! and Altivec• Arbitrary size transforms with complexity of O(n·log(n)) (sizes which

can be factored to 2, 3, 5 and 7 are most efficient by default, but a custom code can be also generated for other sizes if required)

• Even/odd data (DCT/DST), types I-IV• Can produce pure real output, or process pure real input data• Efficient handling of multiple, strided transforms (e.g. transformation of

multiple arrays at once; one dimension of multi-dimensional array; one field of multi-component array)

• Parallel code supporting Cilk, SMP platforms with threads, or MPI• Ability to save and restore plans optimized for a given platform (through

wisdom mechanism)• Portable to any platform with a working C compiler

54

Page 55: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

FFTW Sample Code

Source: http://www.fftw.org/fftw3.pdf

Computing 1-D complex DFT

55

#include <fftw3.h>#include <fftw3.h>......{{ fftw_complex *in, *out;fftw_complex *in, *out; fftw_plan p;fftw_plan p; ...... in = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N);in = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N); out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N);out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N); /* populate in[] with input data *//* populate in[] with input data */ … … p = fftw_plan_dft_1d(N, in, out, FFTW_FORWARD, FFTW_ESTIMATE);p = fftw_plan_dft_1d(N, in, out, FFTW_FORWARD, FFTW_ESTIMATE); ...... fftw_execute(p); /* repeat as needed */fftw_execute(p); /* repeat as needed */ /* transform now available in out[] *//* transform now available in out[] */ ...... fftw_destroy_plan(p);fftw_destroy_plan(p); fftw_free(in); fftw_free(out);fftw_free(in); fftw_free(out);}}

Page 56: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Outline

• Introduction to High Performance Libraries• Linear Algebra Libraries (BLAS, LAPACK)• PDE Solvers (PETSc) • Mesh manipulation and load balancing

(METIS/ParMETIS, JOSTLE)• Special purpose libraries (FFTW)• General purpose libraries (C++: Boost)• Summary – Materials for test

56

Page 57: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

The Boost Libraries

• What’s Boost– What’s important– Other stuff

57

Page 58: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

What is Boost?

• Data Structures, Containers, Iterators, and Algorithms

• String and Text Processing • Function Objects and Higher-Order

Programming • Generic Programming and Template

Metaprogramming • Math and Numerics• Input/Output • Miscellaneous

• Mostly header only

58

Page 59: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

What’s important

• OS abstraction– Thread: OS independent kernel level thread

interface– Asio: asynchronous input output– Filesystem: file system operations as file copy,

delete, directory create, file path handling– System: OS error code abstraction and handling– Program options: handling of command line

arguments and parameters– Streams: build your own C++ streams– DateTime: Handling of dates, times and time

periods– Timer: simple timer object

59

Page 60: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

What’s important

• Data types, Container types, all extending STL– Pointer containers: allow for pointers in STL containers:

vector<char *> ptr_vector<char>– Multi index: data structures with multiple indicies– Constant sized arrays: array<char, 10>, acts like vector or

plain ‘C‘ array– Any: can hold values of any type (if you need polymorphism)– Variant: can hold values of any of the types specified at

compile time (‘C’ equivalent is discriminated union)– Optional: can hold a value or nothing– Tuple: like a vector or array, but every element may have a

different type (similar to plain struct)– Graph library: very sophisticated collection of graph releated

data structures and algorithms• Parallel version exists (using MPI)

60

Page 61: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

What’s important

• Helper classes– Smart pointers: working with pointers

without having to worry about memory management

– Memory pools: specialized memory allocation for containers

– Iterator library: write your own iterator classes with ease (non trivial otherwise)

61

Page 62: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Other stuff in Boost

• String and Text processing• Regex, parsing, format, conversion etc.

• Alorithms• String algos, FOR_EACH, minmax etc.

• Math and numerics• Conversion, interval, random, octonion, quarternion, special

functions, rational, uBLAS

• Functional and higher order prgramming• Bind, lambda, function, ref, signals etc.

• Generic and template metaprogramming• Proto, mpl, fusion, phoenix, enable_if etc.

• Testing• Unit tests, concept checks, static_assert

62

Page 63: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Conclusion

• Look at Boost first if you need something not available in Standard library

• Even if it‘s not in Boost look around, there are a lot of libraries in preparation for Boost (Boost Sandbox, File Vault)

63

Page 64: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Links

• Boost, current release V1.33.1 – Web: http://www.boost.org

– CVS: http://sourceforge.net/projects/boost

• Boost Sandbox– CVS: http://sourceforge.net/projects/boost-sandbox

– File Vault: http://boost-consulting.com/vault/

• Boost mailing lists– http://www.boost.org/more/mailing_lists.htm

64

Page 65: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Outlook

Functional specification with a Domain Specific Embedded Language (DSEL)

equation = sum<vertex_edge> [ sumf<edge_vertex>(0.0,

_e) [ pot * orient(_e, _1) ] * A / d * eps] - V * rho

65

Elliptic PDE discretized by Finite Volume

References: [1]

Page 66: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

References

1. Rene Heinzl, Modern Application Design using Modern Programming Paradigms and a Library-Centric Software Approach, OOPSLA 2006, Workshop on Library Centric Software Design, Portland, Oregon, October 2006.

66

Page 67: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Outline

• Introduction to High Performance Libraries• Linear Algebra Libraries (BLAS, LAPACK)• PDE Solvers (PETSc) • Mesh manipulation and load balancing

(METIS/ParMETIS, JOSTLE)• Special purpose libraries (FFTW)• General purpose libraries (C++: Boost)• Summary – Materials for test

67

Page 68: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.

Summary – Material for the Test

• High performance libraries 5,6,7• Linear algebra libraries: BLAS: 9, 11, 12• Linear algebra libraries: LinPACK: 18• PDE Solvers: 23, 24, 26, 27• Mesh decomposition & load balancing: 30, 31,

34, 35, 37, 44, 45, 46, 48, 49• FFTW: 53, 54• Boost: 58, 59, 60, 61, 62

Page 69: High Performance Computing: Concepts, Methods & Means HPC Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April.