Top Banner
Presented by The HPC Challenge (HPCC) Benchmark Suite Piotr Luszczek The MathWorks, Inc. http://icl.cs.utk.edu/hpcc/
11

The HPC Challenge (HPCC) Benchmark Suite · 2007. 10. 2. · 3 Luszczek_HPCC_SC07 HPCC: Motivation and measurement HPC Challenge Benchmarks Select Applications 0.00 0.20 0.40 0.60

Aug 21, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The HPC Challenge (HPCC) Benchmark Suite · 2007. 10. 2. · 3 Luszczek_HPCC_SC07 HPCC: Motivation and measurement HPC Challenge Benchmarks Select Applications 0.00 0.20 0.40 0.60

Presented by

The HPC Challenge (HPCC)Benchmark Suite

Piotr Luszczek

The MathWorks, Inc.

http://icl.cs.utk.edu/hpcc/

Page 2: The HPC Challenge (HPCC) Benchmark Suite · 2007. 10. 2. · 3 Luszczek_HPCC_SC07 HPCC: Motivation and measurement HPC Challenge Benchmarks Select Applications 0.00 0.20 0.40 0.60

2 Luszczek_HPCC_SC07

HPCC: Components

1. HPL (High Performance LINPACK)

2. STREAM

3. PTRANS A ← AT+B

4. RandomAccess

5. FFT

6. Matrix-matrix multiply7. b_eff (effective bandwidth/latency)

------------------------------------------------------- name kernel bytes/iter FLOPS/iter------------------------------------------------------- COPY: a(i) = b(i) 16 0 SCALE: a(i) = q*b(i) 16 1 SUM: a(i) = b(i) + c(i) 24 1 TRIAD: a(i) = b(i) + q*c(i) 24 2-------------------------------------------------------

+1

-1

T: T[k] (+) ai

64 bits

ping

C ← s*C + t * A*B

Ax=b

zk=Σxj exp(-2π√-1 jk/n)

pong

Page 3: The HPC Challenge (HPCC) Benchmark Suite · 2007. 10. 2. · 3 Luszczek_HPCC_SC07 HPCC: Motivation and measurement HPC Challenge Benchmarks Select Applications 0.00 0.20 0.40 0.60

3 Luszczek_HPCC_SC07

HPCC: Motivation and measurement

HPC Challenge BenchmarksSelect Applications

0.00

0.20

0.40

0.60

0.80

1.00

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00

Spatial Locality

Te

mp

ora

l lo

ca

lity

HPL

Test3D

CG

OverflowGamess

RandomAccess

AVUS

OOCore

RFCTH2

STREAM

HYCOM

Generated by PMaC @ SDSC

Spatial and temporal data locality hereis for one node/processor - i.e., locallyor “in the small.”

Hig

h

Spatial locality

Tem

pora

l lo

calit

y

DGEMMHPL

PTRANSSTREAM

FFT

RandomAccess

Missionpartner

applications

Low High

Measurement

Concept

Page 4: The HPC Challenge (HPCC) Benchmark Suite · 2007. 10. 2. · 3 Luszczek_HPCC_SC07 HPCC: Motivation and measurement HPC Challenge Benchmarks Select Applications 0.00 0.20 0.40 0.60

4 Luszczek_HPCC_SC07

HPCC: Scope and naming conventions

G

EP

S

HPL STREAM

FFT...

RandomAccess(1 m)HPL (25%)

system

CPU

thread

M

PP

M

PP

M

PP

M

PP

Network

M

PP

M

PP

M

PP

M

PP

Network

M

PP

M

PP

M

PP

M

PP

Network

Global

Embarrassingly Parallel

Single

CPU

Memory Interconnect

Computationalresources

core(s)

MPI

OpenMP

Softwaremodules

Vectorize

Page 5: The HPC Challenge (HPCC) Benchmark Suite · 2007. 10. 2. · 3 Luszczek_HPCC_SC07 HPCC: Motivation and measurement HPC Challenge Benchmarks Select Applications 0.00 0.20 0.40 0.60

5 Luszczek_HPCC_SC07

HPCC: Hardware probes

Registers

Cache

Local memory

Disk

Instr. Operands

Blocks

Pages

Remote memory

Messages

HPC ChallengeBenchmark

CorrespondingMemory Hierarchy

HPCS PerformanceTargets (improvement)

•Top500: solves a system

Ax = b

•STREAM: vector operations

A = B + s x C

•FFT: 1D fast Fourier transform

Z = FFT(X)

•RandomAccess: random updates

T(i) = XOR( T(i), r )

bandwidth

latency

2 Petaflops(8x)

6.5 Petabytes(40x)

0.5 Petaflops(200x)

64,000 GUPS(2000x)

• HPCS program has developed a new suite of benchmarks (HPC Challenge).

• Each benchmark focuses on a different part of the memory hierarchy.

• HPCS program performance targets will flatten the memory hierarchy, improvereal application performance, and make programming easier.

Page 6: The HPC Challenge (HPCC) Benchmark Suite · 2007. 10. 2. · 3 Luszczek_HPCC_SC07 HPCC: Motivation and measurement HPC Challenge Benchmarks Select Applications 0.00 0.20 0.40 0.60

6 Luszczek_HPCC_SC07

HPCC: Official submission process1. Download

2. Install

3 . Run

4. Upload results

5. Confirm via @email@

6. Tune

7. Run

8. Upload results

9. Confirm via @email@• Only some routines can be replaced.• Data layout needs to be preserved.• Multiple languages can be used.

Provide detailedinstallation andexecution environment.

Results are immediately availableon the Web site:● Interactive HTML● XML● MS Excel● Kiviat charts (radar plots)

Optional

Prerequisites:• C compiler• BLAS• MPI

Page 7: The HPC Challenge (HPCC) Benchmark Suite · 2007. 10. 2. · 3 Luszczek_HPCC_SC07 HPCC: Motivation and measurement HPC Challenge Benchmarks Select Applications 0.00 0.20 0.40 0.60

7 Luszczek_HPCC_SC07

HPCC: Submissions over time

10

100

1000

10000

100000

1000000

10000000

Nov-03

Jan-04

Mar-04

May-04

Jul-04

Sep-04

Nov-04

Jan-05

Mar-05

May-05

Jul-05

Sep-05

Nov-05

Jan-06

Mar-06

May-06

Jul-06

Sep-06

1

10

100

1000

10000

100000

Jul-04

Aug-04

Sep-04

Oct-04

Nov-04

Dec-04

Jan-05

Feb-05

Mar-05

Apr-05

May-05

Jun-05

Jul-05

Aug-05

Sep-05

Oct-05

Nov-05

Dec-05

Jan-06

Feb-06

Mar-06

Apr-06

May-06

Jun-06

Jul-06

Aug-06

Sep-06

Oct-06

0.1

1

10

100

1000

10000

Nov-03

Jan-04

Mar-04

May-04

Jul-04

Sep-04

Nov-04

Jan-05

Mar-05

May-05

Jul-05

Sep-05

Nov-05

Jan-06

Mar-06

May-06

Jul-06

Sep-06

0.001

0.01

0.1

1

10

100

1000

Jul-04

Aug-04

Sep-04

Oct-04

Nov-04

Dec-04

Jan-05

Feb-05

Mar-05

Apr-05

May-05

Jun-05

Jul-05

Aug-05

Sep-05

Oct-05

Nov-05

Dec-05

Jan-06

Feb-06

Mar-06

Apr-06

May-06

Jun-06

Jul-06

Aug-06

Sep-06

Oct-06

STREAM[GB/s]

HPL[Tflop/s]

FFT[Gflop/s]

RandomAccess[GUPS]

Sum Sum

Sum Sum

#1

#1

#1

#1

Page 8: The HPC Challenge (HPCC) Benchmark Suite · 2007. 10. 2. · 3 Luszczek_HPCC_SC07 HPCC: Motivation and measurement HPC Challenge Benchmarks Select Applications 0.00 0.20 0.40 0.60

8 Luszczek_HPCC_SC07

HPCC: Comparing three interconnects

Kiviat chart (radar plot)• 3 AMD Opteron clusters− Clock: 2.2 GHz− 64-processor cluster

• Interconnect typesA. VendorB. CommodityC. GigE− G-HPL− Matrix-matrix multiply

• Cannot be differentiated based on− G-HPL− Matrix-matrix multiply

• Available on HPCC Web site− http://icl.cs.utk.edu/hpcc/

Page 9: The HPC Challenge (HPCC) Benchmark Suite · 2007. 10. 2. · 3 Luszczek_HPCC_SC07 HPCC: Motivation and measurement HPC Challenge Benchmarks Select Applications 0.00 0.20 0.40 0.60

9 Luszczek_HPCC_SC07

HPCC: Analysis of sample resultsHPCS ~102

HPC ~104

Clusters ~106

Systems(in Top500

order)

Meg

aG

iga

Tera

Pet

a

Effe

ctiv

e B

andw

idth

(w

ords

/sec

ond)

• All results inwords/second

• Highlightsmemoryhierarchy

• Clusters− Hierarchy

steepens

• HPC systems− Hierarchy

constant

• HPCS goals− Hierarchy

flattens− Easier to

program

Kilo

Page 10: The HPC Challenge (HPCC) Benchmark Suite · 2007. 10. 2. · 3 Luszczek_HPCC_SC07 HPCC: Motivation and measurement HPC Challenge Benchmarks Select Applications 0.00 0.20 0.40 0.60

10 Luszczek_HPCC_SC07

HPCC: Augmenting June TOP500

TOP500 rating Data provided by HPCC database

Page 11: The HPC Challenge (HPCC) Benchmark Suite · 2007. 10. 2. · 3 Luszczek_HPCC_SC07 HPCC: Motivation and measurement HPC Challenge Benchmarks Select Applications 0.00 0.20 0.40 0.60

11 Luszczek_HPCC_SC07

Contacts

Piotr Luszczek

The MathWorks, Inc.(508) [email protected]