Top Banner
Surviving the End of Scaling of Traditional Micro Processors in HPC Olav Lindtjørn (Schlumberger, Stanford), Robert G. Clapp (Stanford), Oliver Pell, Oskar Mencer, Michael J Flynn (Maxeler)
47

Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Apr 16, 2018

Download

Documents

lamhanh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Surviving the End of Scaling of

Traditional Micro Processors in HPC

Olav Lindtjørn (Schlumberger, Stanford), Robert G. Clapp

(Stanford), Oliver Pell, Oskar Mencer,

Michael J Flynn (Maxeler)

Page 2: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

2/20

The Memory Wall and the Power Wall

• Moore’s Law continues to deliver double the transistors on a chip every 18-24 months

– But we are having trouble making those extra transistors deliver performance.

• Memory Wall

– Parallel processing elements on-chip must share the same off-chip bandwidth

• Power Wall

– Chips still need to be cooled in the same physical space

Page 3: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

CPUs vs. FPGA Processing

Streaming Data through a data flow machine

Page 4: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

• Oil and Gas HPC applications

• Maxeler FPGA Compiler and Accelerators

• Key Computational Kernels in Oil&Gas

– Sparse Matrix

– Convolution based methods

• Applications scalability – Technology trends

• Conclusions

Outline

Page 5: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

HPC – Its role in Oil & Gas exploration

• Identify

resources

• Access

resources

• Maximize

recovery

Courtesy of Statoil

Page 6: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Where to Drill

Seismic –Acoustic measurement

Electromagnetic

Gravity

?

??

??

??

??

?

Page 7: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Seismic DetectorsSeismic Source

Recording Vessel

Sea Surface

Water Bottom

OilGas

Hydrocarbon "trap"

Seismic WaveRay Paths

Seismic Data Acquisition

Page 8: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

1200m

1200m

1200m1200m

1200m

Page 9: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Data Intensity and Complex Physics

Isotropic Anisotropic

Page 10: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

20 – 25,000 sensors

500 MB – 2 GB

50 – 200,000 shots

50 – 200 TB Data

1000s node

5 – 7 days

Data Rates and Computational needsR

ela

tive

com

pu

tati

on

al

cost

Relative Disk Space

2

4

8

16

32

64

1

1 2 4 8 16 32 64

Isotropic

VTI

TTI

30 Hz RTM

128

Page 11: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

20 – 25,000 sensors

500 MB – 2 GB

50 – 200,000 shots

50 – 200 TB Data

1000s node

5 – 7 days

Data Rates and Computational needs

Relative Disk Space

2

4

8

16

32

64

1

1 2 4 8 16 32 64

Isotropic

VTI

TTI

Isotropic

VTI

TTI

30 Hz RTM

60 Hz RTM

128

Re

lati

veco

mp

uta

tio

na

l co

st

Page 12: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

20 – 25,000 sensors

500 MB – 2 GB

50 – 200,000 shots

50 – 200 TB Data

1000s node

5 – 7 days

15 -20,000 nodes

Days - weeks

Data Rates and Computational needs

Relative Disk Space

2

4

8

16

32

64

1

1 2 4 8 16 32 64

Isotropic

VTI

TTI

Isotropic

VTI

TTI

30 Hz RTM

60 Hz RTM

128

Re

lati

veco

mp

uta

tio

na

l co

st

Page 13: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Cost of Imaging Algorithms

FWI-Elastic

Imaging Complexity

Shot WEM (VTI)1

10

100

1000

10000

100000

: Memory and disk space cost

Reverse Time Migration (RTM)

1000000

‘09 ‘10

FWI-Acoustic

‘11 ‘?

Re

lati

veco

mp

uta

tio

na

l co

st

Page 14: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

HPC – Its role in Hydrocarbon exploration

• Identify resources

• Access resources

Geomechanics

Page 15: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

HPC – Its role in Hydrocarbon exploration

• Identify

resources

• Access

resources

• Maximize

recovery

Geomechanics

Reservoir Flow Simulation

Page 16: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Oil and Gas Computational Kernels

Wave propagation

Diffusion

Fluid Flow

Finite Difference

FFT

Finite Element

Sparse matrix

EM

Physics Kernels

Page 17: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Oil and Gas Computational Kernels

Wave propagation

Diffusion

Fluid Flow

Convolution

FFT

Sparse Matrix

Sparse matrix

EM

Physics Kernels

Page 18: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Oil and Gas Computational Kernels

Wave propagation

Diffusion

Fluid Flow

Convolution

FFT

Sparse Matrix

EM

Physics Kernels

Page 19: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

• Oil and Gas HPC applications

• Maxeler FPGA Compiler and Accelerators

• Key Computational Kernels in Geophysics

– Sparse Matrix

– Convolution based methods

• Applications scalability – Technology trends

• Conclusions

Outline

Page 20: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Accelerating

Convolution and Sparse Matrix

in the

Maxeler Environment

Page 21: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Maxeler Accelerators

• Commodity silicon

chips configurable to

implement any digital

circuit

– ~106 small processing

elements that operate

in parallel

– Several megabytes of

on-chip memory

– Run at several hundred

megahertz

– Support large on-board

memory (24GB+)

Page 22: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

MaxNode with MAX3

Compute 8x 2.8GHz Nahelem Cores4x Virtex 6-SX475T FPGAs

Interconnect PCI-Express Gen. 2MaxRingGigabit Ethernet

Storage 3x 2TB Hard disks

Memory 96GB DRAM

Form Factor 1U

Specifications:

FPGA FPGA

FPGAFPGA

NahelemCores

PCIePCIe

PCIe PCIe

Max

Rin

g

Max

Rin

g

MAX3 Node Architecture

Page 23: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

MAX3 System Bandwidths

Main

Memory

SX475T

FPGA

(4.68MB)

Mem

24GB

38.4

GBytes/s

SX475T

FPGA

(4.68MB)

Mem

24GB

38.4

GBytes/s

SX475T

FPGA

(4.68MB)

Mem

24GB

38.4

GBytes/s

SX475T

FPGA

(4.68MB)

Mem

24GB

38.4

GBytes/s

PCIe x8 Gen 2

8 GBytes/s 8 GBytes/s

8 GBytes/s8 GBytes/s

8x Nehalem

Cores

MaxRing

8 GBytes/s 8 GBytes/s 8 GBytes/s

8 GBytes/s

Page 24: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Maxeler Programming Paradigm

Page 25: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Sparse Matrix Format

Structured Unstructured

Page 26: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Typical scalability of SLB Sparse Matrix Applications

Visage – Geomechanics(2 node Nehalem 2.93 GHz)

Eclipse Benchmark(2 node Westmere 3.06 GHz)

0

1

2

3

4

0 2 4 6 8 10 12

Re

lati

ve

Sp

ee

d

# cores

E300 2 Mcell Benchmark

0

1

2

3

4

5

0 1 2 3 4 5 6 7 8

Re

lati

ve

Sp

ee

d

# cores

FEM Benchmark

Page 27: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Sparse Matrix on FPGAs

624

624

• 4 MB BLK RAM

• Pipelining

• Addressing scheme optimized for Matrix structure

• Domain Specific Data Encoding

Page 28: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Sparse Matrix on FPGAs

0

10

20

30

40

50

60

0 1 2 3 4 5 6 7 8 9 10

Compression RatioS

pee

du

p p

er 1

U N

od

e

GREE0A1new01

624

624 Domain Specific Address and Data Encoding

SPEEDUP is 20x-40x per 1U at 200MHz

Page 29: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Sparse Matrix on FPGAs

0

10

20

30

40

50

60

0 1 2 3 4 5 6 7 8 9 10

Compression RatioS

pee

du

p p

er 1

U N

od

e

GREE0A1new01

624

624 Domain Specific Address and Data Encoding

SPEEDUP is 20x-40x per 1U at 200MHz

Page 30: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

3D Convolution

800 -1200

80

0 -

12

00

• Low Flop/Byte ratio

• Sparse structure requires large

streaming memory buffers

(14×nx×ny for 14th order in space).

• Data Structure >> Data Caches

xy

• CPUs:

• Constrained by:

• Small L1/L2 cache

• Limited utilization of pipeline

• Limited by Streaming BW

• Limited data element reuse

• Fraction of peak performance

Page 31: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

FPGA Opportunities

800 -1200

80

0 -

12

00

• FPGA opportunities

• 4 MB on-chip Memory

• Hundreds of pipeline stages

• Optimal trade off between streams

for BW utilization and Pipe line depth

xy

• CPU limits:

• Constrained by:

• Small L1/L2 cache

• Limited depth of pipeline

• Limited by Streaming BW

• Limited data element reuse

• Fraction of peak performance

Page 32: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Performance

Algorithm Hardware Design Speedup8-core Nehalem 2.93 GHz 1U server vs 1U MaxNode

Starstencil

VIRTEX 5 3 pipe 20x

Starstencil

VIRTEX 6 9 pipe 73x

Page 33: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

• Oil and Gas HPC applications

• Maxeler FPGA Compiler and Accelerators

• Key Computational Kernels in Geophysics

– Sparse Matrix

– Convolution based methods

• Applications scalability – Technology trends

• Conclusions

Outline

Page 34: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

• Transistor count keeps increasing

• Memory BW continues to trail

• How will our algorithms scale?

• Convolution:

– Deeper pipelines:

• An example: Cascading multiple time steps

– Specialized macros on FPGAs

Application scalability and Technology

trends

Page 35: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

FPGA: Time step Cascadingx

z

Stencil width

Page 36: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

FPGA: Time step Cascadingx

z

Stencil widthAdvance t1

Page 37: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

FPGA: Time step Cascadingx

z

Stencil widthAdvance t1

Page 38: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

FPGA: Time step Cascadingx

z

Stencil widthAdvance t1

All information needed to update t2 at X is now available

x

Page 39: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

FPGA: Time step Cascadingx

z

Stencil widthAdvance t2

All information needed to update t2 at X is now available

Advance t1

Page 40: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

FPGA: Time step Cascadingx

z

Stencil widthAdvance t4

In one pass through the data to multiple steps in time

Advance t3

Advance t2

Advance t1

Page 41: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

FPGA: Time step Cascadingx

z

Stencil widthAdvance t4

Requires more computational units per pass but reduce memory bandwidth requirements

Advance t3

Advance t2

Advance t1

Page 42: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Technology opportunities

1

4

16

64

256

2008 2010 2012 2014 2016 2018

Re

lati

ve

Pe

rfo

rma

nce

Macros and freq.

Memory

Transistor scaling

LUT/FFs DSPs

MaxGenFD on Virtex-5 207 8

MaxGenFD on Virtex-6 33 8

Resulting perf. improvement 50 %

Resource costs for a symmetric 15-point stencil:

Virtex-6 DSP enhanced with Pre-Adder

• Added Resources (Transistor

scaling ) translates directly into

performance using Multiple

time step techniques

• Independent of Memory BW

increase

Page 43: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

• Oil and Gas HPC applications

• Maxeler FPGA Compiler and Accelerators

• Key Computational Kernels in Geophysics

– Sparse Matrix

– Convolution based methods

• Applications scalability – Technology trends

• Conclusions

Outline

Page 44: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Surviving the End of Scaling of

Traditional Micro Processors in HPC

FWI-Elastic

Rel

ativ

e C

PU

cos

t

Imaging Complexity

1

10

100

1000

10000

100000

: Memory and disk space cost

RTM

1000000

‘09 ‘10 ‘18 ‘20

• Conclusions:

• FPGA Streaming has come of age

• Development Environment is here today

• Application will scale with predicted technology evolution

• Considerable upside for “smart macros”

Conventional Road Map

Page 45: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Surviving the End of Scaling of

Traditional Micro Processors in HPC

FWI-Elastic

Rel

ativ

e C

PU

cos

t

Imaging Complexity

1

10

100

1000

10000

100000

: Memory and disk space cost

RTM

1000000

‘09 ‘10 ‘13 ‘15

• Conclusions:

• FPGA Streaming has come of age

• Development Environment is here today

• Application will scale with predicted technology evolution

• Considerable upside for “smart macros”

FPGA road maps

Page 46: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

Thank You

Page 47: Surviving the End of Scaling of Traditional Micro ... … · Surviving the End of Scaling of Traditional Micro Processors in HPC ... Geomechanics (2 node Nehalem 2.93GHz) Eclipse

GPU Comments