Top Banner
From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006
70

From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

From Organic Computing to Reconfigurable

Computing

Reiner Hartenstein

TU Kaiserslautern

PASA, Frankfurt, March 16, 2006

Page 2: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de2

TU Kaiserslautern

Reconfigurable Computing (RC)

and FPGA* in the media

#####

Design Starts until 2010: from 80,000 to

110,000 [Dataquest]

June 2005

fastest growing segment of the semiconductor

market: ~6 billion US-$ [Dataquest]

*) Field-Programmable Gate Array

Google: 10 million hits

Page 3: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de3

TU Kaiserslautern

The Pervasiveness of RC

162,000

127,000

158,000113,000

171,000194,000

# of hits by Google

1,620,000

915,000

398,000

272,000

647,000

1,490,000

# of hits by Google

search “FPGA and ….”

Page 4: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de4

TU Kaiserslautern>> Outline <<

•Reconfigurable Computing Paradox

•Von Neumann loosing its dominance

•Software vs. Configware

•The dual paradigm approach

•Coarse-grained Reconfigurable Devices

•Conclusionshttp://www.uni-kl.de

Page 5: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de5

TU Kaiserslautern The RC Paradox

Effective integration density much worse than the Gordon Moore curve: by a factor of more than 10,000„very power-hungry“ [Rick Kornfeld*]

*) personal communication

application development: until recently still Logic Design on a very strange platform

The awful technology of FPGAs:

FPGAs run at lower clock frequencies, draw more power and are more expensive.

Page 6: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de6

TU Kaiserslautern

fine-grained RC: low effective integration density

immense area inefficiency

reconfigurability overhead

routing congestion

wiring overhead

overhead:

> 10 000

1980 1990 2000 2010100

103

106

109

FPGAlogical

FPGArouted

density:

FPGAphysical

(Gordon Moore curve)

transistors / microchip

(microprocessor)

[DeHon, Ph.D 1996]

Page 7: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de7

TU Kaiserslautern

published speed-up factors#

1980 1990 2000 2010100

103

106

109

8080

P4

7%/yr

50%/yr

http://xputers.informatik.uni-kl.de/faq-pages/fqa.html

100 000

Los Alamos traffic simulation

Los Alamos traffic simulation

47

real-time face detectionreal-time face detection6000

video-rate stereo vision

video-rate stereo vision

900pattern

recognitionpattern

recognition730

SPIHT wavelet-based image compressionSPIHT wavelet-based image compression 457Smith-Waterman pattern matching

Smith-Waterman pattern matching

288

BLASTBLAST52protein identificationprotein identification

40

molecular dynamics simulationmolecular dynamics simulation

88

Reed-Solomon Decoding

Reed-Solomon Decoding2400

Viterbi DecodingViterbi Decoding

400

FFTFFT

100

1000MA

CMA

C

Grid-based DRC:no FPGA: DPLA on MoM by TU-KL

Grid-based DRC:no FPGA: DPLA on MoM by TU-KL

2000

2-D FIR filter (no FPGA: DPLA by TU-KL)2-D FIR filter (no FPGA: DPLA by TU-KL)39,4

Lee Routing (DPLA by TU-

KL)

Lee Routing (DPLA by TU-

KL)

160

Grid-based DRC („fair

comparizon“)

Grid-based DRC („fair

comparizon“)15000

DSP and wirelessImage processing,Pattern matching,

Multimedia

Bioinformatics

GRAPEGRAPE20

Astrophysics

MoM Xputer architecture

cryptocrypto

Microprocessor

rela

tive p

erf

orm

ance

Memory

X 2/yr

Page 8: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de8

TU Kaiserslautern

HeHon‘s LawMOPS / milliWatt

1

10

100

1000

2 1 0.5 0.25 0.13 0.1 0.07

µ feature sizeRISC

FPGA

Page 9: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de9

TU Kaiserslautern

However ....

Application migration [from supercomputer] resulting in performance increase up to 4 orders of magnitude

Reducing electricity bill by an order of magnitude

Hits the memory wall from a different direction

People think that high-performance must mean expensive

Page 10: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de10

TU Kaiserslautern

why the RC paradigm shift is so important

Move the stool or the grand piano?

by Software

byConfigware

Page 11: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de11

TU Kaiserslautern>> Outline <<

•Reconfigurable Computing Paradox

•Von Neumann loosing its dominance

•Software vs. Configware

•The dual paradigm approach

•Coarse-grained Reconfigurable Devices

•Conclusionshttp://www.uni-kl.de

Page 12: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de12

TU Kaiserslautern

Cray XD1

vN paradigm loosing its dominanceXilinx inside !Xilinx inside !

Xilinx FPGAXilinx FPGA

Page 13: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de13

TU Kaiserslautern

von Neumann is not the common model

programcounter

DPUCPU

RAMmemory

von Neumann bottleneck

von Neumann instruction-

stream-based machine

co-processors

acceleratorCPU

instruction-stream-based

data-stream-

based

hard

ware

software

mainframe age:

microprocessor age:

wagging the dog

the tail is

vN paradigm dominance ?

Page 14: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de14

TU Kaiserslautern

Here is the common model

programcounter

DPUCPU

RAMmemory

von Neumann bottleneck

von Neumann instruction-

stream-based machine

co-processors

acceleratorCPU

instruction-stream-based

data-stream-

based

hard

ware

software

mainframe age:

microprocessor age:

configware age:

morp

hw

are

accelerator reconfigurable

accelerator hardwired

CPU

Page 15: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de15

TU Kaiserslautern

Here is the common model

programcounter

DPUCPU

RAMmemory

von Neumann bottleneck

von Neumann instruction-

stream-based machine

co-processors

acceleratorCPU

instruction-stream-based

data-stream-

based

hard

ware

software

mainframe age:

microprocessor age:

configware age:

CPU accelerator reconfigurable

morp

hw

aresoftware/configware

co-compiler

Page 16: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de16

TU KaiserslauternFundamentally different mind set

no program counter

non-von-Neumann

completely different OS principles

no instruction fetch at run time

it’s configware: definitely it is not software

Page 17: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de17

TU Kaiserslautern>> Outline <<

•Reconfigurable Computing Paradox

•Von Neumann loosing its dominance

•Software vs. Configware

•The dual paradigm approach

•Coarse-grained Reconfigurable Devices

•Conclusionshttp://www.uni-kl.de

Page 18: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de18

TU KaiserslauternCompilation: Software vs.

Configware

source program

softwarecompiler

software code

Software Engineeri

ng

Software Engineeri

ng

configware code

mapper

configwarecompiler

scheduler

flowware code

source „program“

Configware

Engineering

Configware

Engineering

placement &

routing

data

C, FORTRANMATHLAB

Page 19: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de19

TU Kaiserslautern

configware resources: variable

Nick Tredennick’s Paradigm Shifts explain the differences

2 programming sources needed

flowware algorithm: variable

Configware EngineeringConfigware Engineering

Software EngineeringSoftware Engineering

1 programming source needed

algorithm: variable

resources: fixedsoftware

CPU

Page 20: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de20

TU Kaiserslautern

Co-Compilation

softwarecompiler

software code

Software / Configware Co-Compiler

Software / Configware Co-Compiler

configware code

mapperconfigware

compiler

scheduler

flowware code

data

C, FORTRAN, MATHLAB

automatic SW / CW partitionersimulated annealing

simulated annealing

simulated annealing

simulated annealing

Page 21: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de21

TU Kaiserslautern

Organic Computing ?Bio-inspired use of FPGAs

• evolvable „hardware“ community:

• crossover of chromosomes

• In love with genetic algorithms: darwinistic way to fitness thru generations of populations

• inefficient, but unexpected results possible

• simulated annealing (genetic morphing) - fitness by synthesis: highly efficient

Page 22: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de22

TU Kaiserslautern

Software / Configware Co-Compilation

Resource Parameters

supportingdifferentplatformsAnalyzer

/ Profiler

SW code

SWcompiler

paradigm“vN" machine

CW Code

CWcompiler

Kress/Kung machine paradigm

Partitioner

C language source

FW Code

Juergen Becker’s CoDe-X, 1996

simulated annealing

Page 23: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de23

TU Kaiserslautern

Co-Compiler for Hardwired Kress/Kung Machine

[e. g. Brodersen]

softwarecompiler

software code

Software / Flowware

Co-Compiler

Software / Flowware

Co-Compiler

flowwarecompiler

scheduler

flowware code

data

source

automatic SW / CW partitioner

Page 24: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de24

TU Kaiserslautern>> Outline <<

•Reconfigurable Computing Paradox

•Von Neumann loosing its dominance

•Software vs. Configware

•The dual paradigm approach

•Coarse-grained Reconfigurable Devices

•Conclusionshttp://www.uni-kl.de

Page 25: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de25

TU Kaiserslautern

The dual paradigm approach

von Neumann paradigm Kress-Kung paradigm

Software Engineering

Software Engineering

Configware

Engineering

Configware

Engineering

ASM

CPU

Page 26: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de26

TU Kaiserslautern

DPA

xxx

xxx

xxx

|

||

x x

x

x

x

x

x x

x

- -

-

input data streams

xx

x

x

x

x

xx

x

--

-

-

-

-

-

-

-

-

-

-

xxx

xxx

xxx

|

|

|

|

|

|

|

|

|

|

|

|

|

|output data streams

„data

streams“ time

port #

time

time

port #time

port #

Flowware defines: ... which data item at which time at which port

Data streams (flowware)

(pipe network)

ASM

ASM

ASM

ASM

ASM

ASM

AS

M

AS

M

AS

M

AS

M

AS

M

AS

M

algebraic synthesis algorithms:H. T. Kung paradigm(systolic array)

Auto-Sequencing

Memory

RA

M

GA

G

ASM

implemented by distributed

memory

Page 27: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de27

TU Kaiserslautern

500MHz FlexibleSoft Logic Architecture

200KLogic Cells

500MHz Programmable DSP Execution Units

0.6-11.1GbpsSerial Transceivers

500MHz PowerPC™ Processors(680DMIPS)

withAuxiliary Processor Unit

1Gbps DifferentialI/O

500MHz multi-portDistributed 10 Mb SRAM

500MHz DCM DigitalClock Management

DSP platform FPGA[courtesy Xilinx Corp.]

Page 28: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de28

TU Kaiserslautern

Generalization of the systolic array ....

discard algebraic synthesis methods

[Rainer Kress]

use optimization algorithms instead

for example: simulated annealing

the achievement: also non-linear and non-uniform pipes, and even more wild pipe structures possible

now reconfigurability makes sense

remedy?

Page 29: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de29

TU Kaiserslautern>> Outline <<

•Reconfigurable Computing Paradox

•Von Neumann loosing its dominance

•Software vs. Configware

•The dual paradigm approach

•Coarse-grained Reconfigurable Devices

•Conclusionshttp://www.uni-kl.de

Page 30: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de30

TU Kaiserslautern

rDPU not used used for routing only operator and routing port location markerLegend: backbus connect

array size: 10 x 16 = 160 rDPUs

Coarse grain is about computing, not logic

rout thru only

not usedbackbus connect

SNN filter on KressArray (mainly a pipe network)

[Ulrich Nageldinger]

Example: mapping onto rDPA by DPSS: based on simulated annealing

reconfigurable function block, e. g. 32 bits wide

no CPU

Page 31: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de31

TU Kaiserslautern

coarse-grained RC: high integration density

FPGArouted

> 10 000

1980 1990 2000 2010100

103

106

109

(Gordon Moore curve)

transistors / microchip

rDPA physical rDPA logical

[Hartenstein, ISIS 1996]

The Reconfigurable Computing Paradox

Page 32: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de32

TU Kaiserslautern

hardwired

hardwired and coarse-grained reconf.

(rDPA)

Claassen‘s Law

2 1 0.5 0.250.001

0.01

0.1

1

10

100

1000

0.13 0.1 0.07

µ feature size

MOPS / milliWatt

standard microprocessor

DSP

instruction set processors(fine grained reconf.)

FPGAs

+ Hartenstein‘s Amendment

Page 33: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de33

TU Kaiserslauterncommercial rDPA

example:

PACT XPP - XPU128XPP128 rDPA

• Evaluation Board available, and • XDS Development Tool with Simulator

buses not

shown

rDPU

CF

G

PAE

core

ALU CtrlALU

CF

GC

FG

PAE

core

CF

GC

FG

PAE

core

PAE

core

ALU CtrlALUALU CtrlALU

CF

GC

FG

CF

GC

FG

• Full 32 or 24 Bit Design working silicon • 2 Configuration Hierarchies

© PACT AG, http://pactcorp.com

(r)DPA

Page 34: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de34

TU Kaiserslautern>> Outline <<

•Reconfigurable Computing Paradox

•Von Neumann loosing its dominance

•Software vs. Configware

•The dual paradigm approach

•Coarse-grained Reconfigurable Devices

•Conclusionshttp://www.uni-kl.de

Page 35: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de35

TU Kaiserslautern

Conclusions

RC is reducing cost without loss of performance and flexibility.

FPGAs may be configured like for a micro-processor for C/C++ code.An FPGA can perform a specific algorithm at very high speed.

Using a high-level language, the FPGA can be programmed for a wide variety of algorithms without any deep knowledge of the underlying architecture.

RC is reducing the electricity bill and the required building floor area

Speed-up factors of up to 4 orders of magnitude hve been reported

Compared to ASICs, prototyping time is on the order of hours rather than months, with a cost less than a tenth of that for an ASIC.

The personal supercomputer is near

Page 36: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de36

TU KaiserslauternConclusions (2)

We urgently need Reconfigurable Computing Education

An Update of CS curricula is overdue

Page 37: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de37

TU Kaiserslautern

END

Page 38: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de38

TU Kaiserslautern

thank you

Page 39: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de39

TU KaiserslauternThe first archetype machine model

mainframe

CPU

compile orassemble

proceduralpersonalization

Software IndustrySoftware Industry Software Industry’sSecret of Success

simple basic .Machine Paradigm

personalization:RAM-based

instruction-stream- based mind set

“von Neumann”

Page 40: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de40

TU KaiserslauternAn Archetype Common Model needed

Guidance for organizing efficient solutions

Make the project manageable

Allow to share lessions between applications and between application areas

Useful simple archetype not widely accepted

Archetype common model should provide ....

Progress stalled by the software/configware chasm

Configware IndustryConfigware Industryfrom the

Page 41: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de41

TU KaiserslauternThe 2nd archetype machine model

compilestructural

personalization

Configware IndustryConfigware Industry

Configware Industry’sSecret of Success

personalization:RAM-based

data-stream- based mind set

“Kress-Kung”

accelerator reconfigurable

simple basic .Machine Paradigm

Page 42: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de42

TU Kaiserslautern

rDPU rDPU rDPU rDPU

rDPU rDPU rDPU rDPU

rDPU rDPU rDPU rDPU

rDPU rDPU rDPU rDPU

rDPU rDPU rDPU rDPU

rDPU rDPU rDPU rDPU

rDPU rDPU rDPU rDPU

rDPU rDPU rDPU rDPU

rDPU rDPU rDPU rDPU

rDPU rDPU rDPU rDPU

rDPU rDPU rDPU rDPU

rDPU rDPU rDPU rDPU

rDPU rDPU rDPU rDPU

rDPU rDPU rDPU rDPU

rDPU rDPU rDPU rDPU

rDPU rDPU rDPU rDPU

S

+

for demo: a tiny section of the pipe networkinter-rDPU-communication: no memory cycles needed

configware solution: computing in space

Page 43: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de43

TU KaiserslauternCompare it to software solution on CPU

on a very simple CPU C = 1

memory cycles

nanoseconds

if C then read A

read instruction

instruction decoding

read operand*

operate & register transfers

if not C then read B

read instruction

instruction decoding

add & store

read instruction

instruction decoding

operate & register transfers

store result

total

S = R + (if C then A else B endif);

S

+

ABR C

Clock200

=1

S

+

Page 44: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de44

TU Kaiserslautern

hypothetical branching example to illustrate software-to-configware

migration

*) if no intermediate storage in register file

C = 1simple conservative CPU example

memory cycles

nanoseconds

if C then read A

read instruction 1 100instruction decoding

read operand* 1 100operate & reg. transfers

if not C then read B

read instruction 1 100instruction decoding

add & store

read instruction 1 100instruction decoding

operate & reg. transfers

store result 1 100

total 5 500

S = R + (if C then A else B endif);

S

+

ABR C

clock200 MHz(5 nanosec)

=1

sect

ion

of a

maj

or p

ipe

netw

ork

on rD

PU

no m

emor

y cy

cles

:

no m

emor

y cy

cles

:

spee

d-up

fac

tor

= 1

00

spee

d-up

fac

tor

= 1

00

Page 45: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de45

TU Kaiserslautern

The wrong mind set ....

S = R + (if C then A else B endif);

=1

+

ABR C

section of a very large pipe network:

decision

not knowing this solution:symptom of the hardware / software chasm

and the configware / software chasm

„but you can‘t implement decisions!“

Page 46: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de46

TU Kaiserslautern

The hardware / software chasm

If I use the term "software", a variety of images might appear in the engineering audience's mind.

Still we have "hardware" engineers and "software" engineers that go to different schools, attend different conferences, avoid each other's cocktail parties, and almost never play on the same volleyball teams at the company picnic. System designers begin to plan their creations around the skill sets and development processes of hardware engineers and software engineers. The two become oil and water.

The hardware / software chasm

Page 47: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de47

TU Kaiserslautern

Blurred line between hardware and software

The line between "hardware" and "software" is rapidly blurring and even becoming irrelevant from a system design perspective. As this happens, the traditional roles and skillsets of hardware and software engineers are being challenged, and a new generation of designers is emerging as a result.

the obfuscation caused by the pervasiveness of softness.

Page 48: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de48

TU Kaiserslautern

We need Reconfigurable Computing Education

We need a unification in dealing with problems, which are shared across many different application domains

There is an urgent need to cure severe qualification deficiencies of our graduates.

We need new curricula in CS and CE for providing an integrating dual paradigm mind set instead of vN-only

Page 49: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de49

TU KaiserslauternTerminology clean-up

Software: for scheduling instruction streams

Flowware: for scheduling data streams

Configware: for configuring morphware

Programming sources:

vonNeumann

primarilynon-vonNeumann

Page 50: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de50

TU KaiserslauternWhy coarse grain

much more MOPS/milliWatt

reconfigurable Data Path Unit (e. g. rALU)

mind set close to classical computing background

instead of rLB (~1 bit wide) use rDPU (e. g. 32 bits wide)

instead of FPGA use rDPA

rDPU rDPU rDPU rDPUrDPU rDPU rDPU rDPUrDPU rDPU rDPU rDPUrDPU rDPU rDPU rDPUReconfigurable Computing

(RC)

much more area-efficientmuch less

reconfigurability overhead

Page 51: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de51

TU Kaiserslautern

„data stream“: an ambigouos definition

Reconfigurable Computing is not instruction-stream-based

it‘s data-stream-based

it‘s different from the operation of the (indeterministic) „dataflow machine“

other definition also from multimedia area

usable definition from systolic array area

Page 52: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de52

TU Kaiserslautern>> Outline <<

•Reconfigurable Devices

•Coarse-grained Reconfigurable Devices

•Data-stream-based Computing

•The contemporary Common Model

•Reconfigurable Supercomputing

•Conclusionshttp://www.uni-kl.de

Page 53: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de53

TU KaiserslauternWhy the speed-up ...

... although FPGA is clock slower by x 3 or even more(most know-how from „high level synthesis“ discipline)

decisions without memory cycles nor clock cycles

most „data fetch“ without memory cycle

Page 54: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de54

TU Kaiserslautern

data moved around by software

i.e. by memory-cycle-hungry instruction streams which fully hit the memory wall

P&R: move

locality of

operation, not data !

extr

emel

y unbal

ance

d

stolen from Bob Colwell

CPU

Page 55: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de55

TU Kaiserslautern

Replace Caches by ...

stolen from Bob Colwell

CPUcaches

… by 16 x 16 reconfigurable data path array (rDPA)

which fits on the same chip

Page 56: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de56

TU Kaiserslautern

Similarly skilledwith hardware description languages, Hardware engineers had to adopt the methodologies and techniques of software engineers - Increased softness has an impact on even our products themselves

The required skills for your respective jobs are converging (against the grain in an age of increased specialization) and you'll soon be working with (and competing against) a new generation of embedded engineers that are similarly skilled in both disciplines.

Page 57: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de57

TU Kaiserslautern

Using FPGAs

Reducing cost without loss of performance and flexibility.

It may be configured like a general flexible micro-processor executing conventional C/C++ code, and as a highly specific programmability of FPGAs distinguishes to ASICs.

An FPGA can perform a specific algorithm at very high speed. Compared to ASICs, prototyping time is on the order of hours rather than months, with a cost less than a tenth of that for an ASIC.

Using a high-level language, the FPGA can be programmed for a wide variety of algorithms without any deep knowledge of the underlying architecture.

Field-programmable FPGAs

Page 58: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de58

TU Kaiserslautern

Co-Compiler Enabling Technology

is available from academia

only a small team needed for commercial re-implementation

on the road map to the Personal Supercomputer

Page 59: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de59

TU KaiserslauternConclusions (1)

We need a unification in dealing with problems, which are shared across many different application domains.

RC suffers from fragmentation into different cultures of the many application domains.

CS is the only domain being qualified f. such an effort

Page 60: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de60

TU KaiserslauternConclusions (2)

IEEE Computer Society should advocate to improve application development methodologiesand, a common educational approach useful for the wide variety of application domainsinside IEEE Computer Society, a TC on RC should lobby for more

Page 61: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de61

TU KaiserslauternConclusions (3)

reverse the downtrend in CS enrolment

educate not only students …

increase membership

make CS more fascinating

Strategic issue for entire IEEE Computer Society

Page 62: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de62

TU KaiserslauternConclusions (4)

The personal supercomputer is near, not only for the desktop, but also for a new road map to large scale supercomputing of up to now unthinkable highest performance dimensions.

IEEE-CS should accept this fascinating challenge, by spearheading the paradigm shift.

IEEE-CS is needed as a translator to explain the impact to managers and to a wide public.

Page 63: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de63

TU Kaiserslautern

RC education last week at Karlsruhe

Attendees declared ready to work for a task force

35 submissions from

Australia, Brasil, India, USA, and throughout Europe

But education is just one of several facets ……But education is just one of several facets ……

Page 64: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de64

TU Kaiserslautern

However ....

“What did you say again that your company does?” My father posed the question, “Gate arrays,” I replied, “They’re chips used to…”

“Oh yes, that’s right, Gatorade.” ….. “I used to give that to my marching band members so they wouldn’t get dehydrated on hot days. Don’t remember it coming in chip form …..”

Explain to your grandmother what it means if you’re one of the world’s leading experts on optical proximity correction (OPC) for nanometer-scale semiconductor lithography?

Could you perhaps relate it to some difficulty she has with needlepoint and her cataracts?

Even those with a scientific or technical background often won’t understand precisely what we do. A PhD in molecular biology won’t help to understand VHDL and Verilog synthesis for FPGAs.

Trying to relate DNA sequences to LUT truth tables might offer a starting point, but somebody has to be able to bridge the technology and terminology gap, even to initiate that analogy. Try explaining FPGAs with the consumer electronics approach. “People tend to relate when you tell them what your part goes into. Today, finally, ‘chip’ seems universally understood. I never get people asking about potato chips anymore.”

Page 65: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de65

TU Kaiserslautern

However ....Abstract. Google’s yaw-dropping hit rates illustrate the pervasiveness of Reconfigurable Computing (RC), mainstream in embedded systems already for years, and now being adopted by supercomputing (Cray, sgi, etc.). From FPGA usage as accelerators, speed-up factors by up to two orders of magnitude are reported, as well as floor space requirements and electricity invoice amounts reduced by one order of magnitude. About 3 orders of magnitude and more is obtained by using coarse-grained reconfigurable datapath arrays (rDPAs) available from a number of start-ups.This is astonishing, since FPGAs and rDPAs have a substantially lower clock speed than microprocessors. Algorithmic cleverness is the secret of success, based on software to configware migration mechanisms, striving away from memory-cycle-hungry instruction-stream-based computing paradigms.The main benefit of RC platforms - having replaced the use of hardwired accelerators - is their flexibility by non-procedural programmability. This also contributes to those concepts of Organic Computing, which rely on processes of evolution, self-organization, adaptation and fault tolerance. The main hurdles on the way to heart-stopping new horizons of cheap highest performance are CS-related educational deficits causing the configware / software chasm and a methodology fragmentation between the different cultures of application domains. Current CS curricula do not sufficiently meet their transdisciplinary responsibility. The talk gives a survey on fundamental issues in RC and on new directions in CS-related curricula, focused on a dual paradigm organic computing approach.

Page 66: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de66

TU Kaiserslautern

However ....

Application migration [from supercomputer] resulting in performance increase up to 4 orders of magnitude

„Saves more than $10,000 in electricity bills per year (7¢ / kWh) - .... per 64-processor 19" rack“ [Herb Riley, R. Associates]

Reducing electricity bill by an order of magnitude

Hits the memory wall from a different direction

Page 67: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de67

TU Kaiserslautern

However ....

Page 68: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de68

TU KaiserslauternConclusions

IEEE Computer Society should advocate to introduce a dual paradigm approach – away from the monopoly of the vN mind set

IEEE Computer Society should advocate a common model useful for the wide variety of application domains

Page 69: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de69

TU KaiserslauternConclusions

We need a unification in dealing with problems, which are shared across many different application domains.

RC suffers from fragmentation into different cultures of the many application domains.

Each domain uses its own trick box.We should teach the world to think outside the box

CS is the only domain qualified for this unification

Page 70: From Organic Computing to Reconfigurable Computing Reiner Hartenstein TU Kaiserslautern PASA, Frankfurt, March 16, 2006.

© 2005, [email protected] http://hartenstein.de70

TU KaiserslauternAn Archetype Common Model needed

Configware IndustryConfigware Industryfrom the

IEEE Computer Society should advocate to introduce a dual paradigm transdisciplinary education by using Configware Engineering as the counterpart of Software Engineering by new curricula in CS and CE for providing an integrating dual paradigm mind set supporting a unification in dealing with problems, which are shared across many different application domains - to cure severe qualification deficiencies of our graduates.