Top Banner
A High Performance Application Representation for Reconfigurable Systems Wenrui Gong Gang Wang Ryan Kastner Department of Electrical and Computer Engineering University of California Santa Barbara, CA 93106-9560 {gong, wanggang, kastner}@ece.ucsb.edu http://express.ece.ucsb.edu June 22, 2004
29

A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

A High Performance Application Representation

for Reconfigurable Systems

Wenrui Gong Gang Wang Ryan KastnerDepartment of Electrical and Computer Engineering

University of CaliforniaSanta Barbara, CA 93106-9560

{gong, wanggang, kastner}@ece.ucsb.eduhttp://express.ece.ucsb.edu

June 22, 2004

Page 2: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 2

Outline

Reconfigurable computing systems Compilation process Synthesizing to hardware Experimental results Concluding remarks

Page 3: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 3

Outline

Reconfigurable computing systems Reconfigurable computing systems Challenges of application representations

Compilation process Synthesizing to hardware Experimental results Concluding remarks

Page 4: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 4

Reconfigurable Computing Systems

Standard programmable platforms Post-manufacturing customization Designs shift from physical chips to

configuration files A software design flow

Feature hardware speed with software flexibility

Enable higher productivity

Page 5: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 5

Application Representations

A common application representation is needed to tame the complexity of system synthesis

Requirements Able to generate software code for

microprocessors Able to be easily translate to hardware

configuration files Allow a variety of transformations and

optimizations to exploit the performance

Page 6: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 6

Parallelism Exploration

Fine grain parallelism Multiple functional units Issuing an operation to a free functional units Operations executed independently

Coarse grain parallelism Executing multiple threads With occasional synchronization

Reconfigurable computing systems support both fine and coarse grain parallelism

Page 7: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 7

PDG + SSA

The PDG + SSA representation can be used for both hardware synthesis and software generation

The PDG and SSA forms are common representations for software generation

Here we concentrate on hardware synthesis

Page 8: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 8

Outline

Reconfigurable computing systems Compilation process

Overview Constructing the PDG Incorporating the SSA form

Synthesizing to hardware Experimental results Concluding remarks

Page 9: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 9

Overview

Page 10: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 10

Program Dependence Graph

PDG: Program Dependence Graph ENTRY node: the root node of a PDG PREDICATE nodes: producing predicate

values from expressions Diamond-shaped nodes 2, 3, and 4

STATEMENTS nodes: a arbitrary set of operations

Circle nodes: 1, 4, 6, 7, and 8 REGION nodes: summarizing all

operations with the same control conditions together.

House-shaped nodes R2, R3, R4 … R3: the predicate value of 2 is True

Edges represent dependencies

Page 11: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 11

Constructing the PDG from the CDFG

Implemented based on Ferrante’s algorithm Using post-dominate tree

var = pred;for (i = 0; i < len; ++i){ val += diff; if (val > 32767) val = 32767; else if (val < -32768) val = -32768;}return val;

Page 12: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 12

Constructing the PDG (cont’d)

Page 13: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 13

The Static Single Assignment Form

Each variable has exactly one assignment A variable is referenced always using the same

name At joint points of control conditions, special Ø nodes

are inserted.

val += diff;if (val > 32767) val = 32767;else if (val < -32768) val = -32768;

val_2 = val_1 + diff;if (val_2 > 32767) val_3 = 32767;else if (val_2 < -32768) val_4 = -32768;val_5 = phi(val_2,val_3,val_4);

Page 14: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 14

Extending the PDG with Ø-Nodes

Page 15: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 15

The Program Representation

Loop independent Ø-nodes taking two or more input

values and a predicate value committing one of the inputs

depending on this predicate Loop carried Ø-nodes

Input: the initial value, the loop-carried value, and also a predicate value

Outputs: one to the iteration body, and the other to the loop exit

Directing proper values to proper outputs.

Page 16: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 16

Outline

Reconfigurable computing systems Compilation process Synthesizing to hardware

Data-path elements Ø-nodes

Experimental results Concluding remarks

Page 17: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 17

Synthesizing the Data-Path

A one-to-one mapping is used Different resource allocation and binding algorithms can be used (on-going work)

Each operation has an operator and several operands Operands are synthesized directly to wires in the circuit

Each variable in the SSA form has only one definition point PREDICATE nodes: synthesized to Boolean logic signals to control

next-stage transitions and direct multiplexers to commit the correct value.

Page 18: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 18

Synthesizing Ø-nodes

A loop-independent Ø-nodes are synthesized to a multiplexer. The multiplexer selects input values depending on the predicate values.

For a loop carried Ø-node, an additional switch is generated to direct the loop-exiting values

Page 19: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 19

Synthesize to Hardware

Simplifications and optimizations Removing unnecessary

control dependencies Cascading/ expanding

multipliers obtain better performance

Flip-flops are inserted Guarantee that correct

values will available no matter which execution path is taken

Page 20: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 20

Outline

Reconfigurable computing systems Compilation process Synthesizing to hardware Experimental results

Setup and benchmarks Results

Concluding remarks

Page 21: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 21

Setup and Benchmarks

Benchmark suites Functions from the MediaBench suite Profiled using sample data Only report conservative results

Estimated execution time Aggressive predicated execution Only report conservative results

Area One-to-one mapping without resource sharing Reported in numbers of FPGA slices

Page 22: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 22

Estimated Execution Time

Page 23: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 23

Estimated Execution Time (cont’d)

Page 24: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 24

Estimated FPGA Area

Page 25: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 25

Outline

Reconfigurable computing systems Compilation process Synthesizing to hardware Experimental results Concluding remarks

On-going/future work

Page 26: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 26

Concluding Remarks

The PDG+SSA form supports a variety of transformations and enables both coarse and fine grain parallelism

A method to synthesize this form to hardware

This form gives faster execution time using similar area when compared with CFG and PSSA forms

Page 27: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 27

On-going/Future work

Investigate transformations to create coarse grained parallelism using the PDG+SSA form

Augment the PDG+SSA form with architectural information to provide fast estimation.

Integrate of resource sharing and other architectural synthesis techniques

Page 28: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 28

Thank You

Prof Ryan Kastner and Gang Wang All audiences

Page 29: A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

6/21/2004

GONG et al: A High Performance Application Representation for Reconfigurable Systems 29

Questions