Top Banner
Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of Wisconsin - Madison
28

Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

Exploration of Pipelined FPGA Interconnect Structures

Scott HauckAkshay Sharma, Carl EbelingUniversity of Washington

Katherine ComptonUniversity of Wisconsin - Madison

Page 2: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

2

PipeRoute

• FPGA’2003: Pipelining-aware Router for FPGAs• Architecture-adaptive, based on Pathfinder

• Uses optimal 2-terminal, 1-delay router

• Greedy formulation for multi-delay, multi-terminal routing

T1

S

T2

Page 3: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

3

RaPiD

• Coarse-grained, 1D, 16-bit, w/DSP Units

• Carl Ebeling @ UW-CSE

• Pipelined interconnect via Bus Connectors (BCs)

GP

R

RA

M

RA

M

GP

R

MU

LT

GP

R

AL

U

AL

U

GP

R

GP

R

RA

M

AL

U

GP

R

Page 4: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

4

Pipelined Routing Results• Area expansion due to pipelining

• Normalized to unpipelined circuit area

0

0.5

1

1.5

2

2.5

3

0% 10% 20% 30% 40% 50% 60% 70%

% PIPELINED SIGNALS

NO

RM

AL

IZE

D A

RE

A

TS TS

Ave: 75% cost

Page 5: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

5

Contributions

• Optimized PipeRoute• Support multiple delays per BC (greedy preprocessor)

• Timing driven – Pathfinder’s, worst-case criticality across signal

• RouteCost = Criticality * delay_cost + (1-criticality) * area_cost

• Arch. Exploration of RaPiD Pipelined Interconnects• Registered logic block (input/output/none)

• BC track length

• Delays per register/BC

• BC/non-BC routing mix

• Register-only logic blocks

• Goal: More efficient support of pipelined interconnects

TS

Page 6: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

6

Methodology

• Benchmarks• Retimed, not C-slowed

• Graphs• Increase arch to fit

(cells, tracks/cell)

• Variation around local minima

0

1

2

3

4

5

6

7

8

9

10

1 2 3 4 5 6 7Delays per BC/Reg

AR

EA

0

2

4

6

8

10

12

DE

LA

Y

AREA DELAY AREA*DELAY

0%

20%

40%

60%

80%

NETLIST

% P

IPE

LIN

ED

SIG

NA

LS

Page 7: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

7

Registers in Logic Blocks

• Output Registers

• No Registers

• Input Registers

0

1

2

3

4

5

6

7

8

9

Out None InRegs in Functional Units

AR

EA

0

2

4

6

8

10

12

DE

LA

Y

AREA DELAY AREA*DELAY

+

+

+

T1

S

T2

5% 20% 23%

Page 8: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

8

Delays per Register/BC

• 1 Delay/BC

• 2 Delays/BC

0

1

2

3

4

5

6

7

8

9

10

1 2 3 4 5 6 7Delays per BC/Reg

AR

EA

0

2

4

6

8

10

12

DE

LA

Y

AREA DELAY AREA*DELAY

15% 20% 30%

Page 9: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

9

BC Track Length

• Length 16 BC wires

• Length 8 BC wires

0

1

2

3

4

5

6

7

8

9

32 16 8 4BC Track Length

AR

EA

0

5

10

15

20

25

DE

LA

Y

AREA DELAY AREA*DELAY

17% 64% 69%

Page 10: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

10

Routing Resource Mix (BC vs. non-BC)

• 5/7

• 7/7

0

1

2

3

4

5

6

7

8

9

7/7 6/7 5/7 4/7 3/7Proportion BC Tracks

AR

EA

0

2

4

6

8

10

12

DE

LA

Y

AREA DELAY AREA*DELAY

19% 17% 18%

Page 11: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

11

GPRs per Cell

• GPR roles:• Registers from computation

• Passthrough for changing tracks

• 6 per cell

• 9 per cell

0

1

2

3

4

5

6

7

8

5 6 7 8 9 10GPRs per Cell

AR

EA

0

2

4

6

8

10

12

DE

LA

Y

AREA DELAY AREA*DELAY

6% 23% 22%

Page 12: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

12

Overall – vs. RaPiD-I

• RaPiD-I• 1 BC / cell (13 LBs long)

• 5/7 BC tracks

• 3 registers / BC

• 6 GPRs / cell

• registered outputs

• Post-Explore• 1 BC / cell (16 LBs long)

• 5/7 BC tracks

• 3 registers / BC

• 9 GPRs / cell

• registered inputs

0

0.2

0.4

0.6

0.8

1

1.2

1.4

firtm

fft16

cascade

matmult4

sobel

imagerapid

firsymeven

sort_g

sort_rb

Proportion non-BC Tracks

Ra

tio

Po

st/

Ra

PiD

-I

AREA DELAY AREA*DELAY

Ave: 1% 18% 19%

Page 13: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

13

Overall – Pipelining Cost

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

0% 10% 20% 30% 40% 50% 60% 70% 80%

% Pipelined Signals

No

rma

lize

d A

rea

TS TS

Ave: 18% cost

Page 14: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

14

Conclusions

• Router for arbitrary pipelined architectures• Timing-driven

• Supports multiple delays at each register site

• Good quality: <18% of pseudo-lower bound (non-pipelined) area

• Architecture Exploration of RaPiD• Parameters:

• Registered inputs on functional units

• Length 16 wires

• 3 delays per BC/register

• 2/7 non-registered, 5/7 registered wires

• 9 GPRs/cell to improve flexibility

• Delay: spacing of registers CRITICAL, too close better than too far

• 19% area*delay improvement over RaPiD-I (primarily delay)

Page 15: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

15

*** End of Talk Marker ***

Page 16: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

16

1-Delay Two Terminal

• Can do optimal routing for 1-delay routes via BFS

TS

Page 17: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

17

1-Delay Two Terminal

• Can do optimal routing for 1-delay routes via BFS

TS

Page 18: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

18

1-Delay Two Terminal

• Can do optimal routing for 1-delay routes via BFS

TS

Page 19: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

19

1-Delay Two Terminal

• Can do optimal routing for 1-delay routes via BFS

TS

Page 20: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

20

1-Delay Two Terminal

• Can do optimal routing for 1-delay routes via BFS

TS

Page 21: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

21

1-Delay Two Terminal

• Can do optimal routing for 1-delay routes via BFS

TS

Page 22: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

22

1-Delay Two Terminal

• Can do optimal routing for 1-delay routes via BFS

TS

Page 23: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

23

N-Delay Two Terminal

• Greedy Approximation via 1-Delay Router

TS

Page 24: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

24

N-Delay Two Terminal

• Greedy Approximation via 1-Delay Router• Find 1-delay route

TS

Page 25: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

25

N-Delay Two Terminal

• Greedy Approximation via 1-Delay Router• Find 1-delay route

• While not enough delay on route

• Replace any 0-delay segment with cheapest 1-delay replacement

TS

Page 26: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

26

N-Delay Two Terminal

• Greedy Approximation via 1-Delay Router• Find 1-delay route

• While not enough delay on route

• Replace any 0-delay segment with cheapest 1-delay replacement

TS

Page 27: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

27

N-Delay Two Terminal

• Greedy Approximation via 1-Delay Router• Find 1-delay route

• While not enough delay on route

• Replace any 0-delay segment with cheapest 1-delay replacement

TS

Page 28: Exploration of Pipelined FPGA Interconnect Structures Scott Hauck Akshay Sharma, Carl Ebeling University of Washington Katherine Compton University of.

28

N-Delay Two Terminal

• Greedy Approximation via 1-Delay Router• Find 1-delay route

• While not enough delay on route

• Replace any 0-delay segment with cheapest 1-delay replacement

TS