Transcript

Design Flow – Computation Flow

2

Computation Flow

• For both run-time and compile-time

• For some applications, must iterate

3

If many reconfigurations have to be done, then some of the steps should be reiterated according to the application's need.

A synchronization mechanism is usually used between the processor and the RD.

Blocking access should also be used for the memory access between the two devices.

Computation flow

4

Devices like the Xilinx Virtex II/II-Pro up and the Altera Excalibur feature one or more soft or hard-macro processors.

− The complete system can be integrated in only one device.

The reconfiguration process can be:

− Full: The complete device have to be reconfigured.

− Partial: Only part of the device is configured while the rest keeps running.

Computation flow

5

Full reconfiguration devices− Function to be downloaded at run-time are

developed and stored in a database.− No geometrical constraints restriction are

required for the function.

Partial reconfiguration capabilities− Modules represented as rectangular

boxes, are pre-computed and stored in a data base.

− With relocation, the modules are assigned to a position on the device at run-time.

Services

task 1

task 2

task N

PlacerM2

M4

M3

M1

Module Database

Scheduler

Task Request

O.S.

T1

TN

Reconfigurable Device

T2

Computation flow

6

RTR Challenges

• Management of Reconf. Device:− Usually as a part of the OS running on a

processor

Scheduler− Decides when a task must be executed− Tasks in a database− Characterized by (bbox, run time)

Placer− Temporal placement: management of tasks at run

time− Allocates a set of resources for the task.− If cannot find a site, task is rejected

• Challenges:

Fragmentation Communication between new/old tasks

Services

task 1

task 2

task N

PlacerM2

M4

M3

M1

Module Database

Scheduler

Task Request

O.S.

T1

TN

Reconfigurable Device

T2

Design Flow

8

• Implementation of a reconfigurable system:

a Hardware/software co-design process:

• Software part: (code-segment to be executed on the processor)

Development in a software language with common tools

• Hardware part: (to be executed on the RD)

Development in HDL• Interface:

HDL or system-level languagesSoftware

C, C++, Javaetc ...

HardwareVHDL, VerilogHandelC, etc..

Interface

Hardware/Software Partitioning

9

FPGA Architecture• FPGA architecture from CAD tools’ point of view:

N BLE’s (Basic Logic Element) K-LUT: k-input LUT I inputs, N outputs Inputs and outputs fully connected to the inputs of each LUT

through MUXes

10

Design Flow for H/w Part

Almost the same for all digital circuit design

• Synthesis Different particularly in Technology

mapping− LUT-technology mapping− Specific to target technology (device)

11

Design Flow for H/w Part

• Design Entry

Schematic Netlist HDL Waveform State Diagram

12

Textual or Schematic

• Most people today use textual languages rather than schematic

Poor use of screen space.

Not appropriate for large designs.

Hard tooling (parsing).

13

What is Synthesis?• Transformation of an

abstract description into a more detailed description

"+" operator is transformed into a gate netlist

"if (VEC_A = VEC_B) then" a comparator which controls a multiplexer

• Transformation depends on several factors:

Algorithm, constraints, library

مثل ( • ساده ) AND ،ORعملگرهاي مشخصي گيتهاي به مقايسه ،به بتدا ا ضرب مثل تر پيچيده عملگرهاي اما شوند مي تبديل

آن خاص .toolماکروسلهاي شوند مي تبديل

14

Synthesizability

• Only a subset of VHDL is synthesizable

• Different tools support different subsets records? arrays of integers? clock edge detection? sensitivity list? ...

15

• Compilation and optimization: All non-synthesizable data types and

operations synthesizable code Translated into a set of Boolean equations Then minimized (Technology-independent

optimization)• Technology mapping:

Assign functional modules to library elements. On FPGAs:

− Mapping control logic and datapath to LUTs and BLEs

− Mapping optimized datapath to on-chip dedicated circuit structures (e.g. on-chip multipliers, adders with dedicated carry-chains, embedded memory blocks)

Technology-dependent optimization

Synthesis

16

• Result:

Netlist: a list of components and their interconnections.

• Netlist Formats:

EDIF (Electronic Design Interchange Format). Vendor specific formats.

− Example: XNF (Xilinx Netlist Format)

Synthesis

17

• Place:

Assign locations to the components In hierarchical architectures:

− May need a separate clustering step: to group BLEs into logic blocks

− Clustering: prior to placement or during placement

• Route: Provide communication paths to the

interconnections.

• Optimization problems: some cost must be minimized

• Important factors: Clock frequency Power Consumption Routing congestion ...

Physical Design: Place and Route

18

FPGA Placement & Routing

19

Field Programmable Gate Array (FPGA)

20

• Bitstream:

LUT contents,

Multiplexer control lines,

Interconnections,

….

Configuration Bitstream

21

Design Flow

• Debug سيکلبرنامه مانند طرحنويسي:

اي کامپل

برنامه اجرانويسي

ويرايش

کامپايل

سازي طرح شبيه ورود

ويرايش

سنتز سازي شبيه

ويرايش

22

• Design: Modulo 10-counter

• Target device: FPGA with 2x2 Logic Blocks (LB) LBs:

− Two 2-inputs LUTs− Two edge-triggered T-Flipflops

• Objectives:

Area Latency

FPGA Design Flow – Example

23

• Truth table: State transitions TFF inputs

FPGA Design Flow – Example• Synthesis and Optimization:

Karnaugh maps

24

FPGA Design Flow – Example

25

FPGA Design Flow – Example

26

References

[Bobda07] C. Bobda, “Introduction to Reconfigurable Computing: Architectures, Algorithms and Applications,” Springer, 2007.