Design Flow – Computation Flow
Design Flow – Computation Flow
2
Computation Flow
• For both run-time and compile-time
• For some applications, must iterate
3
If many reconfigurations have to be done, then some of the steps should be reiterated according to the application's need.
A synchronization mechanism is usually used between the processor and the RD.
Blocking access should also be used for the memory access between the two devices.
Computation flow
4
Devices like the Xilinx Virtex II/II-Pro up and the Altera Excalibur feature one or more soft or hard-macro processors.
− The complete system can be integrated in only one device.
The reconfiguration process can be:
− Full: The complete device have to be reconfigured.
− Partial: Only part of the device is configured while the rest keeps running.
Computation flow
5
Full reconfiguration devices− Function to be downloaded at run-time are
developed and stored in a database.− No geometrical constraints restriction are
required for the function.
Partial reconfiguration capabilities− Modules represented as rectangular
boxes, are pre-computed and stored in a data base.
− With relocation, the modules are assigned to a position on the device at run-time.
Services
task 1
task 2
task N
PlacerM2
M4
M3
M1
Module Database
Scheduler
Task Request
O.S.
T1
TN
Reconfigurable Device
T2
Computation flow
6
RTR Challenges
• Management of Reconf. Device:− Usually as a part of the OS running on a
processor
Scheduler− Decides when a task must be executed− Tasks in a database− Characterized by (bbox, run time)
Placer− Temporal placement: management of tasks at run
time− Allocates a set of resources for the task.− If cannot find a site, task is rejected
• Challenges:
Fragmentation Communication between new/old tasks
Services
task 1
task 2
task N
PlacerM2
M4
M3
M1
Module Database
Scheduler
Task Request
O.S.
T1
TN
Reconfigurable Device
T2
Design Flow
8
• Implementation of a reconfigurable system:
a Hardware/software co-design process:
• Software part: (code-segment to be executed on the processor)
Development in a software language with common tools
• Hardware part: (to be executed on the RD)
Development in HDL• Interface:
HDL or system-level languagesSoftware
C, C++, Javaetc ...
HardwareVHDL, VerilogHandelC, etc..
Interface
Hardware/Software Partitioning
9
FPGA Architecture• FPGA architecture from CAD tools’ point of view:
N BLE’s (Basic Logic Element) K-LUT: k-input LUT I inputs, N outputs Inputs and outputs fully connected to the inputs of each LUT
through MUXes
10
Design Flow for H/w Part
Almost the same for all digital circuit design
• Synthesis Different particularly in Technology
mapping− LUT-technology mapping− Specific to target technology (device)
11
Design Flow for H/w Part
• Design Entry
Schematic Netlist HDL Waveform State Diagram
12
Textual or Schematic
• Most people today use textual languages rather than schematic
Poor use of screen space.
Not appropriate for large designs.
Hard tooling (parsing).
13
What is Synthesis?• Transformation of an
abstract description into a more detailed description
"+" operator is transformed into a gate netlist
"if (VEC_A = VEC_B) then" a comparator which controls a multiplexer
• Transformation depends on several factors:
Algorithm, constraints, library
مثل ( • ساده ) AND ،ORعملگرهاي مشخصي گيتهاي به مقايسه ،به بتدا ا ضرب مثل تر پيچيده عملگرهاي اما شوند مي تبديل
آن خاص .toolماکروسلهاي شوند مي تبديل
14
Synthesizability
• Only a subset of VHDL is synthesizable
• Different tools support different subsets records? arrays of integers? clock edge detection? sensitivity list? ...
15
• Compilation and optimization: All non-synthesizable data types and
operations synthesizable code Translated into a set of Boolean equations Then minimized (Technology-independent
optimization)• Technology mapping:
Assign functional modules to library elements. On FPGAs:
− Mapping control logic and datapath to LUTs and BLEs
− Mapping optimized datapath to on-chip dedicated circuit structures (e.g. on-chip multipliers, adders with dedicated carry-chains, embedded memory blocks)
Technology-dependent optimization
Synthesis
16
• Result:
Netlist: a list of components and their interconnections.
• Netlist Formats:
EDIF (Electronic Design Interchange Format). Vendor specific formats.
− Example: XNF (Xilinx Netlist Format)
Synthesis
17
• Place:
Assign locations to the components In hierarchical architectures:
− May need a separate clustering step: to group BLEs into logic blocks
− Clustering: prior to placement or during placement
• Route: Provide communication paths to the
interconnections.
• Optimization problems: some cost must be minimized
• Important factors: Clock frequency Power Consumption Routing congestion ...
Physical Design: Place and Route
18
FPGA Placement & Routing
19
Field Programmable Gate Array (FPGA)
20
• Bitstream:
LUT contents,
Multiplexer control lines,
Interconnections,
….
Configuration Bitstream
21
Design Flow
• Debug سيکلبرنامه مانند طرحنويسي:
اي کامپل
برنامه اجرانويسي
ويرايش
کامپايل
سازي طرح شبيه ورود
ويرايش
سنتز سازي شبيه
ويرايش
22
• Design: Modulo 10-counter
• Target device: FPGA with 2x2 Logic Blocks (LB) LBs:
− Two 2-inputs LUTs− Two edge-triggered T-Flipflops
• Objectives:
Area Latency
FPGA Design Flow – Example
23
• Truth table: State transitions TFF inputs
FPGA Design Flow – Example• Synthesis and Optimization:
Karnaugh maps
24
FPGA Design Flow – Example
25
FPGA Design Flow – Example
26
References
[Bobda07] C. Bobda, “Introduction to Reconfigurable Computing: Architectures, Algorithms and Applications,” Springer, 2007.