Lecture 7 FPGA technology
Dec 14, 2015
Lecture 7 FPGA technology
2
Implementation Platform Comparison
3
FPGA main components and features
Logic block architecture Interconnect architecture Programming technology Power dissipation Reconfiguration model
4
FPGA model
…….
5
Interconnect Network Topologies
Island style Row-based Sea-of-gates Hierarchical One-dimensional structures
6
Island-Style Architecture
7
Row-Based Architecture
8
Sea-of-Gates Architecture
9
Hierarchical Architecture
10
One-Dimensional Architecture
11
Logic Cluster Parameters
The size of (number of inputs to) a LUT.
The number of CLBs in a cluster. The number of inputs to the cluster
for use as inputs by the LUTs. The number of clock inputs to a
cluster (for use by the registers).
12
Studies on the CLB structure
Area optimal: 3-4 input LUTs For multiple output LUTs:
Optimal area: 4 input LUTs Optimal delay: 5-6 input LUTs
4-input LUT clusters show 10% area efficiency in comparison to single 4-input LUTs
13
Programming Technology
Volatile (SRAM) Irreversible (Antifuse) EPROM, EEPROM AND FLASH The programming technology affects
the FPGA area
14
SRAM Programming Technology
Configuration storage on SRAM cells Volatile (FPGA has to be
reprogrammed on power-up) Large area (SRAM cells) Allows dynamic and partial
reconfiguration
15
Antifuse Programming Technology
Programming element is an antifuse (high impedance (open-circuit) on low voltage, low impedance (connection) on high voltage)
Small area Non-volatile (no need for
reprogramming on power-up) Irreversible (design errors cannot be
corrected)
16
EPROM, EEPROM and Flash Programming Technology
Non-volatile Reprogramming through exposure to
ultraviolet light (EPROM) or electrical signals (EEPROM/Flash)
Slower programming than SRAM
17
FPGA Power Consumption
FPGA power dissipation components: Interconnection network Clock network Input/Output Logic block
18
FPGA Power Consumption Breakdown (XC4003)
19
Dynamic vs Static Power Consumption
Dynamic power consumption is still dominant, even though the static power consumption component increases with the decrease in feature size.
20
Reconfiguration Models
Static Reconfiguration Dynamic Reconfiguration Single Context Multi-Context Partial Reconfiguration Pipeline Reconfiguration
21
Static Reconfiguration
Compile-time Reconfiguration Most common approach One configuration per application System must be halted and then
restarted with new program
22
Dynamic Reconfiguration
Run-time Reconfiguration Based on virtual hardware Trade-off between time and space
23
Single Context
One configuration at a time Programming using a serial
bitstream High overhead for small
configuration changes Not suitable for run-time
reconfiguration
24
Multi-Context
Multiple memory bits for each programming bit location
Multiplexed set of single context devices
One context can be reprogrammed when another is active
25
Partial Reconfiguration
Addresses used to specify the target location of the configuration data
Undisturbed portions of the array can continue execution during reconfiguration
Reduces the amount of data that must be transferred to the FPGA
26
Pipeline Reconfiguration
Partial reconfiguration increments of pipeline stages
Used in datapath-style computations
27
Run-Time Reconfiguration
Algorithmic Reconfiguration Architectural Reconfiguration Functional Reconfiguration Fast Configuration Configuration Prefetching Configuration Compression Relocation and Defragmentation in
Partially Reconfigurable Systems Configuration Caching
28
Algorithmic Reconfiguration
Reconfigure the system with an algorithm which performs the same functionality but with different requirements
Adapt dynamically to environment or operational changes
29
Architectural Reconfiguration
Modify hardware topology by reallocating resources to computations
30
Functional Reconfiguration
Execute different functions on the same resources
Time-share resources across computational tasks
31
Fast Configuration
Reconfigure the device as fast as possible in order to minimize reconfiguration overhead
32
Configuration Prefetching
Loading a configuration onto a device in advance, in order to overlap reconfiguration with useful computation
The challenge is to determine future configurations
33
Configuration Compression
Minimize the data that must be loaded to the device in multi-context environment
34
Configuration Caching
Reducing the amount of configuration data that must be transferred to the device
The challenge is to determine which configuration to retain and which to flush
35
Commercial Fine-Grain Reconfigurable Architectures
Xilinx Spartan-3 /Spartan-3L Virtex-4 Virtex-5
Altera Cyclone Cyclone II Stratix II /Stratix II GX
Actel Fusion ProASIC3/
ProASICPLUS Axcelerator Varicore
AtmelAT40K/AT40KLVAT6000
QuicklogicPolarProEclipse II
LatticeLatticeECP2LatticeXP
36
Xilinx Spartan-3 CLB
Four slices Two logic function generators/slice Two storage elements/slice
Interconnect Long lines (one out of every six CLBs) Hex lines (one out of every three CLBs) Double lines (every other CLB) Direct lines (each CLB with its neighbours)
Advanced features BlockRAM Dedicated Multipliers Digital Clock Managers
Configuration SRAM
37
Xilinx Spartan-3
38
Xilinx Virtex-4 Three variations (LX, FX, SX) CLB
Four slices Two logic function generators/slice Two storage elements/slice
Advanced features BlockRAM XtremeDSP slices Digital Clock Managers
Additional features in the FX family 8–24 RocketIO Multi-Gigabit serial Transceivers One or Two PowerPC cores Two or Four Tri-MAC Cores
Configuration SRAM
39
Xilinx Virtex-5 65 nm ExpressFabric
6-input LUTs Interconnect
Diagonal symmetric interconnect Advanced features
DCM and PLLs BlockRAM DSP48E slices
Configuration SRAM Advanced Encryption Standard technology for bitstream
protection
40
Altera Cyclone/Cyclone II
Essentially the same architecture in 130 nm (Cyclone), and 90 nm (Cyclone II)
LE (10 per LAB): 4-input LUT Register Carry chain
MultiTrack Interconnect Row and column interconnects spanning fixed distances
Advanced Features: Embedded Memory PLLs External RAM interfacing Embedded multipliers (Cyclone II only)
41
Cyclone Logic Element
42
Altera Stratix II/ Stratix II GX
Adaptive Logic Modules: MultiTrack Interconnect Advanced Features:
TriMatrix Memory
43
Adaptive Logic Module
44
Review Questions
Can you partially reconfigure a single-context FPGA?
How often do you need to reconfigure a SRAM configuration memory FPGA device?
One design comprising 200 CLBs and one comprising 400 CLBs are to be downloaded on the same device, that doesn’t support dynamic reconfiguration. How big is the size of the second design bitstream in comparison to the first?