-
L12: 6.111 Spring 2006 1Introductory Digital Systems
Laboratory
L12: Reconfigurable Logic ArchitecturesL12: Reconfigurable Logic
Architectures
Acknowledgements:
Materials in this lecture are courtesy of the following sources
and are used with permission.
Prof. Randy Katz (Unified Microelectronics Corporation
Distinguished Professor in Electrical Engineering and Computer
Science at the University of California, Berkeley) and Prof.
Gaetano Borriello (University of Washington Department of Computer
Science & Engineering) From Chapter 2 of R. Katz, G. Borriello.
Contemporary Logic Design. 2nd ed. Prentice-Hall/Pearson Education,
2005.
Frank Honore
-
L12: 6.111 Spring 2006 2Introductory Digital Systems
Laboratory
History of Computational FabricsHistory of Computational
Fabrics
Discrete devices: relays, transistors (1940s-50s)Discrete logic
gates (1950s-60s)Integrated circuits (1960s-70s)
e.g. TTL packages: Data Book for 100s of different parts
Gate Arrays (IBM 1970s)Transistors are pre-placed on the chip
& Place and Route software puts the chip together automatically
only program the interconnect (mask programming)
Software Based Schemes (1970s- present)Run instructions on a
general purpose core
Programmable Logic (1980s to present)A chip that be reprogrammed
after it has been fabricatedExamples: PALs, EPROM, EEPROM, PLDs,
FPGAsExcellent support for mapping from Verilog
ASIC Design (1980s to present)Turn Verilog directly into layout
using a library of standard cells Effective for high-volume and
efficient use of silicon area
-
L12: 6.111 Spring 2006 3Introductory Digital Systems
Laboratory
Reconfigurable LogicReconfigurable Logic
Logic blocksTo implement combinationaland sequential logic
InterconnectWires to connect inputs andoutputs to logic
blocks
I/O blocksSpecial logic blocks at periphery of device
forexternal connections
Key questions:How to make logic blocks programmable?(after chip
has been fabbed!)What should the logic granularity be?How to make
the wires programmable?(after chip has been fabbed!)Specialized
wiring structures for localvs. long distance routes?How many wires
per logic block?
LogicLogic
Configuration
Inputs Outputsn m
Q
QSET
CLR
D
-
L12: 6.111 Spring 2006 4Introductory Digital Systems
Laboratory
Programmable Array Logic (PAL)Programmable Array Logic (PAL)
Based on the fact that any combinational logic can be realized
as a sum-of-productsPALs feature an array of AND-OR gates with
programmable interconnect
inputsignals
outputsignals
programming of product terms
programming of sum terms
ANDarray OR array
-
L12: 6.111 Spring 2006 5Introductory Digital Systems
Laboratory
Inside the 22v10 PALInside the 22v10 PAL
Each input pin (and its complement) sent to the AND arrayOR
gates for each output can take 8-16 product terms, depending on
output pinMacrocell block provides additional output
flexibility...
Image removed due to copyright restrictions.
-
L12: 6.111 Spring 2006 6Introductory Digital Systems
Laboratory
Cypress PAL CE22V10Cypress PAL CE22V10
Outputs may be registered or combinational, positive or
inverted
Images courtesy of Lattice Semiconductor Corporation. Used with
permission.
Image removed due to copyright restrictions.
From Lattice Semiconductor
-
L12: 6.111 Spring 2006 7Introductory Digital Systems
Laboratory
AntiAnti--FuseFuse--Based Approach (Based Approach
(ActelActel))
Rows of programmablelogic building blocks
+
rows of interconnect
Anti-fuse Technology:Program Once
8 input, single output combinational logic blocks
FFs constructed from discrete cross coupled gates
Use Anti-fuses to buildup long wiring runs from
short segmentsI/O Buffers, Programming and Test Logic
Logic Module Wiring Tracks
I/O Buffers, Programming and Test Logic
I/O B
uffe
rs, P
rogr
amm
ing
and
Test
Log
ic
I/O B
uffers, Programm
ing and Test Logic
-
L12: 6.111 Spring 2006 8Introductory Digital Systems
Laboratory
ActelActel Logic ModuleLogic Module
Combinational block does not have the output FFExample Gate
Mapping
00011011
GNDA
BC
DE
GND
GND
VDD
Y
R
S
VDDQ
S-R Flip-Flop
00011011
-
L12: 6.111 Spring 2006 9Introductory Digital Systems
Laboratory
ActelActel Routing & ProgrammingRouting &
Programming
Logic Module
Output SegmentsLong Vertical Tracks
Input Segments
Outputs
Inputs
HorizontalChannel
Vpp
Vpp/2
Vpp/2Gnd
Programming an Antifuse
Antifuseshorted
Vpp/2
Vpp/2Vpp/2
Vpp/2
PrechargePhase
Programming is Permanent (one time)
Courtesy of Actel. Used with permission.
Courtesy of Actel. Used with permission.
-
L12: 6.111 Spring 2006 10Introductory Digital Systems
Laboratory
RAM Based Field Programmable RAM Based Field Programmable Logic
Logic -- XilinxXilinx
CLB
CLB
CLB
CLB
SwitchMatrix
ProgrammableInterconnect I/O Blocks (IOBs)
ConfigurableLogic Blocks (CLBs)
D Q
SlewRate
Control
PassivePull-Up,
Pull-Down
Delay
Vcc
OutputBuffer
InputBuffer
Q D
Pad
D QSD
RDEC
S/RControl
D QSD
RDEC
S/RControl
1
1
F'G'
H'
DIN
F'G'
H'
DIN
F'
G'H'
H'
HFunc.Gen.
GFunc.Gen.
FFunc.Gen.
G4G3G2G1
F4F3F2F1
C4C1 C2 C3
K
Y
X
H1 DIN S/R EC
Courtesy of Xilinx. Used with permission.
-
L12: 6.111 Spring 2006 11Introductory Digital Systems
Laboratory
The Xilinx 4000 CLBThe Xilinx 4000 CLBCourtesy of Xilinx. Used
with permission.
-
L12: 6.111 Spring 2006 12Introductory Digital Systems
Laboratory
Two 4Two 4--input Functions, Registered Outputinput Functions,
Registered Outputand a Two Input Functionand a Two Input
Function
Courtesy of Xilinx. Used with permission.
-
L12: 6.111 Spring 2006 13Introductory Digital Systems
Laboratory
55--input Function, Combinational Outputinput Function,
Combinational OutputCourtesy of Xilinx. Used with permission.
-
L12: 6.111 Spring 2006 14Introductory Digital Systems
Laboratory
LUT MappingLUT Mapping
N-LUT direct implementation of a truth table: any function of
n-inputs.N-LUT requires 2N storage elements (latches)N-inputs
select one latch location (like a memory)
4LUT example
Latches set by configuration bitstream
Inputs
Output
Why Latches and Not Registers?
Courtesy of Xilinx. Used with permission.
-
L12: 6.111 Spring 2006 15Introductory Digital Systems
Laboratory
Configuring the CLB as a RAMConfiguring the CLB as a RAM
Memory is built using Latches not FFs
Read is same a LUT Function!
16x2Courtesy of Xilinx.
Used with permission.
-
L12: 6.111 Spring 2006 16Introductory Digital Systems
Laboratory
Xilinx 4000 InterconnectXilinx 4000 Interconnect
Courtesy of Xilinx. Used with permission.
-
L12: 6.111 Spring 2006 17Introductory Digital Systems
Laboratory
Xilinx 4000 Interconnect DetailsXilinx 4000 Interconnect
Details
Wires are not ideal!
Courtesy of Xilinx. Used with permission.
-
L12: 6.111 Spring 2006 18Introductory Digital Systems
Laboratory
Xilinx 4000 Flexible IOBXilinx 4000 Flexible IOB
Adjust Transition Time
Adjust the Sampling Edge
Outputs through FF or bypassed
Courtesy of Xilinx. Used with permission.
-
L12: 6.111 Spring 2006 19Introductory Digital Systems
Laboratory
Add Bells & WhistlesAdd Bells & Whistles
HardProcessor
I/O
BRAM
Gigabit Serial
Multiplier
ProgrammableTermination
Z
VCCIO
Z
Z
ImpedanceControl Clock
Mgmt
18 Bit
18 Bit36 Bit
Courtesy of David B. Parlour, ISSCC 2004 Tutorial, The Reality
and Promise of Reconfigurable Computing in Digital Signal
Processing. and Xilinx. Used with permission.
-
L12: 6.111 Spring 2006 20Introductory Digital Systems
Laboratory
The The VirtexVirtex II CLB (Half Slice Shown)II CLB (Half Slice
Shown)
Courtesy of Xilinx. Used with permission.
-
L12: 6.111 Spring 2006 21Introductory Digital Systems
Laboratory
Adder ImplementationAdder Implementation
Y = A B CinB
Cin
Cout
A
LUT: AB
1 half-Slice = 1-bit adder
Dedicated carry logic
Courtesy of Xilinx. Used with permission.
-
L12: 6.111 Spring 2006 22Introductory Digital Systems
Laboratory
Carry ChainCarry Chain
1 CLB = 4 Slices = 2, 4-bit adders
64-bit Adder: 16 CLBs
+
CLB15
CLB0A[3:0]B[3:0]
A[63:60]B[63:60]
A[63:0]
B[63:0]Y[63:0]
Y[3:0]
Y[63:60]
Y[64]
CLBs must be in same column
CLB1A[7:4]B[7:4] Y[7:4]
Courtesy of Xilinx. Used with permission.
-
L12: 6.111 Spring 2006 23Introductory Digital Systems
Laboratory
VirtexVirtex II FeaturesII Features
Double Data Rate registers Digital Clock Manager
Embedded MultiplierBlock SelectRAMCourtesy of Xilinx. Used with
permission.
-
L12: 6.111 Spring 2006 24Introductory Digital Systems
Laboratory
The Latest Generation: The Latest Generation: VirtexVirtex--II
ProII Pro
High-speed I/O
Embedded PowerPc
Hardwired multipliers
Embedded memoriesFPGA Fabric
Courtesy of Xilinx. Used with permission.
-
L12: 6.111 Spring 2006 25Introductory Digital Systems
Laboratory
FPGA Evolution Summary [Parlour04]FPGA Evolution Summary
[Parlour04]
0.1
10
1000
Logic + FF
1985 1990 1995 2000 20051980
Transistorsx 106
Distributed RAM
Block RAM
Arithmetic Support
Hard MAC
Hard CPUDSP System Design Tools
Glue
Logic C
ore
Functio
nality
Logic
Platfor
mSys
tem
Platfor
m Doma
in
Speci
fic
Platfor
m
High Speed Serial IO
Courtesy of Xilinx. Used with permission.
-
L12: 6.111 Spring 2006 26Introductory Digital Systems
Laboratory
Design Flow Design Flow -- MappingMapping
Technology Mapping: Schematic/HDL to Physical Logic unitsCompile
functions into basic LUT-based groups (function of target
architecture)
always @(posedge Clock or negedge Reset)beginif (! Reset)
q
-
L12: 6.111 Spring 2006 27Introductory Digital Systems
Laboratory
Design Flow Design Flow Placement & RoutePlacement &
Route
Placement assign logic location on a particular device
LUT
LUT
LUT
Routing iterative process to connect CLB inputs/outputs and
IOBs. Optimizes critical path delay can take hours or days for
large, dense designs
Iterate placement if timing not met
Satisfy timing? Generate Bitstream to config device
Challenge! Cannot use full chip for reasonable speeds (wires are
not ideal). Typically no more than 50% utilization.
-
L12: 6.111 Spring 2006 28Introductory Digital Systems
Laboratory
Example: Example: VerilogVerilog to FPGAto FPGA
module adder64 (a, b, sum); input [63:0] a, b; output [63:0]
sum;
assign sum = a + b;endmodule
Virtex II XC2V2000
Synthesis Tech Map Place&Route
64-bit Adder Example
Courtesy of Xilinx. Used with permission.
-
L12: 6.111 Spring 2006 29Introductory Digital Systems
Laboratory
How are How are FPGAsFPGAs Used?Used?
PrototypingEnsemble of gate arrays used to emulate a circuit to
be manufacturedGet more/better/faster debugging done than with
simulation
Reconfigurable hardwareOne hardware block used to implement more
than one function
Special-purpose computation enginesHardware dedicated to solving
one problem (or class of problems)Accelerators attached to
general-purpose computers (e.g., in a cell phone!)
-
L12: 6.111 Spring 2006 30Introductory Digital Systems
Laboratory
SummarySummary
FPGA provide a flexible platform for implementing digital
computingA rich set of macros and I/Os supported (multipliers,
block RAMS, ROMS, high-speed I/O)A wide range of applications from
prototyping (to validate a design before ASIC mapping) to
high-performance spatial computingInterconnects are a major
bottleneck (physical design and locality are important
considerations)
College students will study concurrent programming instead of C
as their first
computing experience.
-- David B. Parlour, ISSCC 2004 Tutorial
L12: Reconfigurable Logic ArchitecturesHistory of Computational
FabricsReconfigurable LogicProgrammable Array Logic (PAL)Inside the
22v10 PALCypress PAL CE22V10Anti-Fuse-Based Approach (Actel)Actel
Logic ModuleActel Routing & ProgrammingRAM Based Field
Programmable Logic - XilinxThe Xilinx 4000 CLBTwo 4-input
Functions, Registered Outputand a Two Input Function5-input
Function, Combinational OutputLUT MappingConfiguring the CLB as a
RAMXilinx 4000 InterconnectXilinx 4000 Interconnect DetailsXilinx
4000 Flexible IOBAdd Bells & WhistlesThe Virtex II CLB (Half
Slice Shown)Adder ImplementationCarry ChainVirtex II FeaturesThe
Latest Generation: Virtex-II ProFPGA Evolution Summary
[Parlour04]Design Flow - MappingDesign Flow Placement &
RouteExample: Verilog to FPGAHow are FPGAs Used?Summary