CSET 4650 Field Programmable Logic Devices Dan Solarek Overview of FPGA Interconnect.
Post on 25-Dec-2015
237 Views
Preview:
Transcript
CSET 4650 CSET 4650 Field Programmable Logic DevicesField Programmable Logic Devices
Dan SolarekDan SolarekDan SolarekDan Solarek
Overview of FPGA Overview of FPGA InterconnectInterconnect
2
Programmable InterconnectProgrammable Interconnect
In addition to programmable logic cells, FPGAs must have In addition to programmable logic cells, FPGAs must have programmable interconnectprogrammable interconnect
Structure and complexity of the interconnect is determined Structure and complexity of the interconnect is determined by the programming technology and architecture of the logic by the programming technology and architecture of the logic cellcell
Interconnect is typically aluminum-based metal layersInterconnect is typically aluminum-based metal layers
Resistance of approximately 50 mResistance of approximately 50 m/square/square
Line capacitance of approximately 0.2 pF/cmLine capacitance of approximately 0.2 pF/cm
Early FPGAs had two metal interconnect layers, but current, Early FPGAs had two metal interconnect layers, but current, high density parts may have three or more metal layershigh density parts may have three or more metal layers
3
Field-Programmable Gate ArraysField-Programmable Gate Arrays
Requires some form of programmable interconnect Requires some form of programmable interconnect at crossovers …at crossovers … CLB CLB CLB CLBCLB CLB CLB
CLB CLB CLB CLBCLB CLB CLB
CLB CLB CLB CLBCLB CLB CLB
CLB CLB CLB CLBCLB CLB CLB
CLB CLB CLB CLBCLB CLB CLB
CLB CLB CLB CLBCLB CLB CLB
CLB CLB CLB CLBCLB CLB CLB
over simplified
4
Tradeoffs in FPGA InterconnectTradeoffs in FPGA Interconnect
How are logic blocks arranged?How are logic blocks arranged?
How “rich” is interconnect between channels?How “rich” is interconnect between channels?
How many wires will be needed between them?How many wires will be needed between them?
Are wires evenly distributed across chip?Are wires evenly distributed across chip?
How should wires be segmented (short, long)?How should wires be segmented (short, long)?
How long is the average wire?How long is the average wire?
How much buffering do we add to wires?How much buffering do we add to wires?
5
Tradeoffs in FPGA InterconnectTradeoffs in FPGA Interconnect
Programmability slows signals down …Programmability slows signals down …
are some wires specialized to long distances?are some wires specialized to long distances?
How many inputs/outputs must be routed to/from How many inputs/outputs must be routed to/from each configurable logic block?each configurable logic block?
What utilization are we willing to accept? What utilization are we willing to accept?
20%? 50%? 90%?20%? 50%? 90%?
6
Interconnect Comes With a CostInterconnect Comes With a Cost
7
Routing: Choosing a PathRouting: Choosing a Path
LE
LE
wiring channel
wire
switch
Routing is done by a software tool.Routing is done by a software tool.
LEs hold previously placed functions
8
Routing ConsiderationsRouting Considerations
Global routing:Global routing:Which combination of channels?Which combination of channels?
Local routing:Local routing:Which wire in each channel?Which wire in each channel?
Routing metrics:Routing metrics:Net lengthNet length
DelayDelay
9
Programmable vs. Fixed InterconnectProgrammable vs. Fixed Interconnect
Switch adds delaySwitch adds delay
Transistor off-state is worse in advanced Transistor off-state is worse in advanced technologiestechnologies
FPGA interconnect has extra length = added FPGA interconnect has extra length = added capacitancecapacitance
10
Interconnect StrategiesInterconnect Strategies
Some wires will not be utilizedSome wires will not be utilized
Congestion will not be same throughout chipCongestion will not be same throughout chip
Types of wires:Types of wires:
Short wires: local LE connectionsShort wires: local LE connections
Global wires: long-distance, buffered communicationGlobal wires: long-distance, buffered communication
Special wires: clocks, etc.Special wires: clocks, etc.
11
Paths in InterconnectPaths in Interconnect
Connections may be long and complexConnections may be long and complex
Long wires can help simplifyLong wires can help simplify
LE LE LE LE LE
LE LE LE LE LE
LE LE LE LE LE
Wiring channel
Wir
ing
chan
nel
12
Interconnect architectureInterconnect architecture
Connections from wiring channels to LEs.Connections from wiring channels to LEs.
Connections between wires in the wiring channels.Connections between wires in the wiring channels.
LE LE
wiring channel
switches
13
Interconnect RichnessInterconnect Richness
Within a channel:Within a channel:How many wiresHow many wires
Length of segmentsLength of segments
Connections from Connections from LE to interconnect channel to interconnect channel
Between channels:Between channels:Number of connections between channelsNumber of connections between channels
Channel structureChannel structure
14
Segmented WiringSegmented Wiring
Length 1
Length 2
15
Offset SegmentsOffset Segments
16
SwitchboxSwitchbox
channel channel
chan
nel
chan
nel
Multiple switch pointsMultiple switch points
Increased flexibilityIncreased flexibility
17
Rows of programmablelogic building blocks
+
rows of interconnect
Anti-fuse Technology:Program Once
8 input, single output combinational logic blocksFFs constructed from discrete cross coupled gates
Use Anti-fuses to buildup long wiring runs from
short segments
I/O Buffers, Programming and Test Logic
Logic Module Wiring Tracks
I/O Buffers, Programming and Test Logic
I/O
Buf
fers
, P
rogr
amm
ing
and
Test
Log
ic
I/O B
uffers, P
rogramm
ing and Test LogicActel FPGAsActel FPGAs
18
Actel Programmable InterconnectActel Programmable Interconnect
Actel interconnect is similar to a channeled gate arrayActel interconnect is similar to a channeled gate array
Horizontal routing channels between rows of logic modulesHorizontal routing channels between rows of logic modules
Vertical routing channels on top of cellsVertical routing channels on top of cells
Each channel has a fixed number of tracks each of which Each channel has a fixed number of tracks each of which holds one wireholds one wire
Wires are divided into segments of various lengthsWires are divided into segments of various lengths
segmented channel routingsegmented channel routing
Long vertical tracks (LVT) extend the entire height of the Long vertical tracks (LVT) extend the entire height of the chipchip
19
Actel Programmable InterconnectActel Programmable Interconnect
Each logic module has connections to its inputs and Each logic module has connections to its inputs and outputs called stubsoutputs called stubs
Input stubs extend vertically into routing channels above and Input stubs extend vertically into routing channels above and below logic modulebelow logic module
Output stub extends vertically 2 channels up and 2 channels downOutput stub extends vertically 2 channels up and 2 channels down
Wires are connected by antifusesWires are connected by antifuses
20Interconnection Fabric
Logic Module
Horizontal Track
Vertical Track
Anti-fuse
Actel InterconnectActel Interconnect
21
jogs cross an anti-fuse
minimize the number of jogs for speed critical circuits
2 - 3 jogs for most interconnections
Logic Module
Logic ModuleLogic Module Output
Input
Input
Actel Routing ExampleActel Routing Example
22
Metal to metal antifuse moved the antifuse out of Metal to metal antifuse moved the antifuse out of silicon making the part denser and fastersilicon making the part denser and faster
Metal to Metal AntifuseMetal to Metal Antifuse
23
Metal to Metal AntifuseMetal to Metal Antifuse
TWO DIMENSIONAL
SEA OF MODULES
MODULES
TRACKS
24
Actel Programmable InterconnectActel Programmable Interconnect
25
Detail of ACT1 Channel ArchitectureDetail of ACT1 Channel Architecture
ACT 1 horizontal and vertical channel architecture
26
Routing ResourcesRouting Resources
ACT 1 interconnection architectureACT 1 interconnection architecture22 horizontal tracks per channel for signal routing with22 horizontal tracks per channel for signal routing with3 dedicated for VDD, GND, GCLK3 dedicated for VDD, GND, GCLK
8 vertical tracks per LM are available for inputs 8 vertical tracks per LM are available for inputs (4 from the LM above the channel, 4 from the LM below) (4 from the LM above the channel, 4 from the LM below) – input stub– input stub
4 vertical tracks per LM for outputs – output stub4 vertical tracks per LM for outputs – output stuba vertical track extends across the two channels above the module a vertical track extends across the two channels above the module and the two channels below and the two channels below
1 long vertical track (spans the entire height of the chip)1 long vertical track (spans the entire height of the chip)
27
Elmore’s ConstantElmore’s ConstantApproximation of waveform at node Approximation of waveform at node ii::
where Rwhere Rkiki is the resistance of the path to V is the resistance of the path to V00 shared by node shared by node kk and node and node ii
Examples: RExamples: R2424 = R = R11, R, R2222 = R = R11+R+R22, and R, and R3131 = R = R11
If the switching points are assumed to be at the 0.35 and 0.65 points, the If the switching points are assumed to be at the 0.35 and 0.65 points, the
delay at node delay at node ii can be approximated by can be approximated by DIDI
Measuring the delay of a net. (a) An RC tree. (b) The waveforms as a result of closing the switch at t=0.
n
kkkiDi
t
i CRetV Di
1
;
28
Elmore’s ConstantElmore’s Constant
DIDI is the Elmore time constant
It serves as a reminder that, if we approximate Vi by an exponential waveform, the delay of the RC tree using 0.35/0.65 trip points is approximately DIDI seconds.
29
RC Delay in Antifuse ConnectionsRC Delay in Antifuse Connections
Actel routing model. (a) A four-antifuse connection. L0 is an output stub, L1 and L3 are horizontal tracks, L2 is a long vertical track (LVT), and L4 is an output stub. (b) An RC-tree model. Each antifuse is modeled by a resistance and each interconnect segment is modeled by a capacitance.
30
RC Delay in Antifuse ConnectionsRC Delay in Antifuse ConnectionsRRnn - resistance of antifuse, C - resistance of antifuse, Cnn - capacitance of wire segment - capacitance of wire segment
D4 D4 = R= R1414CC11 + R + R2424CC22 + R + R3434CC33 + R + R4444CC44
= (R= (R11 + R + R22 + R + R33 + R + R44)C)C44 + (R + (R11 + R + R22 + R + R33)C)C33 + (R + (R11 + R + R22)C)C22 + R + R11CC11
If all antifuse resistances are approximately equal and much larger than If all antifuse resistances are approximately equal and much larger than the resistance of the wire segment, then: R1 = R2 = R3 = R4, and:the resistance of the wire segment, then: R1 = R2 = R3 = R4, and:
D4 D4 = 4RC= 4RC44 + 3RC + 3RC33 + 2RC + 2RC22 + RC + RC11
A connection with two antifuses will generate a 3RC time constant, a A connection with two antifuses will generate a 3RC time constant, a connection with three antifuses will generate a 6RC time constant, and a connection with three antifuses will generate a 6RC time constant, and a connection with 4 antifuses will generate a 10RC time constantconnection with 4 antifuses will generate a 10RC time constant
Interconnect delay grows quadratically (Interconnect delay grows quadratically ( n n22) as the number of antifuses ) as the number of antifuses nn increases increases
31
Actel Routing ResourcesActel Routing Resources
32
Xilinx LCA InterconnectXilinx LCA Interconnect
Xilinx LCA interconnect has a hierarchical architecture:Xilinx LCA interconnect has a hierarchical architecture:Vertical linesVertical lines and and horizontal lineshorizontal lines run between CLBs run between CLBs
General-purpose interconnectGeneral-purpose interconnect joins joins switch boxesswitch boxes (also known as (also known as magic boxesmagic boxes or or switching matricesswitching matrices))
Long linesLong lines run across the entire chip - can be used to form internal run across the entire chip - can be used to form internal buses using the three-state buffers that are next to each CLBbuses using the three-state buffers that are next to each CLB
Direct connectionsDirect connections bypass the switch matrices and directly connect bypass the switch matrices and directly connect adjacent CLBsadjacent CLBs
Programmable Interconnect PointsProgrammable Interconnect Points (PIPs) are programmable pass (PIPs) are programmable pass transistors the connect CLB inputs and outputs to the routing networktransistors the connect CLB inputs and outputs to the routing network
Bi-directional interconnect buffersBi-directional interconnect buffers (BIDI) restore the logic level and (BIDI) restore the logic level and logic strength on long interconnect pathslogic strength on long interconnect paths
33
Xilinx FPGA InternalsXilinx FPGA Internals
Portion of a Xilinx Portion of a Xilinx 4000 FPGA4000 FPGA
Shows relative Shows relative sizes of major sizes of major elementselements
Need more detail Need more detail about interconnect about interconnect architecturearchitecture
34
Xilinx 4000 InterconnectXilinx 4000 Interconnect
A closer lookA closer look
Programmable Programmable switch matricesswitch matrices
Single length Single length lines between lines between adjacent PSMsadjacent PSMs
Double length Double length lines skip a PSMlines skip a PSM
35
Switch Detail and ScaleSwitch Detail and Scale
CLBs in a sea CLBs in a sea of interconnectof interconnectProgrammable Programmable Switch Matrix Switch Matrix (PSM)(PSM)Connections Connections are controlled are controlled by SRAM bitsby SRAM bitsLong linesLong linesGlobal linesGlobal lines
36
Programmable Switch MatrixProgrammable Switch Matrix
37
Programmable Switch MatrixProgrammable Switch Matrix
38
Pass Transistor ControlPass Transistor Control
39
Programmable Switch MatrixProgrammable Switch Matrix
programmable switch element
turning the corner, etc.
40
Xilinx LCA Interconnect (cont.)Xilinx LCA Interconnect (cont.)
Xilinx LCA interconnect. (a) The LCA architecture (notice the matrix element size is larger than a CLB). (b) A simplified representation of the interconnect resources. Each of the lines is a bus.
41
Xilinx Switching Matrix and Xilinx Switching Matrix and Components of Interconnect DelayComponents of Interconnect Delay
Components of interconnect delay in a Xilinx LCA array. (a) A portion of the interconnect around the CLBs. (b) A switching matrix. (c) A detailed view inside the switching matrix showing the pass-transistor arrangement. (d) The equivalent circuit for the connection between nets 6 and 20 using the matrix. (e) A view of the interconnect at a Programmable Interconnection Point (PIP. (f) and (g) The equivalent schematic of a PIP connection (h) The complete RC delay path.
42
Routing ConnectionsRouting Connections
A connection is realized in an FPGA interconnect fabric by enabling routing switches in the connection and switch boxes.
43
Routing ConnectionsRouting ConnectionsThe parasitic contribution from the switches (realized as pass transistors) and the metal trace constitute the total resistive and capacitive components of the interconnect.
44
Routing ConnectionsRouting ConnectionsBased on the switch and wire parasitic, interconnect routes can be modeled as RC networks.
For typical parasitic values, Rwire is so negligible when compared to Ron, and thus can be dropped.
45
Routing ConnectionsRouting Connections
The capacitance of a route segment is given by:Cseg = 10Cdiff + Cwire
This can be used to model the energy of the route asEnergy (E) 50Cdiff + 4Cwire
The delay of the route can be compute as follows:Delay (D) 10RonCwire + 125RonCdiff
This modeling of the interconnect can be used to compute the cost of the architectural modifications.
46
Xilinx EPLD InterconnectXilinx EPLD Interconnect
The Xilinx EPLD UIM (Universal Interconnection Module). (a) A simplified block diagram of the UIM. The UIM bus width, n, varies from 68 (XC7236) to 198 (XC73108). (b) The UIM is actually a large programmable AND array. (c) The parasitic capacitance of the EPROM cell.
Xilinx EPLD family uses an interconnect bus called a Universal Xilinx EPLD family uses an interconnect bus called a Universal Interconnection Module (UIM)Interconnection Module (UIM)UIM is a programmable AND array with constant delay from any input to UIM is a programmable AND array with constant delay from any input to any outputany output
CCGG is the fixed gate is the fixed gate
capacitance of the capacitance of the EPROM deviceEPROM device
CCDD is the fixed drain is the fixed drain
capacitance of the capacitance of the EPROM deviceEPROM device
CCBB is the variable is the variable
horizontal line horizontal line capacitancecapacitance
CCWW is the variable vertical is the variable vertical
line capacitanceline capacitance
47
Altera MAX 5K & 7K InterconnectAltera MAX 5K & 7K Interconnect
A simplified block diagram of the Altera MAX interconnect scheme. (a) The PIA (Programmable Interconnect Array) is deterministic - delay is independent of the path length. (b) Each LAB (Logic Array Block) contains a programmable AND array. (c) Interconnect timing within a LAB is also fixed.
Altera MAX 5000 and 7000 devices use a Programmable Interconnect Altera MAX 5000 and 7000 devices use a Programmable Interconnect Array (PIA)Array (PIA)PIA is also a programmable AND array with constant delay from any PIA is also a programmable AND array with constant delay from any input to any outputinput to any output
48
Altera MAX 9K Interconnect ArchitectureAltera MAX 9K Interconnect Architecture
The Altera MAX 9000 interconnect scheme. (a) A 4 X 5 array of Logic Array Blocks (LABs), the same size as the EMP9400 chip. (b) A simplified block diagram of the interconnect architecture showing the connection of the FastTrack buses to a LAB.
Altera MAX 9000 devices use long row and column wires (Altera MAX 9000 devices use long row and column wires (FastTracksFastTracks) ) connected by switchesconnected by switches
49
Altera FlexAltera Flex
The Altera FLEX interconnect scheme. (a) The row and column FastTrack interconnect. (b) A simplified diagram of the interconnect architecture showing the connections between the FastTrack buses and a LAB.
Altera Flex devices also use Altera Flex devices also use FastTracksFastTracks connected by switches, but connected by switches, but the wiring is more dense (as are the logic modules)the wiring is more dense (as are the logic modules)
50
SummarySummary
Antifuse FPGA architectures are dense and regularAntifuse FPGA architectures are dense and regular
SRAM architectures contain nested structures of SRAM architectures contain nested structures of interconnect resourcesinterconnect resources
Complex PLD architectures use long interconnect Complex PLD architectures use long interconnect lines but achieve deterministic routinglines but achieve deterministic routing
top related