-
EXPLORING ALTERNATIVE 3D FPGA ARCHITECTURES:DESIGN METHODOLOGY
AND CAD TOOL SUPPORT
K. Siozios, K. Sotiriadis, V. F. Pavlidist, andD. Soudris
Dep. of Electrical and Computer EngineeringDemocritus University
of Thrace, Greece
email: {ksiop, kostsot, dsoudris} @ee.duth.gr
ABSTRACT
This paper introduces a software supported methodologyfor
exploring/evaluating 3D FPGA architectures. Two newCAD tools are
developed: (i) the 3DPRO for placement androuting on 3D FPGAs and
(ii) the 3DPower forpower/energy estimation on such architectures.
We mainlyfocus our exploration on the total number of layers and
theamount of vertical interconnects (or vias). The efficiency ofthe
proposed architecture is evaluated by making anexhaustive
exploration for via connections under theEnergyxDelay Product
criterion. Experimental resultsdemonstrate the effectiveness of our
solution, consideringthe 20 largest MCNC benchmarks. Considering
3Darchitectures with 4 layers and two scenarios of fabricatedvia
densities (30% and 70%), we achieve an averagedecrease in the
delay, the wire length, and the energyconsumption of 18%, 17%, and
310%, respectively, ascompared to 2D FPGAs. We also achieved high
utilizationof vias links.
1. INTRODUCTION
In the real estate market, an often-stated truism is that asland
becomes more expensive, there is a tendency to buildupward, rather
than outward. This idea has some resonancein the domain of ICs,
where the sizes of the die are limitedby yield and performance
constraints. 3D integration canmitigate many of these limitations.
For example, aconsiderable reduction in the number and length of
theglobal wires can be achieved [2]. This decrease results, inturn,
in performance enhancements and decreased powerconsumption for 3D
ICs as compared to 2D circuits.
Recently many research groups from academia [4, 5, 6,7, 8],
industry [9], and research institutes [1] haveinvestigated
significant effort on designing andmanufacturing applications in 3D
technologies. Severalcompanies [9] develop 3D ICs for commercial
purposes bywafer stacking, where the distance between the layers
ismainly determined by the wafer thickness. Note that theexisting
industrial research primarily concerns themanufacturing and
fabrication processes rather than thedevelopment of tools to
support the design of emerging 3Dtechnologies.This paper is part of
the 03ED593 research project, implementedwithin the framework of
the "Reinforcement Program of HumanResearch Manpower" (PENED) and
co-financed by National andCommunity Funds (75% from E.U.-European
Social Fund and 25%from the Greek Ministry of Development-General
Secretariat ofResearch and Technology).
Dept. of Electrical and Computer EngineeringUniversity of
Rochester, USA
email: [email protected]
Although 3D integration promises considerablebenefits, several
challenges need to be satisfied. Amongothers, design space
exploration is essential to build high-performance and low energy
architectures that exploit allof the advantages offered by 3D
integration. In addition,CAD tools that facilitate the design of 3D
circuits arerequired. Up to date there are only a few academic
CADtools [4, 6] for mapping applications on 3D FPGAtechnologies,
while there is no complete CAD flow inorder to promote the
commercialization of this potentdesign paradigm. Furthermore, there
is no commercialCAD tool for realizing applications on 3D FPGAs,
similarto the standalone tools and/or design flows (i.e. providedby
Cadence, Mentor Graphics, and Xilinx) for 2Dtechnologies.
Consequently, there is a significant need todevelop algorithms and
software tools to exploit theadvantages of the third dimension, and
to solve timeconsuming and complex tasks, such as
floorplanning,placement, and routing (P&R) for 3D FPGAs.
In [6], a P&R approach for island style 3D FPGAarchitectures
is described. A partitioning-based placementand simulated
annealing-based refinement tools are used,which target on the
reduction of the interconnection length.The authors report gains in
wire lengths compared to 2Darchitectures, without considering,
however, the wirepower consumption and delay. Hence, these tools
(PR3D)cannot be used for exploring alternative 3D
architectures.
In [4], a similar P&R approach for 3D FPGAs isdescribed. The
reconfigurable architecture consists ofmultiple stacked functional
layers, while thecommunication among layers is realized by using
3DSwitch Boxes (SBs). A tool, named TPR, for P&R in suchdevices
was developed. Although TPR is one of the firstattempts in academia
to develop tools for 3D FPGA, itsuffers from many limitations. The
target architectureutilized in this tool initially assumes an
unlimited numberof vias, while the TPR aims at minimizing this
number.However, such a scenario is not realistic, since the
totalnumber and the spatial distribution of vias are
importantproblems that need to be addressed. In addition, this
toolcannot estimate other important design parameters, such asthe
power/energy consumption.
In this paper a software supported design methodologyfor
exploring several parameters of 3D FPGAs isintroduced. We evaluate
for a number of cost factors, suchas delay, energy consumption, and
total wire length over aplethora of 3D architectures. Then, we
perform exploration
1-4244-1060-6/07/$25.00 (C2007 IEEE. 652
Authorized licensed use limited to: EPFL LAUSANNE. Downloaded on
January 25, 2010 at 09:51 from IEEE Xplore. Restrictions apply.
-
for different number and various locations of the vias
thatconnect circuits within the 3D FPGA. To best of ourknowledge,
this is the first time that a software-supportedapproach for
exploring/evaluating 3D FPGAs withdifferent number of vias is
presented. Using the 20 largestMCNC benchmarks [11], we demonstrate
the effectivenessof our methodology.
2. THE PROPOSED 3D FPGA ARCHITECTUREAND TOOL FLOW
In order to realize the interlayer vias, we have to extendsome
conventional 2D SBs to employ connections to theother layers of the
3D FPGA. Although the utilized SBsare based on the pattern found in
Xilinx XC-4000 FPGAarchitecture, the results are applicable for any
other SBpattern found in bibliography. Different SB
topologiesutilize a different number of pass transistors leading
todifferent interconnection delay and power consumptionvalues. For
example, in a 2D SB an incoming routing trackcan be connected to
three other wires (F, = 3). Similarly,for a 3D SB, the incoming
routing track is possible to beconnected to five other tracks (F, =
5). In the first case,the SB is formed by 6 transistors, while in
the 3D approach10 transistors are required. As we target FPGAs, the
powerconsumption is one of the upmost parameters for reductionand,
therefore, the selection of the appropriate connectivityacross the
3D device layers is essential for efficientdesigns. Also, a large
number of vias occupies largeportion of Si-area, where active
circuits and interconnectsmust be excluded. Furthermore, the effect
of thedistribution and length of these vias on the performanceand
power consumption of 3D FPGAs needs to beaddressed.
The proposed 3D architecture can be constructed byplacing a
number of identical 2D individual layers,providing communication by
interlayer vias amongvertically adjacent SBs. Hence, the SBs are
extended to thethird dimension, while the structure of the
individual logicblocks remains unchanged.
Based on the required number of interconnections forthe
successful implementation of an application ontoFPGAs, the nets can
be routed by using various channelsegments to enhance both the
delay/power efficiency andresource utilization. For all of the
simulation/evaluationexperiments presented in this work, we use a
multi-segment routing architecture similar to the one that
appearsin the Xilinx Virtex devices for horizontal tracks(composed
from routing segments of lengths LI, L2, L6,and long lines, while
the distribution of the segments ineach channel is 8%, 20%, 60%,
and 12% respectively). Forthe vias we use segment tracks of LI.
In order to model the vertical wires we assume thateach via is
electrically equivalent to a horizontal routingtrack with the same
length. This means that the verticaltracks of our 3D FPGA have the
same delay and powervalues as the horizontal segments with length
LI. Thisassumption is based on the fabrication process [5],
where
the interlayer vias with length 5 pm -10 pm is feasible. Forsuch
technologies, the delay of the wires dominates thedelay of the
transistors (similarly to 2D architectures).
Two new software tools are developed to support theproposed
exploration/evaluation procedure for 3Darchitectures. These tools
are integrated onto the existingMEANDER design flow [12] (Figure
1).
Apphlcation descniption in HDL
Existing Proposeddesign flow Synte\sdesign flow
__/ I~~~~ecnlllllgy Mapping
BTtsr**Eamgeneraio
Figure 1: The MEANDER Framework for 2D/3D FPGAs
The 3D branch adopts some existing CAD tools fromthe 2D toolset
[12, 13], which do not need to be adapted forthe 3rd dimensional
topology. Only the tools which arerelated to P&R and power
estimation tasks should bereplaced by the new tools, because these
tools consider theparticular traits of the 3D FPGAs. More
specifically wedevelop a 3D Placement and Routing Optimizer
(3DPRO).We also the 3DPower a novel tool to model and estimatethe
power/energy consumption in 3D architectures. To bestof our
knowledge, this toolset is the first completeframework in academia
for mapping applications onto 3DFPGAs starting from a high level
(HDL) description of theapplication and ending up to configuration
file generation.More details about the 3D framework can be found in
[10].
3. EXPLORATION AND COMPARISON RESULTS
We performed qualitative comparison between 3DPRO andthe TPR
(the only public available tool for P&R on 3DFPGAs) tool (Table
1). Thus, 3DPRO performs architectureexploration for a
significantly larger number of parametersas compared to TPR.
The effectiveness of the proposed methodology isexhibited by
exploring several 3D architectures for variousparameters. We
performed exploration with the followingassumptions: (i) total
number of layers is equal to four, (ii)percentage of vertical
interconnects per layer ranges from0% (i.e., conventional 2D FPGA)
to 100% (TPR solution),(iii) the spatial location (x, y, z) of each
via per layerremains invariant, (iv) a via connection between
adjacentlayers (with length L,) is electrically equivalent to L,
wiresformed on the 2D FPGA plane, (v) the via width is W=4 inany
layer, (vi) the hardware resources of each layer areidentical
(i.e., identical number of Basic Logic Elements
653
Authorized licensed use limited to: EPFL LAUSANNE. Downloaded on
January 25, 2010 at 09:51 from IEEE Xplore. Restrictions apply.
-
(BLEs)) among different layers and (vii) the applicationsare
implemented onto the smallest number of BLEs perFPGA layer that can
be mapped.
Table 1: Qualitative comparison between TRP and 3DPROFeature TRP
3DPROArchitecture exploration Yes YesMeasure Delay Yes YesMeasure
Wire length Yes YesMeasure Power No Yes
Subset DsgeSupported switch boxes Wilton sDpesigeUniversal
spcfe
Heterogeneous interconnect(simultaneously 2D/3D SBs) No YesVias
exploration No YesBelong to complete framework No YesFull custom 3D
interconnections No Yes
Assuming a layer of size and is the available3D SBs per layer,
the pattern of placement of a 3D SB isderived as follows: Assigning
first a 3D SB to a location
of a certain layer, then the neighboring 3D SBs
areuniformly-assigned to the locations (x+r+1, y, z) where
isderived by . Alsos r indicates thenumber of 2D SBs between two
neighboring 3D SBs.
In order to evaluate our methodology we performed anexhaustive
exploration with the 20 largest MCNCbenchmarks. The results are
summarized in Figure 4. Thehorizontal axis corresponds to the
percentage of viaconnections in each layer of the 3D FPGA (which
isidentical to the percentage of 3D SBs of an FPGA layer),while the
vertical axis shows the normalized value ofEnergyxDelay Product
(EDP). These points correspond toPareto points showing all of the
possible solutions. Wenormalize the results with the EDP value of a
conventional(i.e., 2D) FPGA. According to the designer
requirements,similar curves to those in Figure 4 can be
derived,considering, for instance, the energy consumption
orperformance as the optimizing parameter of the system.
2D Solution Ly Layers _ Layers
Several conclusions can be drawn from the diagram ofFigure 4. As
we increase the number of layers, theapplications are realized with
smaller delay for critical netsand energy consumption in 3D FPGAs.
Secondly, we canclaim that the developed P&R tools provide
promisingresults for such architectures, where only a percentage
ofSBs forms 3D via connections. More specifically, for thethree
layers solution, as we increase the percentage of 3DSBs per layer,
the EDP value increases. Similarly, the EDPcurve for four layer
devices gives two local minima of30% and 70% of 3D SBs.
Choosing the 3D architecture with the two localminima EDP values
from Figure 4, we performed detailexploration in terms of the
delay, the wire length and theenergy requirement for the chosen
benchmarks shown inTable 2. We compare 2D (conventional) with 3D
FPGAarchitecture consisting of 4 layers with 30% and 70% ofthe SBs
of each layer to form 3D connections. Thecorresponding values of
the delay reduction, the wirelength, and the energy consumption
are: 16%, 17%, and30%, and, 18%, 15%, and 31%. Indeed, the wire
lengthreduction due to 3D integration results in
remarkableimprovements in delay and energy consumption.
Furthermore, in Table 2 the columns with 100% viasgive the
calculated values of delay, wire length, and energyconsumption,
which correspond to the 3D architectures of[4]. It can be seen that
these average values is similar to theones of the explored 3D FPGA
architecture results (i.e.,30% and 70% vias). Specifically, a
decrease up to 70% inthe utilized vertical interconnects is
observed. The lastpoint is very important because we achieved the
sameimprovements employing fewer vias.
For 3-D system, the smaller number of vias means: (i)lower
fabrication costs and (ii) larger useful silicon area ineach layer
(a via contact occupies much more silicon areathan a simple metal
contact).
1 --+-"2 Layers" _ "3Layers" 4 LayersULn
>,B 791
.30)691"" 591; F
:= 1.05
L- 1.00 -
-04 ID95t o.90
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%F fabricated vias
Figure 5: Vertical interconnects utilization
1Q% 20%; 30%D 40%e 50% 607;D 70%o 80%o 90% lo(% fabricated
vias
Figure 4: Average EDP over the 20 largest MCNCbenchmarks for
different number of layers and vias.
The utilization degree of the fabricated vias is shown inFigure
5. We can infer that the number of actually-utilizedvertical
interconnects deviates from the average utilizationdegree by a
small fraction for a given number of layers.Considering the number
of layers 2, 3, and 4, thecorresponding average values are 2.31%,
3.58% and4.98%, while the largest deviation from the average
values
654
Iso
D.75-
1.70
Authorized licensed use limited to: EPFL LAUSANNE. Downloaded on
January 25, 2010 at 09:51 from IEEE Xplore. Restrictions apply.
-
are 0.44%, 0.45% and 2.41%, respectively. Morespecifically,
given a certain number of layers, the viautilization degree remains
almost invariant - i.e., it isrelatively independent from the
percentage of vias perlayer. We observed in Figure 5 that the
utilized via links ofthe 4-layer architectures with fabricated 30%
and 70% are4.63% and 4%, respectively, which means
botharchitectures utilized almost the same number of vias.
4. CONCLUSIONS
A systematic methodology for exploring alternative 3DFPGA
architectures is presented. This methodology issoftware supported
by two new tools, namely 3DPRO and3DPower, which belong to the
first complete 3D FPGADesign Framework in academia. Comparison
resultsindicate improvements up to 18% in the delay, 17% in thewire
length, and 31% in the energy consumption for theproposed 3D FPGAs
as compared to existing 2D FPGAs.
5. ACKNOWLEDGMENTS
The authors acknowledge the support from Prof. K. Bazargan andH.
Mogal (Univ. of Minnesota) about specific parts ofTPR tool.
6. REFERENCES
[2] J. W. Joyner et al., "Impact of
Three-DimensionalArchitectures on Interconnects in Gigascale
Integration",IEEE Trans. on VLSI, Vol. 9, No. 6, pp. 922-927, Dec.
2001
[3] Kara Poon, et. al., "A Flexible Power Model for FPGA's",
in12th Int. Conf FPL, 2002.
[4] Cristinel Ababei, et. al., "Placement and Routing in
3DIntegrated Circuits", IEEE Design and Test, Vol. 22, No. 6,pp.
520-53 1, Nov-Dec 2005.
[5] R. Reif, et. al., "Fabrication Technologies for
Three-Dimensional Integrated Circuits",in ISQED, pp.33-37,
2002.
[6] Shamik Das, et. al., "Technology, Performance, andComputer
Aided Design of Three Dimensional IntegratedCircuits",Int. Symp.
Physical Design, pp. 108-115, 2004.
[7] Arifur Rahman, et. al., "Wiring Requirement and
Three-Dimensional Integration Technology for FieldProgrammable Gate
Arrays", IEEE Trans. on VLSI, Vol. 11,No. 1, pp. 44-54, Feb.
2003.
[8] V. F. Pavlidis and E. G. Friedman, "Interconnect
DelayMinimization through Interlayer Via Placement in 3-D ICs",in
Proc. ofGreat Lakes Symp. on VLSI, pp. 20-25, 2005.
[9] 3D IC Industry Summary, available at
"http://www.tezzaron.com/technology/3D%201C%20Summary.htm".
[10] K. Siozios, et. al., "A Software-Supported Methodology
forDesigning High-Performance 3D FPGAs", in Proc. of 15thIFIP
VLSI-SoC, 2007.
[11] S. Yang, "Logic Synthesis and Optimization
Benchmarks,Version 3.0", Techical Report, 1991.
[12] http:Hvlsi.ee.duth.gr/amdrel[13] K. Siozios, et.al., "An
Integrated Framework for
Architecture Level Exploration of Reconfigurable Platform",in
15th FPL, pp. 658-661, 2005.
[1] Eric Beyne, "The Rise of the 3rd Dimension for
SystemIntegration", in Proc. of 8th EPTC, 2006.
Table 2: Comparison results about MCNC benchmarks:
Implementation in 2D and 3D FPGA architecture (with 30/o, 700 and
00/ vialinks 4 lavers and minimal EnerrvxDelav Product).
bigkey 10.8 6.14 6.50 9.41 59.03 50.57 51.56 49.65 13.6 10.2
10.3 10.0clma 63.2 31.3 29.1 31.1 379.42 287.23 283.19 283.44 72.6
45.0 45.0 44.5des 14.7 8.74 9.87 8.67 94.07 54.90 53.94 55.03 22.6
13.0 12.9 13.0diffeq 15.3 16.7 11.3 18.0 43.48 36.97 45.65 36.04
24.3 15.1 12.4 11.9dsip 8.19 5.25 5.80 6.38 53.70 39.87 39.53 38.90
13.3 7.28 7.27 7.15elliptic 26.2 24.3 22.8 25.6 116.14 93.54 111.04
96.04 20.1 12.9 13.3 13.3exIOlO10 25.3 18.8 20.3 20.6 181.30 167.22
164.05 162.82 18.5 13.1 13.0 12.7ex5p 10.5 10.7 10.4 10.6 42.53
37.17 38.19 36.95 5.45 4.29 4.77 4.14frisk 31.6 30.7 32.6 32.3
122.70 110.30 109.19 108.89 35.6 25.1 25.9 26.4misex3 10.9 12.0
11.1 10.1 48.83 39.08 39.25 37.94 8.37 5.72 5.98 5.64pdc 27.4 27.4
24.6 26.8 257.77 226.94 222.49 228.05 25.7 23.5 19.9 19.2s298 26.0
27.3 21.1 21.9 62.12 57.94 60.14 57.82 14.8 10.4 10.0 10.2s38417
31.6 34.7 29.3 31.8 376.48 230.61 259.94 239.13 53.2 42.1 43.2
43.1s38584 25.7 18.6 19.4 19.8 225.13 198.58 192.200 191.35 43.4
30.0 30.4 31.1seq 15.6 12.7 14.8 11.6 64.36 52.12 56.85 50.69 9.84
7.29 8.92 7.15spla 21.4 18.3 20.8 18.0 169.22 127.94 127.92 127.96
15.9 12.1 13.2 12.6tcRf nQor I PI 1 1, 7I l471 7 70 7A44 7TX4 7A64
T77 0 w5 l1 OI I
655
Authorized licensed use limited to: EPFL LAUSANNE. Downloaded on
January 25, 2010 at 09:51 from IEEE Xplore. Restrictions apply.