1 Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation Seokjin Lee * , D. F. Wong + * Dept. of Electrical and Computer Engineering + Dept. of Computer Sciences The University of Texas at Austin
Dec 13, 2015
1
Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation
Seokjin Lee*, D. F. Wong+
*Dept. of Electrical and Computer Engineering
+Dept. of Computer SciencesThe University of Texas at Austin
2
Outline Overview Introduction
FPGA Architecture, Routing resources FPGA routing problem
Problem Formulation Routing graphs and Timing graphs
Algorithm Description Lagrangian Relaxation LR_ROUTE, NET_ROUTE
Experimental Results Conclusion
3
Overview A new timing-driven routing
algorithm for FPGAs Find a routing with minimum critical
path delay for a given placed circuit. Handling of the timing constraints in
a mathematical programming framework.
Routing results are compared with those of VPR router.
4
FPGA Architecture Logic modules
Implements logic functions
LUTs, flip-flops Routing resources
Wire segments Programmable
switches I/O modules
L
S
wiresegments
logicmodule
I/Omodule
programmableswitch
L L
L L L
LLL
S S
S S S
S S S
<A typical FPGA architecture>
5
FPGA Routing Resources Prefabricated wire
segments Routing constraints :
Sharing of a wire segments by different nets is not possible
Limited Routability High RC delays and
large area of switches
a b
cd
ef
g h
L2 L4
L1 L3
7
Routing Graph Gr (Vr , Er)
Vr : I/O pins of logic modules, wire segments Er : feasible connections between the nodes Routing problem: Find vertex disjoint trees
T={T1,…Tn}
3
2
8
7 13
16
10
9
a b
c
d
g h
e
f
a b
c
d
e
f
g h
12
34
56
7
8
910
1314
1112
1516
L1
L2
L3
L4
8
Timing Constraints Source-to-sink delays of nets
Delay of wire-switch chains Calculated from architecture specific RC
values based on Elmore delay model Timing constraints
Specified by arrival times at primary inputs (outputs of storage elements) or required times at primary outputs (inputs of storage elements)
9
Timing Graph Gt (Vt , Et)
Constructed from input netlist
Captures timing constraints
Vt : inputs, outputs,
logic module pins
Et : source-sink pairs of nets, input-output pairs of logic modules Fictitious nodes
s : connects primary inputs, t : connects primary outputs
s t
primaryinput
logicmodules primary
output
10
Timing-Driven FPGA Routing
Minimization of critical path delay under timing and routing constraints
Find vertex disjoint routing trees T = {T1, …, Tn} for all the nets
such that
Minimize subject to
ta
vuvu aDa tEvu ),(
)),( alongdelay
, nodeat timearrival , nodeat timearrival else,
output of timearrival and 0 , if
input of timearrival and 0 , (if
vupathD
vaua
uaDtv
vDasu
uv
vu
uut
svs
11
Lagrangian Relaxation General technique for solving
optimization problems with difficult constraints
Lagrangian subproblems New objective function: adding constraints to
the original objective function after multiplied by constants (Lagrangian multipliers)
Iteratively update Lagrangian multipliers and solve Lagrangian subproblems
12
Lagrangian Relaxation
kk b)(g
b)(g
b)(g
f
x
x
x
x
...
s.t
)( min
22
11
))((
))((
))(( )( min
222
111
kkk bg
bg
bgf
x
x
xx
Original problem Lagrangian subproblem
k ,,, update 21
13
LR for Our ProblemOriginal problem Lagrangian subproblem
t
vuvu
t
E(u,v)
aDa
a
s.t
min
tEvu
vuvuuvu aDaa
),(L
),(
)(
min
Taλ
λ update
14
Optimality Conditions Optimality conditions on By rearranging terms,
0/ uaL tVu
TEtuut
),(
1
t tEvw Ewu
uwwv),( ),(
},{ tsVw t
ta
b
cd
ac
bccd
e
dt
et
cdbcac 1 etdt
),(
),(
),(
)(
)1(),(
vuuvuv
(w,u) wuwuwwu
tutut
D
a
aL
Taλ
15
Simplified Lagrangian Subproblem
),(
0
),(
0
),(
)( )1(),(vu
uvuv(w,u) wu
wuwwutu
tut DaaL
Taλ
Optimality conditions on
tEvu
uvuvDL),(
)( Tλ
Lagrangian subproblem becomes
tEvu
uvuvDLLS),(
)(min : )( Tλ
16
Updating Lagrangian Multipliers
)}(,0max{1tuvur
ruv
ruv aDa
econvergenc
lim 0lim1
r
ii
rr
r
Subgradient Method
r : stepsize
17
LR_ROUTE
1. Initialize 2. Call NET_ROUTE to solve LS()3. Compute for each
4. Update for each 5. Repeat Steps 2-4 until no shared
resource exists.
ua tVu
uv tEvu ),(
18
Solving Lagrangian Subproblem
NET_ROUTE Find routing trees T for a set of
given multipliers such that
Minimize
subject to
where
tEvu
uvuvD),(
k
rik Vix 1
otherwise 0,
node uses net for if ,1 ikTx kik
19
Solving Lagrangian Subproblem
1s.t
min : )(),(
kik
Evuuvuv
x
DLSt
netk Vii
Viiki
netkvuuvuv
Vi kiki
Evuuvuv
rr
rt
xD
xDL
constant
knet for cost congestion routing
knet for delay sink weighted
),(
),(
}{
)1()(min
xμ
20
Routing Nets For net k,
Cost for each node:
netkvu Viiki
vupathiiuv
Viiki
netkvuuvuv
r
r
xd
xD
),( ),(
),(
minimize
cost
congestioncostdelay
iiuvi dc
21
NET_ROUTE
1. For each net k2. Rip up routing for net k3. for each sink v of net k 4. Maze route from source to sink
with cost 5. Update for all nodes in
iiuvi dc i ),( vupath
22
Experimental Results FPGA model used
Symmetrical-array-based FPGA Each logic block contains four 4-input LUTs and
flip-flops Switch connections: Fs = 3, Fc = W Fs: number of connections per wire entering the
switch box Fc : number of tracks to which each logic block pin can connect W : number of tracks in a channel
23
Experimental Results Tested on large circuits from MCNC
benchmark Routing with fixed channel width
Minimum channel width obtained by running VPR in timing-driven mode
Better results for 13 circuits (out of 17)
Critical path delay improved up to 33% with comparable runtime
24
Experimental Results Critical path delay and runtime
comparisonCircuits
LUTs/ FFs
Number of
Tracks
Delay (ns) Runtime (s)
VPR LR_ROUTE
VPR LR_ROUTE
Alu4 1522 33 46.6 46.2 58 57
Apex2 1878 43 61.5 49.3 61 46
Apex4 1262 41 45.4 48.9 29 41
Bigkey 1707 24 41.7 27.8 53 62
Clma 8383 51 125.0 96.4 531 464
Des 1591 24 43.5 48.1 44 42
Diffeq 1497 29 48.8 48.6 32 31
Dsip 1370 25 29.6 27.6 53 78
Elliptic 3604 40 77.1 71.3 151 256
25
Experimental Results
Circuits LUTs/ FFs
Number of
Tracks
Delay (ns) Runtime (s)
VPR LR_ROUTE
VPR LR_ROUTE
Ex1010 4598 44 83.5 75.2 248 351
Ex5p 1064 43 44.8 43.7 22 34
frisc 3556 43 81.5 84.3 121 171
misex3 1397 37 42.5 49.4 50 49
pdc 4575 61 96.5 95.0 304 465
s298 1931 28 98.7 91.5 71 85
seq 1750 35 55.9 47.0 55 67
spla 3690 56 94.7 74.0 203 234