Top Banner
Large Scale Circuit Placement: Gap and Promise Jason Cong 1 , Tim Kong 2 , Joseph R. Shinnerl 1 , Min Xie 1 and Xin Yuan 1 UCLA VLSI CAD LAB 1 Magma Design Automation 2
75

Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Jun 27, 2018

Download

Documents

truongnga
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Large Scale Circuit Placement: Gap and Promise

Jason Cong1, Tim Kong2, Joseph R. Shinnerl1, Min Xie1 and Xin Yuan1

UCLA VLSI CAD LAB1

Magma Design Automation2

Page 2: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Outline

n Introductionn Gap Analysis of Existing Placement Algorithmsn Scalable Paradigmn Timing Optimizationn Routability Optimizationn Concluding Remarksn ApplicationuMulti-Million Gate FPGA Placement

Page 3: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Outline

nn IntroductionIntroductionnn Gap Analysis of Existing Placement AlgorithmsGap Analysis of Existing Placement Algorithmsnn Scalable ParadigmScalable Paradigmnn Timing OptimizationTiming Optimizationnn RoutabilityRoutability OptimizationOptimizationnn ConcludingConcluding RemarksRemarksnn Application Application uuMultiMulti--Million Gate FPGA PlacementMillion Gate FPGA Placement

Page 4: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Why Still Placement Problem

n True, it has been studied over 30 years, but …n We need good solutions more then ever

u One of most important steps in IC implementation flowF Directly defines interconnects

n Difficultu Problem size grows 2X every 18-24 months

F Moore’s Law

u Cannot place hierarchically without quality degradation

Page 5: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Example of Logic Hierarchy in Final Layout

By courtesy of IBM (Tony Drumm)

Page 6: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Why Still Placement

n True, it has been studied over 30 years, but …n We need good solutions more then ever

u One of most important steps in IC implementation flowF Directly defines interconnects

n Difficultu Problem size grows 2X every 18-24 months

F Moore’s Law

u Cannot place hierarchically without quality degradation

n We are not very good at it …

Page 7: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Outline

nn IntroductionIntroductionnn Gap Analysis of Existing Placement AlgorithmsGap Analysis of Existing Placement Algorithmsnn Scalable ParadigmScalable Paradigmnn Timing OptimizationTiming Optimizationnn RoutabilityRoutability OptimizationOptimizationnn ConcludingConcluding RemarksRemarksnn ApplicationApplicationuuMultiMulti--Million Gate FPGA PlacementMillion Gate FPGA Placement

Page 8: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Motivation

n Lack of significant progress in wirelength reductionuRate of reduction is about 5-10% every 2-3 yearsuLatest developments in placement differ mainly in

runtime

n Most work compare only with known heuristicsuUse real design based benchmarksuUse synthetic benchmarks

n Little understanding about the divergence from the optimal

Page 9: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Placement Examples with Known Optimal Wirelength [Chang et al, 2003]

n All the modules are of equal size, and there is no space between rows and adjacent modules

n For 22-pin nets , connect any two adjacent modules

/ 2n n n

+ −

n For each nn-pin net , connect the nnmodules in a rectangular region close to a square, i.e., the length of each side is close to sqrt(n)

n The wirelength is of each nn-pin net is given by

n Given a (real) netlist Nn Construct netlist N’with known opt. WL and match the net distribution of N

Page 10: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Placement Examples with Known Upperbounds [Cong et al, 2003]

n Limitations of PEKO

u All the nets are local

u Wirelength contribution by global connections in real designs can be significant

n Extend PEKO by introducing non-local nets to mimic global connectionsu Method 1: Generate a subset of ii--pin

nets by randomly connecting iimodules on the chip

u Method 2: Generate a subset of ii-pin nets according to wirelengthdistribution vector (WDV)

Page 11: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Illustration:PEKU Example Construction

Input : t = 64, D = {d2=35,d3=21,d4=7,d5=4,d6=2, d7=1} α=0.2

Total WL = 184

Generate 28 2-pin optimally

Generate 16 3-pin optimally

Generate 6 4-pin optimally

Generate 1 4-pin randomly

Generate 4 5-pin optimally

Generate 2 6-pin optimally

Generate 1 7-pin optimally

W = {w1… w3=0, w4=3, w5=3, w6= 0,w7 =2,w8 =2,w9=1, w10=0, w11=1, w12=1}

Generate 7 2-pin randomly

Generate 5 3-pin randomly

Page 12: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Studied Five State-of-the-Art Placersn Capo [Caldwell et al, 2000]

u Based on multilevel partitioneru Aims to enhance the routability

n Dragon [Wang et al, 2000]u Uses hMetis for initial partitionu SA with bin-based swapping

n mPL [Chan et al, 2000]u Nonlinear programming on the coarsest levelu Discrete relaxation at finer levels

n mPG [Chang et al, 2002]u Uses FC clustering and hierarchical density control u Incremental A-tree for routability

n Qplace [Cadence Inc.]u Leading edge industrial placeru Component of Silicon Ensemble

Page 13: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Experimental Results on PEKO

0

10000

20000

30000

40000

50000

0 50000 100000 150000 200000 250000

#cells

Ru

nti

me(

s)

Capo 8.6 Dragon 2.20 mPG 1.0 mPL 3.0 Qplace 5.1

1.20

1.40

1.60

1.80

2.00

2.20

2.40

2.60

0 50000 100000 150000 200000 250000

#cells

Qu

alit

y R

atio

Capo 8.6 Dragon 2.20 mPG 1.0 mPL 3.0 Qplace 5.1

n Existing Algorithms can be 59% to 140% away from the optimal on PEKO

n On Examples with padsu mPG and Qplace show improvement of 12% and 10% repectivelyu Dragon, mPL, and Capo do not benefit much from the additional information

n There is significant room for improvement in placement algorithms

Page 14: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Experimental Results on PEKO

n Capo, QPlace and mPL scales well in runtimen Average solution quality of each tool shows deterioration by an

additional 9% to 17% when the problem size increases by a factorof 10

0

20000

40000

60000

80000

100000

10000 100000 1000000 10000000#cells

Ru

nti

me(

s)

Capo 8.6 Dragon 2.20 mPG 1.0 mPL 3.0 Qplace 5.1

1.201.401.601.802.002.202.402.602.80

10000 100000 1000000 10000000#cells

Qu

ali

ty R

ati

o

Capo 8.6 Dragon 2.20 mPG 1.0 mPL 3.0 QPLace 5.1

Page 15: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Experimental Results on PEKU

n The effectiveness of existing placers can vary significantly for circuits of similar size but different characteristics

n Comparing QRshelps to identify the technique that works best under each scenario

1.10

1.30

1.50

1.70

1.90

2.10

2.30

0.00 0.00 0.50 0.75 1.00 2.00 5.00 10.00% of non-local nets

Qua

lity

Rat

io

Dragon 2.20 Capo 8.6 mPG 1.0 mPL 3.0 Qplace 5.1

QR (Placed Wirelength vs Upperbound) may not be tight

Page 16: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

High Interest in the Community

Page 17: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Timing-driven Placement Examples with Known Optimal (TPEKO)

n Obtain a placement for the circuit from any available tool

n Perform timing analysis on the circuit

n Create an artificial combinational path with equal or larger delay than the longest path

n Guarantee the cells in the path are adjacent to each other

n Make necessary modifications

Original longest path

Artificial path

Page 18: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Evaluating Timing-Driven Placement Algorithms Using TPEKO

n Evaluating two state-of-the-art FPGA placement algorithmsuVPR [Marquardt et al.

2000]u PATH [Kong 2002]

n Can be far away from the optimal for difficult examplesu 35% on averageu 54% in the worst case

1.00

1.05

1.10

1.15

1.20

1.25

1.30

1.35

1.40

1 2 3 4 5#longest path

Qu

ali

ty R

ati

o

VPR PATH

Page 19: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Observations from Gap Analysis

n Significant opportunity in placementuExisting algorithms may produce solutions far away

from the optimal uThe quality result of the same placer varies for

circuits of similar size but different characteristicu Scalability problem in runtime and solution quality

n Significant ROIuBenefit equal to one to two generations of process

scaling uBut without requiring multi-billion dollar

investment (hopefully!)

Page 20: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Outline

nn IntroductionIntroductionnn Gap Analysis of Existing Placement AlgorithmsGap Analysis of Existing Placement Algorithmsnn Scalable ParadigmScalable Paradigmnn Timing OptimizationTiming Optimizationnn RoutabilityRoutability OptimizationOptimizationnn ConcludingConcluding RemarksRemarksnn ApplicationApplicationuuMultiMulti--Million Gate FPGA PlacementMillion Gate FPGA Placement

Page 21: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Scalable Paradigms for Placement

n Assertion: some form of hierarchy is essentialn Three main paradigms:

1) Top-downF Generalized recursive partitioning defines the hierarchy

2) Bottom-up (multilevel)F Generalized recursive clustering defines the hierarchy

3) FlatF Flat on the outside, but hierarchical internally

n Caveatsu Scalable may be slower than O(N), due to Moore’s lawu Focus on global placement; assume scalability of

legalization and detailed placement

Page 22: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Paradigm 1: Top-Down Placement

n Hierarchy ConstructionuCutsize-Minimizing Partitioning

F E.g., Capo, Feng-Shui, Dragon.

u Partitioning guided by wirelength-driven placements F Start with a loosely constrained WL-driven solution; a quadratic

objective function approximates weighted wirelengthF E.g., Gordian-L, BonnPlace

n Hierarchy RefinementuThe order in which subregions are partitioned matters –

especially under terminal propagationuCan cells migrate across partition boundaries?

Page 23: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Cutsize-Driven Partitioning-Based Placement

n Cutsize = the number of nets not contained in just one side of the partitionuRent’s rule shows that wirelength and cutsize correlate to

within about X2 log N [Wang et al, 2000].u Fast FM-style iterations with terminal propagationuCareful cutline selection and multiway partitions can help

n e.g. Capo, Feng-Shui, Dragon

Page 24: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Initially, there is only netlist connectivity; no spatial information is available.

Cutsize-Driven Recursive Top-Down Partitioning

Apply a standard partitioning algorithm to the given netlist.Multilevel partitioning algorithms are the most effective.

After two stages, each cell has been assigned to one of four possible subregions. As few nets as possible have been cut.After three stages, each cell has been assigned to one of eight possible subregions. As few nets as possible have been cut.

Iterative improvement by repartitioning with terminal propagation is essential.

Page 25: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Partitioning Guided by Approximate Placements

n Minimize a quadratic approximation to global wirelengthu Solve one large symmetric positive-definite linear systemu Pads prevent cells from collapsing to a single point

n Use the given placement to recursively partition cellsu Gordian-L:

F Minimize cutsize, but use the given placement to form initial partitions (e.g., using x- or y-coordinate median for cutline)

F New subregions generate new center-of mass constraints for subsequent iterations

u BonnPlace: F To assign cells to subregions, minimize displacement from the given

locations, not cutsize.

Page 26: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Example: Gordian-L-style Placement

Page 27: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Hierarchy Adjustments in Top-Down Placement by Iterative Refinement

n K-way partitioning followed by localized repartitioning with terminal propagation (Feng-Shui).

n Initial cutsize-driven quadrisection followed by bin swapping, each bin being a block of the partition, with wirelength-based annealing only at the finest level (Dragon).

n Unconstrained quadratic wirelength minimization over 2x2 windows of overlapping subregions, followed by repartioning inside each window (BonnPlace).

Page 28: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Paradigm 2: Multilevel Placement

n Coarsening: build the hierarchy by recursive aggregation (generalized clustering)

n Relaxation: improve the placement at each level by localized optimization

n Interpolation: transfer coarse-level solution to adjacent, finer level (generalized declustering)

n Multilevel Flow: multiple traversals over multiple hierarchies (V-cycle variations)

Page 29: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Multi-Level Optimization Framework

Interpolation &Relaxation (optimization)

Coarsening(Clustering)

Pro

blem

siz

e de

crea

ses

•Multilevel coarsening generates smaller problem sizes at coarser levels àfaster optimization at coarser levels

•May explore different aspects of the solution space at different levels•Gradual refinement on good solutions from coarser levels is very efficient•Successful in many applications

•Originally developed for PDEs•Recent success in VLSI CAD: partitioning, placement, routing

Given problem

Page 30: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Multilevel Coarse Placement

Coarsening by clustering

Refinement by placement

Initial Placement

§A bin grid structure at each level

§Hierarchical area density control

§Optimization by SA, QP, RDFL, etc.

Page 31: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Multilevel Methods: Coarsening by Recursive Aggregation

n Recursive aggregation defines the hierarchy.n Different aggregation algorithms can be used on different

levels and/or in different V-cycles.n Clustering methods

u First-Choice Clustering (hMetis [Karypis 1999]).u AMG based aggregation

F An aggregate need not be a cluster. A cell can be fractionally associated to more than one aggregate

Merge each vertex with its “best”neighborMerged Nets

Page 32: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Multilevel Methods: Relaxation(Intralevel Optimization)

n Iterative improvement at each level by fast, localized computationuDiscrete permutation enumerations; swappinguUnconstrained quadratic wirelength minimization on

subsets uNetwork-flow based improvement on subsets (RDFL)

n Local relaxation is sufficient. Global improvement comes from the multilevel hierarchy.

n Relaxations at finer levels may be quite different, e.g., more discrete, than relaxations at coarser levels.

Page 33: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Relaxation on Local Subsets

Original Subnetlistwith Subproblem

Move the red cells to their optimal positions, holding all other cells fixed and (perhaps) ignoring overlap

Unrelated Cell

Fixed Neighbor

Movable Cell

Page 34: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Example: Goto-based Discrete Relaxation

n Each cell’s optimal location is readily calculated when all other cells are held fixed.

n Compute a chain A, B, C, D, E, whereB is a randomly selected neighbor of A’s optimal location, etc.

n Examine all permutations of the chain and take the best one.

n Problem: the chain is not closed (A is not necessarily near any other cell’s optimal location).

Page 35: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Example: Quadratic Relaxation on Noncontiguous Subsets (QRS)

n Select a subset M of cells to moven Identify other cells and pads, F, connected to

M by nets in

n Decouple the horizontal and vertical problems.n M is obtained as segments of length k along a

DFS vertex traversal of the netlist

}.|{ φ≠∩∈= MeEeE M

Page 36: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Solving the QRS subproblem

n Problem formulation (horizontal case):

n Iteratively solve the weighted quadratic minimization problem, using the current solution to determine the weight (as in Gordian-L)

n May result in cell overlap!

number. small is , )(||

1 where

|)(|))((

min )()(

2

ε

ε

∑ ∑

∈ ∈

=

+−−

eve

Ee evk

ek

e

vxe

x

xvxxvx

M

Page 37: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Ripple-move legalization [Hur and Lillis, 2000]Because many forms of subset relaxation ignore overlap, post-relaxation cell swaps may be needed to remove overlap.

Define a DAG on neighboring bins. Edge cost reflects the best wirelength gain over all cell swaps between two bins.Calculate a max-gain monotone path on the bin-grid graph

Page 38: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Multilevel Methods: Interpolation(Generalized Declustering)

n Goal: transfer a partial solution from a coarser level to its adjacent finer level

n Simplest approach: place all components of a cluster at its center

n Better approach: place each component of an aggregate at the weighted average of the aggregates to which it is strongly connected.

n Optionally: impose constraints; e.g., the average location of the components can be held fixed.

Page 39: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Interpolation (Declustering)

Initial Coarsest Level Placement

Declustering Placement

DeclusteringPlacement

Final placement solution

n Use the same grid structure at each leveln Variable cluster size (may be bigger than a bin): handled by

hierarchical area density controln Multilevel SA engine: SA engine starts with a low temperature at

each level except the coarsest level

Page 40: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

AMG-style Linear Interpolation

Place the C-Pt representatives

The inherited position of a cluster component ( ) can be determined by several cluster positions, not just its own.

Place the F-pts by weighted interpolation

Page 41: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

AMG-based Linear Interpolation[A. Brandt 1986]

interpolation

constantAMG

jijvFjijvCi vavavjj points points −−

Σ+Σ=

clusterNext finer level cells

Within each cluster, select the one with maximum degree as C-point; others are

considered as F-points

C-point

Page 42: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Iterated Multilevel Flow

Make use of placement solution from 1st V-cycle

First Choice (FC)clustering

Geometric basedFC clustering

Page 43: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Iterated Multilevel Flow

Iterated V-Cycles F-Cycle

Backtracking V-Cycle

Page 44: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Sample Impact of the Multilevel Components to mPL’s overall quality

n First-Choice Clustering: 3— 4% reduced WLn QRS Relaxation: 5— 6% reduced WLn AMG Interpolation: 2— 3% reduced WLn Iterated V-cycles: 2— 8% reduced WL

Page 45: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Extension: Multilevel Mixed-size Placement Level 0

Level kCoarsest level

placement

Big objects legalization

big object

small object

fixed big object

cluster

ØSimultaneous place big and small objectsØGradually fix the locations of big objects and generate overlap-free placement for big objects during multilevel placement

Page 46: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Example: Final Placement of ibm02 by mPG-ms

Page 47: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Paradigm 3: Embedded Multilevel Optimization

n Maintain a flat, nonhierarchical view of the placement problemuNo explicit aggregation or partitioning

n Use advanced hierarchical computation algorithms to perform internal iterations

n Example: AMG-accelerated force-directed methods uMinimize weighted unconstrained quadratic wirelengthu Incorporate area-distribution gradients iteratively in the

quadratic optimality condition (Kraftwerk)uEmploy algebraic multigrid (AMG) to solve the large

linear systems for the optimality conditions at each step.

Page 48: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Outline

nn IntroductionIntroductionnn Gap Analysis of Existing Placement AlgorithmsGap Analysis of Existing Placement Algorithmsnn Scalable ParadigmScalable Paradigmnn Timing OptimizationTiming Optimizationnn RoutabilityRoutability OptimizationOptimizationnn ConcludingConcluding RemarksRemarksnn ApplicationApplicationuu MultiMulti--Million Gate FPGA PlacementMillion Gate FPGA Placement

Page 49: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Timing Optimization

n Additional goal: to minimize longest-path delay or maximize the minimum slack.

n Difficulties:uExponential number of paths.uComplex timing constraints –multi-clock domain,

multi-cycle, etc.

n Existing Algorithmsu Path-based algorithmsuNet-based algorithms

Page 50: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Path-based Algorithms

n Directly minimize the longest path delay. n Popular approaches:uExplicitly reduce the maximum length of a set of

paths; the set could be pre-computed or dynamically adjusted.F [Burstein & Youssef, 1985, Swartz & Sechen, 1995]

a

b

cd

e

1

2

3 4

5

6

D(a)+D(c)+D(d)D(b)+D(c)+D(d)D(a)+D(c)+D(e)D(b)+D(c)+D(e)

MAX {D(i): edge delay (cell delay included)

Page 51: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Mathematical Programming based Approaches

n Popular approaches (cont’d):uMathematical programming by introducing

auxiliary variables (arrival time)F [Jackson & Kuh, 1989, Srinivasan et al, 1991,

Hamada et al, 1993, … ]

a

b

cd

e

1

2

3 4

5

6

A(i): arrival time at i

A(1)+D(a) ≤A(3) : edge(a)A(2)+D(b) ≤A(3) : edge(b)… …A(5) ≤T(5) : endpoint 5A(6) ≤T(6) : endpoint 6

Page 52: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Pros and Cons

n Advantage(s):uAccurate timing view during optimization.

n Disadvantage(s):uHigh computational cost.uDifficult to fit in certain placement

frameworks.

Page 53: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Net-based Algorithms

n Timing constraints are translated into net weights (net-weighting) or length constraints (delay-budgeting).

n Delay budgeting: distribute slacks to all edges in the circuit to achieve zero-slack u [Hauge et al, 1987, Gao et al, 1991, Luk 1991, Youssef et al, 1992,

Tellez et al, 1996, … ]

a

b

cd

e

1

2

3 4

5

6

D(a) ≤ τ1D(b) ≤ τ2D(c) ≤ τ3D(d) ≤ τ4D(e) ≤ τ5

Page 54: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Timing-Driven Placement with Delay Budgeting

n Construct a placement to meet all budgets.u If all budgets are met, timing is GUARANTEED!

n Difficulty: Too many possible budgeting solutions and do not know which can be satisfied a priori.uBudgeting is often done in structural domain without

physical feasibility considerationuUnify placement and delay budgeting?

F [Sarrafzadeh et al, 1997; Halpin et al, 2001; Yang et al, 2002]

Page 55: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Net Weighting

n Timing criticalities are translated into net weights (soft constraints); then compute a placement which minimizes total weighted delay (or WL).

a

b

cd

e

1

2

3 4

5

6

Cost = w(a)*D(a) + w(b)*D(b) + w©*D(c) + w(d)*D(d) + w(e)*D(e)

Page 56: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Net Weighting Principles

n P1: smaller slack è higher weight.uFor example: w(e)=(1-slack(e)/T)α

FVPR [Marquardt et al, 2000]

a:7/0

b:5/2

c:1/0d:5/0

e:3/2

α=3.0w(a) = 1.0w(b) = 0.6058w(c) = 1.0w(d) = 1.0w(e) = 0.6058Edge label: delay/slack

Page 57: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Net Weighting Principles

n P2: more paths è higher weight.uFor example: path-counting

F [Senn et al, 2002]

a:7/0

b:5/2

c:1/0d:5/0

e:3/2

w(a) = 2.0w(b) = 2.0w(c) = 4.0w(d) = 2.0w(e) = 2.0

Page 58: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Ideal Net Weighting

n Need to consider both principles together

a:7/0

b:5/2

c:1/0d:5/0

e:3/2

Ideally, we want

w(c) > w(a)=w(d) > w(b)=w(e)

Page 59: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

All Path Net Weighting Problem

n Challenge: can we compute the impact of all paths through an edge, properly scaled by their slacks? u i.e., for each edge e, compute

F w(e) = Σ∀ p ∋ e f(slack(p))

a:7/0

b:5/2

c:1/0d:5/0

e:3/2

w(a)= f(slack(a-c-d)) +f(slack(a-c-e))

w(c)= f(slack(a-c-d)) +f(slack(b-c-d)) +f(slack(a-c-e)) +f(slack(b-c-e))

Page 60: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

All Path Counting

n For certain function f, PATH algorithm ([Kong 2002]) can compute exact weights of all edges in linear time:uw(e) = Σ∀ p ∋ e f(slack(p))

n For example:

a:7/0

b:5/2

c:1/0d:5/0

e:3/2

f(x) = 10-x/13

w(a) = 1.7017w(b) = 1.1940w(c) = 2.8957w(d) = 1.7017w(e) = 1.1940

w(c) > w(a)=w(d) > w(b)=w(e)

Page 61: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Results

n IT WORKS!uIncorporated into state-of-the-art FPGA

placer VPR.uOriginal weighting: w(e)=(1-slack(e)/T)α

u15.6% delay reduction.uNo runtime overhead.u4.1% wirelength increase.

Page 62: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Outline

nn IntroductionIntroductionnn Gap Analysis of Existing Placement AlgorithmsGap Analysis of Existing Placement Algorithmsnn Scalable ParadigmScalable Paradigmnn Timing OptimizationTiming Optimizationnn RoutabilityRoutability OptimizationOptimizationnn Concluding RemarksConcluding Remarksnn ApplicationApplicationuuMultiMulti--Million Gate FPGA PlacementMillion Gate FPGA Placement

Page 63: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Routability Optimization

n Aggressive WL minimization != routabilityn Routability-driven placementuRoutability modelinguSolution techniques for routability control

Page 64: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Routability Modeling

n X × Y global routing grid in chip core region

n Model routing supply and demand for each bin and boundary on the grid structure

Supply

Demand

Page 65: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Categories of Routability Modelinig

n Topology-free modelinguNo routing topology generateduFast

n Topology-based modelinguSteiner tree topology generationuProvide upper bound for routability

estimationuHigh complexity

Page 66: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Topology-free Modeling

n Bounding-box (BBOX)-based modeling [Cheng 1994]

n Probabilistic analysis-based modeling [Lou et al, 2001]

n Rent’s rule-based modeling [Yang et al, 2002]n Pin density-based modeling [Hu & Marek-

Sadowska 2002, Brenner & Rohe, 2003]

Page 67: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Topology-free Modeling Stochastic Analysis [Lou et al, 2001]

n Probability of a 22-pin net crossing bin(i,j) in a mxn bin griduP(i,j) = C(i, j)/F(m,n)

FC(i,j): #routes crossing bin(i,j)

FF(m,n): #routes frombin(1,1) to bin(m,n)

n Decompose multi-pin nets into 22-pin nets

12

3

4

5

6s

t

Page 68: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Topology-free ModelingPin Density Based

[Hu & Marek-Sadowska 2002,Brenner & Rohe, 2003]

n Calculated weighted wirelength

u Di: degree of net iu BBi : bounding box of net iu Ps(b) : heuristic function

capturing pin density in bin bn Can be combine with

probabilistic analysis-based modeling

ii i

ibs

i BBD

bPwCF ∑

= ∈

)(

F’= (1+1+1)/3*4=4

Let P(b) linear with respect to #pin

Page 69: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Topology-based Modeling

n Precomputed Steiner tree topology for wiring demand estimation [Mayrhofer & Lauther, 1990]

n Congestion-avoidance two-bend routing for 2-pin nets [Chang et al, 2003]

n IncAtree with incremental updates support for multi-pin nets [Chang et al, 2003]

Page 70: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Topology-based ModelingFast LZ-routing for Two-pin Nets

[Chang et al, 2003]

n Decide HVH or VHV:n Select the less congested layer

n Binary search on V-stem (or H-stem)n Initial left region and right

region to cover bounding boxn Repeat

n Query wire usage on both regions

n Select region with less congestion

n Wire usage query can be done in O(log grid_size)

Left region Right region

HVH VHV

Page 71: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Topology-based ModelingFast Incremental A-tree Routing

[Chang et al, 2003]

§ Simple incremental A-tree§Recursively Quad-partition grids§ Each pin recursively connects to

lower left corner of each level of partition

§For net with bounding box length B, at most 2 *log B edge updates for each pin move, except the root. §Each edge routed by LZ-router

First Quadrant

Root(source pin)

Page 72: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Optimization Techniques for Routability

n Net weightingu Transfer congestion picture into bin weights and optimization

weighted WL u Used in iterative placement, such as SA-based placer [Hu &

Marek-Sadowska, 2002, Chang et al, 2003]

n Cell weighting (a.k.a cell inflation)u Weight cell size based on the congestion picture u Use partitioner or implicit/explicit bin density control to move

inflated cell out of congested region u Used in constructive placement and iterative

placement[Parakh 1998, Brenner & Rohe, 2003, Yang et al, 2003]

Page 73: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Outline

nn IntroductionIntroductionnn Gap Analysis of Existing Placement AlgorithmsGap Analysis of Existing Placement Algorithmsnn Scalable ParadigmScalable Paradigmnn Timing OptimizationTiming Optimizationnn RoutabilityRoutability OptimizationOptimizationn Concluding Remarksnn ApplicationApplicationuuMultiMulti--Million Gate FPGA PlacementMillion Gate FPGA Placement

Page 74: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Concluding Remarks

n There is significant opportunity to improve the placement technologies

n Three scalable paradigms1) Top-down

F Generalized recursive partitioning defines the hierarchy2) Bottom-up (multilevel)

F Generalized recursive clustering defines the hierarchy3) Flat

F Flat on the outside, but hierarchical internally

n Timing and routability optimization can be achieved through weighted wirelength optimization

Page 75: Large Scale Circuit Placement: Gap and Promise - UCLAcadlab.cs.ucla.edu/~cong/slides/iccad03_placement_tutorial.pdf · Large Scale Circuit Placement: Gap and Promise Jason Cong1,

Outline

nn IntroductionIntroductionnn Gap Analysis of Existing Placement AlgorithmsGap Analysis of Existing Placement Algorithmsnn Scalable ParadigmScalable Paradigmnn Timing OptimizationTiming Optimizationnn RoutabilityRoutability OptimizationOptimizationnn Concluding RemarksConcluding Remarksn ApplicationuMulti-Million Gate FPGA Placement