Profile-Guided Profile-Guided Microarchitectural Microarchitectural Floorplanning for Deep Floorplanning for Deep Submicron Processor Design Submicron Processor Design Mongkol Ekpanyapong, Jacob R. Minz, Mongkol Ekpanyapong, Jacob R. Minz, Thaisiri Watewai*, Hsien-Hsin S. Lee, Thaisiri Watewai*, Hsien-Hsin S. Lee, and Sung Kyu Lim and Sung Kyu Lim Georgia Institute of Technology Georgia Institute of Technology , , * University of California at * University of California at Berkeley Berkeley
32
Embed
Profile-Guided Microarchitectural Floorplanning for Deep Submicron Processor Design
Profile-Guided Microarchitectural Floorplanning for Deep Submicron Processor Design. Mongkol Ekpanyapong, Jacob R. Minz, Thaisiri Watewai*, Hsien-Hsin S. Lee, and Sung Kyu Lim Georgia Institute of Technology , * University of California at Berkeley. Computer Architecture Design - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Floorplanning for Deep Floorplanning for Deep Submicron Processor DesignSubmicron Processor Design
Mongkol Ekpanyapong, Jacob R. Minz, Mongkol Ekpanyapong, Jacob R. Minz, Thaisiri Watewai*, Hsien-Hsin S. Lee, and Thaisiri Watewai*, Hsien-Hsin S. Lee, and
Sung Kyu LimSung Kyu Lim
Georgia Institute of Technology Georgia Institute of Technology,,* University of California at Berkeley* University of California at Berkeley
Current Processor Design ParadigmCurrent Processor Design Paradigm
Employ the availability of Employ the availability of silicon area.silicon area.
Employ the higher clock Employ the higher clock speed to enhance the speed to enhance the performance.performance.
Assume unit delay Assume unit delay model.model.
Architects just do their Architects just do their own good jobs assuming own good jobs assuming that smart CAD tools will that smart CAD tools will do the rest of the work.do the rest of the work.
VLSI & Physical Design CADVLSI & Physical Design CAD Minimize both gate and Minimize both gate and
wire delay.wire delay.
Minimize total die area.Minimize total die area.
Accomplish above by Accomplish above by knowing about the design knowing about the design as little as possible.as little as possible.
CAD designers just CAD designers just designdesigna good tools assuming a good tools assuming that computer architects that computer architects did their good job.did their good job.
Next Generation Processor Next Generation Processor DesignDesign
Physical Planning is not Physical Planning is not enough.enough.
Employing some Employing some knowledge for the design knowledge for the design can result in better can result in better performance.performance.
Iterations between Iterations between computer architecture computer architecture design and CAD tools is design and CAD tools is necessary.necessary.
Smart CAD tools needSmart CAD tools needsome help from computer some help from computer architect.architect.
TerminologyTerminology ProfilingProfiling
The techniques for compiler or computer The techniques for compiler or computer architecturearchitectureto collect statistic information that can result into collect statistic information that can result inbetter optimization.better optimization.
Instructions Per Cycle Instructions Per Cycle ((IPCIPC))Number of instructions that can be issued per a Number of instructions that can be issued per a
cycle.cycle.
Billions Instruction Per Second Billions Instruction Per Second ((BIPSBIPS))
Number of instructions that can be issued per aNumber of instructions that can be issued per agiven second.given second.
OutlineOutline IntroductionIntroduction Related WorkRelated Work Wire Delay IssuesWire Delay Issues Profile-Guided FloorplanningProfile-Guided Floorplanning Simulation InfrastructureSimulation Infrastructure Experimental ResultsExperimental Results ConclusionsConclusions
Related WorkRelated Work Ho et al. [SRC 1999,IEEE 2001]Ho et al. [SRC 1999,IEEE 2001]
Discussed about the impact of wire delay in Discussed about the impact of wire delay in deep submicron technology.deep submicron technology.
Agarwal et al. [ISCA 2000]Agarwal et al. [ISCA 2000]
Raised the issue of wirelength impact in Raised the issue of wirelength impact in designing conventional microarchitecture in designing conventional microarchitecture in this submicron processor design.this submicron processor design.
Cong el al. [DAC 2003]Cong el al. [DAC 2003]
Proposed that BIPS should be used instead of Proposed that BIPS should be used instead of IPC, widely used metric in current processor IPC, widely used metric in current processor design.design.
OutlineOutline IntroductionIntroduction Related WorkRelated Work Wire Delay IssuesWire Delay Issues Profile-Guided FloorplanningProfile-Guided Floorplanning Simulation InfrastructureSimulation Infrastructure Experimental ResultsExperimental Results ConclusionsConclusions
Ho et al. classify wires to be three classes:Ho et al. classify wires to be three classes: Local wire.Local wire. Global wire.Global wire. Repeated wire.Repeated wire.
For 30 nm technology Repeated wire delay is approximated to be 80pS/mm. A FO4 gate delay is approximately 17pS.
To archive the target high frequency, flipflop insertionTo archive the target high frequency, flipflop insertionis required.is required.
4For example, the Pentium processor design has 2dedicated pipeline stages for moving signal across
ttt tttt ttt tt tttt ttttt
When Wire Delay Becomes the When Wire Delay Becomes the ProblemProblem
module i and j before considering wire delay impact.
L = target cycle time (1/clock freq.).
gi = gate delay for module i.wmax,i , wmin,i = max. and min. half
width of module i.ij = interconnect traffic info.
between module i and j. = repeated delay per mm.Paremeters:xi,yi= location info for module iwi = half width of module iOutput:zij = number of flipflops between
The relation between The relation between module module ii and and jj ca can be n be either left, right, above, or either left, right, above, or below relationship based below relationship based on value set by binary on value set by binary ccijij and and ddijij..
The relation between The relation between module module ii and and jj ca can be n be either left, right, above, or either left, right, above, or below relationship based below relationship based on value set by binary on value set by binary ccijij and and ddijij..
ai = 2hi x 2wi
xi+wi ≤ xj – wj , i is on the left of j
xi-wi ≥ xj + wj , i is on the right of j
4 yi wi wj + ai wj ≤ 4 yj wi wj – aj wi
, i is on the below of j4 yi wi wj + ai wj ≥ 4 yj wiwj – aj wi
Number of flipflops Number of flipflops between modules between modules ii and and jj has to be larger than has to be larger than summation between gate summation between gate delay anddelay and wire delay wire delay between these two between these two modules divided by target modules divided by target cycle time.cycle time.
3 ns 2ns2ns
Cycle Time (L) = 4 ns
(MINP) Objective(MINP) Objective
Minimizing weighted Minimizing weighted wire length when the wire length when the weight value is weight value is interconnect traffic interconnect traffic information from information from profiling.profiling.
Note that which the Note that which the same target technology same target technology and clock frequency: and clock frequency: ggii, , , and , and LL are constant. are constant.
Non-Linear RelaxationNon-Linear Relaxation
ih
iw
iwmin,iwmax,
iiii kwmh
i
ii w
ah
4
i
i
i
ii
ii
ii
w
a
w
ak
ww
am
min,max,
max,min,
4
4
4
=
= +
=
= +
Mixed Integer Linear Mixed Integer Linear ProgrammingProgramming
Integer RelaxationInteger Relaxation Solving Mixed Integer Programming is NP hard.Solving Mixed Integer Programming is NP hard. Using bipartitioning for relaxationUsing bipartitioning for relaxation
Linear ProgrammingLinear Programming
rrjj,,lljj,,ttjj,,bbj j are right, left, top, bottom of the hard virtual are right, left, top, bottom of the hard virtual box constraints imposed on our floorplanner.box constraints imposed on our floorplanner.
Soft virtual box Soft virtual box constraint that constraint that allow module to allow module to relocate (crossing relocate (crossing between blocks) by between blocks) by maintaining center maintaining center of gravity of gravity constraints.constraints.
Floorplanning AlgorithmFloorplanning Algorithm
Last iteration
OutlineOutline IntroductionIntroduction Related WorkRelated Work Wire Delay IssuesWire Delay Issues Profile-Guided FloorplanningProfile-Guided Floorplanning Simulation InfrastructureSimulation Infrastructure Experimental ResultsExperimental Results ConclusionsConclusions