T I Marw Amina · Noura, Mohamed, and Maher; as m uc h, the y oungest of us, Samir, Jad, Dahlia, Marw an. T o all of y ou, a wholehearted thank ou and Go d bless ou. v. Con ten ts

Voltage-drop and Ele tromigration in Chip Power Grids

Zahi Moudallal

A thesis submitted in onformity with the requirements

for the degree of Do tor of Philosophy

The Edward S. Rogers Sr. Department of Ele tri al and Computer Engineering

University of Toronto

Abstra t

Voltage-drop and Ele tromigration in Chip Power Grids

Zahi Moudallal

Do tor of Philosophy

The Edward S. Rogers Sr. Department of Ele tri al and Computer Engineering

University of Toronto

A major hallenge in modern hip design is the design and analysis of the hip's power grid

the network that relays power from the external supply pins of an integrated ir uit (IC) to

the hip's transistors. The power grid must deliver a sour e of power that is fairly free from

large u tuations over the hip's target lifetime. Large voltage variations on the grid an easily

put ir uit performan e and reliability at risk. The large swit hing urrents drawn by the logi

ir uitry may result in una eptable voltage levels on the grid in the short-term. In addition,

Ele tromigration, a physi al phenomenon that leads to the aging and deformation of the grid

metal lines over an extended period of time, may result in una eptable voltage levels in the

long-term, whi h pla es a limit on the useful lifetime of the hip. In light of these two problems,

this thesis develops a suite of omputer-aided design (CAD) te hniques to help analyze, verify,

and design the power grid. This thesis is in three parts. The fo us of the rst part is to develop

a systemati approa h that generates power budgets whi h, if adhered to by the underlying logi

ir uitry, would guarantee safe voltage levels on the grid. The se ond part of the thesis develops

a power s heduling framework for power-gated power grids, a modern type of grid designs that

in ludes a tive devi es that allow major ir uit blo ks to be turned OFF, for power and thermal

management. During hip operation, a power s heduler monitors the on- hip hardware resour es

and manages hip workload (swit hing a tivity) to ensure the voltage levels on the grid remain

within spe i ations. Finally, the third part of the thesis develops a power grid xing s heme

that introdu es minimal modi ations to the grid design to meet a target ele tromigration

lifetime.

To whom I owe it all,

Marwan & Amina

A knowledgements

Impossible is not a fa t. It is an opinion.

Impossible is not a de laration. It is a dare.

Impossible is potential. Impossible is temporary.

Impossible is nothing. - Muhammad Ali

Now, the PhD journey is over. As ex iting as it is, starting a new hapter of my life that

holds ountless opportunities to pursue, I will always look ba k at this period of my life with an

eye of ae tion and pride. Now that this hapter is over, I annot but think of how lu ky I was

to have been in the right pla e, at the right time, with the right people around me. Thank

you God (Alhamdulillah) for guiding me through this journey, and for giving me the patien e

and strength to get where I am. Thank you for bringing to my life a whole lot of irrepla eable

people, to whom I am deeply greatful.

First, I would like to express my sin ere gratitude to my mentor and advisor Professor Farid

Najm who has been a true role model on both a ademi and personal levels. My words fail

to des ribe the honor and pleasure I felt working with you for the past 6 years. I still fondly

remember during the rst few weeks of our work together when I ra ked a small resear h

problem and you looked at the solution I sket hed on the little white board in your o e and

simply said I am impressed and another time when I told you in our weekly meeting that I

nally proved something we have been working on for several weeks (one of the lemmas in this

thesis) and you responded why didn't you tell me before! I should have brought a bottle of

hampagne to elebrate. You always managed to keep me motivated and ex ited about my

resear h. Professor, thank you for being a onstant bea on of intelle t, inspiration, and support.

It would not have been possible to nish this work without you.

I would also like to thank Professors Vaughn Betz, Jason Anderson, and Piero Triverio from

the ECE department at the University of Toronto, and Dr. Eli Chiprout for their time and

eort in reviewing this work.

Many thanks to the people who have tou hed my life with or without knowing, with a

simple sweet word they have said to me, a strong advi e, an overnight in the lab ra king

a resear h problem, a teary-eyed laugh that took my mind away from work, a time we spent

binge-wat hing, playing board games or basketball, a heated argument, planning the future, or

a goodbye hug. While I may not give you justi e in mentioning ea h and every single name of

you, you know yourselves and you are very important to me. Thank you for existing in my life.

A spe ial thanks from the bottom of my heart goes to my better half, my moral ompass,

and my number one lady Noha Sinno. Noha, thank you for keeping me sane throughout this

journey. I am forever indebted to you, for your presen e in my life, the unwaivering support,

and your ability to listen. Thank you for the pre ious moments that we spent together, I would

not be the same without you.

My greatest gratitude goes to my wonderful parents Marwan and Amina, to whom I dedi ate

this work. Their ontinuous sa ri es and pride in my su ess have raised my ambition and

provided me with unwaivering inspiration. I owe everything to you.

Last but not least, I would like to thank the rest of my ever growing family my siblings

Shadi, Nassma, and Hani; my family in-law Marwan, Khadija, Walid, Sara, Noura, Mohamed,

and Maher; and as mu h, the youngest of us, Samir, Jad, Dahlia, and Marwan.

To all of you, a wholehearted thank you and God bless you.

Contents

A knowledgements iv

Table of Contents vi

List of Tables ix

List of Figures x

1 Introdu tion 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.4 Notation and Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Ba kground 5

2.1 Mathemati al Ba kground . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.1 Element-wise Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.2 Irredu ible Matri es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.3 The Class of M-matri es . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2 The Power Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.3 Power Grid Parasiti s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.4 Power Grid Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.4.1 RC Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.4.2 DC Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.4.3 RLC Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.5 Power Grid Veri ation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.5.1 Ve tor-Based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.5.2 Ve torless Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.6 Generating Current Budgets for RC Grids . . . . . . . . . . . . . . . . . . . . . . 24

2.6.1 Safe Containers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.6.2 Maximal Containers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.7 Power-Gated Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.8 Ele tromigration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.8.1 The Physi al Phenomenon . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.8.2 Ee tive-EM Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.8.3 Ele tromigration Failure Models . . . . . . . . . . . . . . . . . . . . . . . 30

2.8.4 Power Grid EM Che king Approa hes . . . . . . . . . . . . . . . . . . . . 32

2.8.5 Physi s-based EM Che king . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3 Generating Current Budgets for RC Grids Revisited 35

3.2 Redundant Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.3 Appli ations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.3.1 Peak Power Dissipation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.3.2 Uniform Current Distribution . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.3.3 Combined Obje tive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.5 Con lusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4 Generating Current Budgets for RLC Grids 55

4.1 Motivational Example: RC versus RLC . . . . . . . . . . . . . . . . . . . . . . . 55

4.3 Problem Denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.4 Maximal Containers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.4.1 Extremal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.4.2 Irredu ible . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.4.3 Maximality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.5 Appli ations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.5.1 Peak Power Dissipation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.5.2 Uniform Current Distribution . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.5.3 Combined Obje tive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.6 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.7 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

4.8 Con lusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

5 Power S heduling with A tive RC Power Grids 84

5.1 High-Level Design Flow of a Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.4 Proposed Approa h Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

5.4.1 Isolated Blo k Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

5.4.2 Full Grid Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

5.5 Appli ation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

5.5.1 User-spe ied Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

5.5.2 Maximum Lo al Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

5.5.3 Maximum Working Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

5.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

5.7 Con lusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

6 Power Grid Fixing for EM-indu ed Voltage Failures 116

6.1 Problem Denition and Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

6.2 Proposed Approa h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

6.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

6.2.2 Stepping Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

6.2.3 Step-size Sele tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

6.3 Design Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

6.3.1 Maximum Metal Area Usage . . . . . . . . . . . . . . . . . . . . . . . . . 127

6.3.2 Minimum Spa ing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

6.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

6.5 Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

6.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

6.7 Con lusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

7 Con lusions and Future Work 139

Appendi es 141

A Generating Current Budgets for RC Grids 142

A.1 Appli ations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

B Generating Current Budgets for RLC Grids 144

B.1 Subset-Preserving (SP) Matri es . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

B.2 Proof of Lemma 4.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

B.3 Proof of Lemmas 4.7 4.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

B.4 Appli ations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

B.5 Properties of the matri es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

C Power S heduling with A tive RC Power Grids 160

C.1 Design Obje tives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

Bibliography 162

List of Tables

3.1 Details of power grids used in experiments . . . . . . . . . . . . . . . . . . . . . . 50

3.2 Runtime of the three approa hes . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.3 Comparison of the three approa hes . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.2 Runtime of the three approa hes . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.3 Comparison of the three approa hes . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.4 Number of variables and onstraints for all three LPs . . . . . . . . . . . . . . . . 81

5.2 Power grid properties and the runtime breakdown . . . . . . . . . . . . . . . . . . 113

5.3 The user-spe ied onstraints parameters and omparison of the two design ob-

je tives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

6.2 Summary of results using proposed approa h . . . . . . . . . . . . . . . . . . . . 133

6.3 Summary of results using the greedy approa h . . . . . . . . . . . . . . . . . . . . 133

List of Figures

2.1 Example of X in R2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 A ross-se tional view of a power distribution system of a high performan e in-

tegrated ir uit [68 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.3 A power grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.4 Simple RC grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.5 Simple RLC grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.6 Example of a ontainer F for i1(t) and i2(t). . . . . . . . . . . . . . . . . . . . . . 25

3.1 An example of a urrent ontainer with redundant onstraints whi h was produ ed

on a 14-node grid with 2 urrent sour es. The redundant onstraints are shown

as dotted lines whereas the non-redundant onstraints are shown as solid lines. . 38

3.2 An example of a power grid with 4 nodes, 2 urrent sour es, and Vth = [110 100 95 105]T

(units of mV ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.3 An example of F(up), F(us), and F(uc). . . . . . . . . . . . . . . . . . . . . . . . 44

3.4 Contour plots for peak urrent density a ross the layout and the orresponding

histograms using F(us) from [50. The olor bar units are mA/cm2. . . . . . . . 51

histograms. The olor bar units are mA/cm2. . . . . . . . . . . . . . . . . . . . . 53

4.1 Simple example of an RLC power grid. . . . . . . . . . . . . . . . . . . . . . . . . 56

4.2 An example of a urrent waveform that leads to voltage violations. . . . . . . . . 57

4.3 A urrent ontainer (represented as an empty polygon) generated for the RC

ir uit resulting from ignoring the indu tan e of Fig. 4.1 that in ludes the prob-

lemati urrent waveform of Fig. 4.2. Also, a urrent ontainer (represented as

striped polygon) generated using one of the proposed algorithms presented below

that avoid the problemati urrent stimulus presented in Fig. 4.2. . . . . . . . . . 58

4.4 Example of a ontainer F for i1(t) and i2(t). . . . . . . . . . . . . . . . . . . . . . 58

4.5 Nodal voltage at nodes 1 and 3 of Fig. 4.1 due to urrent waveform in Fig. 4.4.

The grey line represents the Vdd value. The dashed lines represent the voltage

overshoot/undershoot thresholds. The blue dotted lines represent the values of

Px(F), where F is the urrent ontainer represented in Fig. 4.4. . . . . . . . . . 59

4.6 Examples of Pu and F(u). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.7 A graphi al representation of the set U , ve tors u that are irredu ible, ve tors u

that are extremal in U , and ve tors u that have F(u) maximal in S. . . . . . . . 69

4.8 An Example of F(up), F(us), and F(uc). . . . . . . . . . . . . . . . . . . . . . . 73

histograms. The olor bar units are mA/ m

2. . . . . . . . . . . . . . . . . . . . . 82

5.1 Con eptual system-level representation of the proposed run-time workload s hed-

uler in a power-gated hip. This gure was inspired from [60. . . . . . . . . . . . 84

5.2 S hemati diagram of an a tive power grid with power gating transistors. . . . . 86

5.3 S hemati diagram of a power-gated grid using resistive swit hes, referred to as

the original grid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5.4 S hemati diagram of the equivalent passive grid. . . . . . . . . . . . . . . . . . . 87

5.5 Relative error of the maximum voltage drop using original grid vs the equivalent

passive grid, based on HSPICE simulations of a 400K-node grid with 49 blo ks. . 88

5.6 A blo k in isolation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

5.7 Simple example of a power grid with 2 blo ks. . . . . . . . . . . . . . . . . . . . . 90

5.8 Simple example of a power grid with a supply value of Vdd − (1− α)γ. . . . . . . 91

5.9 (a) A urrent ontainer F1(α1) for the left blo k in Fig. 5.7 for dierent values

of α1, (b) A urrent ontainer F2(α2) for the right blo k in Fig. 5.7 for dierent

values of α2, ( ) the set of safe working modes W(α) for dierent values of α =

[α1 α2]Tunder the ontainers generated for ea h blo k in isolation, i.e. F1(α1)

and F2(α2); the dashed polygons orrespond to α1 = 0.4 and for dierent values

of α2, the solid polygons orrespond to α2 = 0.4 and for dierent values of α1,

and (d) the set W(α) for dierent values of α. . . . . . . . . . . . . . . . . . . . . 95

5.10 A 3-D plot in (a) and a ontour plot in (b) of the per entage of safe working

modes for dierent values of α on a 5k node grid with 16 blo ks. The olor bar

represents the per entage of safe working modes. . . . . . . . . . . . . . . . . . . 104

5.11 The feasible spa e of α in Fig. 5.7 as a result of some user-spe ied onstraints. . 106

6.1 A simple example of a power grid with 2 inter onne t trees. . . . . . . . . . . . . 117

6.2 TTF distributions for original and resized grids. . . . . . . . . . . . . . . . . . . . 119

6.3 High-level representation of the proposed approa h shown in a high-dimensional

spa e of s aling fa tors for the width of ea h inter onne t tree in the grid. . . . . 122

6.4 Minimum spa ing between supply and ground inter onne t trees. . . . . . . . . . 128

6.5 Flow hart of the proposed algorithm. . . . . . . . . . . . . . . . . . . . . . . . . 129

6.6 A parallel implementation of the proposed algorithm. . . . . . . . . . . . . . . . . 131

6.7 Metal in rease breakdown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

6.8 Runtime breakdown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

6.9 A ontour plot of the worst- ase (over all grid samples) voltage drop at time T ∗

on a 37K-node grid with 8 metal layers. The olor bar represents the voltage

drop in mV. The red dots show the nodes that have voltage drop above Vth. The

yellow stripes show the lo ation of the inter onne t trees that have been resized

by the proposed approa h. Both x and y axes are in µm. . . . . . . . . . . . . . . 135

6.10 A ontour plot of the worst- ase voltage drop at time T ∗on the xed grid using

the greedy approa h (MTF = 12.8 years). . . . . . . . . . . . . . . . . . . . . . . 136

6.11 Top view of a 37K-node power grid with 8 metal layers. The red dots shows the

die region where the failed nodes are lo ated. The yellow stripes show the die

region where an inter onne t tree was s aled using the proposed approa h. . . . . 136

6.12 Top view of a 37K-node power grid with 8 metal layers. The red dots shows the

die region where the failed nodes are lo ated. The yellow stripes show the die

region where an inter onne t tree was s aled using the greedy approa h. . . . . . 137

B.1 A high-level representation of G(Z). . . . . . . . . . . . . . . . . . . . . . . . . . 155

B.2 A simple high-level representation of G(Z). . . . . . . . . . . . . . . . . . . . . . 157

Chapter 1

Introdu tion

1.1 Motivation

Integrated ir uits (ICs) are used in virtually all modern ele troni equipment, ranging from

high-performan e systems, su h as servers and super omputers, to low-power systems, su h as

mobile devi es and wearable te hnologies. As IC te hnology is ontinuously driven towards

high-performan e design, a major hallenge in modern hip design is the design and analysis of

the hip's power grid, whi h has a dire t impa t on the hip's performan e and reliability.

The power grid in integrated ir uits is the network that relays power from the external hip

pins to the hip's transistors. The power grid thus a ts as a voltage sour e that must supply

appropriate voltage levels to the underlying logi blo ks over the hip's lifetime. However, due

to the parasiti s of the grid transmission lines, ir uit a tivity, oupling ee ts, ele tromigration,

and other fa tors, the grid may experien e ex essive voltage variations at its end points that have

serious ee ts on the logi and timing behavior of the integrated ir uit, su h as redu ing the

noise margins, slowing down the ir uit, and ausing soft errors. The designer has a formidable

task to ensure that the grid design an support the high power requirements of the logi blo ks

while a ounting for all aforementioned fa tors.

In this thesis, we onsider two fa tors that may result in supply u tuations: 1) the large

swit hing urrents drawn by the logi ir uitry, and 2) the aging and deformation of the grid

metal lines over time, a physi al phenomenon known as Ele tromigration (EM), whi h in reases

the resistan e of the grid. The voltage integrity of the grid must be veried under both fa tors.

Verifying the grid voltages due to the a tivity of the logi ir uitry is referred to as power grid

veri ation, whereas verifying the grid due to EM degradation is referred to as EM he king.

This thesis tou hes on the omputer-aided design (CAD) side of both problems.

Existing power grid veri ation tools an be divided into two ategories: ve tor-based (or

simulation-based) and ve torless. Most traditional tools are ve tor-based in a sense that they

require the user to spe ify urrent tra es that represent the swit hing a tivity of the logi

ir uitry over time, and ompute the exa t voltages or voltage drops at the grid nodes. These

urrent tra es are di ult to obtain, unless the logi ir uitry has already been spe ied. This

Chapter 1. Introdu tion 2

makes ve tor-based methods unsuitable for verifying the grid early in the design pro ess, when

modi ations an be most easily in orporated. Ve torless methods, whi h were developed as an

alternative to ve tor-based methods, require the user to spe ify only onstraints on the urrents

drawn by the logi ir uitry. These methods are quite powerful, as they allow for e ient and

early veri ation, but they assume that the ir uit urrent onstraints an be easily spe ied

at early design stages by relying on engineering judgement or expertise from previous design

a tivities. Obtaining/spe ifying these onstraints has proven to be di ult and has be ome a

hurdle to the adoption of these methods. Providing the urrent onstraints is a burdensome

task for users, and the rst part of this thesis is aimed at addressing this problem.

While the power grid is mostly a passive stru ture with resistive, indu tive and apa itive

parasiti s, power grids in modern hip designs often also in lude a tive devi es (e.g. MOSFETs)

that implement power-gating to allow the supply urrents of major ir uit blo ks to be turned

OFF by dis onne ting them from the rest of the power grid. Typi ally, in a large die, for thermal

and supply integrity reasons, one annot have all the ir uit blo ks turned ON simultaneously,

so that there will always be some ir uit blo ks that are turned OFF (so- alled dark Sili on).

With this modern type of power grid design, power s heduling has emerged as a new CAD

problem. For example, many authors have studied the s heduling of hip workload to manage

total power and temperature in luding [46, 24, 37. But power-gating also has an impa t on the

supply voltage levels a ross the die, be ause voltage drop is generated in the grid depending on

the ombinations of blo ks that are ON. We onsider the question of how to manage the hip

workload so that supply voltage variations remain within spe i ations. The hip will typi ally

in lude a design omponent, a power s heduler, that monitors the on- hip hardware resour es

and manages hip workload to ensure the voltage variations remain within spe i ations. This

is a key problem that is addressed in this thesis.

Ele tromigration, on the other hand, is a long-term failure me hanism that des ribes the

migration of metal atoms in a metal line due to high urrent density in the line. This for ed

redistribution of metal atoms eventually leads to void formation at the jun tions of the power

grid, whi h results in an in rease to the grid resistan e, leading to large voltage drops and grid

failure. As su h, even though a hip may perform as expe ted for a relatively long period of

time, performan e degradation and hip failure may eventually o ur. In the nal stages of

design, if the expe ted grid lifetime is determined to be less than the target lifetime of the

hip, the grid design is said to have an EM-lifetime violation and needs to be xed. At this

stage, the x is most ommonly done by introdu ing minimal modi ations to the grid design,

su h as widening the grid metal lines. Many authors have studied this problem, in luding for

example [22, 27, 66, 65, 70, however, the main limitation of most proposed te hniques is that

they are based on ina urate EM failure models and pessimisti power grid failure models. This

thesis presents a power grid xing s heme based on the re ent work in [19 an e ient full- hip

EM assessment approa h that has been validated against experimental data [23.

1.2 Contributions

The thesis fo uses on the following three major frontiers:

1. Generating of ir uit urrent onstraints under RC and RLC grid models.

2. Introdu ing a power s heduling framework for power-gated grids.

3. Proposing a power grid xing s heme for EM-indu ed voltage failures.

The major ontributions of this thesis with respe t to the above three frontiers are summarized

below:

• Moudallal et al. [50 have proposed and laid down the theoreti al foundation of the on-

straints generation framework assuming an RC model of the grid. Chapter 3 follows up

on that previous work and develops some key theoreti al results that redu e the number of

onstraints required to express safe urrents without altering the urrent spa e. It further

generalizes some theoreti al results that were presented in [50 by hara terizing the prop-

erties of design obje tives that an provably produ e maximal ontainers. The hapter

also presents algorithms that target the same design obje tives as in [50 but a hieve large

speedup over the algorithms in [50. Finally, the hapter presents an algorithm that tar-

gets a ombination of the design obje tives presented in [50. We show that this algorithm

is superior to the work in [50.

• Chapter 4 extends the onstraints generation framework to allow for indu tan e. It uses

the same systemati way of dening the problem and generating the onstraints as in the

RC ase. However, be ause indu tive elements introdu e overshoots on the power grid, the

extension to RLC grids is not at all straightforward. The hapter addresses the problem

in the ontext of RLC grids, establishes new theoreti al results, and develops algorithms

that target key design quality metri s.

• Chapter 5 proposes a power s heduling framework for RC grids with power-gating tran-

sistors. This framework manages the trade-o between how many blo ks are ON simul-

taneously and how big the power budgets of the individual blo ks are. Subje t to user

guidan e, we generate blo k-level urrent onstraints, as well as an impli it binary de ision

diagram (BDD) that helps identify the safe working modes.

• Chapter 6 proposes an e ient power grid xing s heme for EM-indu ed voltage failures

based on the re ent work in [19. This s heme takes into a ount the random nature of

ele tromigration and iteratively resizes the metal lines of the grid while respe ting the

design rules and ele tri al guidelines.

1.3 Organization

The thesis is organized as follows. Chapter 2 provides the ba kground material, in luding a

review of some mathemati al denitions and properties, a detailed des ription of the power

grid, its models, its veri ation te hniques, the ele tromigration phenomenon, and ommon EM

he king approa hes. Chapters 3 through 6 present the various ontributions highlighted above.

Ea h hapter will in lude a fair amount of theoreti al ontribution, algorithms, appli ations,

and dis ussion of results. In addition, ea h hapter will in lude a se tion that summarizes the

spe i notation and terminology used throughout a hapter. Finally, Chapter 9 provides some

on luding remarks and suggestions for future work.

1.4 Notation and Terminology

In addition to the hapter-spe i set of notation and terminology provided at the beginning

of Chapters 3 through 6, the following is a list of mathemati al notations and onventions used

throughout the thesis:

• For any two ve tors x and y, the notation x ≤ y (or x < y) will be used to denote that

xi ≤ yi (or xi < yi), ∀i, respe tively.

• For any matrix X, the notation X ≥ 0 (or X > 0) will be used to denote that Xij ≥ 0 (or

Xij > 0), ∀i, j, respe tively.

• Whenever the produ t of a number of matri es Ai by a ve tor v is followed by the notation

|i, as in A1A2 · · ·Akv|i, the expression shall denote the ith entry of the ve tor resulting

from the produ t A1A2 · · ·Akv.

• The n × n identity matrix will be denoted as In. The simpler notation 1 will be used

whenever the dimension is lear from the ontext.

• The n × 1 ve tor of all 1s will be denoted as 1n. The simpler notation 1 will be used

• The n×m matrix of zeros will be denoted as 0n×m. The simpler notation 0 will be used

• The symbols R and B will be used to denote the set of all real numbers and the set of

all binary values, respe tively. Also, we will use the symbol R+ to denote the set of all

non-negative real numbers. These symbols will also appear with an integer supers ript ν

to denote a higher-dimensional spa e, e.g. Rνrepresents the set of all ν× 1 ve tors of real

numbers and Bνrepresents the set of all ν × 1 ve tors of binary values.

Chapter 2

Ba kground

2.1 Mathemati al Ba kground

This se tion provides a brief review of few mathemati al denitions and properties of ertain

lasses of matri es that are important in order to follow the rest of the thesis.

2.1.1 Element-wise Operators

We introdu e three element-wise operators, namely: emax(·), emin(·), and eopt(·), that will help

us apture the worst- ase voltage drop on the power grid. These operators perform element-

wise maximization/minimization on a ve tor-valued argument. In a sense, they extend the

standard max(·) and min(·) operators that perform maximization/minimization on a s alar-

valued argument.

Denition 2.1. (emax operator) Let X be a bounded and losed subset of Rm

and let f(·)

be a ve tor-valued fun tion f(·) : X → Rn. If z ∈ R

nis a n × 1 ve tor su h that, for every

i ∈ 1, . . . , n:

zi = maxx∈X

[fi(x)]

then we apture this with the shorthand notation:

z = emaxx∈X

[f(x)] (2.1)

with the onvention that emaxx∈X [f(x)] = 0, if X = φ.

Consider the simple example of X ⊂ R2, as shown in Fig. 2.1, and f(x) = x. Using the

above notation, we have:

emaxx∈X

[f(x)] = emaxx∈X

maxx∈X

Chapter 2. Ba kground 6

Figure 2.1: Example of X in R2.

Denition 2.2. (emin operator) Let X be a bounded and losed subset of Rm

and let f(·) be

a ve tor-valued fun tion f(·) : X → Rn. If z ∈ R

nis a n × 1 ve tor su h that, for every

i ∈ 1, . . . , n:

zi = minx∈X

[fi(x)]

z = eminx∈X

[f(x)] (2.3)

with the onvention that eminx∈X [f(x)] = 0, if X = φ.

For the example in Fig. 2.1 and f(x) = x, we have

eminx∈X

[f(x)] = eminx∈X

minx∈X

Denition 2.3. (eopt operator) Let X be a bounded and losed subset of Rm

and let f(·) be

a ve tor-valued fun tion f(·) : X → Rn. If z ∈ R

2nis a 2n × 1 ve tor su h that, for every

i ∈ 1, . . . , n:

zi = maxx∈X

[fi(x)]

zn+i = minx∈X

[fi(x)]

z = eoptx∈X

[f(x)] (2.5)

with the onvention that eoptx∈X [f(x)] = 0, if X = φ.

For the example in Fig. 2.1 and f(x) = x, we have

eoptx∈X

[f(x)] = eoptx∈X

maxx∈X

minx∈X

Of ourse, the examples in (2.2), (2.4), and (2.6) are simple and, thus, omputing the result

of the element-wise operation is a straightforward pro ess. As we will see later, nding the worst-

ase voltage drop is mu h more ompli ated and so the individual maximization/minimization

would be formulated as a linear program (LP).

2.1.2 Irredu ible Matri es

A square matrix A is said to be irredu ible if there does not exist a reordering of the rows and

olumns of A that results in a blo k upper-triangular matrix. We now give a formal mathemati al

denition of irredu ible matri es.

Denition 2.4. (Dire ted Graph) A dire ted graph G is the ombination of a set of verti es

V(G) and a set of ordered pairs of verti es from V(G), alled dire ted edges, E(G). If (vi, vj) is

a dire ted edge of G, then it is said to have a dire tion from vi to vj.

Any square n × n matrix A an be used to generate a graph G(A), dened as the dire ted

graph on n verti es v1, . . . , vn in whi h (vi, vj) ∈ E(G(A)) if and only if aij 6= 0, where aij is

the (i, j)th element of A. The graph G(A) is referred to as the asso iated graph of A.

Denition 2.5. (Dire ted Path) A dire ted path in a graph G is a sequen e of verti es v0, v1, . . . , vk

where (vi−1, vi) ∈ E(G) for all i ∈ 1, . . . , k. The vertex vk is said to be rea hable from v0,

denoted as v0 → vk.

Denition 2.6. (Strongly Conne ted) A dire ted graph is said to be strongly onne ted if, for

every pair of verti es, u, v, we have u → v.

Lemma 2.1. ([11, page 30) A square matrix A is said to be irredu ible if and only if its

asso iated graph G(A) is strongly onne ted.

Denition 2.7. (Positive Denite) A real square matrix A is said to be positive denite if

xTAx > 0, ∀x 6= 0.

Lemma 2.2. ([69, page 91) If a square matrix A is a real, symmetri , non-singular, and

irredu ible matrix, where aij ≤ 0, ∀i 6= j, then A−1 > 0 if and only if A is positive denite.

2.1.3 The Class of M-matri es

Denition 2.8. (M-matrix) A square matrix A is alled an M-matrix if:

aij ≤ 0, ∀i 6= j and R(λi) > 0, ∀i (2.7)

where aij is the (i, j)th element of A, λi is an eigenvalue of A, and R(λi) is the real part of λi.

Denition 2.9. (Prin ipal Submatrix) Let A be an n × n matrix. A k × k submatrix of A

formed by deleting n − k rows of A, and the same n − k olumns of A, is alled a prin ipal

submatrix of A.

Lemma 2.3. ([11, page 140) If A is an M-matrix, then ea h prin ipal submatrix of A is also

an M-matrix.

Lemma 2.4. ([11, page 134) If A is an M-matrix, then A−1exists and its entries are non-

negative, whi h we denote by A−1 ≥ 0.

Denition 2.10. (Diagonally Dominant) A matrix A is said to be diagonally dominant if |aii| ≥∑

j 6=i |aij |, ∀i. A matrix A is said to be stri tly diagonally dominant if |aii| >∑

j 6=i |aij |, ∀i.

Denition 2.11. (Irredu ibly Diagonally Dominant) A square matrix A is said to be irredu ibly

diagonally dominant if it is irredu ible, it is diagonally dominant, and there is an i ∈ 1, 2, . . . , n

for whi h |aii| >∑

j 6=i |aij |, i.e., it is stri tly diagonally dominant in at least one row.

Lemma 2.5. ([69, page 91) If A is irredu ibly diagonally dominant with aii > 0, ∀i, and

aij ≤ 0, ∀i 6= j, then A is an M-matrix and its inverse has stri tly positive entries, whi h we

denote by A−1 > 0.

2.2 The Power Grid

The power distribution network (PDN) of an integrated ir uit is a distributed system that is

used to deliver the appropriate supply voltages from pad lo ations to all on- hip logi ells. The

PDN onsists of several stages, starting at the voltage regulator module (VRM), through the

motherboard, pa kage and nally the on-die portion of the PDN. A ross-se tional view of the

PDN is shown in Fig. 2.2. The term power grid is usually used to refer to the on-die part of the

PDN; it is a multi-layered metalli mesh that is used to deliver power from the external supply

pins of a hip to the underlying logi ir uitry. On every layer, modern grids onsist of long

interleaved lines that arry supply and ground, with vias onne ting them to the layers above

and below and onne ted by C4 bumps to wiring in the pa kage and on the board. Typi ally,

some supply and ground lines are removed to make room for signal routing, whi h introdu es

non-uniformities to the grid stru ture. An example of a power grid is shown in Fig. 2.3.

The power grid, if not arefully designed, an easily put the performan e and reliability of

a hip at risk. Ideally, the voltage levels on the grid are expe ted to be uniform and equal to

Figure 2.2: A ross-se tional view of a power distribution system of a high performan e

integrated ir uit [68

the supply voltage (Vdd). However, this value is usually disturbed due to the parasiti s of the

grid, ir uit a tivity, oupling ee ts, ele tromigration [8, et . Consequently, the voltage levels

on the grid might vary; they may drop below Vdd (undershoot) or rise above Vdd (overshoot).

This voltage variation an have serious ee ts on the logi and timing behavior of the integrated

ir uit, su h as redu ing the noise margins, slowing down the ir uit, and ausing soft errors [38,

7, 57, 9.

One must ensure that the power grid an supply well-regulated voltages throughout the

target lifetime of the hip while distributing large urrents to millions (or billions) of logi ells.

The di ulty in designing a power grid lies in the fa t that there are many unknowns during

the design pro ess, until the very end of the design y le. The design of the power grid is thus

a progressive pro edure with repeated renements [68. Typi ally, it begins before the physi al

design of the logi blo ks, on e power estimation for ma ros and top-level blo ks is omplete, and

ontinues until tape-out and post-sili on validation. The initial andidate design of the power

grid is often laid out as a uniform stru ture. At this stage, very little information is known about

the power requirements of the logi ir uitry. These requirements are typi ally estimated by

s aling the power onsumption of a hip from a previous te hnology node onsidering the target

die area, operating frequen y, power supply voltage, and other hara teristi s. As the ir uit

design be omes better spe ied, a more a urate hara terization of the power requirements is

possible and, onsequently, the design of the power grid be omes more pre ise. As su h, designers

have to gradually update the power grid design as more information be omes available, until

the nal design is ready for sign-o.

Power grid analysis is a ne essity for modern hip designs. In fa t, power grid analysis is

required at every stage of the design pro ess, as a way to guide the design ow of the grid and to

verify whether the design meets the target spe i ations. The primary goal of the analysis and

its omplexity vary at dierent phases of the design pro ess. Generally speaking, at early design

stages, the design of the grid is abstra ted and very little information is known about the logi

Figure 2.3: A power grid

ir uitry. Thus, the main hallenge is to guide the design ow based on information of limited

a ura y. At later stages, the design of the grid be omes more detailed whi h dramati ally

in reases the omplexity of the analysis. At this point, a signi ant portion of the hip ir uitry

has been spe ied, and so it is very di ult and ostly to modify the grid design. Thus, the

main purpose of the analysis is to be able to dis over lo ations on the die that violate the target

spe i ations and to x those violations by making minimal revisions to the grid design.

In this thesis, we fo us on two major problems that impa t the quality of the power grid

design: 1) voltage variations resulting from instantaneous transients due to the a tivity of the

logi ir uitry, and 2) voltage variations resulting from the aging and deformation of the grid

metal lines over time, a phenomenon known as Ele tromigration (EM). Analyzing the grid for

the former is referred to as power grid veri ation whereas analyzing the grid for the latter

is referred to as ele tromigration he king.

2.3 Power Grid Parasiti s

The various parasiti s of power grids have an ee t on ir uit behavior and onsequently on the

voltage variation. The main parasiti ee ts are: resistive, apa itive, and indu tive.

Resistive parasiti s are due to the resistan e of metal lines of the power grid. In modern

mi ropro essors, the number of metal layers have substantially in reased due to the large volume

of inter onne t required for grid routing. As a result, the voltage drop indu ed by the metal

resistan e be omes signi ant. This voltage drop, referred to as IR drop, is a major omponent

of the total voltage drop on the grid [56, 62, and has in reased from one te hnology generation

to the next, as metal lines widths have de reased.

Capa itive ee ts refer to the parasiti apa itan e between metal wires, as well as MOS-

FET apa itan e and on- hip de oupling apa itan e. The apa itan e on power grids an be

lassied as either impli it or expli it. Impli it apa itan e arises from the proximity of metal

lines to one another, intrinsi apa itan e of non-swit hing devi es, and the apa itan e between

N-well and substrate [56. Alone, impli it apa itan e is not enough for dealing with supply

voltage noise. De oupling apa itors, ommonly referred to as de ap, behave as on- hip low-pass

lters that redu e noise and ounter the impa t of fast swit hing urrents, keeping the supply

voltage drop within a safe margin. A ommon pra ti e is to insert expli it de aps by lling

on-die white spa es at strategi lo ations, i.e., empty areas in the hip oorplan, with devi es

whose gate oxide apa itan e provides a de oupling ee t. Any power grid model must a ount

for ee ts arising from both types of apa itan es.

Indu tive ee ts are mainly due to the indu tan e of the pa kage leads resulting in power

noise at the pad lo ations. This indu tive noise is be oming a signi ant omponent of the total

power supply noise [44, 51, 64, be ause of the high urrents and frequen ies of today's hips,

reating Ldi/dt noise. Due to indu tive parasiti s, the power grid might experien e voltage

overshoots, whi h an result in gate oxide breakdown and timing issues. Furthermore, the

resistive, apa itive, and indu tive parasiti s form a omplex RLC ir uit whi h has a spe i

resonan e frequen y. If the hip operating frequen y is lose to this resonan e frequen y the grid

might experien e large voltage variations that an be problemati . Therefore, indu tive ee ts

on the power grid must be in luded when verifying ir uits operating at high frequen ies.

2.4 Power Grid Models

A urate power grid analysis relies heavily on the model des ribing its parasiti s. A power grid

is usually modelled as a linear ir uit omposed of a large number of lumped linear elements

(resistive, apa itive, indu tive). At its metal-1 and metal-2 terminals, the grid is loaded by

the ir uit blo ks, where nonlinearities are en ountered due to the ir uit MOSFETs. It is

pra ti ally impossible to jointly simulate or analyze both the full nonlinear ir uit and the large

grid all at on e, and ommon pra ti e is to de ouple the two. This typi ally means that the

ir uit blo ks are represented by some suitable model, onsisting of a urrent sour e along with

some parasiti network to ground. However, these parasiti s are often negle ted be ause of the

larger impa t that un ertainty of urrents has on the grid response, and so the ir uit urrent

sour es are often assumed ideal and this is what will be assumed in this thesis.

This being said, there are three main ele tri al models of the power grid that are ommonly

used in the literature, ea h model has a dierent level of omplexity and serves a dierent role

in power grid analysis: 1) a DC model that in ludes only resistive parasiti s, 2) a RC model

that in ludes resistive and apa itive parasiti s, and 3) a RLC model that in ludes resistive,

indu tive, and apa itive parasiti s. The DC model is often used to he k the reliability of

the grid against ertain failure me hanisms, su h as Ele tromigration, that happen over long

time periods (several years) and, thus, urrent transients over short time s ales do not play a

signi ant role only umulative long-term ee ts are signi ant. The RC and RLC models are

generally used for power grid veri ation be ause the instantaneous voltage on the grid is more

sensitive to the swit hing a tivities of the underlying logi omponents. Choosing one model

over another is usually based on the purpose of the veri ation, the level of abstra tion of the

grid, the signi an e of indu tive ee ts at the target operating frequen y, et .

As we will see later, in Chapters 3-6, the largest power grids that we ran our algorithms on

onsist of few million nodes. The power grids in modern hip designs onsist of around 1 billion

nodes. However, a signi ant part of these grids are usually abstra ted away to improve the

runtime e ien y of the power grid analysis at the ost of a ura y. In fa t, the detail level at

whi h the grid parasiti s are extra ted is typi ally based on the purpose of the analysis and how

mu h error the analysis an tolerate. Furthermore, there are many te hniques, su h as in [75,

that redu e the size of the grid at the ost of a ura y. The authors in [75 show that the size

of a power grid an be redu ed by 40%-80% at the ost of roughly 2 mV error when omputing

the voltage variations on the grid.

2.4.1 RC Model

Consider an RC model of the power grid in whi h there are three types of nodes:

Type 1: nodes that are onne ted to ideal urrent sour es to ground, in parallel with

apa itors to ground.

Type 2: nodes that are onne ted to resistors to other grid nodes and apa itors to

ground.

Type 3: nodes that are onne ted to resistors to other grid nodes and ideal voltage

sour es to ground.

Any onne ted inter onne tion of R and C elements is allowed, provided every node is one

of the above three types. A simple RC power grid is shown in Fig. 2.4, in whi h nodes 1 and 2

are of type 1, nodes 3, 4, 5 and 6 are of type 2, and nodes 7 and 8 are of type 3.

The ideal urrent sour es (with their parallel apa itors) represent the urrents drawn by

the logi ir uits tied to the grid at these nodes, and the ideal voltage sour es represent the

onne tion to the external voltage supply Vdd. It is important to note that this model assumes

that every apa itor is onne ted to ground and that every node that is not a Vdd node is

onne ted to a apa itor. Models that in lude node-to-node apa itors are harder to analyze,

and are beyond the s ope of this thesis.

Ex luding the ground node, let the power grid onsist of n+p nodes, where nodes 1, 2, . . . , n

are the nodes not onne ted to a voltage sour e, and the remaining nodes (n+1), (n+2), . . . , (n+

p) are the nodes where the p voltage sour es are onne ted. Furthermore, let m be the number

of urrent sour es onne ted to the grid, whose positive (referen e) dire tion of urrent is from

node-to-ground, assumed to be onne ted at nodes 1, 2, . . . ,m ≤ n, and let i(t) ≥ 0 be the m×1

ve tor of all sour e urrents. Also, let H be an n ×m matrix of 0 and 1 entries that identies

i (t)1 i (t)

Figure 2.4: Simple RC grid

(with a 1) whi h node is onne ted to whi h urrent sour e, i.e. Hij = 1 if node i is onne ted

to urrent sour e j, and Hij = 0, otherwise. Thus, H an be expressed as follows:

The nodes in the power grid of Fig. 2.4 are numbered a ording to the above onvention, so

that m = 2, n = 6, and p = 2.

Let ϑk(t) be the kth node voltage waveform, relative to ground, and let ϑ(t) be the ve tor

of all ϑk(t) signals. By superposition, ϑ(t) may be found in three steps: 1) open- ir uit all the

urrent sour es and nd the response, whi h would be ϑ(1)(t) = vdd, 2) short- ir uit all the

voltage sour es and nd the response ϑ(2)(t), and 3) nd ϑ(t) = ϑ(1)(t)+ϑ(2)(t). To nd ϑ(2)(t),

Kir hho's Current Law (KCL) at every node k ∈ 1, . . . , n provides the following:

Gϑ(2)(t) + Cϑ(2)(t) = −Hi(t) (2.9)

where G and C are n×n matri es resulting from the appli ation of the traditional nodal analysis

(NA) formulation [52, and vdd is n×1 ve tor where all of its entries are the supply value Vdd. C

is the n×n diagonal non-negative apa itan e matrix, whi h is non-singular be ause every node

is atta hed to a apa itor; G is the n× n ondu tan e matrix, whi h is known to be symmetri

and diagonally dominant with positive diagonal entries and non-positive o-diagonal entries.

Under the standard assumption that the grid is onne ted (so that G is irredu ible) and has at

least one voltage sour e (so that G is stri tly diagonally dominant in at least one row), then G

is irredu ibly diagonally dominant and, by Lemma 2.5, we have that G is an M-matrix with

G−1 > 0.

For power grid veri ation in the RC ase, we are mostly interested in the voltage drop

experien ed at grid nodes. Let v(t) be the n × 1 ve tor of all voltage drops, so that v(t) =

vdd − ϑ(t) = vdd − ϑ(1)(t)− ϑ(2)(t) = −ϑ(2)(t), and

G(−v(t)) + C(−v(t)) = −Hi(t) (2.10)

or equivalently,

Gv(t) + Cv(t) = Hi(t) (2.11)

One an use the revised system equation (2.11) to dire tly solve for the voltage drop values.

Noti e that the system equation (2.11) is equivalent to (2.9), with all the voltage sour es set to

zero (short ir uit) and all the urrent sour e dire tions reversed.

The system in (2.11) is a rst-order ordinary dierential equation (ODE) that des ribes the

dynami s of the power grid given a time-varying input urrent waveform i(t), ∀t. Based on

linear systems theory [20, at any time t ≥ 0, the solution of (2.11) is

v(t) = e−C−1Gtv(0) +

0e−C−1G(t−τ)C−1Hi(τ)dτ (2.12)

where eX =∑∞

k=01k!X

kis the matrix exponential of a square matrix X. Of ourse, it is pro-

hibitively expensive to ompute (2.12). The above expression is only mentioned for theoreti al

ompleteness. In fa t, solving (2.11) is usually done by dis retizing time using a nite-dieren e

approximation of the dierential term v(t), as we will dis uss next.

Time-dis retization

Using a ba kward Euler (BE) numeri al s heme [42, the derivative is approximated, for small

∆t > 0, as

v(t) ≈v(t)− v(t−∆t)

∆t(2.13)

Substituting (2.13) in (2.11), we an write a dis rete-time version of the system in (2.11), for

small ∆t > 0, as

Gv(t) + C

v(t)− v(t−∆t)

≈ Hi(t) (2.14)

or equivalently,

v(t) ≈C

∆tv(t−∆t) +Hi(t) (2.15)

Assuming that the time-step ∆t is hosen su h that (2.15) is a urate enough irrespe tive of the

urrent stimulus i(t), (2.15) leads to the following re urren e relation that aptures the evolution

of the system over time

Av(t) = Bv(t−∆t) +Hi(t) (2.16)

where A = G+ C∆t

and B = C∆t

. Be ause G satises the onditions of Lemma 2.5, then it's lear

that A = G+B also satises the same onditions, so that A is an M-matrix with A−1 > 0. Let

M = A−1 > 0 and dene the n×m matrix M ′ = MH > 0. Furthermore, be ause B = A−G,

then In +G−1B = In +G−1(A−G) = G−1A. But In ≥ 0, G−1 > 0, and B ≥ 0 with bii > 0, ∀i,

so that:

G−1A > 0 (2.17)

The fa t that every entry of the matri es G−1, A−1

, and G−1A, are stri tly positive will prove

to be ru ial later on.

Voltage Safety Condition

We assume that a ertain number of grid nodes d ≤ n (the nodes of interest) are required to

satisfy ertain user-provided voltage drop threshold spe i ations, aptured in the d × 1 ve tor

Vth. The nodes of interest are typi ally the bottom-most nodes of the grid where the hip

ir uitry is onne ted, i.e., to the urrent sour es shown in the gures. Clearly, these are the

only nodes whose voltage drop matters be ause they dire tly ae t ir uit operation. To

avoid trivial ases, we will assume that Vth > 0. Let P be a d× n matrix onsisting of 0 and 1

elements only, spe ifying (with a 1 entry) the nodes that are subje t to a voltage drop threshold

spe i ation. Note that P ≥ 0 and has exa tly one 1 entry in every row, otherwise 0s, and that

no olumn of P has more than a single 1 entry.

With this, the grid is said to be safe from voltage drop violations if:

Pv(t) ≤ Vth, ∀t (2.18)

2.4.2 DC Model

The DC model of the power grid is very similar to the RC model ex ept that only grid resistan e

is relevant and the voltage drop may be obtained by setting to zero the time-derivative in (2.11),

i.e. v(t) = 0, ∀t, whi h leads to:

Gv(t) = Hi(t) (2.19)

As in the RC ase, the grid is safe if Pv(t) ≤ Vth is satised at every time point t.

2.4.3 RLC Model

An RLC power grid is very similar to the RC model ex ept for the extra indu tors added in

series with the voltage sour es. Consider an RLC model of the power grid in whi h there are

three types of nodes:

Type 1: nodes that are onne ted to ideal urrent sour es to ground, in parallel with

Type 2: nodes that are onne ted to resistors or indu tors to other grid nodes and

i (t)1

Figure 2.5: Simple RLC grid

Type 3: nodes that are onne ted to resistors or indu tors to other grid nodes and ideal

voltage sour es to ground.

Any onne ted inter onne tion of R, L, and C elements is allowed, provided every node is

one of the above three types. A simple RLC power grid is shown in Fig. 2.5 in whi h nodes 1

and 2 are of type 1, nodes 3-8 are of type 2, and nodes 9 and 10 are of type 3.

As in the RC model, node-to-node apa itan es are ignored and all apa itan es are assumed

to be onne ted to ground. Furthermore, mutual indu tan es are also ignored, i.e. only self

indu tan e is onsidered. The urrent sour es (with their parallel apa itors) represent the

urrents drawn by the logi ir uits tied to the grid at these nodes. The ideal voltage sour es

represent the external voltage supply, Vdd.

Ex luding the ground node, let the power grid onsist of nv+p nodes, where nodes 1, . . . , nv

are the nodes not onne ted to a voltage sour e, while the remaining nodes (nv + 1), (nv + 2),

. . . , (nv + p) are the nodes where the p voltage sour es are onne ted. Let m be the number

of urrent sour es onne ted to the grid, whose positive (referen e) dire tion of urrent is from

node-to-ground, assumed to be onne ted at nodes 1, 2, . . . ,m ≤ nv, and let i(t) ≥ 0 be the

m × 1 ve tor of all sour e urrents. Also, let H be an nv × m matrix of 0 and 1 entries that

identies (with a 1) whi h node is onne ted to whi h urrent sour e, i.e. Hij = 1 if node i is

onne ted to urrent sour e j, and Hij = 0, otherwise. Thus, H an be expressed as follows:

(2.20)

Finally, let nl be the number of indu tors in the grid.

Let ϑ(t) be the nv × 1 ve tor of node voltages, relative to ground. By superposition, ϑ(t)

may be found in three steps: 1) open- ir uit all the urrent sour es and nd the response, whi h

would be ϑ(1)(t) = Vdd, 2) short- ir uit all the voltage sour es and nd the response ϑ(2)(t), and

3) nd ϑ(t) = ϑ(1)(t) + ϑ(2)(t). To nd ϑ(2)(t), KCL at every node k ∈ 1, . . . , nv provides:

Gϑ(2)(t) + Cϑ(2)(t) +Mil(t) +Hi(t) = 0 (2.21)

where il(t) is the nl×1 ve tor of indu tor bran h urrents, G is the nv×nv ondu tan e matrix,

whi h is a sparse matrix with positive diagonal entries and non-positive o-diagonal entries;

C is an nv × nv non-singular diagonal matrix of the node apa itan es, and M is an nv × nl

in iden e matrix onsisting of ±1 or 0 elements only, a ording to:

1, if node j is onne ted to indu tor k and the

referen e urrent dire tion in k is away from j;

−1, if node j is onne ted to indu tor k and the

referen e urrent dire tion in k is towards j;

0, otherwise.

(2.22)

As in the RC ase, G is known to be symmetri and diagonally dominant with positive diagonal

entries and non-positive o-diagonal entries. The only dieren e is that, due to the existen e

of indu tors, the resistive mesh may be ome dis onne ted, i.e. G may no longer be irredu ible,

and, so, G may be ome singular. If the graph onsisting of all grid nodes 1, 2, · · · , nv and all grid

resistan es in between these nodes is a onne ted graph, then G is irredu ible. If this graph is

not onne ted, then there is an easy and pra ti al x that maintains this useful property of G,

whi h is to atta h a large resistan e in parallel with every indu tor. These large resistors have

a negligible ee t on the ir uit solution, but they have the ee t that G be omes irredu ible,

under the standard assumption that the full RLC grid is onne ted. Considering the voltage

drop v(t)

= Vdd − ϑ(t) = −ϑ(2)(t), so that:

Gv(t) + Cv′(t)−Mil(t) = Hi(t) (2.23)

whi h is a system of nv equations of the nv + nl unknowns. Using the relationship between

the indu tor bran h urrents and the indu tor voltages, we have the familiar indu tor bran h

equation MTϑ(2)(t) = Li′l(t), from whi h:

MT v(t) + Li′l(t) = 0 (2.24)

whi h is a system of nl equations in nv + nl unknowns, in whi h L is an nl × nl non-singular

diagonal matrix onsisting of the indu tan e values of the nl indu tors in the ir uit. The

dynami s of the power grid are governed by the ombined set of equations (2.23) and (2.24),

whi h is a omplete system of nv + nl equations in nv + nl unknowns.

Time-dis retization

Ba kward Euler dis retization [42, applied to this system, leads to:

Av(t)−Mil(t) = Bv(t−∆t) +Hi(t) (2.25)

MT v(t) + Eil(t) = Eil(t−∆t) (2.26)

where B = C/∆t, E = L/∆t, and A = G+B. Multiplying (2.26) by E−1to get an expression

for il(t):

il(t) = −E−1MT v(t) + il(t−∆t) (2.27)

and substituting that in (2.25), gives:

Dv(t) = Bv(t−∆t) +Mil(t−∆t) +Hi(t) (2.28)

where:

D = G+C

∆t+M

MT(2.29)

It an be shown that D is a symmetri positive-denite M-matrix [29, so that D is non-

singular [69. Furthermore, noti e that the matrix D is real, be ause G, C, and L, are real

matri es, and D is irredu ible due to Lemma B.14 (in the appendix), so that D−1 > 0, due to

Lemma 2.2. Multiplying (2.28) by D−1gives:

v(t) = D−1Bv(t−∆t) +D−1Mil(t−∆t) +D−1Hi(t) (2.30)

then substituting this for v(t) in (2.27), we get:

il(t) = −E−1MTD−1Bv(t−∆t)+ (Inl−E−1MTD−1M)il(t−∆t)−E−1MTD−1Hi(t) (2.31)

Combining (2.30) and (2.31) gives the following re urren e relation:

x(t) = Fx(t−∆t) +Ri(t) (2.32)

(2.33)

D−1H

−E−1MTD−1H

D−1B D−1M

−E−1MTD−1B (Inl− E−1MTD−1M)

(2.34)

Let n = nv + nl so that x(t) is n× 1, F is n× n, and R is n×m.

Voltage Safety Condition

A ertain number of grid nodes d ≤ nv (the nodes of interest) are required to satisfy ertain

user-provided voltage drop threshold spe i ations. Contrary to the RC ase where only positive

voltage drops (undershoots) are en ountered, the nodes of an RLC grid may experien e both

positive (undershoots) and negative (overshoots) voltage drops. The voltage undershoot safety

spe i ations and the voltage overshoot safety spe i ations are aptured in the d × 1 ve tors

xub ≥ 0 and xlb ≤ 0, respe tively. Let Π be a d× n matrix onsisting of 0 and 1 elements only,

spe ifying (with a 1 entry) the nodes that are subje t to a voltage drop threshold spe i ation.

Note that Π ≥ 0 and has exa tly one 1 entry in every row, otherwise 0s, and that no olumn of

Π has more than a single 1 entry.

With this, the grid is safe if:

xlb ≤ Πx(t) ≤ xub, ∀t (2.35)

2.5 Power Grid Veri ation

As mentioned previously, power grid veri ation onsists of he king whether the voltage vari-

ations resulting from the urrent transients of the logi ir uitry are within ertain limits. To

verify the voltage integrity of the power grid, one should he k that the supply voltage is within

an a eptable range under all possible urrent tra es that result from the swit hing a tivity of

the logi ir uitry. Of ourse, exhaustive simulation of the grid is infeasible.

Given a spe i urrent waveform (or a set of waveforms), the straightforward approa h

would be to ompute the voltage drop waveform orresponding to ea h of the urrent waveforms

using (2.16) for the RC ase or (2.32) for the RLC ase. The grid would be marked as safe if

these voltage drop waveforms are within an a eptable range. The main problem with this

approa h is one needs to spe ify the urrent tra es that a urately represent the swit hing

a tivity of the logi blo ks; su h urrent tra es annot be spe ied (with onden e) unless

the logi ir uitry has already been spe ied and, so, these methods are unsuitable for early

grid veri ation. Several approa hes have been proposed in the literature for verifying the

grid using urrent waveforms. These methods build on this straightforward approa h by either

redu ing its omputational omplexity or proposing ways to arrive at a representative set of

urrent waveforms. These methods are ommonly used in industrial tools and are referred to as

ve tor-based methods.

Notably, ve tor-based methods require the user to obtain these urrent waveforms, whi h an

only be done at a late stage of the design pro ess when omplete knowledge of the underlying logi

ir uitry is available. Ve torless methods, or pattern independent methods, were developed as

an alternative to ve tor-based methods. They assume only in omplete information of the ir uit

urrents is available, the type of information whi h may be easy to get early on in the design

pro ess. This in omplete information is usually in the form of onstraints on the urrents

drawn by the logi ir uitry, whi h represent some sort of impre ise power budgets of the logi

ir uitry. Ve torless methods use these urrent onstraints to nd onservative bounds on the

voltage drops. The grid is then marked as safe if these bounds are within an a eptable range.

2.5.1 Ve tor-Based Methods

Ve tor-based power grid veri ation onsists of simulating the power grid under a sequen e of

input urrent ve tors and omputing the orresponding voltage drop. Naturally, it is di ult

to apture all possible urrent tra es using a small number of urrent ve tors. Thus, the

simulation pro ess must be done e iently to be able to verify as many ve tors as possible in a

timely manner. General ir uit simulators like SPICE are not very useful for grid veri ation as

these simulators have signi ant omplexity overhead. The power grids of modern hips onsist

of over a billion nodes and, so, require more e ient methods to simulate. On the other hand,

the logi ir uitry is very omplex and, onsequently, it is impossible to ompletely apture the

a tivity of this ir uitry in a nite number of urrent ve tors. Ve tor-based methods simulate

the grid under a large set of urrent ve tors whi h only provides a ertain level of onden e

on the safety of the grid. The major limitation of these methods is that they require urrent

ve tors, whi h are only available after the logi ir uitry has been nalized, and, so, are not

suitable for early power grid veri ation.

Multigrid Approa h

For a given set of urrent waveforms, the authors in [41 propose a multigrid-like approa h,

inspired by the Algebrai Multigrid (AMG) and the Standard Multigrid (SMG) te hniques.

The proposed method is an iterative grid redu tion algorithm that produ es a signi antly

redu ed oarser grid. The obje tive of the grid redu tion algorithm is to sele tively remove as

many nodes as possible while maintaining the ability to estimate voltages at the removed nodes.

Generally speaking, grid redu tion algorithms, su h as those based on SMG te hniques, skip

every other wire of the grid. However, typi al power grids may be irregular, i.e., dierent edges

may have dierent lengths and dierent separation distan es. Thus, the redu tion algorithm

should present a systemati me hanism for redu ing any general grid. After oarsening the grid,

the proposed te hnique solves the redu ed system and maps the solution ba k to the original

system. The voltages at the removed nodes on the ner grid are interpolated based on the

voltages of the kept nodes that are strongly onne ted to it. Simulation results show that the

proposed te hnique a hieves up to 16X-20X speedups for stati analysis and up to 600X for

dynami analysis.

Hierar hi al Analysis

The authors in [77 proposed a method that exploits the hierar hi al stru ture of the power grid.

This method partitions the grid into small lo al grids that are onne ted to a global grid via

port nodes at the interfa e. To model the intera tion between dierent lo al grids, the authors

employ the notion of ma romodels; essentially a model that abstra ts away the behavior of the

sub-grids at their ports. Solving the original grid boils down to simulating the redu ed global

grid, and nally, omputing the voltages at the nodes of ea h lo al partition. Simulation results

show that this approa h a hieves a 2X-5X speedup, as well as a 10X-20X redu tion in memory

requirement as opposed to solving the traditional at approa h.

Random Walk

A random walk approa h for power grid analysis was proposed in [59. The authors pose power

grid analysis as a probabilisti problem where the voltage at ea h node is expressed as the

expe ted value of the voltages on neighboring nodes, weighted by ondu tan e values. Several

Monte Carlo simulations are performed in the form of random walks on the grid to estimate

(statisti ally) the voltage levels on the power grid. A major feature of this approa h is that it

allows e ient estimation of the voltage drop at sele ted nodes without the need to solve the

entire system.

2.5.2 Ve torless Methods

As dis ussed previously, ve tor-based methods rely heavily on the user to spe ify a set of urrent

ve tors that aptures the a tivity of the logi ir uitry. These methods are thus impra ti al

and annot be used early in the design pro ess when most of the logi ir uitry is still not

available. Ve torless methods were developed as an alternative to ve tor-based methods. These

methods only require a modest amount of information about the logi ir uitry while produ ing

onservative solutions that bound the voltage drops.

Statisti al Te hniques

In [58, the authors suggest a method that relies on statisti al veri ation. The main obje tive

of this method is to nd the rst-order statisti s (mean and varian e) of the voltage drop on the

grid. In short, the authors model the urrents drawn by the power grid as sto hasti pro esses,

then they propose a method to propagate the statisti al parameters, whi h in ludes orrelations

between dierent blo ks both in time and spa e, through the linear system of the power grid

to obtain the distribution of voltage drops at any node in the grid. As a rst step, the grid

is partitioned into large blo ks keeping the orrelation between the urrents drawn by these

blo ks minimal. The blo ks are then simulated in order to obtain the mean, auto- orrelation,

and ross- orrelation for ea h blo k urrent. After that, the power supply network is modelled

as a linear system in order to obtain the transfer fun tion from every ir uit blo k to every node

voltage. These impulse responses, along with the statisti al parameters of blo k urrents are

then used to generate statisti al parameters for the voltage drop.

The Constraints-Based Framework

The onstraints-based framework was rst introdu ed by Kouroussis et al. [40. As users may

not know the exa t urrent tra es of the underlying logi ir uitry, or may not yet know the

exa t logi ir uitry, it is rarely the ase that they know nothing about them. There is always

some engineering judgement or expertise from previous design a tivities, in the same or similar

te hnologies that users an bring to bear. Based on this, and to over ome the un ertainty

problem, the urrent onstraints framework was introdu ed in [40. These onstraints are an

abstra tion whi h aims to apture design knowledge, and they fall somewhere between knowing

everything and knowing nothing about the urrents and ir uits that load the grid.

These onstraints are DC onstraints in a sense that they are xed over time. DC urrent

onstraints are peak urrent upper-bound limits, or envelopes on the urrent waveforms. For

instan e, onstraints on single urrent sour es are of the form

0 ≤ i(t) ≤ iL, ∀t (2.36)

where iL is an m × 1 onstant ve tor representing the upper-bounds on individual urrent

sour es. Constraints in this form are referred to as lo al onstraints. Although they are DC

onstraints, the method allows for any transient urrent waveform that ts below them. If only

lo al onstraints are provided, the urrent sour es would be able to simultaneously draw their

peak allowable urrents resulting in an overly pessimisti voltage drop; this is typi ally not the

ase and, hen e, the need for global onstraints. Global onstraints impose limits on the sum of

the urrents drawn by groups of urrent sour es, whi h an be thought of as power budgets on

groups of logi blo ks. If there are l global onstraints, then these onstraints may be expressed

0 ≤ Ui(t) ≤ iG, ∀t, (2.37)

where iG is an l× 1 ve tor of upper bounds orresponding to global onstraint values, and U is

an l × n in iden e matrix onsisting of 1 and 0 elements only, where Uij = 1 if urrent sour e

j is in luded in the ith global onstraint. Lo al and global onstraints an be ombined into a

single matrix inequality as follows

0 ≤Wi(t) ≤ ic, ∀t (2.38)

, and ic =

(2.39)

A transient urrent waveform i(t) is said to be feasible if it satises (2.38). In fa t, the set of

inequalities (2.38) denes a multidimensional polytope F , also referred to as feasible spa e of

urrents, that represent all possible feasible urrent tra es as follows

= i(t) : 0 ≤Wi(t) ≤ ic (2.40)

With this, given a feasible spa e F , one an formulate the ve torless power grid veri ation

problem as follows: under all possible urrent tra es that satisfy the (lo al and global) on-

straints, he k whether the grid voltages satisfy ertain user spe i ations, so that the grid is

For an RC grid, where only positive voltage drops (undershoots) are observed, one an nd

the worst- ase voltage drop, under all feasible urrent tra es, by solving n optimization problems

as follows

for j = 1, . . . , n do (2.41)

v∗j (τ)

= Maximize vj(τ)

subje t to i(t) ∈ F , ∀t(2.42)

Using the emax(·) notation introdu ed in Denition 2.1, we an ombine the v∗j (τ)'s in one

ve tor v∗(τ), as follows

v∗(τ) = emaxi(t)∈F ,∀t

[v(τ)] (2.43)

The grid is said to be safe if the worst- ase voltage drops are within a user-spe ied threshold.

For RLC grids, both positive (undershoots) and negative (overshoots) voltage drops are

possible. Thus, one would like to nd the the minimum (worst- ase overshoot) and the maximum

(worst- ase undershoot) voltage drop. Using the eopt(·) notation introdu ed in Denition 2.3,

we an apture these worst- ase variations in one ve tor, as follows

eopti(t)∈F ,∀t

[v(τ)]

emaxi(t)∈F ,∀t

[v(τ)]

emini(t)∈F ,∀t

[v(τ)]

(2.44)

Here, the grid is said to be safe if the worst- ase voltage variations (in both dire tions) are

within ertain user-spe ied thresholds.

It turns out that nding the worst- ase voltage drop (for RC and RLC grids) is prohibitively

expensive as it requires an innite sum of emax(·) operations (in the RC ase) and eopt(·)

operations (in the RLC ase) [53. Several improvements have been made to mitigate the

runtime omplexity of the onstraints-based framework; we will review a few of these methods.

For RC grids, the authors in [31 and [53 derived an important upper-bound on v∗. This

bound is mu h easier to ompute, ompared to v∗; it requires the omputation of A−1, whi h

is a full matrix, and solving n linear programs (LPs). Be ause this bound requires as many

LPs as the number of nodes in a grid, [2 proposed a sparse approximate inverse te hnique to

redu e the size of the LPs. Due to topologi al properties of the grid, voltages of grid nodes

are not ompletely independent, whi h was exploited by dominan e relations among voltage

drops in [5. In [71, a restri tion of the problem to a hierar hi ally stru tured set of power

onstraints was onsidered to a hieve a signi ant runtime improvement. Another interesting

te hnique was proposed in [72 that veries ea h subgrid of the original network independently,

imposing boundary onditions on the neighboring nodes of that subgrid, i.e. without expli itly

involving nodes that are not dire tly onne ted to the subgrid. In [6, an in remental veri ation

te hnique was proposed whi h allows a subset of the full grid nodes to be veried and employs

a ma romodel to the rest of the grid. In [75, a model order redu tion (MOR) method was

proposed that fa tors in grid topology as a way to redu e the size of the grid by eliminating

layers that do not impa t the a ura y of the veri ation pro ess. In [28, the authors proposed

a te hnique to spe ify the time-step that appears in the re urren e (2.16) in order to minimize

the errors between the exa t worst- ase voltage drop and the upper-bound.

For RLC grids, a onservative bound on the worst- ase voltage drop was rst proposed

in [1, whi h was then followed by a similar bound by the same authors that an be omputed

more e iently [3. This bound is omputed as a sum of r eopt(·) terms. The results in [3

showed that the value of r is typi ally between 1 and 7. In [29, the authors proposed a way to

spe ify the time-step that appears in the re urren e (2.32), as its ounterpart in the RC ase,

in order to minimize the errors between the exa t worst- ase voltage drop and the omputed

bound. Furthermore, the authors showed that su h a time-step would only require a single

eopt(·) operation (i.e. r = 1), whi h leads to a more e ient bound omputation.

2.6 Generating Current Budgets for RC Grids

Ve torless veri ation methods have been fully developed over the last de ade [53, but a key

question remains: how would one obtain/spe ify the urrent onstraints? This point often

omes up as a limitation to ve torless methods and it has be ome a hurdle to adoption of these

methods. The onstraints are meant to apture the peak power dissipation of ir uit blo ks.

It's easy to see how to get the onstraints for a logi blo k that is available (down to the ell

level) and small enough to exhaustively simulate, by using an oine hara terization pro ess.

Otherwise, if the blo k is unavailable or too large to simulate, one must rely on engineering

judgment, and/or expertise from previous design a tivities, whi h pla es an undue burden on

users.

Instead of the traditional approa h of expe ting users to provide urrent onstraints that

would be used to he k if the grid is safe (what one might all the forward problem), Moudallal

et al. [50 proposed to solve the inverse problem: given a grid and the allowed voltage drop

thresholds at all grid nodes, we would like to generate ir uit urrent onstraints whi h, if

satised by the underlying ir uitry, would guarantee grid safety. These urrent onstraints are

also referred to as urrent ontainers. It turns out that there is an innity of possible urrent

ontainers that guarantee power grid safety, referred to as safe urrent ontainers. The authors

in [50 provide a rigorous problem denition and develop some key theoreti al results related to

maximality of the urrent spa e dened by the onstraints. Based on this, the authors develop

two algorithms that generate spe i urrent ontainers targeting key grid quality metri s su h as

the peak power dissipation a grid an safely support and the uniformity of urrent/temperature

distribution a ross the die area.

In this thesis, we improve and extend the onstraints generation framework introdu ed

in [50. In the following, we provide a brief review of some important terminology introdu ed

in [50 as well as some of the main theoreti al results established in [50 that are ru ial to follow

the rest of the thesis.

2.6.1 Safe Containers

The following denition introdu es the notion of a urrent ontainer for a ve tor of urrent

waveforms.

Denition 2.12. (Container) Let t ∈ R, let i(t) ∈ Rm

be a fun tion of time, and let F ⊂ Rm

be a losed subset of Rm. If i(t) ∈ F , ∀t ∈ R, then we say that F ontains i(·), represented by

the shorthand i(·) ⊂ F , and we refer to F as a ontainer of i(·).

Figure 2.6 illustrates the idea of a ontainer for a simple ase of two urrent waveforms.

Be ause i(t) = [i1(t) i2(t)]T ∈ F for all time instants, we say that F ontains i(t).

0 1 2 3 4 5

time (sec)

0 1 2 3 4 5

time (sec)

0 2.5 5 7.5 10 12.5I1

Figure 2.6: Example of a ontainer F for i1(t) and i2(t).

To he k if a power grid is safe, one would typi ally be interested in the worst- ase voltage

drop at some grid node k, at some time point τ ∈ R, over a wide range of possible urrent

waveforms. Using the above notation, and given a ontainer F that ontains a wide range of

urrent waveforms of interest, this an be expressed as maxi(·)⊂F (vk(τ)). Clearly, be ause F is

the same irrespe tive of time, and applies at all time points t ∈ R, then this worst- ase voltage

drop must be time-invariant, independent of the hosen time point τ . Therefore, one way to

he k grid safety is to ompute the worst- ase voltage drop attained by ea h omponent of v(t),

denoted as v∗(F) = emaxi(·)⊂F (v(τ)). The exa t expression for v∗(F) was derived in [53 to be

v∗(F) =∞∑

emaxI∈F

(MB)qM ′I]

(2.45)

where I ∈ Rm

is a ve tor of arti ial variables, with units of urrent, that is used to arry out

the emax(·) operation. Computing the exa t v∗(F) is prohibitively expensive and so the authors

in [50 instead use an upper-bound on v∗(F) based on the following.

Denition 2.13. For any F ⊂ Rm, dene:

= G−1A emaxI∈F

M ′I)

(2.46)

In [53, it has been shown that v(F) is an upper-bound on v∗(F):

v∗(F) ≤ v(F), ∀F ⊂ Rm

(2.47)

Furthermore, in [28, the authors show that, for a ertain range of the dis retization time-step

∆t resulting from using BE dis retization in (2.13), the a ura y of this upper-bound relative

to v∗(F) is quite good (a maximum error of 1 mV on a 100K-node grid).

Denition 2.14. (Safe Container) A ontainer F is said to be safe if P v(F) ≤ Vth.

Thus, a safe ontainer F is useful be ause, due to (2.47), it guarantees that Pv∗(F) ≤ Vth, so

that the grid is safe for that ontainer. A safe ontainer F an be expressed as a set of onstraints

on the ir uit urrents that load the grid, thereby providing a set of linear urrent onstraints

that are su ient to guarantee grid safety. In the onstraints-based framework (des ribed in

Se tion 2.5.2), urrent ontainers were spe ied and the orresponding worst- ase voltage drop

was found by a pro ess of optimization. In the onstraints generation framework [50, these

ontainers are generated so that, if the ir uit is designed to respe t these onstraints, the grid

be omes safe by design. Some of the major results in [50 are restated below.

2.6.2 Maximal Containers

Let u ∈ Rnand dene the sets U , F(u), and S as follows:

= u ∈ Rn : u ≥ 0, Pu ≤ Vth (2.48)

= I ∈ Rm : I ≥ 0, M ′I ≤MGu (2.49)

= F(u) : u ∈ U (2.50)

where U is ee tively a set of safe voltage drop assignments u, F(u) is a spe ial kind of ontainer

onstru ted based on u ∈ U , and S is the set of all ontainers F(u) orresponding to u ∈ U . It

turns out that it is enough to onsider only ontainers of the form (2.49), due to the following

ne essary and su ient ondition.

Lemma 2.6. [50 A ontainer J ⊂ Rm+ is safe if and only if it is a member of S or a subset of

a member of S.

The importan e of this lemma is two-fold: 1) F(u) is safe for any u ∈ U and 2) all interesting

safe ontainers J may be found as either spe i F(u) for some u ∈ U , or as subsets of su h

F(u). Note that, if J ⊆ F(u), for some u ∈ U , with J 6= F(u), then learly F(u) is a better

hoi e than J . Choosing J would be unne essarily limiting, while F(u) would allow more

exibility in the ir uit loading urrents. Therefore, it is enough to onsider only ontainers of

the form F(u) with u ∈ U . Going further, if F(u1) ⊆ F(u2) with F(u1) 6= F(u2), then learly

F(u2) is a better hoi e than F(u1). Thus, in a sense, the larger the ontainer, the better.

Therefore, one is interested in safe ontainers that are not fully ontained in any other safe

ontainer. These ontainers are referred to as maximal ontainers. Mathemati ally, the notion

of maximality is aptured in the following denition:

Denition 2.15. Let E be a olle tion of subsets of Rm

and let X ∈ E. We say that X is

maximal in E if there does not exist another Y ∈ E, Y 6= X , su h that X ⊆ Y.

The authors in [50 derived a ne essary and su ient ondition for maximality, whi h is

restated below in Theorem 2.1. This result depends on important properties of u, namely the

extremal and irredu ible properties. The denitions of these properties are also restated below

for ompleteness of presentation.

Denition 2.16. For any u ∈ U , we say that u is extremal in U if ∃k ∈ 1, . . . , d su h that

Pu|k = Vth,k.

Denition 2.17. We say that u ∈ Rnis redu ible if there exists u′ ≤ u, u′ 6= u, with F(u′) =

F(u); otherwise, u is said to be irredu ible.

Theorem 2.1. F(u) is maximal in S if and only if u is irredu ible and extremal in U .

Based on this, the authors in [50 then develop two algorithms for onstraints generation

that target key grid quality metri s, su h as the peak total hip power that is allowed by the grid

and the uniformity of urrent distribution. These algorithms are formulated as a maximization

of a ertain design obje tive g(u), over all safe voltage assignments u.

2.7 Power-Gated Grids

Power gating [35, 74, 73 refers to design te hniques that partition the logi ir uitry of a hip

into fun tional blo ks that may be sele tively powered ON or OFF. Modern high-performan e

hips in lude very large power grids. While the power grid is mostly a passive RLC stru ture,

power grids often also in lude a tive devi es (e.g. MOSFETs) that implement power-gating

to allow the supply urrents (in luding leakage) of major ir uit blo ks to be turned o by

dis onne ting them from the rest of the grid. Thus, su h a ir uit blo k has its own lo al grid

(as we all it) that may be ut o from the rest of the grid (whi h we all the global grid). We

refer to a power grid with a tive devi es as an a tive power grid; otherwise it is a passive power

Depending on what blo ks that are ON/OFF, the total power dissipation and temperature may

ex eed spe i ations, so that there is a need to s hedule the hip workload (whi h blo ks are

ON/OFF) in order to remain within the allowed power/temperature spe s. Several authors have

looked at this question, in luding for example [46, 24, 37. But the hip workload also impa ts

the voltage drop on the grid. Depending on the ombination of blo ks that are in operation,

large amounts of urrent may ow through the grid ausing ex essive voltage variations that

put both ir uit performan e and reliability at risk. Proper design and operation of an a tive

grid is ru ial to ensure supply integrity to the ir uit blo ks, and so avoid timing and signal

integrity problems.

So far, we have reviewed the state-of-the-art te hniques to verify the voltage integrity of a

passive power grid. With a tive grids, this veri ation be omes very di ult be ause of the

many possible ombinations of power states that the hip blo ks an have. For example, a

hip with 20 blo ks, with 2 power states (i.e. ON and OFF) ea h, has over a million working

modes. A brute-for e approa h would require exhaustive transient simulation under all possible

working modes, ea h overing a very large number of lo k y les to apture the dynami s of

the ir uit. The authors in [78 propose an e ient transient analysis approa h of the power

distribution network exploiting lo alized voltage variations near the a tive blo ks. Su h an

approa h requires full knowledge of the urrent waveforms drawn by every logi blo k atta hed

to the grid. Thus, it does not allow for early grid veri ation, when grid modi ations an be

most easily in orporated. Furthermore, the number of urrent tra es needed to over the spa e of

voltage drops exhibited on the grid is intra table for modern designs. In [76, the authors propose

a te hnique to drasti ally redu e the number of full simulations, by modeling the lo al grids as

swit hable urrent sour es. Assuming that the urrent waveforms representing the urrents

drawn by the underlying ir uitry are available, the method determines an approximate set of

working modes that generates the largest average urrent from the blo k's power-taps. Then,

the full grid is simulated under this set of working modes for hundreds of lo k y les. A major

problem in this work is that the worst- ase working modes are determined based on the urrents

rather than the voltage drop. In Chapter 5, we onsider the question of how to manage the hip

workload so that supply voltage variations remain within spe i ations.

2.8 Ele tromigration

Ele tromigration (EM) is a major reliability on ern in the design of large integrated ir uits

(ICs) in the wake of ontinuing te hnology s aling. The EM phenomenon is the dire tional

migration of metal atoms in a metal line due to high urrent density in the line. While signal

and lo k lines also suer from EM degradation, these lines arry bidire tional urrent and

so have longer lifetimes due to so- alled healing. In ontrast, power grid lines arry mostly

unidire tional urrent with no benet of healing and thus are more sus eptible to EM failure.

Hen e, our fo us is on EM in power grids. As mentioned previously, modern grids onsist of

long interleaved lines that arry supply and ground to the underlying logi blo ks. These long

stru tures on every layer have been alled inter onne t trees in the reliability literature, but this

name is unfortunate be ause it leads to onfusion with the word inter onne t as typi ally used

for signal lines, and be ause even though these lines are often trees (a y li graphs), they do

not have to be so. Here, and later in Chapter 6, we will ontinue to use the term inter onne t

trees to refer to these stru tures keeping in mind that our work (in Chapter 6) allows for both

trees and for general ( y li ) graphs.

2.8.1 The Physi al Phenomenon

Ele tromigration is the mass transport of metal atoms in a metal line due to momentum transfer

between ele trons and atoms, whi h eventually leads to void formation in the metal line. The

pro ess of EM degradation an be divided into two phases: void nu leation and void growth.

Under onditions of high urrent density, the for e exerted by the ow of ele trons an

ause the metal atoms to move in the dire tion of the ele tron ow. If the in-ow of atoms is

not equal to the out-ow, ertain points within a metal segment may experien e high tensile or

ompressive stresses. In modern hip manufa turing te hniques, failure due to ompressive stress

is not usually observed. However, the build-up of tensile stress eventually leads to formation of

a void when the stress rea hes a riti al threshold. This phase of EM degradation, when stress

is in reasing over time but no voids have yet nu leated, is alled the void nu leation phase. In

this phase, the resistan e of a line remains roughly the same as that of a fresh (undamaged) line.

On e a void nu leates, the void growth phase begins. The void starts to grow in the dire tion of

the ele tron ow and the line resistan e in reases towards some high nite steady-state value.

The ondu tan e of the line de reases but never quite goes to zero.

2.8.2 Ee tive-EM Current

As mentioned earlier, there are three types of parasiti ee ts in power grids: resistive, apa itive

and indu tive. Be ause EM is a long-term failure me hanism, short-term transients in hip

workload ( ir uit swit hing a tivity) do not play a signi ant role in EM degradation. Thus, a

standard pra ti e in the eld is to use a ( onstant) ee tive- urrent model [67 to estimate EM

degradation, so that the lifetime of a metal line when arrying the onstant ee tive urrent and

the time-varying transient urrent is roughly the same. Traditionally, the ee tive-EM urrent

is omputed based on some assumed periodi urrent waveform. For unidire tional waveforms,

the ee tive-EM urrent is simply the time-averaged urrent. For bidire tional urrents, the

ee tive-EM urrent density is given as [67, 43

jac,eff =1

(∫ T

0j+(τ)dτ − µ

0|j−(τ)|dτ

(2.51)

where j+(t) and j−(t) denote the urrent waveforms in the hosen positive and negative dire -

tions, respe tively, T is the waveform period, µ is the EM re overy fa tor that is determined

experimentally. The positive dire tion is hosen su h that

∫ T0 j+(τ)dτ ≥

∫ T0 |j−(τ)|dτ .

When voids nu leate due to EM, bran h resistan es hange fairly qui kly. Correspondingly,

the bran h urrents also hange fairly qui kly to their new ee tive values. Hen e, between any

two su essive void nu leations, the power grid has onstant (ee tive) bran h urrents, voltages

and ondu tan es, and so an be modelled as a DC system onsisting of only resistive parasiti s.

With this, the power grid model an be expressed as

G(t)v(t) = u (2.52)

where G(t) is the pie ewise- onstant ondu tan e matrix (it varies over time, over large time-

s ales, as the lines age and deform, hen e the time dependen e), v(t) is the orresponding

time-varying but pie ewise onstant node voltage drop ve tor and u is the ve tor of onstant

ee tive sour e urrent values that model the underlying logi blo ks.

2.8.3 Ele tromigration Failure Models

Bla k's Model

Bla k's model [12, proposed by J. R. Bla k in 1969, is one of the earliest empiri al models for

estimating the EM mean time to failure (MTF). This model applies to an isolated line arrying

a xed DC urrent and gives the Mean Time-to-Failure (MTF) of the line as

MTF =Abl

(2.53)

where Abl is a proportionality onstant, j is the onstant urrent density ( urrent per unit

ross-se tional area) in the line, kb is the Boltzmann's onstant, Tm is the temperature of the

line in Kelvin, n > 0 is the urrent density exponent that depends on the material of the

wire and the failure stage, and Ea is the a tivation energy. The parameters Abl, n, and Ea

are determined experimentally as follows: 1) isolate a metal line and apply a high urrent

density through a large population of su h lines at a higher than typi al operating temperature,

referred to as a elerated testing, 2) the TTFs obtained from the a elerated testing are then

tted to a lognormal distribution to estimate the MTF under these testing onditions, 3) the

parameters Abl, n and Ea are then determined using regression analysis, and 4) the values

of these parameters are then used to extrapolate their MTF to typi al operating onditions.

With this, the time-to-failure of a metal segment is known to be a lognormally distributed

random variable (i.e. the logarithm of the time-to-failure has a normal (Gaussian) distribution)

with mean given in (2.53) and standard deviation determined experimentally for a given metal

te hnology.

Ble h et al. [13, 14, 15 dis overed that su iently small lines have very long lifetimes and

in many ases, and an be onsidered immortal; referred to as the Ble h ee t. Spe i ally, a

line is found to be immune to EM failure if the produ t of its length and urrent density is less

than the riti al Ble h produ t (jL)c, dened as

(jL)c =Ω∆σmax

q∗ρ(2.54)

where Ω is the atomi volume, ∆σmax > 0 is the maximum stress dieren e between the athode

and the anode before void nu leation o urs, q∗ is the absolute value of the ee tive harge of

the migrating atoms and ρ is the resistivity of the metal.

State-of-the-art EM he king te hniques employ the Ble h limit (2.54) ombined with Bla k's

equation (2.53) to assess the reliability of the grid. Equation (2.53), ombined with the Ble h

ee t, ame to be known as the Bla k's model. Re ently, the use of Bla k's model has be ome

highly ontroversial. Lloyd [47 pointed out that the values of the parameters Abl, n and Ea

obtained under stressed onditions (elevated j and Tm) annot be used for the a urate estima-

tion of MTF at the use onditions. Furthermore, Haus hildt et al. [33 ondu ted experiments

whi h showed that n and Ea depend on j and Tm, whi h undermines the validity of Bla k's

equation. Moreover, the Bla k's model impli itly assumes that the lifetimes of neighboring lines

are independent, whi h is simply not true [32.

Physi s-based EM Models

Korhonen et al. [39 proposed a one-dimensional (1D) model to des ribe the hydrostati stress

(average pressure) σ arising under the inuen e of ele tromigration. For a uniform metal line

embedded in a rigid diele tri , Korhonen's model aptures the hange in σ, at a lo ation x in a

metal line, using the following partial dierential equation (PDE):

∂x−q∗ρ

, (2.55)

where j is the urrent density in the line, Da = D0e−Q/(kbTm)

is the lognormally distributed [48

ee tive atomi diusivity with onstant oe ient D0, Ω is the atomi volume, kb is the

Boltzmann's onstant, Tm is the temperature in Kelvin, q∗ is the absolute value of the ee tive

harge of the ondu tor, ρ is the resistivity of the ondu tor, and Q is the a tivation energy for

va an y formation and diusion. The orresponding atomi ux Ja in the line an be written

Ja =DaCΩ

∂x−q∗ρ

. (2.56)

Note that EM degradation is highly dependent on the spe i mi rostru ture of a given line,

whi h is ae ted by random manufa turing variations. This randomness is primarily a ounted

for by the orresponding randomness in Da. Finally, and as is typi al in the eld, a void is said

to nu leate at some point x on e the stress at that point ex eeds a predened threshold value

σth > 0.

2.8.4 Power Grid EM Che king Approa hes

Industrial Approa h

Standard pra ti e employed in industry for EM he king is to break up a grid into isolated metal

bran hes, assess the reliability of ea h bran h separately using Bla k's model and then use the

series model (earliest bran h failure time) to determine the EM-lifetime of the whole grid. The

general ow of the EM he king approa h used in industry is as follows. First, one determines

whi h bran hes are immune to EM failure, using the Ble h limit (2.54). Then, for a given target

EM-lifetime spe i ation of the grid (MTFtarget), a maximum urrent density limit is omputed

for the bran hes that are not immune to EM failure, using the following relation [45:

jmax = jacc

MTFacc

MTFtarget

Tm,use−

Tm,acc

(2.57)

where MTFacc is the MTF observed under a elerated testing onditions, i.e. under urrent

density jacc and temperature Tm,acc, and Tm,use is the operating temperature at whi h the hip

will be used, and the other symbols are as dened before. If the urrent density j in any of

the bran hes violates the orresponding jmax limit, then the whole grid is said to fail before the

target EM lifetime.

These methods have two main limitations:

1) Bla k's model ignores the material ow between bran hes. In today's mesh stru tured

power grids, many bran hes within the same metal layer may be onne ted, leading to a

so- alled inter onne t tree stru ture, and atomi ux an ow freely between the bran hes

of an inter onne t tree.

2) The series model assumes that the grid is as weak as its weakest link. Modern power grids

use a mesh stru ture, so that even if a bran h fails the power grid may ontinue to ondu t

urrent properly. In fa t, there are many paths for the urrent to ow from the C4 bumps

to the underlying logi , a hara teristi that we refer to as redundan y. Thus, it is highly

pessimisti to assume that a single bran h failure will always ause the grid to fail.

Mesh Model

Chatterjee et al. [16, 17 proposed the mesh model as an alternative to the series model. A -

ording to this model, the power grid is said to fail when enough bran hes have failed so that

the voltage drop at some grid node(s) has ex eeded the a eptable user-spe ied threshold.

However, [16, 17 still used Bla k's model to nd the reliability of individual bran hes, whi h is

ina urate.

Steady-State based EM Model

Huang et al. [34 proposed an EM model that a ounts for the material ow and was based on

the steady state stress of a tree, whi h allows them to determine the potential void lo ations

in a tree, but the a tual time and sequen e of void nu leations might vary onsiderably from

the predi ted ones. Furthermore, their approa h does not estimate the MTF of the grid but

rather assumes that a single TTF sample of the grid is enough to apture the EM-lifetime, thus

ignoring the random nature of EM degradation.

2.8.5 Physi s-based EM Che king

Re ently, an e ient physi s-based full- hip EM assessment approa h was proposed in [19 whi h

a ounts for the material ow and the oupled stresses within an inter onne t tree and employs a

voltage-based failure riterion. Furthermore, the results of this approa h were validated against

experimental results as well as nite element analysis (FEA) simulations in [23, whi h makes

this method more suitable for EM assessment.

Stress in an Inter onne t Tree

Chatterjee et al. [19 augmented Korhonen's model by introdu ing boundary laws to tra k the

material ow between the onne ted bran hes. This enables one to evaluate the EM degradation

of an inter onne t tree as a whole. A bran h is a ontinuous straight metal line of uniform width

and a jun tion is any point on the inter onne t tree where a bran h ends or where a via is lo ated.

The authors show that the stress evolution within a tree an be represented as a Linear

Time-Invariant (LTI) system:

σ(t) = Aσ(t) +Bµ (2.58)

where µ is the input ve tor whi h depends on the urrent densities in the tree's bran hes, and

A and B are the system and input matri es, respe tively, whi h an be onstru ted using state

stamps. The reader is referred to [19 for a detailed des ription and derivation of the LTI

formulation. Furthermore, the initial ondition to the above system is the thermal stress at

time t = 0:

σ(0) = σT (0) (2.59)

The authors make the simplifying assumption, whi h is typi al in the eld, that the diusivity

is the same throughout a bran h. As a result, voids nu leate only at jun tions of a tree. On e

the stress at any jun tion rea hes σth, a void is said to nu leate at that point and ae ts all the

onne ted bran hes. When a void nu leates at a jun tion, the jun tion is on eptually treated

as a new jun tion for ea h of the onne ted bran hes su h that there is no material ow between

these new jun tions. Thus, the tree is ee tively divided into separate subtrees, where the stress

evolution in ea h subtree an be aptured by a new LTI system (in the form of (2.58) using

suitable state stamps and initial ondition). However, even though there is no material ow

between the formed subtrees, the ondu tivity (ele tron ow) does not quite go to zero (due to

ondu tion in the metal liner).

Voltage-aware EM Analysis

As mentioned previously, for a grid to fun tion as intended, the voltage drop at ea h of its nodes

should be smaller than a ertain threshold be ause otherwise, timing violations and logi failures

may o ur. A node is said to be safe when its voltage drop meets the orresponding threshold

ondition. Let Vth be the ve tor of all the threshold values whi h are typi ally user-spe ied,

and assume that Vth > 0 to avoid trivial ases. The time-to-failure of a grid is the earliest time

t for whi h a voltage violation o urs, i.e. v(t) ≤ Vth is no longer true.

Noti e that the time-to-failure of the grid, denoted as TTF, is a random variable, be ause

the stress evolution is highly dependent on the ee tive atomi diusivity of the bran hes in the

grid, whi h are random variables. Assigning a diusivity value to ea h bran h in the grid denes

a grid sample G(i). The time-to-failure of G(i)

is a deterministi value and will be referred to as

the TTF sample of G(i)and denoted as TTF

(i). To obtain a TTF sample of a given grid sample

G(i), the authors [19 onstru t a set of LTI systems (ea h orresponding to an inter onne t

tree) as in (2.58), with initial thermal stress. Ea h of these systems are then solved numeri ally

(simulated) to determine the earliest void nu leation among all trees. The simulation is then

interrupted, a void growth model and a resistan e model are applied, the impa t on voltage

drop is found. Based on this, the grid is either de lared failed or the next round of simulation

is restarted to nd the next nu leation time, and this pro ess is repeated until a grid failure is

found. This ow is part of an overall Monte Carlo (MC) loop that a ounts for the randomness

in the lines and ultimately provides the grid mean time-to-failure (MTF) as the average

MTF ≈1

Nmc∑

(i)(2.60)

where Nmc is the number of TTF samples required to satisfy a user-spe ied error toleran e.

Chapter 3

Generating Current Budgets for RC

Grids Revisited

In this hapter, we follow up on the onstraints generation framework developed in [50. Spe if-

i ally, we will fo us on the set of ir uit urrent onstraints

M ′i(t) ≤ MGu, ∀t

whi h was shown to guarantee safe voltage levels on the grid for any safe voltage assignment

u. Spe i safe urrent ontainers are generated by means of maximization of a ertain design

obje tive g(u), over all safe voltage assignments u.

We will provide some important theoreti al and algorithmi improvements to the work done

in [50:

1. We show that, under ertain onditions on u, the urrent ontainer has a simpler form;

the ontainer an be represented using signi antly fewer inequalities.

2. We hara terize the properties of design obje tives g(·) that an provably produ e maximal

ontainers.

Furthermore, we develop algorithms that target the same design obje tives as in [50 but are

more omputationally e ient (∼1000× speedup on average over the algorithms in [50 and

take around 30 minutes on 2.7 million node grid), whi h makes these algorithms more suitable

for real grid designs. We develop a new algorithm that generates a maximal ontainer targeting

a ombination of the design obje tives presented in [50. Finally, we investigate the trade-os

between the dierent proposed algorithms.

We onsider an RC power grid (as des ribed in Se tion 2.4.1) with n non-Vdd nodes. Out of

these n nodes, m nodes are atta hed to ideal urrent sour es and d nodes have voltage threshold

Chapter 3. Generating Current Budgets for RC Grids Revisited 36

spe i ation. So, G is the n × n ondu tan e matrix, A is the n × n matrix that appears as a

result of the time-dis retization in (2.16), M = A−1is the n× n matrix inverse of A, H is the

n×m matrix that identies whi h node is onne ted to whi h urrent sour e as in (2.8), M ′is

the n×m matrix dened as M ′ = MH, P is the d× n matrix whi h identies the nodes with

voltage threshold spe i ation, Vth is a d× 1 ve tor of voltage thresholds, and i(t) is the m× 1

ve tor of urrent waveforms of the m urrent sour es. Furthermore, I is an m × 1 ve tor with

units of urrent whi h will be used to represent i(t) at any arbitrary time point t, and u (whi h

will appear with subs ripts or supers ripts) is an n × 1 ve tor with units of Volts whi h will

be used to onstru t a urrent ontainer. The set U represents the set of all safe voltage drop

assignments, as dened in (2.48), the set F(u) is a spe ial kind of urrent ontainer onstru ted

based on u, as dened in (2.49), and the set S represents the set of all safe urrent ontainers,

as dened in (2.50). The mathemati al expressions of these sets are restated below:

U = u ∈ Rn : u ≥ 0, Pu ≤ Vth (3.1)

= I ∈ Rm : I ≥ 0, M ′I ≤ MGu (3.2)

= F(u) : u ∈ U (3.3)

Finally, v(F(u)) is the n× 1 ve tor of upper-bounds on the worst- ase voltage drop at all grid

nodes under all possible urrent waveforms ontained in a urrent ontainer F(u), as given

in (2.46) and restated below:

v(F(u)) = G−1A emaxI∈F(u)

(M ′I) (3.4)

3.2 Redundant Constraints

One on ern with F(u) is that the set of urrent onstraints M ′I ≤ MGu an be very large for

large grids. In this se tion, we will show that, under ertain onditions, some of these onstraints

are redundant, i.e. they an be removed from the system M ′I ≤ MGu without altering the set

F(u). Hen e, these onstraints should be dis arded, whi h redu es the number of onstraints

without ae ting the ontainer F(u). As we will see, the redundant onstraints are the rows

of the system M ′I ≤ MGu that orrespond to the nodes in the grid that are not onne ted

to urrent sour es. The intuition behind this result is that the voltage drops on these nodes

do not ae t the grid safety; the nodes of interest are the nodes where urrent sour es are

onne ted (i.e. the nodes where the hip ir uitry is onne ted and that usually have a voltage

drop threshold spe i ation), and so, the system M ′I ≤ MGu an be redu ed (under ertain

onditions) to onstraints on those nodes only. In Fig. 3.1, we show an example of a urrent

ontainer whi h was produ ed on a 14-node grid with 2 urrent sour es but an be represented

using only 2 onstraints. We introdu e the following notation to express the familiar matri es

G, A, and M in blo k form:

AT2 A3

MT2 M3

where A1 and M1 are an m×m matri es; A2 and M2 are m× (n−m) matri es; A3 and M3 are

an (n−m)× (n−m) matri es; and G1 and G2 are m×n and (n−m)×n matri es, respe tively.

Dene the n× 1 ve tor w and its sub-ve tors w(1)and w(2)

as follows:

=MGu (3.6)

where w(1)and w(2)

are m× 1 and (n−m)× 1, respe tively.

Lemma 3.1. For any u ∈ Rn, if A−1

3 G2u ≥ 0, then, ∀x ∈ Rm+ :

M1x ≤ w(1) ⇐⇒ M ′x ≤ w (3.7)

Proof. Clearly, if M ′x ≤ w, then M1x ≤ w(1). To prove that M1x ≤ w(1) =⇒ M ′x ≤ w,

onsider the following:

AT2 A3

MT2 M3

0 In−m

Using (3.8), we have:

AT2M1 +A3M

T2 = 0 (3.9)

AT2M2 +A3M3 = In−m (3.10)

or equivalently,

AT2M1 = −A3M

T2 (3.11)

AT2M2 = In−m −A3M3 (3.12)

Assume that M1x ≤ w(1), that is,

M1x ≤ w(1) =M1G1u+M2G2u (3.13)

Be ause AT2 ≤ 0, multiplying (3.13) by AT

2 we get:

AT2M1x ≥ AT

2M1G1u+AT2M2G2u (3.14)

0 5 10 15 20 25 30I1 (mA)

Figure 3.1: An example of a urrent ontainer with redundant onstraints whi h was produ ed

on a 14-node grid with 2 urrent sour es. The redundant onstraints are shown as dotted lines

whereas the non-redundant onstraints are shown as solid lines.

Using (3.11) and (3.14), we get:

−A3MT2 x ≥ −A3M

T2 G1u+AT

2M2G2u (3.15)

Be auseA3 is a prin ipal submatrix ofA, thenA3 is a non-singularM-matrix, due to Lemma 2.3,

so that A−13 exists and A−1

3 ≥ 0, due to Lemma 2.4. Therefore, we an multiply (3.15) by −A−13

to get:

MT2 x ≤MT

2 G1u−A−13 AT

2M2G2u (3.16)

Now, using (3.12) and (3.16), we get:

MT2 x ≤MT

2 G1u−A−13 (In−m −A3M3)G2u

=MT2 G1u+M3G2u−A−1

≤MT2 G1u+M3G2u = w(2)

where we used the fa t that A−13 G2u ≥ 0. Therefore, M ′x ≤ w and the proof is omplete.

In other words, if A−13 G2u ≥ 0, then the system of inequalities M ′I ≤MGu an be redu ed

from n to m, where m is the number of urrent sour es atta hed to the grid.

Corollary. For any u ∈ Rn, if Gu ≥ 0, then:

M1x ≤ w(1) ⇐⇒ M ′x ≤ w (3.17)

Proof. Clearly, if M ′x ≤ w, then M1x ≤ w(1). We now prove that M1x ≤ w(1) =⇒ M ′x ≤ w.

Let u ∈ Rnbe su h that Gu ≥ 0, so that G2u ≥ 0. Re all that A3 is a prin ipal submatrix of

A, so that A−13 exists and A−1

3 ≥ 0, it follows that A−13 G2u ≥ 0. Beneting from Lemma 3.1, it

follows that M1x ≤ w(1) =⇒MT2 x ≤ w(2)

, whi h gives M ′x ≤ w.

The above orollary provides a su ient algebrai ondition under whi h some onstraints

in (3.2) are redundant. This will be useful in the next se tion.

3.3 Appli ations

So far, a ontainer F(u) is maximal in S if and only if u satises the onditions of Theo-

rem 2.1. In this se tion, we will des ribe some design obje tives and orresponding algorithms

for nding spe i maximal safe ontainers F(u). These algorithms will ea h be formulated as

a maximization of a ertain design obje tive g(u), over all u ∈ U . Lemma 3.5 establishes a

su ient ondition on g(·) for whi h the algorithms proposed below will be shown to produ e

maximal ontainers. The algorithms in [50 were proven to generate maximal ontainers based

on Theorem 2.1. In this se tion, we will provide simpler proofs based on Lemma 3.5.

The proof of Lemma 3.5 depends on Lemmas 3.2, 3.3, and 3.4 that were proved in [50 and

are restated below for ompleteness of presentation.

Lemma 3.2. [50 For any u ∈ Rn+, we have 0 ≤ v(F(u)) ≤ u.

Lemma 3.3. [50 For any u ∈ Rn, let u′ = v(F(u)), it follows that F(u′) = F(u).

Lemma 3.4. [50 For any u ∈ Rn+, u is irredu ible if and only if F(u) is non-empty and

v(F(u)) = u.

Lemma 3.5. Given a real-valued fun tion g(·) : Rn → R su h that, for any u, u′ ∈ U , we have:

i) g(u′) = g(u) if F(u′) = F(u)

ii) g(u′) > g(u) if MGu′ > MGu

Furthermore, let:

= maxu∈U

[g(u)] (3.18)

and let u∗ ∈ U be su h that F(u∗) is not empty and g(u∗) = g∗. It follows that F(u∗) is maximal

Proof. We will prove that u∗ is irredu ible and extremal in U , so that F(u∗) is maximal in S,

due to Theorem 2.1. The proof is in two parts.

First, we will prove that u∗ is extremal in U ; the proof will be by ontradi tion. Let u ∈ U

be su h that F(u) is not empty and g(u) = g∗ and suppose that u is not extremal in U , so

that u ≥ 0 and Pu < Vth. Let ǫ

= min∀k

(Vth,k − Pu|k) > 0, let 1n be the n × 1 ve tor whose

every entry is 1, and let u′ = u+ ǫ1n ≥ 0. Be ause P has exa tly one 1 in every row, it follows

that P1n = 1d, and Pu′ = Pu+ ǫP1n = Pu+ ǫ1d ≤ Vth due to the denition of ǫ, from whi h

u′ ∈ U . Note that MGu′ =MGu+ ǫMG1n and, be ause G is irredu ibly diagonally dominant

with positive diagonal and non-positive o-diagonal entries, from whi h G1n ≥ 0, with G1n 6= 0,

then ǫMG1n > 0 due to ǫM > 0, and so MGu′ > MGu. It follows that g(u′) > g(u) = g∗ with

u′ 6= u, whi h ontradi ts (3.18). Therefore, u is extremal in U , so that u∗ is extremal in U ,

whi h ompletes the rst part of the proof.

Next, we will prove that u∗ is irredu ible; the proof will be by ontradi tion. Let u ∈ U be

su h that F(u) is not empty and g(u) = g∗ and suppose that u is redu ible, then by Lemma 3.4

we must have v(F(u)) 6= u. Let u′ = v(F(u)), so that F(u′) = F(u) due to Lemma 3.3.

Be ause 0 ≤ u′ ≤ u due to Lemma 3.2, from whi h Pu′ ≤ Pu due to P ≥ 0, so that u′ ∈ U ,

the onditions of the lemma provide that g(u′) = g(u) = g∗. Let δ =MGu−MGu′. Note that

MGu′ = MGv (F(u)) = emaxI∈F(u)(M′I) ≤ MGu, due to (3.2), and MGu 6= MGv(F(u)),

due to v(F(u)) 6= u, from whi h δ ≥ 0 and δ 6= 0. Combining this with G−1A > 0, from (2.17),

we have 0 < G−1Aδ = u− u′. Consequently, we have 0 ≤ u′ < u, so that Pu′ < Pu ≤ Vth, the

nal step due to P ≥ 0, P has no row with all zeros, and u ∈ U , so that u′ is not extremal in

U . But this ontradi ts the rst part of the proof. It follows that u is irredu ible, so that u∗ is

irredu ible. Therefore, F(u∗) is maximal in S.

3.3.1 Peak Power Dissipation

An interesting quality metri for a power grid is the peak total power dissipation that it an safely

support in the underlying ir uit. We refer here to the instantaneous power dissipation, whi h is

onservatively approximated by Vdd∑m

j=1 ij(t). Thus, we are interested in a safe ontainer that

is maximal in S and that allows the highest possible

∀j Ij . Generally, one might be interested

in the highest weighted sum of the individual urrents, i.e.

∀j qjIj , where qj > 0 is a user-

spe ied weight on the jth urrent sour e. This will allow ertain areas of the die to support

larger power dissipation than other areas. However, in the following, we assume that all urrent

sour es have equal weights and, hen e, we will be nding the peak total power dissipation that

the grid an safely support. The authors in [50 developed an algorithm targeting the same

design obje tive, i.e. a ontainer that allows the highest possible

∀j Ij . In fa t, the algorithm

developed below produ es the exa t same ontainer as the one produ ed using the algorithm

in [50. The major ontribution of this se tion is the result of Lemma 3.7, whi h allows us to

simplify the algorithm, a hieve huge speedup over [50, and prove that the resulting ontainer

has redundant onstraints.

For any u ∈ U , we dene σ(u) to be the largest sum of urrent sour e values allowed under

= maxI∈F(u)

∑mj=1 Ij

(3.19)

and we dene σ∗ to be the largest σ(u) a hievable over all possible u ∈ U , i.e.:

= maxu∈U

(σ(u)) (3.20)

Let up ∈ U be su h that σ(up) = σ∗, and I∗ ∈ F(up) be su h that

∑mj=1 I

∗j = σ∗. In general,

up and I∗ may not be unique. Based on (3.1) and (3.2), we an express the ombined (3.19)

and (3.20) as the following linear program (LP):

σ∗ = Max

∑mj=1 Ij

subje t to

M ′I ≤MGu, Pu ≤ Vth,

I ≥ 0, u ≥ 0

(3.21)

Let D be the feasible region of the LP (3.21):

= (I, u) : I ≥ 0, u ≥ 0,M ′I ≤MGu,Pu ≤ Vth (3.22)

so that, from the above, we have:

σ∗ = max(I,u)∈D

∑mj=1 Ij

(3.23)

Noti e that, (0, 0) ∈ D so that D is not empty and all of σ∗, up, and I∗ are well-dened.

Also, for every (I, u) ∈ D, we have M ′I ≤ MGu and I ≥ 0 whi h, be ause M ′ ≥ 0, gives

0 ≤ M ′I ≤ MGu so that 0 ∈ F(u) and F(u) is not empty. Therefore, F(up) is not empty, so

that the ontainer F(up) = I ∈ Rm : I ≥ 0,M ′I ≤ MGup 6= φ provides the desired urrent

onstraints:

i(t) ≥ 0, ∀t ∈ R

M ′i(t) ≤MGup, ∀t ∈ R

The following lemma establishes the maximality of F(up), based on Lemma 3.5. Denote by

cj and c′j the jth olumns of M and M ′, respe tively, and noti e that c′j = cj , for every

j ∈ 1, 2, . . . ,m. Also, denote by mij the (i, j)th element of M .

Lemma 3.6. F(up) is maximal in S.

Proof. Re all that I∗ and up are well-dened and (I∗, up) ∈ D, so that M ′I∗ ≤ MGup and

I∗ ≥ 0 whi h, be ause M ′ ≥ 0, gives 0 ≤ M ′I∗ ≤ MGup and so F(up) is not empty. We will

prove that σ(·) satises the onditions of Lemma 3.5, from whi h F(up) is maximal in S. First,

noti e that for any u, u′ ∈ U , if F(u′) = F(u), it follows that σ(u′) = σ(u), due to (3.19). It

remains to prove that for any u, u′ ∈ U , if MGu′ > MGu, then σ(u′) > σ(u).

For any u ∈ U , there must exist a ve tor I ∈ F(u) su h that σ(u) =∑m

j=1 Ij . Let

λ =min∀i (MGu′|i −MGu|i)

max∀i,j(mij)(3.24)

Be ause MGu′ > MGu and M > 0, it follows that λ > 0. Also, let e1 ∈ Rm

be the ve tor

whose 1st entry is 1 and all other entries are 0 and let I ′ = I + λe1. Be ause λ > 0, we have

λe1 ≥ 0, λe1 6= 0, I ′ ≥ I ≥ 0, and I ′ 6= I, so that

∑mj=1 I

′j >

∑mj=1 Ij . Furthermore, we have

I ′ ∈ F(u′), be ause:

M ′I ′ =M ′I + λM ′e1 =M ′I + λc′1 (3.25)

=M ′I +min∀i (MGu′|i −MGu|i)

max∀i,j(mij)c1 (3.26)

≤MGu+min∀i

MGu′|i −MGu|i)

1n (3.27)

≤MGu′ (3.28)

where in (3.27) we used I ∈ F(u) and c1/max∀i,j(mij) ≤ 1n. Therefore, we have σ(u′) ≥∑m

j=1 I′j >

∑mj=1 Ij = σ(u), so that σ(·) satises the onditions of Lemma 3.5 and F(up) is

maximal in S.

Maximality is an all-important property, and is guaranteed by the above lemma. However,

we also want to ensure good omputational performan e, and the next lemma will help us

a hieve that. In fa t, the importan e of the following lemma is two-fold. First, it simplies

the LP (3.21) into (3.34) a hieving a huge speedup, as we will see in Se tion 3.4. Se ond, it

shows that, after solving for up, the resulting F(up) an be represented using only m rows of

M ′I ≤MGup.

Lemma 3.7. Let u∗ = G−1HI∗, then u∗ ∈ U and σ(u∗) = σ∗.

Proof. Re all thatM ′ =MH ≥ 0 and I∗ ≥ 0, so thatM ′I∗ ≥ 0. Moreover, be ause I∗ ∈ F(up),

we have:

0 ≤M ′I∗ =MHI∗ ≤MGup (3.29)

Multiplying (3.29) with G−1A ≥ 0, from (2.17), we get:

0 ≤ G−1HI∗ ≤ up (3.30)

Therefore, we have 0 ≤ u∗ = G−1HI∗ ≤ up, so that Pu∗ ≤ Pup ≤ Vth, the nal step is due to

up ∈ U . It follows that u∗ ∈ U . Moreover, we have that MGu∗ = MHI∗ = M ′I∗, from whi h

I∗ ∈ F(u∗), so that σ(u∗) = σ∗, due to (3.19), and the proof is omplete.

Re all that up is dened to be any ve tor u ∈ U su h that σ(u) = σ∗. Therefore, using

Lemma 3.7, we an let up = G−1HI∗. Dene the set D′as follows:

= (I, u) : I ≥ 0, u ≥ 0, Pu ≤ Vth, u = G−1HI (3.31)

Noti e that, for any (I, u) ∈ D′, we have u = G−1HI, so that MGu = MHI = M ′I whi h,

ombined with I ≥ 0, u ≥ 0, and Pu ≤ Vth, gives (I, u) ∈ D. Therefore, we have D′ ⊆ D. Also,

be ause (I∗, up) ∈ D′, then σ∗ = max

(I,u)∈D′

∑mj=1 Ij

, whi h an be found using the LP:

σ∗ = Max

∑mj=1 Ij

subje t to

u = G−1HI, Pu ≤ Vth,

I ≥ 0, u ≥ 0

(3.32)

Re all that H =

, so that for every (I, u) ∈ D′, we have:

= HI =

(3.33)

from whi h, G1u = I and G2u = 0. Using (3.33), we an rewrite (3.32) as:

σ∗ = Max 1TmG1u

subje t to

G1u ≥ 0, G2u = 0,

Pu ≤ Vth, u ≥ 0

(3.34)

The LP in (3.34) has a remarkable simpli ation over (3.21) for two reasons: 1) it has only

n variables and 2n onstraints ompared to n + m variables and 2n + m onstraints, and 2)

In −MB = In −M(A − G) = MG whi h means that MG is a dense matrix, be ause In and

B are diagonal matri es and M is a dense matrix, so that the onstraints of (3.21) are dense

whereas the onstraints of (3.34) are sparse.

Furthermore, be ause up = G−1HI∗, then Gup = [I∗ 0]T ≥ 0. Hen e, using the Corollary

to Lemma 3.1 and for w = MGup, it follows that M1I ≤ w(1) ⇐⇒ M ′I ≤ w, i.e. r′jI ≤ rjGup

is redundant, ∀j ∈ m + 1, . . . , n, where r′j denotes the jth row of M ′. This being said, the

ontainer F(up) an be expressed as F(up) = I ∈ Rm : I ≥ 0, r′jI ≤ rjGup, ∀j ∈ 1, . . . ,m

whi h provides the desired urrent onstraints:

i(t) ≥ 0, ∀t ∈ R

r′ji(t) ≤ rjGup, ∀j ∈ 1, . . . ,m, ∀t ∈ R

As an example, the LP (3.34) is run on the small grid in Fig. 3.2 and the resulting ontainer

is shown in Fig. 3.3 where up = [89 100 95 98]T (units of mV ). Noti e that this method, in

i (t)2

i (t)1

1pF 1pF

36.25 1

Figure 3.2: An example of a power grid with 4 nodes, 2 urrent sour es, and

Vth = [110 100 95 105]T (units of mV ).

Figure 3.3: An example of F(up), F(us), and F(uc).

order to allow the maximum peak power, may generate a ontainer that is skewed in a way that

imposes a tight onstraint on urrent in ertain lo ations of the die (su h as at i2(t), in Fig. 3.3)

while allowing larger urrent in other lo ations (su h as at i1(t)). Other approa hes are possible

to avoid this skew and even out the urrent budgets, as we will see next.

3.3.2 Uniform Current Distribution

The design team may be interested in a grid that safely supports a uniform urrent distribution

a ross the die, so as to allow a pla ement that provides a uniform temperature distribution. This

design obje tive is important be ause it redu es thermal gradients on the die whi h may impa t

the reliability of the hip. We an generate onstraints that allow that obje tive by sear hing

for a safe maximal ontainer F(u) that ontains the hyper ube in urrent spa e that has the

largest volume, or the largest edge length L. In other words, this method aims to raise the

minimum and avoid the skew indi ated above. We will develop a method (3.43) whi h, when

applied to the simple grid in Fig. 3.2, generates the ontainer F(ul) shown in Fig. 3.3, where

ul = [84 91 95 93]T (units of mV ). The authors in [50 developed an algorithm that targets

the same quality metri , i.e. uniform urrent distribution a ross the die area, by sear hing for a

ontainer that ontains the hypersphere in the urrent spa e that has the largest volume. The

algorithm in [50 applied to the grid in Fig. 3.2 generates the ontainer F(us) shown in Fig. 3.3,

where us = [83 91 95 92]T (units of mV ). We an see from the gure that both ontainers

allow roughly similar urrent distributions. We will see in Se tion 3.4 that this observation also

holds for larger grids. The main reason of using the new algorithm is the signi ant runtime

advantage it provides over the hypersphere-based algorithm, as we will see in Se tion 3.4.

Let C(L) ⊂ Rm

denote the hyper ube with edge length L, i.e. C(L) = I : 0 ≤ I ≤ L1m.

We are interested in a non-empty F(u) su h that C(L) ⊆ F(u). Let η = M ′1m ≥ 0, be ause

M ′ ≥ 0. In the following lemma, we will derive a ne essary and su ient algebrai ondition

for whi h C(L) ⊆ F(u) this will be useful to solve (3.35).

Lemma 3.8. For any L ≥ 0 and u ∈ Rn, C(L) ⊆ F(u) if and only if Lη ≤MGu.

Proof. To prove the if dire tion, let I ∈ C(L), i.e. 0 ≤ I ≤ L1m, so that 0 ≤ M ′I ≤

LM ′1m = Lη, due to M ′ > 0. Therefore, we have M ′I ≤ Lη ≤ MGu and I ≥ 0, so that

I ∈ F(u). Conversely, let I = L1m and noti e that I ∈ C(L), so that I ∈ F(u). Therefore,

M ′I = LM ′1m = Lη ≤MGu, and the proof is omplete.

For any u ∈ U , we dene l(u) to be the largest L ≥ 0 for whi h C(L) ⊆ F(u), or equivalently,

for whi h Lη ≤MGu is satised, so that:

= maxC(L)⊆F(u)

(L) = max0≤Lη≤MGu

(L) (3.35)

and we dene L∗to be the largest l(u) a hievable over all possible u ∈ U , i.e.:

= maxu∈U

(l(u)) (3.36)

Let ul be a ve tor at whi h the above maximization attains its maximum. In other words,

ul ∈ U is su h that l(ul) = L∗and C(L∗) ⊆ F(ul). In general, ul may not be unique. Based

on (3.1), we an express the ombined (3.35) and (3.36) as the following linear program (LP):

L∗ = Maximize L

subje t to

Lη ≤MGu, Pu ≤ Vth,

L ≥ 0, u ≥ 0

(3.37)

Let T be the feasible region of the LP (3.37):

= (L, u) : L ≥ 0, u ≥ 0, Lη ≤MGu,Pu ≤ Vth (3.38)

L∗ = max(L,u)∈T

(L) (3.39)

Noti e that, (0, 0) ∈ T so that T is not empty and L∗and ul are well-dened. Also, for every

(L, u) ∈ T , we have Lη ≤ MGu and L ≥ 0. Be ause η ≥ 0, it follows that 0 ≤ Lη ≤ MGu

so that 0 ∈ F(u) and F(u) is not empty, meaning that the ontainer F(ul) = I ∈ Rm : I ≥

0,M ′I ≤MGul 6= φ. Therefore, F(ul) provides the desired urrent onstraints:

i(t) ≥ 0, ∀t ∈ R

M ′i(t) ≤MGul, ∀t ∈ R

Lemma A.1 in Appendix A.1 establishes the maximality of F(ul), based on Lemma 3.5.

The importan e of the following lemma is two-fold. First, it simplies the LP (3.37)

into (3.43) a hieving a signi ant speedup. Se ond, it shows that, after solving for ul, the

resulting F(ul) an be represented using only m rows of M ′I ≤MGul.

Lemma 3.9. Let u∗ = L∗G−1H1m, then u∗ ∈ U and l(u∗) = L∗.

The proof of Lemma 3.9 is available in Appendix A.1, as Lemma A.2.

Re all that ul is dened to be any ve tor u ∈ U su h that l(u) = L∗. Therefore, using

Lemma 3.9, we an let ul = L∗G−1H1m. Dene the set T′as follows:

= (L, u) : L ≥ 0, u ≥ 0, Pu ≤ Vth, u = LG−1H1m (3.40)

Noti e that, for any (L, u) ∈ T ′, we have u = LG−1H1m, so that MGu = LM ′1m = Lη whi h,

ombined with L ≥ 0, u ≥ 0, and Pu ≤ Vth, gives (L, u) ∈ T . Therefore, we have T ′ ⊆ T . Also,

be ause (L∗, ul) ∈ T ′, then L∗ = max

(L,u)∈T ′

(L), whi h an be found using the LP:

L∗ = Maximize L

subje t to

u = LG−1H1m, Pu ≤ Vth,

L ≥ 0, u ≥ 0

(3.41)

Re all that H =

, so that for every (L, u) ∈ T ′, we have:

= LH1m = L

(3.42)

from whi h, G1u = L1m and G2u = 0. Using (3.42), we an rewrite (3.41) as:

L∗ = Maximize L

subje t to

G1u = L1m, G2u = 0,

Pu ≤ Vth, L, u ≥ 0

(3.43)

The LP in (3.43) has a remarkable simpli ation over (3.37) for two reasons: 1) In −MB =

In−M(A−G) =MG whi h means that MG is a dense matrix, be ause In and B are diagonal

matri es and M is a dense matrix, so that the onstraints of (3.37) are dense whereas the on-

straints of (3.43) are sparse, and 2) it does not require the omputation of η =M ′1m and, thus,

saves the time required to ompute an LU-fa torization of A and perform a forward/ba kward

substitution.

Let w =

= MGul, where w(1)and w(2)

are m × 1 and (n − m) × 1 ve tors,

respe tively. Also, let M ′ =

, where M1 and M2 are m×m and (n−m)×m matri es,

respe tively. Be ause ul = LG−1H1m, then Gul = LH1m ≥ 0. Hen e, using the Corollary to

Lemma 3.1 and for w = MGul, it follows that M1I ≤ w(1) ⇐⇒ M ′I ≤ w, i.e. r′jI ≤ rjGul

is redundant, ∀j ∈ m + 1, . . . , n, where r′j denotes the jth row of M ′. This being said, the

ontainer F(ul) an be expressed as F(ul) = I ∈ Rm : I ≥ 0, r′jI ≤ rjGul, ∀j ∈ 1, . . . ,m

whi h provides the desired urrent onstraints:

i(t) ≥ 0, ∀t ∈ R

r′ji(t) ≤ rjGul, ∀j ∈ 1, . . . ,m, ∀t ∈ R

3.3.3 Combined Obje tive

Thus far, we have presented two algorithms for urrent onstraints generation. The rst al-

gorithm aims to maximize the peak power dissipation that the grid an safely support in the

underlying ir uit, however, it generates a skewed ontainer in a way that imposes a tight on-

straint on the urrents in ertain lo ations of the die. The se ond algorithm aims to uniformly

distribute power budgets a ross the ir uit by raising the minimum; but this approa h does

not ne essarily allow for a large peak total power dissipation. One may be interested in a middle

s enario; a ontainer that is maximal in S, maximizes the peak power dissipation that the grid

an safely support and supports a uniform urrent distribution a ross the die. In this se tion, we

will develop a onstraints generation algorithm, essentially a ombination of (3.23) and (3.39),

that allows this type of design obje tive. The algorithm will generate a ontainer su h as the

one shown in Fig. 3.3 where uc = [84 91 95 93]T (units of mV ). Note that F(ul) and F(uc) are

the same in the simple example shown in Fig. 3.3. However, as we will see in Se tion 3.4, the

ontainers are very dierent on larger grids and ea h will provide a unique trade-o.

Re all that (3.19) maximizes the sum of the m urrent sour es atta hed to the grid, whereas

(3.35) maximizes the urrent edge for whi h the hyper ube is ontained in F(u). Therefore,

there is a lear disproportionality between the dimensions of both obje tive fun tions whi h

motivates the following. For any u ∈ U , we dene ψ(u) to be the largest value of the following

ombined obje tive allowed under F(u):

= maxI ∈ F(u)

C(L) ⊆ F(u)

(3.44)

= maxM ′I ≤MGu

Lη ≤MGu

I, L ≥ 0

(3.45)

and we dene ψ∗to be the largest ψ(u) a hievable under all possible u ∈ U , so that:

= maxu∈U

(ψ(u)) (3.46)

Another way to ombine the obje tive fun tions of (3.23) and (3.39) is to simply take the

weighted sum of the individual obje tive fun tions of (3.23) and (3.39), i.e.

∑mj=1 Ij

+ αL,

for any s alar α ≥ 0. Noti e that ψ(u) in (3.44) is dened to be the maximum value of this

weighted sum for α = m. Clearly, if α is lose to zero, then we will be sear hing for a ontainer

that targets the peak power dissipation, and if α is very large, then we will be sear hing for a

ontainer that targets the uniform urrent distribution. It may also be interesting to sweep over

the values of α to nd an α that is suitable for the design obje tive. However, in the following,

we assume that α = m, as in (3.44).

Let uc be a ve tor at whi h the maximization in (3.46) attains its maximum. In other words,

uc ∈ U is su h that ψ(uc) = ψ∗. Also, let ζ and ω be su h that

∑mj=1 ζj

+mω = ψ∗, where

ζ ∈ F(uc) and 0 ≤ ωη ≤ MGuc. In general, uc, ζ, and ω may not be unique. Based on (3.1)

and (3.2), we an express the ombined (3.44) and (3.46) as the following linear program (LP):

ψ∗ = Max

∑mj=1 Ij

subje t to M ′I ≤MGu

Lη ≤MGu

Pu ≤ Vth

I, L, u ≥ 0

(3.47)

Let C be the feasible region of the LP (3.47):

(I, L, u) : 0 ≤ Lη ≤MGu, M ′I ≤MGu,

I ≥ 0, u ≥ 0, Pu ≤ Vth

ψ∗ = max(I,L,u)∈C

(3.48)

Noti e that, (0, 0, 0) ∈ C so that C is not empty, and all of ψ∗, uc, ζ, and ω are well-dened.

Also, for every (I, L, u) ∈ C, we have M ′I ≤ MGu and I ≥ 0 whi h, be ause M ′ ≥ 0, gives

0 ≤ M ′I ≤ MGu so that 0 ∈ F(u) and F(u) is not empty. Therefore, F(uc) is not empty

and the ontainer F(uc) = I ∈ Rm : I ≥ 0,M ′I ≤ MGuc 6= φ provides the desired urrent

onstraints:

i(t) ≥ 0, ∀t ∈ R

M ′i(t) ≤MGuc, ∀t ∈ R

Lemma A.3 in Appendix A.1 establishes the maximality of F(uc), based on Lemma 3.5.

Another way to write the LP (3.47) is as follows:

ψ∗ = Max

∑mj=1 Ij

subje t to

y ≤ w, Lη ≤ w,

Ay = HI, Aw = Gu

Pu ≤ Vth, I, L, u ≥ 0

(3.49)

Although the LP (3.49) has larger number of variables ompared to the LP (3.47), a huge

runtime advantage is attained by solving (3.49) for two reasons: 1) omputing all olumns of

M = A−1is not required in (3.47), where η an be omputed by a single linear system solve

(solve for x in Ax = H1m) , and 2) noti e that In−MB = In−M(A−G) =MG whi h means

that MG is a dense matrix, be ause In and B are diagonal matri es and M is a dense matrix,

so that the onstraints of (3.47) are dense whereas the onstraints of (3.49) are sparse.

3.4 Results

The above three algorithms (3.34), (3.43), and (3.49) have been implemented using C++. Algo-

rithm (3.49) requires one LU-fa torization and one forward/ba kward substitution (to ompute

η). The maximizations were performed using the Mosek optimization pa kage [49. We on-

du ted tests on a set of power grids with a 1.1 V supply voltage that were generated based on

user spe i ations, in luding grid dimensions, metal layers, pit h and width per layer, and C4

Table 3.1: Details of power grids used in experiments

Power Grid

Current

Layers Sour es

G1a 9 8,413 552

G2a 9 18,678 1,119

G3a 9 32,554 2,070

G4a 9 50,444 3,192

G5a 9 113,304 7,140

G6a 9 200,828 12,656

G7a 9 312,232 19,460

G8a 9 449,189 28,056

G9a 9 1,006,625 63,001

G10a 9 1,791,294 111,890

G11a 9 2,432,119 151,710

G12a 9 2,795,602 174,306

Table 3.2: Runtime of the three approa hes

Power Grid Peak Power

Uniform Combined

Distribution Obje tive

Alg. in Alg. in

Speedup

Alg. in Alg. in

Speedup

Alg. in

(3.34) [50 (3.43) [50 (3.49)

G1a 0.3 se 2.45 min 490× 0.7 se 3.35 min 287× 2.3 se

G2a 0.9 se 10.11 min 673× 1.6 se 17.46 min 654× 9.2 se

G3a 1.6 se 36.89 min 1383× 3.2 se 50.08 min 952× 15.6 se

G4a 3.7 se 56.98 min 924× 5.9 se 1.14 hrs 695× 25.6 se

G5a 8.0 se 4.44 hrs 1998× 16.0 se 5.6 hrs 1260× 1.5 min

G6a 23.7 se 13.17 hrs 2000× 41.6 se 18.7 hrs 1618× 4.9 min

G7a 50.5 se 28.92 hrs 2061× 1.0 min 36.41 hrs 2184× 8.6 min

G8a 1.7 min - - 1.9 min - - 13.5 min

G9a 7.6 min - - 5.2 min - - 51.4 min

G10a 13.1 min - - 16.2 min - - 2.5 hrs

G11a 24.7 min - - 24.0 min - - 6.3 hrs

G12a 28.4 min - - 34.3 min - - 5.31 hrs

Geometri mean 32.5 se 1.3 hrs 1194× 45.8 se 1.8 hrs 918× 5.7 min

and urrent sour e distributions, onsistent with 65nm te hnology. The details for the grids are

given in Table 3.1. All results were obtained using a 3.4 GHz Linux ma hine with 32 GB of

The number of variables in (3.34) is n, the number of variables in (3.43) is n + 1, and the

number of variables in (3.49) is 3n+m+ 1, where n is the total number of nodes and m is the

number of urrent sour es. The total CPU time for solving (3.34), (3.43), and (3.49) are given

in olumns 2, 5, and 8 of Table 3.2, respe tively. To study runtime e ien y of (3.34) and (3.43)

0 25 50 75 100 125Number of Windows

mean = 1.3 mA/cm2

std−dev = 0.2 mA/cm2

Figure 3.4: Contour plots for peak urrent density a ross the layout and the orresponding

histograms using F(us) from [50. The olor bar units are mA/cm2.

Table 3.3: Comparison of the three approa hes

Uniform Current

Combined Obje tive

Distribution

Name P (up) (mW ) l(up) (µA) P (ul) (mW ) l(ul) (µA) P (uc) (mW ) l(uc) (µA)

G1a 1.73 0.75 0.94 1.62 1.69 1.60

G2a 3.71 1.03 2.49 1.98 3.40 1.91

G3a 6.57 0.43 3.33 1.35 6.40 1.32

G4a 9.83 0.79 5.96 1.65 9.57 1.64

G5a 22.06 0.41 12.05 1.31 21.68 1.29

G6a 40.69 0.95 26.80 1.97 37.35 1.93

G7a 59.77 0.95 40.30 1.89 55.60 1.88

G8a 85.23 0.58 50.99 1.48 84.15 1.47

G9a 197.57 0.87 124.74 1.81 187.32 1.80

G10a 344.34 0.79 214.45 1.71 332.89 1.70

G11a 467.86 0.44 259.86 1.37 464.90 1.36

G12a 534.13 0.95 359.46 1.91 493.58 1.89

ompared to the algorithms in [50, the algorithms in [50 were implemented on the ma hine

des ribed above. The total CPU time for the algorithms in [50 are shown in olumns 3 and

6 of Table 3.2. On average, (3.34) a hieves 1194× speedup and (3.43) a hieves 918× speedup,

ompared to the orresponding algorithms in [50. For example, on a 312K-node grid, the peak

power dissipation algorithm in [50 took 28 hrs in total whereas (3.34) took 50 se , and the

uniform urrent distribution algorithm in [50 took 36 hrs in total whereas (3.43) took 1 min.

Furthermore, (3.34) a hieves better runtime e ien y than (3.43) and (3.49). On average, (3.34)

is 1.4× faster than (3.43) and 9.7× faster than (3.49).

In Table 3.3, we present the results of the three LPs in olumns 2-7. Denote by P (u)

Vdd×σ(u) the peak power dissipation allowed under F(u). To study the dieren e between the

ontainers generated using (3.34), (3.43), and (3.49), we used two methods. First, we omputed

the peak power dissipation a hievable under all ontainers, whi h are P (up), P (ul), and P (uc),

and the largest urrent edge for whi h the hyper ube in the rst quadrant is ontained in

all ontainers, whi h are l(up), l(ul), and l(uc). For instan e, on a 449,189 node grid (G8a),

the peak power dissipation a hievable under F(up), F(ul), and F(uc) is 85 mW, 50 mW, and

84 mW, respe tively, and the largest urrent edge for whi h the hyper ube in the rst quadrant

is ontained in F(up), F(ul), and F(uc) is 0.58 µA, 1.48 µA, and 1.47 µA, respe tively. The

results show that P (ul) ≪ P (up) and l(up) ≪ l(ul) on all grids. In fa t, the peak power

dissipation a hievable under F(ul) is at most 67% of that a hievable under F(up). Also, the

largest urrent edge for whi h the part of the hypersphere in the rst quadrant is ontained

in F(up) is at most 52% of that ontained in F(ul). Thus, both F(up) and F(ul) provide a

distin t trade-o for the hip design team. Moreover, the results show that P (uc) ≈ P (up) and

l(uc) ≈ l(ul). In fa t, P (uc) is at most 9% less than P (up) and l(uc) is at most 4% less than

l(ul). Therefore, the ombined obje tive approa h gives the best features of the peak power

dissipation and the uniform urrent distribution approa hes.

Another way to ompare the three approa hes (3.34), (3.43), and (3.49), is to look at the

urrent density, i.e., the urrent dissipation per unit area of the die, allowed by the three resulting

ontainers. To assess this, we maximize the allowed urrent within a small window of the die

surfa e, and we do this for every position of that window a ross the die. We divide the die

area into κ× κ of these windows and ompute the peak urrent density inside ea h, as allowed

by F(up), F(ul), or F(uc). In Figs. 3.5a, 3.5b, 3.5 , and Fig. 3.4 we present ontour plots for

κ = 35 for the peak urrent densities under F(up), F(ul), F(uc), and F(us) (generated by

the uniform urrent distribution algorithm in [50 that is based on the hypersphere approa h)

respe tively, on a 50K-node grid. Note that F(ul) and F(us) provide very similar peak urrent

densities a ross the die area. Furthermore, note that the urrent onstraints based on F(up)

allow higher urrent densities at ertain spots but also in lude some spots with very small and

restri ted urrent density budgets. This large spread in urrent densities an lead to thermal

hotspots. This may be avoided by using F(ul) whi h, as expe ted and as seen in the gure,

provides a uniform distribution of urrent densities a ross the die area ompared to F(up),

whi h is ree ted in a smaller standard deviation. Of ourse, F(up) supports a larger overall

peak urrent dissipation than F(ul), whi h is ree ted in a larger mean. The urrent onstraints

based on F(uc) provide a urrent density distribution over a smaller range ompared to F(up)

and allows for larger urrent dissipation ompared to F(ul). Clearly, F(uc) is superior to F(up)

and F(ul) providing the best features in those ontainers.

mean = 1.73 mA/cm2

(a) Using F(up).

mean = 1.4 mA/cm2

(b) Using F(ul).

mean = 1.6 mA/cm2

( ) Using F(uc).

histograms. The olor bar units are mA/cm2.

3.5 Con lusion

In this hapter, we revisited the inverse problem of ve torless veri ation, by generating ir uit

urrent onstraints that ensure power grid safety. We developed some key theoreti al results.

We hara terized the properties of design obje tives that an provably produ e maximal on-

tainers. Also, we showed that, under ertain onditions, the produ ed ontainer has redundant

onstraints and, thus, an be represented using a smaller number of inequalities without altering

the spa e of safe urrents. Furthermore, we developed two onstraints generation algorithms

that target the same quality metri s as in [50 a hieving roughly 1000× speedup on average.

Finally, we developed a new onstraints generation algorithm that targets a ombination of

those quality metri s this algorithm gives the best urrent distribution features of those two

algorithms at the ost of omputational e ien y.

Chapter 4

Generating Current Budgets for RLC

So far, we have dis ussed the onstraints generation framework under an RC model of the power

grid. Indu tive ee ts are be oming a signi ant omponent of the power supply noise and an

no longer be ignored. Indu tive parasiti s are parti ularly important in hips operating at high

frequen ies, due to the resulting Ldi/dt noise. Generally, to guarantee power grid safety, one

would have to a ount for the frequen y ontent of the swit hing urrents, whi h may result

in onstraints on the slopes of the urrents drawn by the logi ir uitry. For example, su h a

framework has been onsidered in [30. In this hapter, we extend the onstraints generation

framework to allow for indu tan e. The generated onstraints dene a urrent envelope, as was

the ase in Chapter 3, whi h allows for any urrent waveform that ts in the envelope, and so

these onstraints are onservative. We use the same systemati way of dening the problem as

in the RC ase, in a sense that we are interested to dis over a safe ontainer that allows as mu h

exibility as possible to the underlying logi ir uitry and targets spe i design obje tive. The

major diversion from the RC ase is that indu tive elements introdu e voltage overshoots on

the power grid whi h is aptured by the bound x(·) introdu ed in [29 and repli ated below in

Denition 4.8 that represents the worst- ase voltage undershoot and voltage overshoot. Thus,

most of the theoreti al results established for the RC ase require new proofs in the ontext of

RLC grids. Furthermore, we develop algorithms that target the same design obje tives as in

the RC ase but require a dierent formulation as these algorithms depend on the theoreti al

results.

4.1 Motivational Example: RC versus RLC

In this se tion, we demonstrate that indu tive ee ts annot be simply ignored. Consider the

simple example in Figs. 4.1, 4.2, and 4.3. Fig. 4.1 shows a simple power grid onsisting of a 4-node

RLC ir uit. The two urrent sour es i1(t) and i2(t) represent ir uit blo ks whose swit hing

Chapter 4. Generating Current Budgets for RLC Grids 56

Figure 4.1: Simple example of an RLC power grid.

a tivity onstitutes the load on the grid, indu ing voltage swings on the grid nodes. Fig. 4.2

show the voltage variations experien ed on nodes 1 and 3 due to two urrent ongurations

of the same waveforms i1(t) and i2(t) but with dierent temporal alignment. It is lear from

Fig. 4.2 that the se ond urrent onguration leads to large power supply noise and should

be prohibited. Thus, urrent onstraints provided to the design team should avoid su h a

onguration. Fig. 4.3 shows a urrent ontainer (represented as empty polygon) resulting from

the algorithms in Chapter 3 on the RC ir uit in Fig. 4.1 after ignoring (short- ir uiting) the

indu tan e and another urrent ontainer (represented as striped polygon) resulting from one of

the algorithms presented in this hapter applied to the RLC ir uit in Fig. 4.1. Only the latter

one ex ludes the urrent tra e of the problemati urrent onguration.

We onsider an RLC power grid (as des ribed in Se tion 2.4.3) with nv non-Vdd nodes and

nl indu tors in the grid, and n = nv + nl. Out of the nv nodes, m nodes are atta hed to

ideal urrent sour es and d nodes have voltage threshold spe i ation. So, G is the nv × nv

ondu tan e matrix; A, B, and D are the nv × nv matri es that appear as a result of the time-

dis retization in (2.25) and (2.28); E is the nl × nl diagonal matrix that appears as a result

of the time-dis retization in (2.26); M is an nv × nl in iden e matrix dened in (2.22); H is

the nv ×m in iden e matrix that identies whi h node is onne ted to whi h urrent sour e as

in (2.20); F is an n × n matrix dened in (2.34); and R is an n ×m matrix dened in (2.34);

x(t) is the n× 1 response ve tor dened in (2.33); i(t) is the m× 1 ve tor of urrent waveforms

of the m urrent sour es; Π is the d×n matrix whi h identies the nodes with voltage threshold

spe i ation; and xth =

is the 2d×1 ve tor of user-provided voltage safety spe i ations,

where xub ≥ 0 and xlb ≤ 0 are d× 1 ve tors.

0 10 20 30 40 50 60Time (nsec)

(a) Current waveform #1

0 10 20 30 40 50 60Time (nsec)

(b) Voltage response for urrent waveform #1

0 10 20 30 40 50 60Time (nsec)

( ) Current waveform #2

0 10 20 30 40 50 60Time (nsec)

(d) Voltage response for urrent waveform #2

Figure 4.2: An example of a urrent waveform that leads to voltage violations.

4.3 Problem Denition

First, we will review the notion of a ontainer for a ve tor of urrent waveforms, whi h will help

us express onstraints that guarantee grid safety.

Denition 4.1. (Container) Let t ∈ R, let i(t) ∈ Rm

be a bounded fun tion of time, and let

F ⊂ Rm

be a losed subset of Rm. If i(t) ∈ F , ∀t ∈ R, then we say that F ontains i(t),

represented by the shorthand i(·) ⊂ F , and we refer to F as a ontainer of i(·).

Figure 4.4 illustrates the idea of a ontainer for a simple ase of two urrent waveforms.

Be ause i(t) = [i1(t) i2(t)]T ∈ F for all time instants, we say that F ontains i(t).

Denition 4.2. If u is an n × 1 ve tor and w is a 2n × 1 ve tor, then we say that u ∈ w if,

∀j ∈ 1, . . . , n:

uj ≤ wj and uj ≥ wj+n (4.1)

Thus, u is upper-bounded by the top half of w and lower-bounded by the bottom half of w.

We say that w is empty if there does not exist any x for whi h x ∈ w. Note that w is non-empty

if and only if wj ≥ wj+n, ∀j ∈ 1, 2, . . . , n. Noti e that 0 ∈ xth, be ause xlb ≤ 0 ≤ xub, so that

xth is non-empty.

0 10 20 30 40 50 60I1 (mA)

Figure 4.3: A urrent ontainer (represented as an empty polygon) generated for the RC

ir uit resulting from ignoring the indu tan e of Fig. 4.1 that in ludes the problemati urrent

waveform of Fig. 4.2. Also, a urrent ontainer (represented as striped polygon) generated

using one of the proposed algorithms presented below that avoid the problemati urrent

stimulus presented in Fig. 4.2.

Figure 4.4: Example of a ontainer F for i1(t) and i2(t).

Lemma 4.1. Let u, u′ ∈ Rnand w,w′ ∈ R

2n. If u ∈ w and u′ ∈ w′

, then u+ u′ ∈ w + w′.

Proof. For any j ∈ 1, . . . , n, we have uj ≤ wj , uj ≥ wj+n, u′j ≤ w′

j , and u′j ≥ w′

j+n. It follows

that uj + u′j ≤ wj + w′j and uj + u′j ≥ wj+n + w′

j+n, so that u+ u′ ∈ w + w′.

Denition 4.3. (Safe Grid) A grid is said to be safe for a given i(t), dened ∀t ∈ R, if

Πx(t) ∈ xth, ∀t ∈ R.

Going ba k to the example of Fig. 4.1 were the nodes of interest are nodes 1 and 3 with

voltage overshoot/undershoot thresholds of 50 mV, so that Πx(t) = [v1(t) v3(t)]Tand xth =

[50 50 − 50 − 50]T . In Fig. 4.5, we show the voltage response at nodes 1 and 3 due to the

urrent waveform i(t) = [i1(t) i2(t)]Tshown in Fig. 4.4. Noti e that the voltage response is

within the spe ied thresholds so that Πx(t) ∈ xth and the grid is safe under i(t).

(a) Node 1

(b) Node 3

Figure 4.5: Nodal voltage at nodes 1 and 3 of Fig. 4.1 due to urrent waveform in Fig. 4.4.

The grey line represents the Vdd value. The dashed lines represent the voltage

overshoot/undershoot thresholds. The blue dotted lines represent the values of Px(F), whereF is the urrent ontainer represented in Fig. 4.4.

To he k if a power grid is safe, one would typi ally be interested in the worst- ase voltage

variation at some grid node k ∈ 1, . . . , nv, at some time point τ ∈ R, over a wide range of

possible urrent waveforms. Using the above notation, and given a ontainer F that ontains a

wide range of urrent waveforms that are of interest, we an express this as maxi(t)⊂F (xk(τ))

and mini(t)⊂F (xk(τ)). Clearly, be ause F is the same irrespe tive of time, and applies at all

time points t ∈ R, then these worst- ase voltage variations must be time-invariant, independent

of the hosen time point τ . Using the eopt(·) notation introdu ed in Denition 2.3, we an

apture in a single ve tor all the separate worst- ase voltage variations, as follows:

x(opt)(F)

= eopti(·)⊂F

[x(t)] (4.2)

It should be lear that if F is not an empty set, then x(opt)(F) is not empty (as a 2n× 1 ve tor)

and x(t) ∈ x(opt)(F), for all i(t) ⊂ F . The exa t expression of the worst- ase response ve tor

x(opt)(F) was derived in [4 to be:

x(opt)(F) =∞∑

eoptI∈F

(F qRI) (4.3)

where I is an m × 1 ve tor dummy variable in units of urrent. One way to he k grid safety

is to ompute (4.3), and then he k if [Π 0]x(opt)(F) ≤ xub and [0 Π]x(opt)(F) ≥ xlb be ause,

learly, [0 Π]x(opt)(F) ≤ Πx(t) ≤ [Π 0]x(opt)(F), ∀t. However, this is obviously too expensive to

ompute dire tly using (4.3), although it is possible to get an approximate value of the solution

by solving for only a few terms of the summation. Instead, we will use some bounds on x(opt)

based on the following notation.

Denition 4.4. If v =

and w =

are 2n× 1 ve tors and vt, vb, wt, and wb are

n× 1 ve tors, then we say that v ⊆ w if:

vt ≤ wt and vb ≥ wb (4.4)

Denition 4.5. If v =

and w =

are 2n× 1 ve tors and vt, vb, wt, and wb are

n× 1 ve tors, then we say that v ⊂ w if:

vt < wt and vb > wb (4.5)

A few simple properties an be stated without proof.

1) The subset relation among ve tors is transitive:

u ⊆ v and v ⊆ w =⇒ u ⊆ w (4.6)

u ⊂ v and v ⊂ w =⇒ u ⊂ w (4.7)

2) The subset relation may be ombined by summation:

u ⊆ v and w ⊆ z =⇒ u+ w ⊆ v + z (4.8)

u ⊂ v and w ⊂ z =⇒ u+ w ⊂ v + z (4.9)

3) For any two ve tors u and v, we have:

u ⊆ v ⇐⇒ −v ⊆ −u (4.10)

u ⊂ v ⇐⇒ −v ⊂ −u (4.11)

4) For any two ve tors u and v, we have:

u ⊆ v ⇐⇒ 0 ⊆ v − u (4.12)

u ⊂ v ⇐⇒ 0 ⊂ v − u (4.13)

Denition 4.6. (matrix extension) Let W be an n × n matrix, and let W+ = 12(W + |W |)

and W− = 12(W − |W |), where |W | is the n× n matrix onsisting of the absolute values of the

elements of W . We dene the extension of W as the 2n× 2n matrix W , given by:

W+ W−

W− W+

(4.14)

Noti e that W+ ≥ 0 onsists of only the non-negative elements of W while W− ≤ 0 onsists

of only the non-positive elements of W , so that, with wij denoting the (i, j)th entry of W , we

have for any (i, j):

w+ij =

wij , if wij ≥ 0;

0, otherwise.

w−ij =

wij , if wij ≤ 0;

0, otherwise.

(4.15)

Denition 4.7. (subset-preserving) A 2n × 2n matrix X is said to be subset-preserving (SP)

if, for any two 2n× 1 ve tors u, v, we have that u ⊆ v =⇒ Xu ⊆ Xv.

From Lemma B.1 in the Appendix, we have that for any matrix W , its extension W is SP.

Be ause x(opt) is expensive to ompute, the authors in [29 derive a bound x on x(opt), whi h

is stated in Denition 4.8 below. The authors show that for a ertain range of the dis retization

time-step ∆t we have ρ(F ) < 1 and (I2n − F ) is non-singular. Furthermore, they show that it

is always possible to hoose a ∆t in that range, it is easy to nd su h a ∆t, and that hoi e of

∆t is good for the a ura y of x (a maximum error of 1 mV on a 4K-node grid). Throughout

the rest of this hapter, we will assume that ∆t is in su h a range, so that:

ρ(F ) < 1 (4.16)

With this, dene Q to be the 2n× 2n matrix:

= (I2n − F )−1(4.17)

Denition 4.8. For any F ⊂ Rm, dene:

= Q eoptI∈F

(RI) (4.18)

where I ∈ Rm

is a ve tor of arti ial variables, with units of urrent, that is used to arry out

the eopt(·) operation.

In [29, the authors have shown that x is a bound on x(opt), for any ontainer F :

x(opt)(F) ⊆ x(F) (4.19)

Let P be a 2d× 2n matrix dened as follows:

Π 0d×n

0d×n Π

(4.20)

Using Lemma B.1, and be ause Π ≥ 0, we have that P is SP, and of ourse P ≥ 0.

Lemma 4.2. For any u, u′ ∈ R2n, if u ⊂ u′ then Pu ⊂ Pu′, where P is the 2d × 2n matrix

dened in (4.20).

Proof. Let u =

, u′ =

, v = Pu =

, and v′ = Pu′ =

, where ut, u′t,

ub, and u′b are n × 1 ve tors, and vt, v′t, vb, and v′b are d × 1 ve tors. Noti e that ut < u′t and

ub > u′b, be ause u ⊂ u′, so that Πut < Πu′t and Πub > Πu′b, be ause Π ≥ 0 and Π has no row

with all zeros. It follows that vt < v′t and vb > v′b so that Pu = v ⊂ v′ = Pu′.

Denition 4.9. (Safe Container) For a given ontainer F , we say that F is safe if Px(F) ⊆ xth.

For the example of Fig. 4.1, one an simply nd x(F) dened in (4.18) for the urrent on-

tainer F represented in Fig. 4.4. Be ause only nodes 1 and 3 are of interest, then Px(F) =

[v(ub)1 v

(ub)3 v

(lb)1 v

(lb)3 ]T , where v

(ub)1 , and v

(ub)3 are the worst- ase voltage drop on nodes 1 and 3,

respe tively, and v(lb)1 and v

(lb)3 are the worst- ase voltage overshoot on nodes 1 and 3, respe -

tively, under all possible urrent waveforms i(t) ∈ F . In Fig. 4.5, we show the values of v(ub)1 ,

v(ub)3 , v

(lb)1 , and v

(lb)3 whi h learly satisfy the voltage thresholds, so that Px(F) ⊆ xth, and F is

Thus, we are interested to dis over a safe ontainer F so that, due to (4.19) and P being

SP, we get Px(opt)(F) ⊆ xth and the grid is safe. We will see below that a safe ontainer F an

be expressed as a set of onstraints on the ir uit urrents that load the grid, thereby providing

a set of onstraints that are su ient to guarantee grid safety.

4.4 Maximal Containers

This se tion ontains the bulk of the theoreti al ontributions of this hapter, and is organized

as follows. First, we will establish a ne essary and su ient ondition for a ontainer to be safe.

We will nd, however, that there is an innity of possible safe ontainers. The question be omes,

whi h safe ontainer should we hoose? Re all that a ontainer an be used to drive parts of

the design pro ess su h as automated oorplanning and pla ement, and hen e, the larger a

ontainer is, the more exibility is provided for the rest of the design stages. We are interested

in ontainers that allow as mu h exibility as possible in the ir uit loading urrents whi h

we will refer to as maximal ontainers. The following denition aptures the notion of maximal

ontainers in mathemati al terms.

Denition 4.10. Let E be a olle tion of subsets of Rm

and let X ∈ E. We say that X is

maximal in E if there does not exist another Y ∈ E, Y 6= X , su h that X ⊆ Y.

Noti e that a ontainer F is a subset of Rm. In the following, we will identify a set E that

is an innite olle tion of safe ontainers. These ontainers will be of the form (4.23). In fa t,

we will show that these are the only interesting ontainers. Finally, we provide the ne essary

and su ient onditions for a ontainer X to be maximal in E . These onditions are given in

Theorem 4.1 and depend on several results proved in Se tions 4.4.1 and 4.4.2.

Let T = Q−1, so that:

T = I2n − F (4.21)

Let u ∈ R2n

and dene the sets U and F(u) as follows:

= u ∈ R2n : Pu ⊆ xth (4.22)

= I ∈ Rm : I ≥ 0, RI ∈ Tu (4.23)

and noti e that:

Tu ⊆ Tu′ =⇒ F(u) ⊆ F(u′), ∀u, u′ ∈ R2n

(4.24)

To graphi ally illustrate the relation between the sets U and F(u), onsider the ir uit in

Fig. 4.1 and suppose that only node 1 is required to satisfy the voltage safety spe i ations

|v1(t)| ≤ 50mV . Noti e that, in this ase we have n = 5 and d = 1, so that for any u ∈ R10,

Pu ∈ R2and, for any u ∈ U , we have Pu ⊆ xth =

mV . In Fig. 4.6, we show several

instan es of u and plot their orresponding Pu and F(u).

(a) Examples of Pu, where the top part of Pu is denoted as u(ub)1 and the lower

part of Pu is denoted as u(lb)1 .

(b) Examples of F(u) with F(u(4)) = F(u′) = φ.

Figure 4.6: Examples of Pu and F(u).

For the instan e u(2), the resulting urrent ontainer is F(u(2)) whi h, as dened in (4.23),

an be represented using the following set of inequalities, with I =

≥ 0:

−0.051

0.76 0.26

0.30 0.27

0.26 1.53

0.04 0.03

−0.86 −0.78

(4.25)

The above set of inequalities on the urrent variable I denes a region in R2as shown in Fig. 4.6.

Lemma 4.3. For any u ∈ R2n, 0 ∈ F(u) if and only if 0 ∈ Tu.

Proof. Let u ∈ R2n

with 0 ∈ Tu, so that for I = 0 we have RI = 0 ∈ Tu, from whi h 0 ∈ F(u).

Conversely, let u ∈ R2n

with 0 ∈ F(u), then 0 ∈ Tu due to (4.23).

Denition 4.11. For any u ∈ R2n, u is said to be feasible if F(u) is not empty, otherwise it is

infeasible.

For illustration, noti e that in Fig. 4.6, the instan e u(4) generates a urrent ontainer F(u(4))

represented using the following set of inequalities, with I =

≥ 0:

−0.032

−0.041

0.76 0.26

0.30 0.27

0.26 1.53

0.04 0.03

−0.86 −0.78

−0.002

−0.01

(4.26)

It is easy to he k that the above set of inequalities is unsatisable for any I ≥ 0, so that

F(u(4)) = φ and u(4) is infeasible, whereas the urrent ontainer F(u(2)), as shown in Fig. 4.6,

is non-empty, so that u(2) is feasible.

Lemma 4.4. For any feasible u ∈ R2n, we have x(F(u)) ⊆ u.

Proof. For any feasible u ∈ R2n, we have RI ∈ Tu, for all I ∈ F(u), due to (4.23), so that:

eoptI∈F(u)

(RI) ⊆ Tu (4.27)

Be ause Q is SP, due to Lemma B.18 (in the appendix), it follows that:

Q eoptI∈F(u)

(RI) ⊆ QTu = u (4.28)

Therefore, x(F(u)) = Q eoptI∈F(u)(RI) ⊆ u, and the proof is omplete.

Lemma 4.5. For any non-empty ontainer J ⊂ Rm+ , J is safe if and only if ∃u ∈ U su h that

J ⊆ F(u).

Proof. The proof is in two parts.

Proof of the if dire tion: Let J ⊆ F(u) for some u ∈ U , it follows that eoptI∈J

(RI) ⊆ eoptI∈F(u)

from whi h x(J ) ⊆ x(F(u)), due to Lemma B.18. Noti e that F(u) is not empty, be ause J is

not empty, so that u is feasible. Using Lemma 4.4 and (4.6), we get x(J ) ⊆ u whi h, be ause

P is SP and u ∈ U , gives Px(J ) ⊆ xth.

Proof of the only if dire tion: Let J ⊂ Rm+ be a non-empty set with Px(J ) ⊆ xth, and let

u = x(J ), so that Pu ⊆ xth, from whi h u ∈ U . Multiplying u = x(J ) = Q eoptI∈J

(RI) with T ,

we get:

eoptI∈J

(RI) = Tu (4.29)

so that, ∀I ∈ J , we have RI ∈ Tu whi h, oupled with I ≥ 0 gives J ⊆ F(u).

Denition 4.12. Dene the set of ontainers:

= F(u) : u ∈ U (4.30)

It should be lear from Lemma 4.5 that all ontainers of interest are members of S or subsets

of members of S. Note that, if J ⊆ F(u), for some u ∈ U , with J 6= F(u), then learly F(u)

is a better hoi e than J . Choosing J would be unne essarily limiting, while F(u) would

allow more exibility in the ir uit loading urrents. Therefore, it is enough to onsider only

ontainers of the form F(u) with u ∈ U . This is why the denitions (4.22), (4.23), and (4.30)

are important, and we refer to S as the set of safe ontainers.

Referring ba k to Fig. 4.6, the instan e u(5) /∈ U and generates the urrent ontainer F(u(5)),

represented using the following set of inequalities, with I =

≥ 0:

−0.055

0.76 0.26

0.30 0.27

0.26 1.53

0.04 0.03

−0.86 −0.78

(4.31)

It an be easily veried by omputing x(·) in Denition 4.8 that the urrent ontainer dened by

the above set of inequalities gives Px(F(u(5)) =

mV 6⊆ xth, and thus is an unsafe on-

tainer. However, u(2) ∈ U generates the urrent ontainer dened by the set of inequalities (4.25)

and satises Px(F(u(2)) ⊆ xth, so that F(u(2)) is a safe ontainer.

Going further, if F(u1) ⊆ F(u2) with F(u1) 6= F(u2), then learly F(u2) is a better hoi e

than F(u1). In a sense, the larger the ontainer, the better. Thus, we are only interested in

ontainers F(u) that are maximal in S.

In the example of Fig. 4.6, the urrent ontainer F(u(1)) represented using the following set

of inequalities, for I =

≥ 0:

−0.042

0.76 0.26

0.30 0.27

0.26 1.53

0.04 0.03

−0.86 −0.78

(4.32)

and the urrent ontainer F(u(2)) represented using (4.25) satisfy F(u(1)) ⊆ F(u(2)), with

F(u(1)) 6= F(u(2)), so that F(u(1)) is not maximal in S, whereas F(u(2)) an be shown to be

maximal in S.

Maximality is a highly desirable property and so the purpose of the rest of this se tion is to

give ne essary and su ient onditions for a ontainer to be maximal in S. We will see that

the maximality of F(u) depends on ru ial properties of u. Note that, for any xth su h that

0 ∈ xth, we have 0 ∈ U and 0 ∈ F(0) so that u = 0 is always feasible. It follows that S always

ontains a non-empty set as a member, so that φ (the empty set) is never maximal in S - this

will be useful below.

4.4.1 Extremal

Denition 4.13. For any u ∈ U , we say that u is extremal in U if ∃k ∈ 1, . . . , 2d su h that

Pu|k = xth,k.

Noti e that in Fig. 4.6, Pu(2)|1 = xth,1 so that u(2)

is extremal in U , whereas Pu(1)|k 6= xth,k,

∀k, so that u(1) is not extremal in U .

Lemma 4.6. If F(u) is maximal in S then u is feasible and extremal in U .

The proof of Lemma 4.6 is available in Appendix B.2, as Lemma B.4.

4.4.2 Irredu ible

Denition 4.14. We say that u ∈ R2n

is redu ible if there exists u′ ⊆ u, u′ 6= u, with F(u′) =

F(u); otherwise, u is said to be irredu ible.

We will see that irredu ibility of u is a ru ial property that is required for maximality of

F(u). The proofs of Lemmas 4.7, 4.8, and 4.9 are available in Appendix B.3, as Lemmas B.6, B.7,

and B.9.

Lemma 4.7. For any feasible u ∈ R2n, let u′ = x(F(u)), it follows that F(u′) = F(u).

Lemma 4.8. For any u ∈ R2n, u is irredu ible if and only if it is feasible and x(F(u)) = u.

Note, if u is irredu ible and extremal in U , then Pu|k = xth,k for some k, and so P x(F(u))|k =

xth,k.

Lemma 4.9. For any u ∈ R2n, u is irredu ible if and only if :

Tu ⊆ Tu′ ⇐⇒ F(u) ⊆ F(u′), ∀u′ ∈ R2n

(4.33)

4.4.3 Maximality

As pointed out earlier, we are interested in safe ontainers that are maximal in S. We now

present our main result that gives the ne essary and su ient onditions for maximality.

Theorem 4.1. F(u) is maximal in S if and only if u is irredu ible and extremal in U .

Proof of the if dire tion: We give a proof by ontradi tion. Let u ∈ U be irredu ible and

extremal in U , but suppose that F(u) is not maximal in S, so that ∃u′ ∈ U su h that F(u) ⊆

F(u′), with F(u) 6= F(u′). Be ause F(u) 6= F(u′), then learly Tu 6= Tu′, and using Lemma 4.9,

we have Tu ⊆ Tu′. Let δ = Tu′ − Tu, so that 0 ⊆ δ and δ 6= 0. Re all that Q is SP, from

Lemma B.18, so that 0 ⊂ Qδ. Let w = Qδ. Denote by wi, δj , and qij the ith entry of w, the

jth entry of δ, and the (i, j)th entry of Q, respe tively. Noti e that,

wi =2n∑

qijδj (4.34)

where qij 6= 0, ∀i, j, due to Lemma B.16. For i ∈ 1, . . . , n, we have qijδj ≥ 0, ∀j, be ause

Q is SP and 0 ⊆ δ, whi h, ombined with qij 6= 0 and δ 6= 0, gives wi > 0. Similarly, for

i ∈ n + 1, . . . , 2n, we have qijδj ≤ 0, ∀j, be ause Q is SP and 0 ⊆ δ, whi h, ombined with

qij 6= 0 and δ 6= 0, gives wi < 0. Therefore, we have 0 ⊂ w = Qδ = u′ − u. Then u ⊂ u′, due

to (4.13), and Pu ⊂ Pu′ ⊆ xth, making use of Lemma 4.2 and the nal step due to u′ ∈ U , so

that u is not extremal in U , and we have a ontradi tion that ompletes the proof.

Proof of the only if dire tion: We give a proof by ontradi tion. Given that F(u) is maximal

in S, we know from Lemma 4.6 that u is feasible and extremal in U . Suppose u is redu ible, so

that x(F(u)) 6= u, be ause we already have that u is feasible. Re all that x(F(u)) ⊆ u, from

whi h Px(F(u)) ⊆ Pu ⊆ xth, be ause P is SP and u ∈ U . Let u′ = x(F(u)) 6= u, so that

u′ ∈ U and Tu′ = Tx (F(u)) = eoptI∈F(u)(RI). Let δ = Tu − Tu′ = Tu − eoptI∈F(u)(RI),

then we have 0 ⊆ δ due to eoptI∈F(u)(RI) ⊆ Tu and δ 6= 0 (due to u′ 6= u). Re all that Q

is SP, from Lemma B.18, and every element of Q is non-zero, due to Lemma B.16, so that

0 ⊂ Qδ = u′ − u; a more detailed explanation of this step was presented in the rst part of

the proof. Consequently, we have u′ ⊂ u, due to (4.13), so that Pu′ ⊂ Pu ⊆ xth, making use

of Lemma 4.2 and the nal step due to u ∈ U , so that u′ is not extremal in U . Therefore, by

Lemma 4.6, F(u′) is not maximal in S. However, F(u) = F(u′), due to Lemma 4.7, so that

F(u) is not maximal in S, a ontradi tion that ompletes the proof.

In Fig. 4.7, we give a graphi al representation of the set U for the same example as in Fig. 4.6.

Sweeping over all values of u ∈ U and he king the onditions of Lemma B.15, we an dis over

the set of irredu ible ve tors u. We represent the set of ve tors u that are irredu ible as the

dotted-boundary polygon. Noti e that the set of ve tors u that are extremal in U have Pu on

the boundary of U . Thus, the set of ve tors u that have F(u) maximal in S is the interse tion

of both irredu ible and extremal sets, due to Theorem 4.1, represented in red on the boundary

of U .

Figure 4.7: A graphi al representation of the set U , ve tors u that are irredu ible, ve tors uthat are extremal in U , and ve tors u that have F(u) maximal in S.

This important theoreti al result forms the basis for our hoi e of pra ti al onstraints

generation algorithms that are guaranteed to give maximal ontainers, as we will see in the next

se tion. Re all that whenever u is irredu ible and extremal in U , then Px(F(u))|k = xth,k, for

some k, so that the bound on the voltage variation at the kth grid node would be equal to its

maximum allowable voltage variation. In other words, a maximal ontainer always auses some

node(s) on the grid to experien e the maximum allowable voltage variation, at least based on

the x(·) bound.

4.5 Appli ations

So far, we have shown that a ontainer F(u) is maximal in S if and only if u satises the

onditions of Theorem 4.1. Note that it is possible to nd a ontainer F(u) that is maximal in

S but does not in lude the zero state, i.e. the state I = 0. We believe that users are interested

in ontainers that in lude the zero state (i.e. if the logi ir uitry does not draw any urrent,

then the grid must learly be safe) and, thus, we will enfor e this onstraint when sear hing

for maximal ontainers F(u). In this se tion, we will des ribe some design obje tives and

orresponding algorithms for nding spe i maximal safe ontainers F(u), with the additional

ondition that 0 ∈ F(u). These algorithms will ea h be formulated as a maximization of a ertain

design obje tive g(u). Lemma B.10 in Appendix B.4 establishes a su ient ondition on g(·)

for whi h the algorithms proposed below produ e maximal ontainers, based on Theorem 4.1.

4.5.1 Peak Power Dissipation

An interesting quality metri for a power grid is the peak total power dissipation that it an safely

support in the underlying ir uit. We refer here to the instantaneous power dissipation, whi h is

onservatively approximated by Vdd∑m

j=1 ij(t). Thus, we are interested in a safe ontainer that

is maximal in S and that allows the highest possible

∀j Ij . Generally, one might be interested

in the highest weighted sum of the individual urrents, i.e.

∀j qjIj , where qj > 0 is a user-

spe ied weight on the jth urrent sour e. This will allow ertain areas of the die to support

larger power dissipation than other areas. However, in this hapter we assume that all urrent

sour es have equal weights and, hen e, we will be nding the peak total power dissipation that

the grid an safely support.

For any u ∈ U , we dene σ(u) to be the largest sum of urrent sour e values allowed under

= maxI∈F(u)

(4.35)

and we dene σ∗ to be the largest σ(u) a hievable over all possible u ∈ U with 0 ∈ F(u), i.e.:

= maxu∈U

0∈F(u)

(σ(u)) = maxu∈U0∈Tu

(σ(u)) (4.36)

where the se ond equality is due to Lemma 4.3.

Let up ∈ U be su h that σ(up) = σ∗, and I∗ ∈ F(up) be su h that

∑mj=1 I

∗j = σ∗. In general,

up and I∗ may not be unique. Based on (4.22) and (4.23), and making use of Lemma 4.3, we

an express the ombined (4.35) and (4.36) as the following linear program (LP):

σ∗ = Maximize

∑mj=1 Ij

subje t to RI ∈ Tu

0 ∈ Tu

Pu ⊆ xth

I ≥ 0

(4.37)

Let D be the feasible region of the LP (4.37):

= (I, u) : I ≥ 0, RI ∈ Tu, 0 ∈ Tu, Pu ⊆ xth (4.38)

σ∗ = max(I,u)∈D

(4.39)

Noti e that, (0, 0) ∈ D so that D is not empty and all of σ∗, up, and I∗ are well-dened.

Therefore, the ontainer F(up) = I ∈ Rm : I ≥ 0, RI ∈ Tup 6= φ provides the desired urrent

onstraints, ∀t ∈ R:

i(t) ≥ 0

Ri(t) ∈ Tup

The following lemma establishes the maximality of F(up), based on Lemma B.10.

Lemma 4.10. F(up) is maximal in S.

Proof. Re all that I∗ and up are well-dened and (I∗, up) ∈ D, so that RI∗ ∈ Tup and I∗ ≥ 0 ,

from whi h up is feasible. We will prove that σ(·) satises the onditions of Lemma B.10, from

whi h F(up) is maximal in S. First, noti e that for any u, u′ ∈ U , if F(u′) = F(u), it follows

that σ(u′) = σ(u), due to (4.35). It remains to prove that for any u, u′ ∈ U , with 0 ∈ Tu and

0 ∈ Tu′, if Tu′ ⊃ Tu, then σ(u′) > σ(u).

For any u ∈ U , let I ∈ F(u) be su h that σ(u) =∑m

j=1 Ij . Let

λ =min∀i

∣Tu′|i − Tu|i

max∀i,j(|rij |)(4.40)

whi h is well-dened due to R 6= 0, and λ > 0 be ause Tu ⊂ Tu′. Also, let e1 ∈ Rm

be the

ve tor whose 1st entry is 1 and all other entries are 0 and let I ′ = I + λe1. Be ause λ > 0, we

have λe1 ≥ 0, λe1 6= 0, I ′ ≥ I ≥ 0, and I ′ 6= I, so that

∑mj=1 I

′j >

∑mj=1 Ij . Denote by cj the

jth olumn of R and noti e that:

RI ′ = RI + λRe1 = RI + λc1 (4.41)

= RI +min∀i

∣Tu′|i − Tu|i

max∀i,j(|rij |)c1 (4.42)

Let ϑ be the 2n×1 ve tor whose rst n entries are 1 and the rest are−1. Be ause c1/max∀i,j(|rij |) ∈

ϑ, noti e that:

min∀i

∣Tu′|i − Tu|i

max∀i,j(|rij |)c1 ∈ min

∣Tu′|i − Tu|i

ϑ (4.43)

whi h, ombined with RI ∈ Tu be ause (I, u) ∈ D, and due to Lemma 4.1, gives:

RI +min∀i

∣Tu′|i − Tu|i

max∀i,j(|rij |)c1 ∈ Tu+min

∣Tu′|i − Tu|i

ϑ (4.44)

Therefore, using (4.42), it follows that:

RI ′ ∈ Tu+min∀i

∣Tu′|i − Tu|i

ϑ (4.45)

Also, noti e that, for any k ∈ 1, . . . , n, be ause Tu ⊂ Tu′, we have:

min∀i

∣Tu′|i − Tu|i

≤∣

∣Tu′|k − Tu|k

∣ = Tu′|k − Tu|k (4.46)

Likewise, for any k ∈ n+ 1, . . . , 2n, we have:

−min∀i

∣Tu′|i − Tu|i

≥ −∣

∣Tu′|k − Tu|k

∣ = Tu′|k − Tu|k (4.47)

Combining (4.46) and (4.47), we get:

min∀i

∣Tu′|i − Tu|i

ϑ ⊆ Tu′ − Tu (4.48)

This, ombined with Tu ⊆ Tu and making use of (4.8), gives:

Tu+min∀i

∣Tu′|i − Tu|i

ϑ ⊆ Tu+ Tu′ − Tu = Tu′ (4.49)

Therefore, due to (4.45) and (4.49), we get:

RI ′ ∈ Tu′ (4.50)

This, oupled with u′ ∈ U , means that (I ′, u′) ∈ D, so that σ(u′) ≥∑m

j=1 I′j >

∑mj=1 Ij = σ(u),

from whi h σ(·) satises the onditions of Lemma B.10 and F(up) is maximal in S.

As an example, the LP (4.37) is run on a 17-node grid with two urrent sour es and the

resulting ontainer is shown in Fig. 4.8. Noti e that this method, in order to allow the maximum

peak power, may generate a ontainer that is skewed in a way that imposes a tight onstraint

on urrent in ertain lo ations of the die (su h as at i2(t)) while allowing larger urrent in other

lo ations (su h as at i1(t)). Other approa hes are possible to avoid this skew and even out the

urrent budgets, as we will see next.

Figure 4.8: An Example of F(up), F(us), and F(uc).

4.5.2 Uniform Current Distribution

The design team may be interested in a grid that safely supports a uniform urrent distribution

a ross the die, so as to allow a pla ement that provides a uniform temperature distribution.

We an generate onstraints that allow that obje tive by sear hing for a safe maximal ontainer

F(u) that ontains the hypersphere in urrent spa e that has the largest volume, or the largest

radius θ. In other words, this method aims to raise the minimum and avoid the skew indi ated

above. We will develop a method (4.55) whi h, when applied to the 17-node grid mentioned

earlier, generates the ontainer F(us) as shown in Fig. 4.8. In Chapter 3, under the RC model

of the grid, we proposed an algorithm that targets uniform urrent distribution a ross the die

area by sear hing for a safe ontainer that ontains the hyper ube in urrent spa e that has the

largest volume. In the RC ase, the hyper ube-based algorithm a hieved a signi ant speedup

ompared to the hypersphere-based approa h. However, in the RLC ase, the hyper ube-based

approa h did not have a omputational advantage over the hypershere-based approa h, and so,

we will only present the hypersphere-based approa h.

For any s × 1 or 1 × s ve tor γ, we will use the standard notation ‖γ‖ =√

∑sj=1 γ

2j ≥ 0

to denote the Eu lidean norm of γ. Let S(θ) ⊂ Rm

denote the hypersphere with radius θ ≥ 0,

entered at the origin and let S+(θ) = S(θ)∩Rm+ be the part of that hypersphere that is in the

rst quadrant of Rm, i.e., S+(θ) = I ≥ 0 : ‖I‖ ≤ θ. In the following lemma, we will derive

a ne essary and su ient ondition for S+(θ) ⊆ F(u), whi h will be useful in the rest of this

se tion.

Let R+and R−

be the n ×m matri es that onsist of the non-negative elements and non-

positive elements of R, respe tively, so that, with rij , r+ij , and r

−ij , denoting the (i, j)th entries

of R, R+, and R−

, respe tively, we have for any (i, j):

r+ij =

rij , if rij ≥ 0;

0, otherwise.

r−ij =

rij , if rij ≤ 0;

0, otherwise.

(4.51)

Denote by ri, r+i , and r

−i the ith rows of R, R+

, and R−, respe tively. Let ν+ and ν− be n× 1

ve tors, where ν+i = ‖r+i ‖ ≥ 0 and ν−i = ‖r−i ‖ ≥ 0, and let ν

−ν−

Lemma 4.11. For any θ ≥ 0 and u ∈ R2n

with 0 ∈ F(u), S+(θ) ⊆ F(u) if and only if θν ⊆ Tu.

The proof of Lemma 4.11 is available in Appendix B.4, as Lemma B.11.

For any u ∈ U with 0 ∈ F(u), we dene Θ(u) to be the largest θ ≥ 0 for whi h S+(θ) ⊆ F(u),

or equivalently, for whi h θν ∈ Tu is satised, so that:

= maxS+(θ)⊆F(u)

θ≥0

(θ) = maxθν⊆Tuθ≥0

(θ) (4.52)

and we dene Θ∗to be the largest Θ(u) a hievable over all possible u ∈ U with 0 ∈ F(u), i.e.:

= maxu∈U

0∈F(u)

(Θ(u)) = maxu∈U0∈Tu

(Θ(u)) (4.53)

Let us ∈ U be a ve tor at whi h the above maximization attains its maximum. In other

words, us ∈ U is su h that Θ(us) = Θ∗and S+(Θ∗) ⊆ F(us). In general, us may not be unique.

Based on (4.22) and (4.23), we an express the ombined (4.52) and (4.53) as the following

linear program:

Θ∗ = Maximize θ

subje t to θν ⊆ Tu

0 ∈ Tu

Pu ⊆ xth

θ ≥ 0

(4.54)

Noti e that 0 ∈ θν, for any θ ≥ 0, so that the onstraints θ ≥ 0 and θν ⊆ Tu in (4.54)

automati ally guarantee that 0 ∈ Tu, and so the onstraint 0 ∈ Tu in (4.54) is redundant.

Therefore, (4.54) an be expressed as follows:

Θ∗ = Maximize θ

subje t to θν ⊆ Tu

Pu ⊆ xth

θ ≥ 0

(4.55)

Let R be the feasible region of the LP (4.55):

= (θ, u) : θ ≥ 0, θν ⊆ Tu, Pu ⊆ xth (4.56)

Θ∗ = max(θ,u)∈R

(θ) (4.57)

Noti e that, (0, 0) ∈ R so that R is not empty and both Θ∗and us are well-dened.

Therefore, the ontainer F(us) = I ∈ Rm : I ≥ 0, RI ∈ Tus 6= φ provides the desired urrent

i(t) ≥ 0

Ri(t) ∈ Tus

Lemma B.12 in Appendix B.4 establishes the maximality of F(us), based on Lemma B.10.

4.5.3 Combined Obje tive

Thus far, we have presented two algorithms for urrent onstraints generation. The rst algo-

rithm (4.37) aims to maximize the peak power dissipation that the grid an safely support in

the underlying ir uit, however, it generates a skewed ontainer in a way that imposes a tight

onstraint on the urrents in ertain lo ations on the die. The se ond algorithm (4.55) aims to

uniformly distribute the power budgets a ross the ir uit by raising the minimum; but this

approa h does not ne essarily allow for a large peak total power dissipation. One may be inter-

ested in a middle s enario; a ontainer that is maximal in S, tries to maximize the peak power

dissipation that the grid an safely support and tries to support a uniform urrent distribution

a ross the die. In this se tion, we will develop a onstraints generation algorithm, essentially a

ombination of (4.39) and (4.57), that allows this type of design obje tive. The algorithm will

generate a ontainer F(uc) su h as the one shown in Fig. 4.8.

Re all that (4.35) maximizes the sum of them urrent sour es atta hed to the grid, while (4.52)

maximizes the urrent radius for whi h the part of the hypersphere in the rst quadrant is on-

tained in F(u). Therefore, there is a lear disproportionality between the dimensions of both

obje tive fun tions whi h motivates the following. For any u ∈ U , we dene ξ(u) to be the

largest value of the following ombined obje tive allowed under F(u):

= maxI∈F(u)

S+(θ)⊆F(u)

(4.58)

= maxRI∈Tuθν⊆TuI,θ≥0

(4.59)

and we dene ξ∗ to be the largest ξ(u) a hievable under all possible u ∈ U with 0 ∈ F(u), so

= maxu∈U

0∈F(u)

(ξ(u)) = maxu∈U0∈Tu

(ξ(u)) (4.60)

Another way to ombine the obje tive fun tions of (4.39) and (4.57) is to simply take the

weighted sum of the individual obje tive fun tions of (4.39) and (4.57), i.e.

∑mj=1 Ij

+ αθ,

for any s alar α ≥ 0. Noti e that ξ(u) in (4.59) is dened to be the maximum value of this

weighted sum for α = m. Clearly, if α is lose to zero, then we will be sear hing for a ontainer

that targets the peak power dissipation, and if α is very large, then we will be sear hing for a

ontainer that targets the uniform urrent distribution. It may also be interesting to sweep over

the values of α to nd an α that is suitable for the design obje tive. However, in the following,

we assume that α = m, as in (4.59).

Let uc ∈ U be a ve tor at whi h the above maximization attains its maximum. In other

words, uc ∈ U is su h that ξ(uc) = ξ∗. Also, let ζ and ω be su h that

∑mj=1 ζj

+mω = ξ∗,

where ζ ∈ F(uc) and ων ⊆ Tuc, where ν is as dened in Se tion 4.5.2. In general, uc, ζ, and ω

may not be unique. Based on (4.22) and (4.23), we an express the ombined (4.59) and (4.60)

as the following linear program (LP):

ξ∗ = Maximize

∑mj=1 Ij

θν ⊆ Tu

0 ∈ Tu

Pu ⊆ xth

I, θ ≥ 0

(4.61)

Noti e that 0 ∈ θν, for any θ ≥ 0, so that the onstraints θ ≥ 0 and θν ⊆ Tu in (4.61)

automati ally guarantee that 0 ∈ Tu, and so the onstraint 0 ∈ Tu in (4.61) is redundant.

Therefore, (4.61) an be expressed as follows:

ξ∗ = Maximize

∑mj=1 Ij

θν ⊆ Tu

Pu ⊆ xth

I, θ ≥ 0

(4.62)

Let C be the feasible region of the LP (4.62):

(I, θ, u) : θν ⊆ Tu, RI ∈ Tu, I, θ ≥ 0, Pu ⊆ xth

(4.63)

ξ∗ = max(I,θ,u)∈C

(4.64)

Noti e that, (0, 0, 0) ∈ C so that C is not empty, and all of ξ∗, uc, ζ, and ω are well-dened.

Therefore, the ontainer F(uc) = I ∈ Rm : I ≥ 0, RI ∈ Tuc provides the desired urrent

i(t) ≥ 0

Ri(t) ∈ Tuc

Lemma B.13 in Appendix B.4 establishes the maximality of F(uc), based on Lemma B.10.

4.6 Implementation

In this se tion, we dis uss the implementation of one of the above algorithms, namely (4.62), as

the implementation of the other algorithms is similar.

One way to onstru t the feasible region of (4.62) is to ompute T = I2n − F . Re all

from Denition 4.6 that F requires knowledge of the non-negative and non-positive elements

in F , where F is dened in (2.34). Noti e from (2.34) that the expli it omputation of F in

turn requires the omputation of D−1B whi h has two major drawba ks: 1) this requires the

omputation of all the olumns of D−1whi h is prohibitively expensive, and 2) D−1

is a dense

matrix, so that F , F , and T are also dense, and the onstraints of the LP (4.62) are dense.

A key observation is that the expli it representation of T is unne essary and we an avoid the

omputation of D−1, as we will show below, using the simple hange of variables in (4.65)

and (4.66).

For any u ∈ R2n

and I ∈ Rm, let z ∈ R

2nvand y ∈ R

nvbe dened as follows:

Kz = Bu (4.65)

Dy = HI (4.66)

where:

B 0 0 0

0 0 B 0

(4.67)

Re all that D is non-singular, so that K is also non-singular and its inverse is given by:

K−1 =

D−1 0

0 D−1

(4.68)

Dene the following matri es:

, H ′

= F − HK−1B, R

= R−H ′D−1H (4.69)

where H is a 2n× 2nv matrix and H ′is an n× nv matrix.

Noti e that Kz = Bu ⇐⇒ z = K−1Bu. But z = K−1Bu =⇒ Hz = HK−1Bu, and

Hz = HK−1Bu =⇒ z = K−1Bu, be ause HT H = I2nv . Therefore,

Kz = Bu ⇐⇒ Hz = HK−1Bu (4.70)

⇐⇒ F u+ Hz = (F + HK−1B)u (4.71)

⇐⇒ F u+ Hz = F u (4.72)

Also, noti e that Dy = HI ⇐⇒ y = D−1HI. But y = D−1HI =⇒ H ′y = H ′D−1HI, and

H ′y = H ′D−1HI =⇒ y = D−1HI, be ause H ′TH ′ = Inv . Therefore,

Dy = HI ⇐⇒ H ′y = H ′D−1HI (4.73)

⇐⇒ RI +H ′y = (R+H ′D−1H)I (4.74)

⇐⇒ RI +H ′y = RI (4.75)

Therefore, for any u ∈ R2n, I ∈ R

m, and θ ∈ R, we have:

RI ∈ Tu

θν ⊆ Tu

⇐⇒

RI ∈ (I2n − F )u

θν ⊆ (I2n − F )u(4.76)

⇐⇒

RI +H ′y ∈ (I2n − F )u− Hz

θν ⊆ (I2n − F )u− Hz

Kz = Bu

Dy = HI

(4.77)

where in (4.76) we have used the fa t that T = I2n − F , and in (4.77) we have used (4.72)

and (4.75). With this, LP3 in (4.62) an be expressed as follows:

ξ∗ = Maximize

∑mj=1 Ij

subje t to RI +H ′y ∈ (I2n − F )u− Hz

θν ⊆ (I2n − F )u− Hz

Kz = Bu

Dy = HI

Pu ⊆ xth

I, θ ≥ 0

(4.78)

Noti e that K and B are sparse matri es that an be onstru ted easily from the matri es D

and B. Furthermore, noti e that onstru ting the matri es R and F requires the omputation

of D−1M , the omputation of the inverse of the diagonal matrix E, and some matrix multipli a-

tions. The omputation of D−1M does not require the full inverse of D; it only requires an LU

fa torization of the matrix D and nl forward/ba kward solves. The result of D−1M is used to

ompute the matri es E−1MTD−1B and E−1MTD−1H in F and R, respe tively. With this, it

is easy to see that the onstraints of (4.78) do not require the full inverse of D thus, it is easier

to onstru t as ompared to (4.62) and the onstraints of (4.78) are mu h sparser than (4.62).

Finally, noti e that the omputation of ν requires the omputation of the rows of R. More

pre isely, nding the upper part of ν, denoted as νu, requires the elements of D−1H whi h

an be done using m ≪ nv linear system solves. Noti e that the full D−1H does not have to

be stored in memory. Finding the lower part of ν, denoted as νl, requires the omputation of

−E−1MTD−1H. Re all that D−1M is already omputed and stored in memory to onstru t

F , so that −E−1MTD−1H = −E−1(D−1M)TH, be ause D−1is symmetri , an be easily

omputed using matrix transpose and matrix multipli ations.

4.7 Results

To ompare the dierent trade-os of the urrent ontainers generated by ea h of the above three

algorithms and their runtime e ien y, we implemented LP1, LP2, and LP3 given in (4.37), (4.55),

Power Grid

Current

Layers Sour es

G1b 9 2,027 156

G2b 9 4,499 306

G3b 9 7,774 552

G4b 9 17,160 1,190

G5b 9 30,117 2,070

G6b 9 46,701 3,192

G7b 9 104,530 7,140

G8b 9 184,523 12,656

G9b 9 412,927 28,056

G10b 9 561,344 38,220

G11b 9 662,708 45,156

and (4.62), respe tively, using C++. Re all that these algorithms were transformed into simpler

forms by avoiding the omputation of the full D−1matrix. The implementation details of

LP1 and LP2 are similar to those of LP3, whi h is explained in Se tion 4.6. We tested them

on a number of power grids with a 1.1 V supply voltage that were generated based on user-

spe i ations, in luding grid dimensions, metal layers, pit h and width per layer, and C4 pads

and urrent sour e distributions, onsistent with 65nm te hnology. With these spe i ations,

the grids are automati ally generated after whi h we introdu e non-uniformity in the grid to

model the real-world s enario. The details for the grids are given in Table 4.1. The maximiza-

tions were performed using the Mosek optimization pa kage [49. All results were obtained using

a 3.4 GHz Linux ma hine with 32 GB of RAM.

The number of variables and onstraints required for ea h LP are shown in Table 4.4. Fur-

thermore, the total CPU runtime for setting up and solving LP1, LP2, and LP3 are given in

olumns 2-4 of Table 4.2. Note that the CPU time in olumns 3 and 4 of Table 4.2 in lude the

time required for omputing ν. For example, on a 600K-node grid (G11b), LP1 took 5.8 hrs,

LP2 took 9.6 hrs, and LP3 took 10.8 hrs.

In Table 4.3, we present the results of the three LPs. Denote by P (u)

= Vdd×σ(u) the peak

power dissipation allowed under F(u). To study the dieren e between the ontainers generated

using LP1, LP2, and LP3, we used the following method. First, we omputed the peak power

dissipation a hievable under all ontainers, whi h are P (up), P (us), and P (uc), and the largest

urrent radius for whi h the part of the hypersphere in the rst quadrant is ontained in all

ontainers, whi h are Θ(up), Θ(us), and Θ(uc). For instan e, on a 560K-node grid (G10b), the

peak power dissipation a hievable under F(up), F(us), and F(uc) is 97.24 mW, 44.54 mW, and

83.96 mW, respe tively, and the largest urrent radius for whi h the part of the hypersphere

Table 4.2: Runtime of the three approa hes

Uniform Current

Combined Obje tive

Distribution

Name Total Time Total Time Total Time

G1b 1.7 se 1.3 se 2.3 se

G2b 5.2 se 5.0 se 5.8 se

G3b 10.2 se 11.5 se 11.9 se

G4b 31.0 se 33.6 se 36.5 se

G5b 1.3 min 1.2 min 1.6 min

G6b 2.6 min 3.1 min 3.7 min

G7b 14.8 min 16.0 min 14.3 min

G8b 21.7 min 1.0 hr 48.8 min

G9b 1.6 hr 6.3 hr 4.2 hr

G10b 3.0 hr 8.6 hr 6.0 hr

G11b 5.8 hr 9.6 hr 10.8 hr

Table 4.3: Comparison of the three approa hes

Uniform Current

Combined Obje tive

Distribution

P (up) Θ(up) P (us) Θ(us) P (uc) Θ(uc)in mW in µA in mW in µA in mW in µA

G1b 0.48 0.98 0.11 2.88 0.43 2.58

G2b 0.96 1.77 0.31 3.58 0.85 3.42

G3b 1.45 1.04 0.45 2.86 1.31 2.74

G4b 3.09 1.22 1.15 3.22 2.72 3.12

G5b 5.77 1.43 2.35 3.43 5.05 3.36

G6b 8.55 1.58 3.61 3.56 7.42 3.50

G7b 18.69 1.28 8.38 3.29 16.75 3.25

G8b 33.52 1.11 15.59 3.11 30.86 3.06

G9b 70.04 0.75 33.95 2.73 66.21 2.71

G10b 97.24 1.57 44.54 3.57 83.96 3.53

G11b 118.53 1.58 54.86 3.55 103.13 3.54

Table 4.4: Number of variables and onstraints for all three LPs

LP1 (4.37) LP2 (4.55) LP3 (4.62)

Number of Variables 5N+2n+m 4N+2n+1 4N+2n+m+1

Number of Constraints 6N+2n 5N+2n 6N+2n

in the rst quadrant is ontained in F(up), F(us), and F(uc) is 1.57µA, 3.57µA, and 3.53µA,

respe tively. The results show that P (us) ≪ P (up) and Θ(up) ≪ Θ(us) on all grids. Thus,

(a) Using F(up).

(b) Using F(us).

( ) Using F(uc).

histograms. The olor bar units are mA/ m

both F(up) and F(us) provide a distin t trade-o for the hip design team. Moreover, the results

show that P (uc) ≈ P (up) and Θ(uc) ≈ Θ(us). Therefore, the ombined obje tive approa h in

LP3 gives the best features of the peak power dissipation and the uniform urrent distribution

approa hes.

Another way to ompare the three approa hes LP1, LP2, and LP3, is to look at the urrent

density, i.e. the urrent dissipation per unit area of the die, allowed by the three resulting

ontainers. To assess this, we maximize the allowed urrent within a small window of the die

surfa e, and we do this for every position of that window a ross the die. We divide the die area

into κ×κ of these windows and ompute the peak urrent dissipation inside ea h, as allowed by

F(up), F(us), and F(uc). In Fig. 4.9, we present ontour plots for κ = 35 for the peak urrent

densities under F(up), F(us), and F(uc), respe tively, on a 100K-node grid (G7b). Note that

the urrent onstraints based on F(up) allow higher urrent densities at ertain spots but also

in lude some spots with very small and restri ted urrent density budgets. This large spread in

urrent densities an lead to thermal hotspots. This may be avoided by using F(us) whi h, as

expe ted and as seen in the gure, provides a uniform distribution of urrent densities a ross

the die area ompared to F(up), whi h is ree ted in a smaller standard deviation. Of ourse,

F(up) supports larger overall peak urrent dissipation than F(us), whi h is ree ted in a larger

mean. The urrent onstraints based on F(uc) provide a urrent density distribution over a

smaller range ompared to F(up) and allows for larger urrent dissipation ompared to F(us).

Clearly, F(uc) is superior to F(up) and F(us) providing the best features in those ontainers.

4.8 Con lusion

In this hapter, we extended the appli ability of the onstraints generation framework to allow for

indu tan e. We developed some key theoreti al results to allow the generation of onstraints that

orrespond to maximal urrent spa es. We then applied these results to provide two onstraints

generation algorithms that target key quality metri s of the grid: maximum power dissipation

and uniformity of the power spread a ross the die. Finally, we presented a ombination of both

quality metri s that proved to be superior to the other algorithms.

Chapter 5

Power S heduling with A tive RC

Power Grids

In a tive power grids, the worst- ase voltage drop is the result of two things: the power budgets

that were allo ated to the various ir uit blo ks during the design pro ess and the ombination

of blo ks that are turned ON in a given operational mode. Intuitively, more blo ks an be turned

ON simultaneously if the blo ks are onstrained to have low urrent levels, and vi e versa. In

this hapter, we propose a framework to manage this trade-o between how many blo ks are

ON simultaneously and how big the power budgets of the individual blo ks are. Subje t to

user guidan e, we generate blo k-level ir uit urrent onstraints as well as an impli it binary

de ision diagram (BDD) that helps identify the safe working modes. If the blo ks are designed

to respe t these onstraints, then the BDD an be used during normal operation to he k

whether a andidate working mode is safe or not. Figure 5.1 shows a on eptual representation

of a hip s heduler. This hip omponent monitors the on- hip hardware resour es required

to exe ute an in oming appli ation issued by the Appli ation Repository to ensure that the

voltage variations remain within spe s. Developing a s heduler requires, at the very least, up-

front analysis to identify the elements or patterns of workload that represent safe operation;

this is a key problem that is addressed in this hapter.

Figure 5.1: Con eptual system-level representation of the proposed run-time workload

s heduler in a power-gated hip. This gure was inspired from [60.

Chapter 5. Power S heduling with A tive RC Power Grids 85

5.1 High-Level Design Flow of a Chip

The design of a hip typi ally starts with a hip ar hite t determining the total power dissipa-

tion spe i ation of the hip being designed. This is usually done based on some engineering

judgment, and/or expertise from previous design a tivities, with this or similar te hnologies.

The hip ar hite t will then onsult with the dierent logi design teams to assign a power

budget for ea h of the teams. Roughly speaking, this is when the power grid design starts.

The initial power grid design is usually arrived at by adapting a previous grid design, perhaps

with minor modi ations. However, there are no tools available today to try and optimize the

grid at the high level, early in the design ow, to a ommodate the power budgets assigned for

individual logi blo ks and high-level ma ros. There is no way to tell if the grid will support

the allo ated power budgets, or how mu h power it an support without generating a voltage

drop violation. Our work is aimed at helping over ome all these types of problems. The power

s heduling framework proposed in this hapter, if in orporated in the urrent design ow of a

hip, allows us to determine the maximum peak power dissipation that ea h blo k an handle.

If these power budgets impose too severe a budget on a ertain blo k in some orner of the die,

for example, then one an address the problem early on by modifying the grid design, while it

is still easy to do so, and generate a fresh set of power budgets for the individual logi blo ks.

Furthermore, if there are ertain fun tional dependen ies among blo ks, then we will be able to

tell early on whether ertain hip workloads will result in voltage drop violations on the grid,

and so, perhaps modify the grid design or even re onsider the oorplan. Of ourse, this still

leaves room for nal revisions of the grid, on e the ir uit has been fully dened and the full grid

has been spe ied, by the standard y le of: dis over violation, make revision, and resimulate.

5.2 Overview

In a power-gated design, fun tional blo ks have their own lo al grids that are onne ted to the

global grid via wide multi-ngered transistors, referred to as sleep transistors or power-gating

swit hes. A s hemati diagram of a power-gated grid is shown in Fig. 5.2. Typi ally, a power-

gating transistor may be modeled as an ideal swit h in series with a resistor, as in Fig. 5.3. We

will refer to the power grid model in Fig. 5.3 as the original grid.

In general, every blo k may have multiple power states, whi h may be as simple as: high-

performan e, low-power, standby, or OFF. We assume that ea h blo k an either be turned ON

or OFF this an be easily extended to multi-power states and is not a limitation to our work.

If every ir uit blo k is in a ertain power state, we say that the hip overall is in a ertain

working mode. If some ir uit blo ks are transitioning from one power state to another, we say

that the hip is in a transition mode. A power-gated grid should be veried under both working

and transition modes. In this hapter, we fo us on analyzing the grid under dierent working

modes.

Verifying the original grid for voltage drop is di ult be ause of the large number of working

Figure 5.2: S hemati diagram of an a tive power grid with power gating transistors.

modes that the grid an have. For example, a hip with 20 blo ks, with 2 power states (i.e. ON

and OFF) ea h, has over a million working modes. A brute-for e approa h would be to verify

the passive power grid orresponding to every possible working mode. Clearly, this method is

prohibitively expensive as it requires the veri ation of an exponential number of passive grids,

orresponding to the exponential number of possible working modes. Instead, in this work, we

verify a slightly simplied model of the grid, whi h we all the equivalent passive grid, shown

in Fig. 5.4. The simpli ation onsists of simply moving the swit hes down to the bottom of

the grid, as shown in the gure. The key benet of this simpli ation is that as a result, as we

will see in Se tion 5.4.2, the voltage integrity veri ation of the equivalent passive grid requires

only one veri ation run for ea h lo al grid in isolation, ombined by means of a type of

superposition in order to identify the set of safe working modes for the full grid.

These benets of using the equivalent passive grid ome with a very small a ura y ost. To

illustrate this, Fig. 5.5 shows a s atter plot of the relative errors, in per ent, versus the maximum

voltage drop on the nodes of the original grid. The relative errors are on the maximum voltage

drop on the nodes of interest of the ON blo ks, resulting from using the equivalent passive

grid instead of the original grid. The gure also shows the urves orresponding to an error of

±0.6 mV where a point on the +0.6 mV urve represents a node's maximum voltage drop in

the equivalent passive grid that is exa tly 0.6 mV away from the maximum voltage drop on the

orresponding node in the original grid. On the other hand, a point with an error larger than

0.6 mV will lie in the region above the +0.6 mV urve. From the resulting s atter plot, it is

lear that su h errors are very small (below ±0.6 mV).

Taken in isolation, a blo k (lo al grid), as shown in Fig. 5.6, an be analyzed separately using

the inverse problem ( onstraints generation) approa h for passive grids, as des ribed in previous

hapters, to give a ontainer (or set of ontainers) that respe ts the maximum allowable voltage

drop at all the nodes of interest in the blo k. Be ause we expe t lower levels of the grid to have

less than ideal voltages, suppose that the supply value applied at every blo k's power taps is

Figure 5.3: S hemati diagram of a power-gated grid using resistive swit hes, referred to as the

original grid.

Figure 5.4: S hemati diagram of the equivalent passive grid.

parameterized by an arti ial variable α. Spe i ally, for a blo k k with uniform voltage drop

threshold at all its nodes of interest, i.e. the nodes of interest in that blo k have the same voltage

drop threshold γk, suppose the supply value is Vdd− (1−αk)γk. We have hosen this expression

for the supply value for mathemati al onvenien e, as we will see later. There is no need to

a tually relate this supply value to any a tual supply value that the full hip may experien e

at ertain layers. In fa t, we will see that these variables α1, α2, . . . , and αq ( orresponding to

blo k 1, . . . , blo k q) an be viewed as parameters that be ome knobs of sorts by whi h we

an have the lo al ontainers expand when the supply voltage is in reased or ontra t when it

is de reased. The safety of these ontainers is not assumed based on the hoi e of α1, α2, . . . ,

and αq. Rather, safety will be enfor ed as part of the subsequent analysis of the full grid, from

whi h we will apture the set of safe working modes of the grid, represented by a set of safe

assignments of a Boolean ve tor β orresponding to any α1, α2, . . . , and αq. This safe spa e of

β will be aptured with a BDD.

0 5 10 15 20 25 30−100

Voltage drop (mV)

Change in the maximum voltage drop

+0.6 mV

−0.6 mV

Figure 5.5: Relative error of the maximum voltage drop using original grid vs the equivalent

passive grid, based on HSPICE simulations of a 400K-node grid with 49 blo ks.

Figure 5.6: A blo k in isolation.

Consider an a tive power grid that has q blo ks. In this hapter, we assume that the power grid,

aside from the a tive elements (or ideal swit hes), ontains only resistive and apa itive (RC)

parasiti s. So, the topologies of the lo al grids (in isolation) and the full grid are a ording to

the RC model des ribed in Se tion 2.4.1. Noti e that the RC model in Se tion 2.4.1 as well as

the results of the onstraints generation framework des ribed in Se tion 2.6 apply to any passive

RC grid, and so they will be invoked to des ribe the power grid of ea h blo k in isolation as well

as the full grid. Thus, for ease of extension and to avoid repetition, we will dene a passive

power grid problem P(·) that in ludes the des ription of the grid model in Se tion 2.4.1

and the results presented in Se tion 2.6 as follows.

Suppose an RC power grid (as des ribed in Se tion 2.4.1) has n non-Vdd nodes. Out of these

n nodes, m nodes are atta hed to ideal urrent sour es and d nodes have voltage threshold

spe i ation. So, G is the n × n ondu tan e matrix, A is the n × n matrix that appears as a

result of the time-dis retization in (2.16), M = A−1is the n× n matrix inverse of A, H is the

n×m matrix that identies whi h node is onne ted to whi h urrent sour e, M ′is an n×m

matrix dened as M ′ = MH, P is the d × n matrix whi h identies the nodes with voltage

threshold spe i ation, Vth is a d× 1 ve tor of voltage thresholds, and i(t) is the m× 1 ve tor

of urrent waveforms of the m urrent sour es. Furthermore, I is an m× 1 ve tor with units of

urrent whi h will be used to represent i(t) at an arbitrary time point t, and u (with subs ripts

or supers ripts) is an n× 1 ve tor with units of Volts whi h will be used to onstru t a urrent

ontainer. Furthermore, the set U represents the set of all safe voltage drop assignments, as

dened in (2.48), the set F(u) represents a urrent ontainer, as dened in (2.49), and the

set S represents the set of all safe urrent ontainers, as dened in (2.50). The mathemati al

denitions of these sets are restated below:

U = u ∈ Rn : u ≥ 0, Pu ≤ Vth (5.1)

= I ∈ Rm : I ≥ 0, M ′I ≤ MGu (5.2)

= F(u) : u ∈ U (5.3)

Finally, v(F(u)) is the n× 1 ve tor of upper-bounds on the worst- ase voltage drop at all grid

nodes under all possible urrent waveforms ontained in the urrent ontainer F(u), i.e.

v(F(u)) = G−1A emaxI∈F(u)

(M ′I) (5.4)

With this, let P(n,m, d,G,C,H, P ) denote a passive power grid problem as des ribed above.

5.4 Proposed Approa h Theory

Given the equivalent passive model in Fig. 5.4, our approa h onsists of two stages: 1) we

perform isolated blo k analysis to generate blo k-level urrent ontainers by adapting the stan-

dard inverse problem ( onstraints generation) approa h des ribed in Se tion 2.6 this will be

dis ussed in Se tion 5.4.1; and 2) these blo k-level ontainers will then be used to identify the

behavioral patterns of the whole hip that are safe based on the voltage analysis of the full

grid, whi h we apture as an impli it binary de ision diagram (BDD) this will be dis ussed in

Se tion 5.4.2. Our approa h uses an internal parameter αk, for every blo k k. These parameters

be ome knobs of sorts by whi h we an have these blo k-level ontainers expand or ontra t,

and in turn, the BDD will either allow for less or more blo ks to operate simultaneously.

This se tion in ludes the bulk of our theoreti al ontribution, ulminating in the result of

Theorem 5.1 that leads to the omputational e ien y of our approa h. This theorem follows

from the results of Lemma 5.4, whi h establishes a s alability property for the upper-bound

on the worst- ase voltage drop in terms of the internal parameters α1, α2, . . . , and αq, and

Lemma 5.5, whi h establishes the prin iple of superposition for the equivalent passive grid. In

addition, we show that the blo k-level urrent ontainers (in Lemma 5.2) also has a s alability

Figure 5.7: Simple example of a power grid with 2 blo ks.

property in terms of these internal parameters. These results allow us to easily manage the trade-

o between the power budgets of the blo ks and the number of blo ks that are ON simultaneously.

Throughout the rest of this se tion, we will refer to the example in Fig. 5.7 to help the reader

better understand our approa h.

5.4.1 Isolated Blo k Analysis

In this se tion, we prove some key results that are appli able to any passive grid, and thus will

be used for every blo k in isolation. Every blo k k has a uniform voltage drop threshold γk and

an internal parameter αk. Throughout the remainder of this se tion, we will omit the subs ript

k for notational simpli ity as we are onsidering a blo k in isolation.

Safety Condition

Grid safety relates to the voltage drop at every node, i.e., the dieren e between the ideal

supply voltage value Vdd and the true node voltage, denoted vi(t) at every node i. Note that

the voltage drop Vdd − vi(t) is relative to the ideal Vdd, and that when we say that node i has

a user-spe ied voltage drop threshold γ, we impli itly mean that γ is the threshold relative to

Vdd, so that the node is safe if Vdd − vi(t) ≤ γ. For a blo k in isolation, and be ause we expe t

lower levels of the grid to have less than ideal voltages, suppose its power taps are onne ted

to a parameterized ideal voltage supply of Vdd − (1 − α)γ, with 0 ≤ α ≤ 1, as shown in the

example in Fig. 5.8. When α = 1, this supply value is Vdd and it de reases all the way to Vdd−γ

for α = 0. For any node i in that blo k, [Vdd − (1 − α)γ] − vi(t) is the voltage drop relative to

Vdd − (1− α)γ, and it is easy to see that the node safety ondition Vdd − vi(t) ≤ γ is equivalent

to [Vdd − (1− α)γ]− vi(t) ≤ αγ. Thus, the voltage drop threshold relative to the supply value

Vdd−(1−α)γ is simply αγ. It is in this sense that the α parameter is simply a knob that, when

redu ed, exerts a more stringent safety onditions on grid nodes, whi h would naturally result

in a smaller ontainer for the lo al blo ks, allowing more blo ks to be turned ON simultaneously,

and vi e-versa. This α be omes an internal parameter that represents the trade-o between the

sizes of lo al grid ontainers and the number of full grid working modes that will be deemed to

be safe.

Figure 5.8: Simple example of a power grid with a supply value of Vdd − (1− α)γ.

We an then easily extend and re-derive the theory of the passive grids from Se tion 2.6

so that it is parameterized by 0 ≤ α ≤ 1. Consider the generi passive power grid problem,

denoted earlier as P(n,m, d,G,C,H, P ) whi h we will apply to an isolated blo k. We assume

that the voltage drop threshold spe i ation is uniform within every blo k, i.e. all the nodes of

interest in that blo k have the same voltage drop threshold γ > 0, relative to Vdd. We apture

this by the d×1 ve tor γ1d, where 1d is a d×1 ve tor whose every entry is 1. Assuming that the

power taps of the isolated passive grid are onne ted to an ideal voltage sour e of Vdd−(1−α)γ,

let v(t) be the ve tor of voltage drops relative to [Vdd − (1−α)γ] at all nodes in the blo k, then

as we saw above, a safe voltage drop assignment for the blo k in isolation must satisfy:

Pv(t) ≤ αγ1d (5.5)

For any α ∈ [0,1], dene the sets U(α), L(u), and S(α) as follows, motivated by (5.5):

= u ∈ Rn : 0 ≤ Pu ≤ αγ1d (5.6)

= I ∈ Rm : I ≥ 0, M ′I ≤MGu (5.7)

= L(u) : u ∈ U(α) (5.8)

The following lemma shows that, for any α > 0, S(α) always has a urrent ontainer that allows

a non-zero urrent. This will be useful later on.

Lemma 5.1. For any α ∈ (0,1], S(α) always has a non-empty member L(u) with L(u) 6= 0.

Proof. For any α ∈ (0,1], let u = αγ1n > 0. Noti e that Pu = αγP1n = αγ1d, where

we used the fa t that P has exa tly one 1 in ea h row so that P1n = 1d. It follows that

u ∈ U(α). Be ause G is irredu ibly diagonally dominant with positive diagonal and non-positive

o-diagonal entries, then G1n ≥ 0, with G1n 6= 0. Furthermore, be ause M > 0, it follows that

MGu = αγMG1n > 0. Finally, let I =

min∀i(MGu|i)

max∀i,j(mij)

e1 ≥ 0 and I 6= 0, where e1 is an

m× 1 ve tor whose rst entry is 1 and all other entries are zero, mij is the (i, j)th entry of M ′.

Let ci denote the ith olumn of M ′and noti e that:

M ′I =

min∀i(MGu|i)

max∀i,j(mij)

M ′e1 =

min∀i(MGu|i)

max∀i,j(mij)

c1 (5.9)

≤ min∀i

(MGu|i)1n (5.10)

last step due to c1/max∀i,j(mij) ≤ 1n, so that M ′I ≤ MGu. It follows that I ≥ 0, I 6= 0, and

I ∈ L(u). Therefore, for any α ∈ (0,1], we have u ∈ U(α), L(u) 6= φ, and L(u) 6= 0.

S alability of Current Containers

In Chapter 3, we dis ussed several algorithms for passive grids that generate a ontainer L(u) ⊆

Rm+ that is both safe and maximal. These algorithms target spe i design obje tives su h as the

total peak power that a grid an safely support, the uniformity of urrent distribution a ross

the die area, or a ombination of both obje tives. The peak power algorithm, for example,

on e extended and parameterized by α as above, then applied to the grid in Fig. 5.8, for

dierent values of α, generates the urrent ontainers shown in Fig. 5.9(a). Generating urrent

ontainers for dierent values of α requires solving an optimization problem for every required

value of α, whi h is omputationally expensive. In this se tion, we show that, under a ertain

mild ondition on the design obje tive, the resulting ontainers an be found by s aling the

ontainer orresponding to α = 1, as we will see in Lemma 5.2, whi h is learly mu h faster

than generating the ontainers for every required value of α.

Typi ally, these algorithms, su h as in Chapter 3, an be expressed in the following general

maxu∈U(α)

maxI∈L(u)

f(I, u)

(5.11)

where f(I, u) : Rm × Rn → R is some real-valued obje tive fun tion. For example, the peak

power algorithm an be expressed in the form of (5.11) where f(I, u) =∑

∀j Ij . Noti e that,

for any u ∈ Rn, the inner maximization nds the maximum value of f(I, u) over all possible

urrent assignments I ∈ L(u). Thus, the result of the inner maximization is a fun tion of u,

denoted as:

g(u) = maxI∈L(u)

f(I, u) (5.12)

and referred to as the design obje tive. The largest g(u) a hievable over all possible safe voltage

drop assignments u ∈ U(α) is found using the outer maximization, the result of whi h is a

fun tion of α, denoted as g∗(α), i.e.

g∗(α) = maxu∈U(α)

g(u) = maxI∈L(u)u∈U(α)

f(I, u) (5.13)

For any α ∈ [0, 1], let u∗(α) be a ve tor fun tion that evaluates to a value of u for whi h the

outer maximization attains its maximum, i.e. g(u∗(α)) = g∗(α), ∀α ∈ [0, 1]. In general, u∗(α)

may not be unique. The ve tor u∗(α) produ ed in (5.13) an be used to onstru t the urrent

ontainer L(u∗(α)), where L(·) is dened in (5.7). Note that the optimization problem (5.13) is

always feasible, be ause 0 ∈ U(α) and 0 ∈ L(0), so that u∗(α) is well-dened and the resulting

ontainer L(u∗(α)) is non-empty.

The lemma below is a key theoreti al result that gives a su ient ondition under whi h

L(u∗(α)) for any supply value Vdd− (1−α)γ, an be found by simply s aling u∗(1) to get u∗(α),

whi h will then be used to onstru t L(u∗(α)) as in (5.7). This will be useful for the full grid

analysis.

Lemma 5.2. If g(cu) = cg(u), for any real number c > 0 and u ∈ Rn, then u∗(α) = αu∗(1),

∀α ∈ [0, 1].

Proof. Re all that for any α ∈ [0, 1], g∗(α) an be found using the following optimization

problem:

g∗(α) = Maximize g(u)

subje t to

Pu ≤ αγ1d,

u ≥ 0

(5.14)

where u ∈ Rnis a ve tor of arti ial variables with units of volts that is used to arry out the

above maximization.

Noti e that if α = 0, then the onstraints of the optimization problem (5.14) be ome Pu ≤ 0

and u ≥ 0 whi h, be ause P ≥ 0 and has exa tly one 1 in ea h row, it follows that u = 0 is

the only ve tor satisfying those onstraints. Also, re all that for any α ∈ [0,1], u∗(α) is dened

to be a ve tor fun tion that evaluates to a value of u for whi h (5.14) attains its maximum. It

follows that u∗(α) = 0 = αu∗(1), the last step is due to α = 0.

Consider the ase where α > 0. Using the following hange of variable:

u = αu′ (5.15)

we an rewrite (5.14) as:

g∗(α) = Maximize g (αu′)

subje t to

Pαu′ ≤ αγ1d,

αu′ ≥ 0

(5.16)

Noti e that α > 0, so that g (αu′) = αg(u′), be ause g(cu′) = cg(u′), for any c > 0.

Furthermore, αu′ ≥ 0 is equivalent to u′ ≥ 0, and Pαu′ ≤ αγ1d is equivalent to Pu′ ≤ γ1d.

With this, we an rewrite (5.16) as follows:

g∗(α) = Maximize αg(u)

subje t to

Pu ≤ γ1d,

u ≥ 0

(5.17)

It follows that g∗(α) = αg∗(1).

Let u = αu∗(1) ≥ 0, be ause α > 0 and u∗(1) ≥ 0. Noti e that:

Pu = αPu∗(1) ≤ αγ1d (5.18)

the se ond step due to Pu∗(1) ≤ γ1d and α > 0, so that u ∈ U(α), and:

g∗(α) = αg∗(1) = αg(u∗(1)) = g (αu∗(1)) (5.19)

where in the last step we used the fa t that cg(u) = g(cu), for any c > 0 and u ∈ Rn. Thus,

g∗(α) = g(u). Therefore, we an let u∗(α) = u = αu∗(1) and the proof is omplete.

Thus, for any α ∈ [0,1], we have

L(u∗(α)) =

I ≥ 0 :M ′I ≤ αMGu∗(1)

(5.20)

Lemmas C.1C.3 in Appendix C.1 show that the design obje tives used in Chapter 3 satisfy

the ondition of the above lemma, so that the ondition of the lemma is indeed mild and

pra ti al, leading to the above very useful s alability property. Referring to the grid in Fig. 5.8,

the peak power algorithm in Chapter 3 for α = 1 gives u∗(1) = [100 100]T mV. Thus, for

α = 0.6, we immediately have u∗(α) = αu∗(1) = [60 60]T mV. This gives us a s aled ontainer

L(u∗(α)), so that for any α ∈ [0,1], we have

L(u∗(α)) =

≥ 0 :

1.01 0.65

0.65 1.01

≤ α

5.4.2 Full Grid Analysis

In this se tion, we apply the results of Se tion 5.4.1 to every blo k of the grid. Every blo k k

has its own urrent ontainer that has the above s alability property in terms of its parameter

αk. The importan e of this se tion is two-fold: 1) we show that the worst- ase voltage drop

ontribution at the nodes of interest in the full grid due to the a tivity of ea h individual blo k

k also has a s alability property in terms of αk, as presented in Lemma 5.4, and 2) we show that

the upper-bound on the worst- ase voltage drop on the nodes of interest in the full grid due to

the a tivity of a set of blo ks is equal to the sum of the individual ontributions of ea h blo k

in that set, as presented in Lemma 5.5. Thus, an upper-bound on the worst- ase voltage drop

Figure 5.9: (a) A urrent ontainer F1(α1) for the left blo k in Fig. 5.7 for dierent values of

α1, (b) A urrent ontainer F2(α2) for the right blo k in Fig. 5.7 for dierent values of α2, ( )

the set of safe working modes W(α) for dierent values of α = [α1 α2]Tunder the ontainers

generated for ea h blo k in isolation, i.e. F1(α1) and F2(α2); the dashed polygons orrespond

to α1 = 0.4 and for dierent values of α2, the solid polygons orrespond to α2 = 0.4 and for

dierent values of α1, and (d) the set W(α) for dierent values of α.

ontribution on the nodes of interest in the full grid due to the a tivity of a set of blo ks for

some value of their internal parameters an be simply found by adding the s aled ontribution

of every blo k k in that set for αk = 1, ulminating in the result of Theorem 5.1. For example,

onsider the example of Fig. 5.7 and suppose that an upper-bound on the worst- ase voltage

drop at node 1 due to the a tivity of blo k 1 for α1 = 1 is 189 mV. Also, an upper-bound on

the worst- ase voltage drop at node 1 due to the a tivity of blo k 2 for α2 = 1 is 71 mV. Then,

an upper-bound on the worst- ase voltage drop at node 1 due to the a tivity of both blo ks, for

α1 = 0.5 and α2 = 0.25, is simply 189× 0.5 + 71× 0.25 = 112.25 mV.

Denitions

In isolation, ea h blo k is a separate passive power grid, and P(nk,mk, dk, Gk, Ck, Hk, Pk) de-

notes its passive grid problem. Furthermore, let Bk = Ck/∆tk be the nk × nk apa itan e

matrix resulting from the Ba kward Euler numeri al integration s heme on blo k k, so that

Ak = Gk +Bk. Also, let Mk = A−1k ≥ 0 and M ′

k =MkHk.

We assume that the voltage drop threshold spe i ation is uniform within a blo k, so that all

nodes of interest within the same blo k have the same threshold spe i ation, i.e. Vth,k = γk1dk ,

where γk > 0 and 1dk is a dk × 1 ve tor of ones. This assumption does not limit our work but

allows for several s alability properties, as we will see below, that lead to the omputational

e ien y of our approa h.

For every blo k k in isolation, let uk be a voltage drop assignment (relative to (Vdd − (1 −

αk)γk)) at all nodes in blo k k. For every isolated blo k k and for any αk ∈ [0,1], dene the sets

Uk(αk), Lk(uk), and Sk(αk), based on the analysis in Se tion 5.4.1, as follows:

Uk(αk)

= uk ∈ Rnk : 0 ≤ Pkuk ≤ αkVth,k (5.21)

Lk(uk)

=Ik ∈ Rmk : Ik ≥ 0, M ′

kIk ≤MkGkuk (5.22)

Sk(αk)

= Lk(uk) : uk ∈ Uk(αk) (5.23)

For every αk ∈ [0,1] and for any uk ∈ Uk(αk), let gk(uk) be a design obje tive for blo k k

satisfying the onditions of Lemma 5.2, and let g∗k(αk) be dened as follows:

g∗k(αk) = maxuk∈Uk(αk)

gk(uk) (5.24)

Let u∗k(αk) be a ve tor fun tion that evaluates to a value of uk for whi h the above maximization

attains its maximum: gk(u∗k(αk)) = g∗k(αk), ∀αk ∈ [0, 1]. Then, using Lemma 5.2, u∗k(αk) an

be expressed as:

u∗k(αk) = αku∗k(1), ∀αk ∈ [0,1] (5.25)

It is important to note that u∗k(αk) depends on the hoi e of gk(uk) so that Lk(u∗k(αk))

depends on gk(uk) as well. For ease of notation, let Fk(αk)

= Lk(u∗k(αk)), again keeping in

mind that Fk(αk) depends on the hoi e of the design obje tive gk(uk).

During hip design, we an set the internal parameters αk, k ∈ 1, . . . , q, to ensure the hip

urrents respe t the desired power budgets for the individual blo ks. Thus, in the dis ussion

below, we assume the hip is designed to respe t these lo al ontainers, so that an ON blo k

draws a urrent that is onsistent with Fk(αk), i.e. Ik ∈ Fk(αk), and an OFF blo k does not

draw any urrent, i.e. Ik = 0.

We will use the notation B and Bqto denote the Boolean spa es B = 0, 1 and B

q = 0, 1q.

Let βk ∈ B denote the mode of operation of blo k k, i.e. βk = 1 if blo k k is ON, otherwise βk = 0.

Also, let β = [β1 · · · βq] ∈ Bqdenote a working mode for the hip, and α = [α1, · · · , αq] ∈ R

denote a ve tor where the kth entry represents the internal parameter for blo k k. Note that

αk ∈ [0, 1], ∀k ∈ 1, . . . , q, so that 0 ≤ α ≤ 1q, whi h will be denoted using the shorthand

α ∈ [0,1q]. Furthermore, we will use the shorthand α ∈ (0,1q] to denote that 0 < α ≤ 1q, i.e.

αk > 0, ∀k ∈ 1, . . . , q.

Dene F(α, β) ⊂ Rm

as follows:

F(α, β) =

∈ Rm : Ik ∈

Fk(αk), if βk = 1

0, if βk = 0

(5.26)

Noti e that F(α, β) denotes a urrent ontainer for all the urrent sour es atta hed to the grid

under the working mode β and for the parameter α.

With this, we an dene υ(α, β) to be an upper-bound on the worst- ase voltage drop

experien ed by the nodes of interest in the equivalent passive grid under the given α and β,

based on the passive grid analysis in (5.4), as follows:

υ(α, β) = Pv(F(α, β)) = PG−1A emaxI∈F(α,β)

(M ′I) (5.27)

Noti e that the urrent ve tor I that is used to arry out the maximization in (5.27) has the

ve tor form dened in (5.26), i.e. its omponents I1, I2, . . . , Iq orrespond to the urrent sour es

atta hed to blo k 1, blo k 2, . . . , and blo k q. The olumns of M ′in (5.27) orrespond to the

dierent omponents of I, so that we an partition M ′as follows:

M ′ =[

Z1 Z2 · · · Zq

(5.28)

where Zk is an n×mk matrix that is multiplied by Ik in (5.27).

For any α ∈ [0, 1q], let

υk(αk)

= PG−1A emaxIk∈Fk(αk)

(ZkIk) (5.29)

V (α) = [υ1(α1) · · · υq(αq)] (5.30)

Noti e that for any α ∈ [0, 1q], we have Ik ≥ 0, ∀Ik ∈ Fk(αk), and Zk ≥ 0, be ause M ′ ≥ 0.

Furthermore, we have G−1A = In + G−1(A − G) = In + G−1B ≥ 0, and P ≥ 0, so that

υk(αk) ≥ 0. Therefore, V (α) ≥ 0, ∀α ∈ [0, 1q]. Furthermore, the following lemma shows that if

α > 0, then V (α) > 0. This will be useful in Se tion 5.5.

Lemma 5.3. For any α ∈ (0,1q], we have V (α) > 0.

Proof. For any α ∈ (0, 1q], we have Fk(αk) is non-empty and Fk(αk) 6= 0, i.e. ∃I ′ ∈ Fk(αk)

su h that I ′ ≥ 0 and I ′ 6= 0, so that emaxIk∈Fk(αk)(ZkIk) ≥ ZkI′ > 0, be ause Zk > 0, I ′ ≥ 0,

and I ′ 6= 0. Thus, ∀α ∈ (0, 1q], we have:

υk(αk) = PG−1A emaxIk∈Fk(αk)

(ZkIk) > 0 (5.31)

where we used the fa t that G−1A > 0, due to (2.17), and P ≥ 0 has at least one 1 in ea h row.

Therefore, for any α ∈ (0, 1q] we have υk(αk) > 0, ∀k ∈ 1, . . . , q. It follows that V (α) > 0,

∀α ∈ (0, 1q].

S alability

It is expensive to ompute V (α) for dierent values of α, as this would require solving q emax(·)

operations as in (5.29), i.e. q×n linear programs. The lemma below shows that, under a ertain

mild ondition on gk(·), V (α) has a s alability property in terms of α.

For any q × 1 ve tor x, let D(x) denote the q × q diagonal matrix with the ve tor x on the

main diagonal, i.e.

x1 0 · · · 0

0 x2 · · · 0.

0 0 · · · xq

(5.32)

Lemma 5.4. If gk(cu) = cgk(u) for any real number c > 0, u ∈ Rnk

and k ∈ 1, . . . , q, then

V (α) = V (1q)D(α), ∀α ∈ [0,1q].

Proof. Re all that, due to (5.25), we have:

u∗k(αk) = αku∗k(1) (5.33)

for any α ∈ [0,1q], and noti e that:

emaxIk∈Fk(αk)

(ZkIk) = emaxM ′

kIk≤MkGku

k(αk)

Ik≥0

ZkIk (5.34)

= emaxM ′

kIk≤αkMkGku

Ik≥0

ZkIk (5.35)

where in the last step we used (5.33).

Noti e that if αk = 0, then the onstraints of the optimization problem (5.35) be ome

M ′kIk ≤ 0 and Ik ≥ 0 whi h, be ause M ′

k > 0, it follows that Ik = 0 is the only ve tor satisfying

those onstraints. Thus, emaxIk∈Fk(αk)

(ZkIk) = 0 and υk(αk) = PG−1A emaxIk∈Fk(αk)

(ZkIk) = 0 =

αkυk(1), the last step is due to αk = 0.

Otherwise if αk > 0, let Ik = αkI′be a hange of variable in (5.35), then we an rewrite (5.35)

as follows:

emaxIk∈Fk(αk)

(ZkIk) = emaxαkM

kI′≤αkMkGku

αkI′≥0

αkZkI′

(5.36)

= αk emaxM ′

kI′≤MkGku

I′≥0

ZkI′

(5.37)

where in the last step we used the fa t that αk > 0. Multiplying both sides of (5.37) with

PG−1A, we get:

PG−1A emaxIk∈Fk(αk)

(ZkIk) = αkPG−1A emax

kI′≤MkGku

I′≥0

ZkI′

(5.38)

or equivalently,

υk(αk) = αkυk(1) (5.39)

Noti e that (5.39) holds for every k, so that (5.39) an be represented ompa tly using matrix

notation as follows:

V (α) = V (1q)D(α) (5.40)

whi h ompletes the proof.

Based on the above lemma, for any α ∈ [0, 1q], we have

V (α) = V (1q)D(α) (5.41)

whi h is learly mu h faster to ompute than solving q instan es of (5.29) for every required

value of α.

For the example in Fig. 5.7, we have

V (α) =

189 71

194 78

106 218

106 221

189α1 71α2

194α1 78α2

106α1 218α2

106α1 221α2

where the units are in mV.

Superposition

It is pra ti ally impossible to solve (5.27) for every required β, as this ould lead to ombinatorial

explosion in the required values of β. The lemma below establishes the prin iple of superposition

for the equivalent passive grid.

Lemma 5.5. For any α ∈ [0,1q] and β ∈ Bq, we have:

υ(α, β) =

βkυk(αk) = V (α)β (5.42)

Proof. First, we will show that, for any I ∈ F(α, β), we have:

emaxI∈F(α,β)

(ZkIk) = βk emaxIk∈Fk(αk)

(ZkIk) (5.43)

for any k ∈ 1, . . . , q. This will be useful to arry out the rest of the proof.

Noti e that, for any I ∈ F(α, β), if βk = 0, then Ik = 0, due to (5.26). Thus, we have:

emaxI∈F(α,β)

(ZkIk) = 0 = βk emaxIk∈Fk(αk)

(ZkIk) (5.44)

Furthermore, if βk = 1, then Ik ∈ Fk(αk), due to (5.26). Thus, we have:

emaxI∈F(α,β)

(ZkIk) = emaxIk∈Fk(αk)

(ZkIk) = βk emaxIk∈Fk(αk)

(ZkIk) (5.45)

Therefore, for any I ∈ F(α, β), (5.43) holds.

Now, re all that, due to (5.27), we have:

υ(α, β) = PG−1A emaxI∈F(α,β)

(M ′I) (5.46)

or equivalently, using (5.28), we have:

υ(α, β) = PG−1A emaxI∈F(α,β)

(5.47)

For any I ∈ F(α, β), noti e that, ∀k, j ∈ 1, . . . , q with k 6= j, Ik and Ij are independent.

Thus, we an rewrite (5.47) as follows

υ(α, β) =∑

PG−1A emaxI∈F(α,β)

(ZkIk) (5.48)

Repla ing (5.43) in (5.48) gives:

υ(α, β) =∑

βkPG−1A emax

Ik∈Fk(αk)(ZkIk) (5.49)

βkυk(αk) (5.50)

= V (α)β (5.51)

where (5.50) is due to (5.29) and (5.51) is due to (5.30), whi h ompletes our proof.

The importan e of the above lemma is that, for a given value of α ∈ [0, 1q], υ(α, β) an be

found for dierent working modes β by a simple matrix-ve tor multipli ation between V (α) and

β, whi h is signi antly faster than solving (5.27) for every required β.

This leads to our main theoreti al result and the main reason behind the omputational

e ien y of our work, stated in the theorem below.

Theorem 5.1. If gk(cu) = cgk(u) for any real number c > 0, u ∈ Rnk, and k ∈ 1, . . . , q, then

for any α ∈ [0,1q] and β ∈ Bq, we have

υ(α, β) = V (1q)D(α)β (5.52)

Proof. Noti e that, for any α ∈ [0,1q] and β ∈ Bq, we have:

υ(α, β) = V (α)β (5.53)

due to Lemma 5.5. Furthermore, be ause gk(cu) = cgk(u) for any real number c > 0, u ∈ Rnk,

and k ∈ 1, . . . , q, it follows that V (α) = V (1q)D(α), due to Lemma 5.4. Thus, we have

υ(α, β) = V (α)β = V (1q)D(α)β, whi h ompletes our proof.

The importan e of the above result is that it allows us to nd an upper-bound on the worst-

ase voltage drop experien ed by the nodes of interest in the full grid υ(α, β) for any α ∈ [0, 1q]

and β ∈ Bqby solving υk(1), dened in (5.29), ∀k ∈ 1, . . . , q, onstru ting V (1q), dened

in (5.30), and performing two matrix-ve tor multipli ations, as in (5.52). Note that we only

need to nd V (1q) on e, whi h will then be used to nd υ(α, β), for any α ∈ [0, 1q] and β ∈ Bq.

This is learly mu h faster than solving (5.27) for every required value of α and β.

υ(α, β) =

189α1 71α2

194α1 78α2

106α1 218α2

106α1 221α2

mV (5.54)

Safe Working Modes

In a power-gated grid, the power gating swit hes of a blo k are turned o when the logi ir uitry

underlying that blo k is in idle or sleep state. Clearly, the voltage levels inside an OFF blo k

do not ae t the voltage integrity of the grid and the only nodes whose voltage drop matters

are the nodes of interest inside the ON blo ks as they are onne ted to swit hing logi ir uitry.

In this se tion, we provide a formal denition for the safety of the equivalent passive grid that

is based on the voltage drops at the nodes of interest inside the ON blo ks. Furthermore, we

provide an equivalent mathemati al ondition that aptures this safety riterion.

We start by dening the safety ondition for the full grid:

Denition 5.1. The equivalent passive grid is said to be safe under F(α, β) if for every node

of interest i that belongs to an ON blo k j, we have υi(α, β) ≤ γj.

In the following lemma we will provide an equivalent mathemati al ondition that aptures

the safety of the equivalent passive grid. We will introdu e a new voltage drop threshold ve tor

that is a fun tion of the working mode β, denoted as vth(β), whi h will then be used to he k

if the grid is safe by omparing υ(α, β) to vth(β), as we will prove in Lemma 5.6. Based on the

working mode β, the entries of vth(β) that orrespond to the nodes of interest that belong to

OFF blo ks will be ome very large, so that the voltage drop at those nodes does not impa t the

safety of the grid, whereas the entries of vth(β) that orrespond to the nodes of interest that

belong to ON blo ks will have the original voltage drop threshold spe i ation.

Let T be a d × q matrix of 0 and 1 entries that identies (with a 1) whi h node of interest

belongs to whi h blo k, i.e. Tij = 1 if the ith node of interest belongs to the jth blo k, otherwise

Tij = 0. Also, let vth(β) = Vth + ρT (1q − β), where ρ > 0 is a large number. It is enough for ρ

to be larger than ‖V (1q)1q‖∞. Noti e that for any β ∈ Bq, we have β ≤ 1q, so that 1q − β ≥ 0

whi h, be ause ρ ≥ 0 and T ≥ 0, gives ρT (1q − β) ≥ 0. Thus, we have:

vth(β) = Vth + ρT (1q − β) ≥ Vth > 0 (5.55)

Lemma 5.6. For any α ∈ [0,1q] and β ∈ Bq, the equivalent passive grid is safe if and only if

V (α)β ≤ vth(β).

Proof. For any α ∈ [0, 1q] and β ∈ Bq, let yk

= βkIk, with Ik ∈ Fk(αk), ∀k. Also, let y ∈ Rm

dened as follows:

(5.56)

Noti e that y ∈ F(α, β), be ause yk = 0 if βk = 0, and yk = Ik ∈ Fk(αk) if βk = 1. With this,

we an rewrite (5.27) as follows:

υ(α, β) = PG−1A emaxIk∈Fk(αk)

M ′y)

(5.57)

= PG−1A emaxIk∈Fk(αk)

βkZkIk

(5.58)

Be ause the onstraints of the emax(·) operation in (5.58) are de oupled, we an rewrite (5.58)

as follows:

υ(α, β) = PG−1A

emaxIk∈Fk(αk)

(βkZkIk)

(5.59)

or equivalently,

υ(α, β) =

βkPG−1A emaxIk∈Fk(αk)

(ZkIk)

(5.60)

Therefore, we have:

υ(α, β) =

(βkυk(αk)) = V (α)β (5.61)

With this, noti e the following:

V (α)β ≤ vth(β) (5.62)

⇐⇒ υ(α, β) ≤ vth(β) = Vth + ρT (1q − β) (5.63)

⇐⇒ υi(α, β) ≤ γj + ρ(1− βj) (5.64)

for any i ∈ 1, . . . , d su h that the node of interest i belongs to blo k j, where in the last step

we used the fa t that Tik = 0, ∀k 6= j, and Tij = 1. Noti e that V (α)β ≤ V (α)1q, be ause

V (α) ≥ 0 and β ≤ 1q, so that υi(α, β) = V (α)β|i ≤ V (α)1q|i ≤ ‖V (α)1q‖∞ = ρ. Therefore, if

βj = 0, then (5.64) is always satised, be ause γj ≥ 0. Therefore, (5.64) is equivalent to the

following:

υi(α, β) ≤ γj , ∀β ∈ Bqwith βj = 1 (5.65)

for any i ∈ 1, . . . , d su h that the node of interest i belongs to blo k j. Therefore, the

equivalent passive grid is safe if and only if V (α)β ≤ vth(β), whi h ompletes the proof.

For any β ∈ Bqsu h that V (α)β ≤ vth(β), β is said to be a safe working mode. Dene the

set W(α) to be the set of all safe working modes under the blo ks' ontainers Fk(αk), i.e.

= β ∈ Bq : V (α)β ≤ vth(β) (5.66)

whi h is aptured by a BDD.

For the example in Fig. 5.7, W(α) is

β ∈ B2 : V (α)β ≤

(1q − β)

(5.67)

To better visualize the set W(α), onsider again the example of Fig. 5.7. In Fig. 5.9( ),

we show a relaxed version of the set W(α), dened as W ′(α) = x ∈ Rq : V (α)x ≤ vth(x),

for dierent values of α. Noti e that, for any α ∈ [0,1q], W(α) onsists of the Boolean ve tors

β ∈ Bqthat lie inside the spa e W ′(α). So, as shown in Fig. 5.9( ),

(5.68)

(5.69)

So far, any α ∈ [0,1q] will give us the required blo k-level urrent ontainers Fk(αk) =

Lk(u∗k(αk)) and the orresponding set of safe working modesW(α), as dened in (5.22) and (5.66).

Figure 5.10: A 3-D plot in (a) and a ontour plot in (b) of the per entage of safe working

modes for dierent values of α on a 5k node grid with 16 blo ks. The olor bar represents the

per entage of safe working modes.

5.5 Appli ation

Referring again to the example of Fig. 5.7, noti e that larger α1 orrespond to a larger F1(α1)

(as shown in Fig. 5.9(a)), and hen e, a larger power budget for blo k 1. Similarly, larger α2

orrespond to a larger F2(α2) (as shown in Fig. 5.9(b)), and hen e, a larger power budget for

blo k 2. On the other hand, larger lo al power budgets result in larger voltage drops at the

grid nodes, and hen e, smaller number of safe working modes (as shown in Fig. 5.9(d)). To

better illustrate this, onsider two dierent values of α: α(1) = [0.4 0.2]T , α(2) = [0.2 0.2]T .

Noti e that F1(α(1)1 ) ⊃ F1(α

(2)1 ), as shown in Fig. 5.9(a), and F2(α

(1)2 ) = F2(α

(2)2 ), be ause

α(1)2 = α

(2)2 . Furthermore, W(α(1)) ⊂ W(α(2)), as shown in Fig. 5.9(d). Therefore, α(1)

allows

larger power budget for blo k 1 but allows less exibility in terms of the number of safe working

modes, as ompared to α(2). There is a lear trade-o for dierent values of α. The trade-o

is between the lo al power budgets allo ated to individual blo ks (based on the generated lo al

ontainers) and the number of safe working modes. In fa t, as we will see below, the lo al power

budget of blo k k is dire tly proportional to αk, and hen e, we an think of α as the allo ated

power budgets for the individual blo ks whi h, in turn, determine the safe working modes. In

Fig. 5.10, we show the trade-o a hieved for dierent values of α on a 5k node grid with 16

blo ks. Fig. 5.10 (a) and Fig. 5.10 (b) orrespond to dierent values of α1 and α2 between 0

and 0.4, while α3, α4, . . . , and α16 are xed to 0.85. Again, be ause the power budget of blo k

k is dire tly proportional to its parameter αk, we present the per entage of safe working modes

as a fun tion of the power budgets for blo k 1 and 2 (the orresponding values of α1 and α2

are shown at the right and top axes of Fig. 5.10 (b), respe tively). Some values of α allow for

large lo al power budgets but a small number of safe working modes, whereas other values of

α allow small lo al power budgets but large number of safe working modes. Thus, the question

be omes, whi h α should we hoose?

In this se tion, we will des ribe two design obje tives: 1) the maximum peak-power dissi-

pation that ea h blo k an safely support and 2) the largest number of safe working modes.

In Se tion 5.5.1, we will des ribe some types of user-spe ied onstraints that our approa h

an handle, basi ally onstraints on the peak power that ea h blo k an safely support and the

allowable working modes, and we will see that these onstraints an be represented as linear

inequalities on α, resulting in a feasible spa e of α, denoted as A. The proposed algorithms will

ea h be formulated as a maximization of the orresponding design obje tive, over all α ∈ A,

resulting in an α that allows large lo al power budgets at the ost of small number of safe

working modes, or an α that allows more blo ks to turn ON simultaneously at the ost of smaller

lo al power budgets. Or, as probably the most useful ase, an intermediate value of α between

the two limits will be hosen to a hieve some obje tive on the size of the lo al ontainers or the

per entage of safe working modes.

Figure 5.11: The feasible spa e of α in Fig. 5.7 as a result of some user-spe ied onstraints.

5.5.1 User-spe ied Constraints

In this se tion, we will examine two approa hes for users to inuen e the spa e of α based on any

spe i ations that may be known about the design at an early stage, thus a hieving dierent

trade-os for hip operation. In a sense, these spe i ations will help redu e the spa e of α to

a spa e that ree ts design knowledge.

The user an enfor e some working modes to be allowed during hip operation, whi h we

an in orporate as in (5.75). Also, the user an enfor e any lo al urrent/power budgets to

satisfy some onstraints, whi h we an in orporate as in (5.95). Assuming that the working

modes onstraints and the urrent/power onstraints are onsistent and feasible, so that there

exists an α ∈ [0,1q] that satises (5.75) and (5.95), then we an dene the feasible spa e of α as

follows:

= α ∈ [0,1q] :Wα ≤ w, plb ≤ Rα ≤ pub (5.70)

Fig. 5.11 shows an example of A for the simple grid in Fig. 5.7 orresponding to the user-spe ied

onstraints in (5.72), (5.73), and (5.92)(5.94).

Working Modes Constraints

Suppose we have some knowledge about the working modes of the ir uit, for example, if there

exists some dependen ies among the blo ks, i.e. a subset of the blo ks are required to be ON at the

same time. In general, let W0 denote the set of user-spe ied working modes that are required

to be safe. This type of onstraint an be easily embedded into our framework by sear hing for

α that satises W0 ⊆ W(α). We will see below that this onstraint an be represented as a set

of linear onstraints on α, i.e.

Wα ≤ w (5.71)

Referring to the example of Fig. 5.7, we an impose a onstraint that ea h blo k is safe

to turn on separately. In other words, we are interested in the values of α ∈ [0, 12] su h that

β(1), β(2) ∈ W(α), where β(1) = [1 0]T and β(2) = [0 1]T . Thus, based on (5.67), we have:

V (α)β(1) ≤ vth(β(1)) ⇐⇒ α1 ≤ 0.52 (5.72)

V (α)β(2) ≤ vth(β(2)) ⇐⇒ α2 ≤ 0.32 (5.73)

In Fig. 5.11, we show the above two onstraints; (5.72) is numbered 1 and (5.73) is numbered 2.

For any β ∈ W0, let W (β) = V (1)D(β). Assuming that a total of ζ working modes are

required to be safe, i.e. W0 = β(1), β(2), . . . , β(ζ), let W and w be a (ζd) × q matrix and

(ζd)× 1 ve tor, respe tively, su h that:

W (β(1)).

W (β(ζ))

vth(β(1))

vth(β(ζ))

(5.74)

The following lemma transforms the onstraint W0 ⊆ W(α) into a set of linear inequalities

on α.

Lemma 5.7. If gk(cu) = cgk(u), for any real number c > 0, u ∈ Rnk

and ∀k ∈ 1, . . . , q, then

for any α ∈ [0,1q], W0 ⊆ W(α) if and only if:

Wα ≤ w (5.75)

Proof. Noti e the following:

Wα ≤ w (5.76)

⇐⇒ W (β)α ≤ vth(β), ∀β ∈ W0 (5.77)

⇐⇒ V (1q)D(β)α ≤ vth(β), ∀β ∈ W0 (5.78)

Furthermore, noti e that:

D(β)α = [β1α1 · · · βqαq]T

(5.79)

= D(α)β (5.80)

With this, and beneting from (5.78), we get:

V (1q)D(α)β ≤ vth(β), ∀β ∈ W0 (5.81)

⇐⇒ V (α)β ≤ vth(β), ∀β ∈ W0 (5.82)

the last step is due to Lemma 5.4, and the proof is omplete.

Current/Power Constraints

A broad range of power bounds an be imposed on the resulting ontainers, given spe i ations

about the design at an early stage. In the following, we will dis uss several examples of su h

onstraints that ould be embedded in our framework and we will show in Lemma 5.9 that these

onstraints an be represented as a set of linear inequalities on α, i.e.

plb ≤ Rα ≤ pub (5.83)

Dene ψk(αk) to be the largest instantaneous peak power dissipation a hievable under

Fk(αk), i.e.

ψk(αk) = Vdd maxIk∈Fk(αk)

1TmkIk)

(5.84)

Re all that for any α ∈ [0,1q], Fk(αk) is non-empty, so that ψk(αk) ≥ 0 is well-dened.

The simplest bounds are on the minimum and peak-power, referred to as lo al onstraints,

su h as ψlb ≤ ψ(α) ≤ ψub where ψ(α) = [ψ1(α1) · · · ψq(αq)]Tis a q × 1 ve tor of the peak

power dissipation that ea h blo k an safely support and ψlb and ψub are ve tors of user-spe ied

lower and upper bounds on the peak power dissipation of the blo ks. Another bound ommonly

available from design spe i ation is the peak total power dissipation of a group of blo ks,

referred to as global onstraints, that is available at an early stage of the design, then assuming

we have a total of κ global onstraints, we an in orporate these onstraints as clb ≤ Fψ(α) ≤ cub

where F is a κ× q matrix that onsists only of 0s and 1s whi h indi ate whi h blo k is present

in ea h onstraint, so that F ≥ 0 has no row with all zeros, clb and cub are κ × 1 ve tors

representing the lower and upper bounds on the peak power dissipation. We an represent the

lo al and global onstraints ompa tly as:

plb ≤ Uψ(α) ≤ pub (5.85)

where plb =

, pub =

, and U =

The following lemma establishes the s alability of ψ(α), whi h will be useful to prove

Lemma 5.9.

and ∀k ∈ 1, . . . , q, then

ψ(α) = D(α)ψ(1q), ∀α ∈ [0, 1q].

Proof. Re all that for any α ∈ [0,1q], ψk(αk) an be found using the following optimization

problem:

ψk(αk) = Maximize 1TmkIk

subje t to M ′kIk ≤ αkMkGku

∗k(1)

Ik ≥ 0

(5.86)

where we used the fa t that u∗k(αk) = αku∗k(1), due to (5.25), and Ik ∈ R

mkis a ve tor of

arti ial variables with units of urrent that is used to arry out the above maximization.

Noti e that if αk = 0, then the onstraints of the optimization problem (5.86) be ome

M ′kIk ≤ 0 and Ik ≥ 0 whi h, be ause M ′

k ≥ 0, leads to Ik = 0 is the only ve tor satisfying those

onstraints. It follows that ψk(αk) = 0 = αkψk(1), the last step is due to αk = 0.

Otherwise if αk > 0, onsider the following hange of variable:

Ik = αkI′

(5.87)

we an rewrite (5.86) as follows:

ψk(αk) = Maximize αk1TmkI ′

subje t to αkM′kI

′ ≤ αkMkGku∗k(1)

αkI′ ≥ 0

(5.88)

whi h, due to αk > 0, gives:

ψk(αk) = Maximize αk1TmkI ′

subje t to M ′kI

′ ≤MkGku∗k(1)

I ′ ≥ 0

(5.89)

so that ψk(αk) = αkψk(1), ∀k, whi h an be represented ompa tly using ve tor notation as

ψ(α) = D(α)ψ(1q), and the proof is omplete.

Based on the above lemma, for any α ∈ [0, 1q], we have

ψ(α) = D(α)ψ(1q) (5.90)

whi h is learly mu h faster to ompute than solving q instan es of (5.84) for every required

value of α.

ψ(α) =

100α1

mW (5.91)

whi h allows us to impose power onstraints on the blo ks. For example, the peak power of

blo k 1 and blo k 2 are larger than 15 mW and 7 mW, respe tively, and the total peak power

that both blo ks an dissipate simultaneously is larger than 30 mW. In other words, we are only

interested in the values of α ∈ [0, 12] su h that:

ψ1(α1) ≥ 15 mW ⇐⇒ α1 ≥ 0.15 (5.92)

ψ2(α2) ≥ 7 mW ⇐⇒ α2 ≥ 0.1 (5.93)

ψ1(α1) + ψ2(α2) ≥ 30 mW ⇐⇒ α1 + 0.7α2 ≥ 0.3 (5.94)

In Fig. 5.11, we show the above three onstraints; (5.92) is numbered 3, (5.93) is numbered 4,

and (5.94) is numbered 5.

The following lemma transforms the user-spe ied power onstraints into a set of linear

inequalities on α.

and ∀k ∈ 1, . . . , q, then

we have plb ≤ Uψ(α) ≤ pub if and only if

plb ≤ Rα ≤ pub (5.95)

where R = UD(ψ(1q)).

Proof. Noti e that:

plb ≤ Uψ(α) ≤ pub (5.96)

⇐⇒ plb ≤ UD(α)ψ(1q) ≤ pub (5.97)

Furthermore, noti e that:

D(α)ψ(1q) = [α1ψ1(1) · · · αqψq(1)]T

(5.98)

= D(ψ(1))α (5.99)

With this, and beneting from (5.97), we get:

plb ≤ UD(ψ(1q))α ≤ pub (5.100)

⇐⇒ plb ≤ Rα ≤ pub (5.101)

5.5.2 Maximum Lo al Power

The design team may be interested in a workload s heduler that allows as mu h lo al power

dissipation as possible to the underlying ir uit. We refer here to the instantaneous power

dissipation, whi h is onservatively approximated by Vdd∑mk

j=1 ik,j(t) for every blo k k, where

ik,j(t) is the time-varying urrent waveform representing the urrent drawn by the jth urrent

sour e in blo k k. Re all that ψk(α) denes the peak power dissipation that blo k k an safely

support in the underlying ir uit. Thus, we are interested in an α that allows the highest possible

∀k ψk(α), while satisfying the user-spe ied requirements on the resulting lo al ontainers and

working modes, i.e. (5.70). We an formulate this as the following optimization problem:

σ∗ = Maximize 1Tq ψ(α)

subje t to α ∈ A(5.102)

Let α(p)be a ve tor at whi h the above maximization attains its maximum. In other words,

α(p) ∈ A su h that 1Tq ψ(α(p)) = σ∗. Be ause A is non-empty, it follows that α(p)

is well-dened.

Therefore, the resulting blo k-level ontainers are Fk(α(p)k ) whi h des ribe the following urrent

onstraints:

ik(t) ≥ 0 (5.103)

M ′kik(t) ≤ α

(p)k MkGku

∗k(1) (5.104)

(5.105)

for every k ∈ 1, . . . , q, where ik(t) is the time-varying urrent waveform representing the

urrent drawn by the kth blo k. Furthermore, the resulting set of safe working modes is:

W(α(p)) = β ∈ Bq : V (1q)D(α(p))β ≤ vth(β) (5.106)

In the following, we will show that the optimization problem in (5.102) is equivalent to the

LP in (5.110), and hen e, we will be solving (5.110) instead of (5.102). Noti e that, due to

Lemma 5.8, we have:

1Tq ψ(α) = 1Tq D(α)ψ(1q) (5.107)

Also, noti e that:

D(α)ψ(1q) = [α1ψ1(1) · · · αqψq(1)]T

(5.108)

= D(ψ(1q))α (5.109)

Therefore, we have 1Tq ψ(α) = 1Tq D(ψ(1q))α = ψT (1q)α. Thus, we an rewrite (5.102) as follows:

Maximize ψT (1q)α

5.5.3 Maximum Working Modes

Another approa h that the design team might be interested in is a workload s heduler that allows

as mu h exibility for the blo ks to turn ON simultaneously as possible, while still satisfying the

user-spe ied requirements. Let |W(α)| denote the ardinality of the set W(α). Thus, we are

interested in α that maximizes |W(α)| and satises the user-spe ied requirements. We an

nd su h an α by solving the following optimization problem:

Maximize |W(α)|

Solving (5.111) is omputationally expensive. For one thing, |W(α)| is a non- onvex fun tion

of α. Alternatively, we propose a simpler optimization problem in (5.115), in fa t a linear

program (LP), motivated by the following lemma. The following lemma establishes a su ient

ondition that maximizes |W(α)|; to maximize |W(α)|, it is enough to minimize all the elements

of α. In a sense, it is su ient to minimize α. Typi ally, this an be a hieved by minimizing

some norm of α (e.g. Eu lidean norm, sum-norm, innity-norm). In this work, we will be

minimizing the innity-norm of α, i.e. ‖α‖∞; this will be formulated as the LP in (5.115).

Lemma 5.10. If gk(cu) = cgk(u) for any real number c > 0, u ∈ Rnk

, and k ∈ 1, . . . , q, then

for any α, α′ ∈ [0, 1q] with α ≤ α′, we have |W(α)| ≥ |W(α′)|.

Proof. For any α, α′ ∈ [0, 1q] with α ≤ α′, we have D(α) ≤ D(α′), so that V (1q)D(α) ≤

V (1q)D(α′), be ause V (1q) ≥ 0. Therefore, due to Lemma 5.4, we have V (α) ≤ V (α′). With

this, noti e that for any β ∈ W(α′), we have V (α)β ≤ V (α′)β ≤ vth(β), so that β ∈ W(α). It

follows that W(α′) ⊆ W(α), and hen e, |W(α)| ≥ |W(α′)|.

For any α ∈ A, let ξ(α)

= ‖α‖∞ denote the innity-norm of α, i.e.

ξ(α) = max∀i

|αi| = max∀i

αi (5.112)

the last step due to α ≥ 0. Noti e that ξ(α) is the smallest real-number greater than αi, ∀i, so

ξ(α) = minξ1q≥α

ξ (5.113)

We dene ξ∗ to be the smallest ξ(α) a hievable over all possible α ∈ A, i.e.:

= minα∈A

(ξ(α)) (5.114)

Let α(w)be a ve tor at whi h the above maximization attains its minimum. In other words,

α ∈ A su h that ξ(α(w)) = ξ∗. Be ause A is non-empty, it follows that α(w)is well-dened. We

an express the ombined (5.113) and (5.114) as the following LP:

ξ∗ = Minimize ξ

subje t to

ξ1q ≥ α

α ∈ A

(5.115)

Power Grid

Metal Layers Metal Layers

Current

in Full Grid in Global Grid Sour es

G1 8 2 3,776 568

G2 8 2 3,228 477

G3 8 2 2,296 384

G4 8 2 41,652 6,768

G5 8 2 39,530 6,300

G6 8 2 33,840 5,112

G7 8 2 518,276 85,952

G8 8 2 1,224,892 204,200

Table 5.2: Power grid properties and the runtime breakdown

Power Grid Runtime

Isolated Blo k

Analysis

Full Grid

Analysis

LP (5.110) LP (5.115)

G1 70 mse 5 se 23 mse 20 mse

G2 169 mse 1 se 25 mse 19 mse

G3 293 mse 607 mse 33 mse 23 mse

G4 414 mse 2 min 188 mse 90 mse

G5 426 mse 1 min 147 mse 85 mse

G6 462 mse 17 se 124 mse 68 mse

G7 4.3 se 1.8 hrs 824 mse 612 mse

G8 10.8 se 8.3 hrs 1.3 se 1 se

Therefore, the resulting blo k-level ontainers are Fk(α(w)k ) whi h des ribe the following

urrent onstraints:

ik(t) ≥ 0 (5.116)

M ′kik(t) ≤ α

(w)k MkGku

∗k(1) (5.117)

for every k ∈ 1, . . . , q. Furthermore, the resulting set of safe working modes is:

W(α(w)) = β ∈ Bq : V (1q)D(α(w))β ≤ vth(β) (5.118)

Table 5.3: The user-spe ied onstraints parameters and omparison of the two design

obje tives

Power Grid Power Constraints BDD Constraints Maximum Lo al Power Maximum Working Modes

Min. Avg. Peak # of working

Max. ON

a P (α(p)) (mW) ω(α(p)) P (α(w)) (mW) ω(α(w))Power (mW) modes

G1 66 8 2 83 37.50 66 75.00

G2 27 14 4 35 10.35 27 49.80

G3 10 21 7 12 1.42 10 12.50

G4 132 21 7 164 1.42 132 12.39

G5 77 30 8 104 0.16 77 3.78

G6 50 41 8 71 - 50 -

G7 288 69 8 426 - 289 -

G8 384 105 8 558 - 384 -

the maximum number of ON blo ks in the user-spe ied working modes.

5.6 Results

The approa h dis ussed in Se tion 5.4 has been implemented in C++. We ondu ted tests on a

set of power grids that were generated based on user spe i ations, in luding grid dimensions,

metal layers, number of blo ks, number of metal layers in the global grid, pit h and width per

layer, and C4 and urrent sour e distributions. The te hnology spe i ations were onsistent

with 1V 45 nm CMOS te hnology. Table 5.1 shows the hara teristi s of a number of test grids.

All results were obtained using a hyperthreaded 12- ore 3GHz Linux ma hine with 128GB of

RAM. The optimizations were performed using MOSEK optimization pa kage [49. All the

linear systems are solved using Cholmod [21 from SuiteSparse [25. In our implementation,

we use Pthreads to parallelize the omputation and take advantage of the 12- ore ma hine.

The runtime breakdown of our approa h, i.e. the isolated blo k analysis, the full grid analysis,

LP (5.110), and LP (5.115), is shown in olumns 2-5 of Table 5.2, whi h represent the wall lo k

time for the parallel Pthreads implementation. Re all that in the isolated blo k analysis, the

blo k-level ontainers are generated based on a hoi e of the design obje tive gk(·). In our tests,

we used the peak power algorithm and the uniform urrent distribution in Chapter 3 as design

obje tive for all the blo ks.

Table 5.3 ompares the results of using α(p)(Se tion 5.5.2) and α(w)

(Se tion 5.5.3) based on

user-spe ied onstraints. In olumn 2, we des ribe the user-spe ied onstraints on the lo al

power. Spe i ally, we require the average of the peak powers of all the blo ks to be larger

than the spe i ation in olumn 2. Furthermore, in olumns 3-4, we des ribe the user-spe ied

onstraints on the working modes, i.e. the number of user-spe ied working modes, as well as

the maximum number of blo ks that are ON in those working modes. Denote by P (α) the average

of the peak powers of all the blo ks under the blo k ontainers Fk(αk). Also, denote by ω(α)

the per entage of the working modes that are safe under blo k ontainers Fk(αk). To study the

dieren e between the generated blo k ontainers and W(·) using α(p)and α(w)

, we found the

average of the peak powers of all the blo ks under Fk(α(p)k ) and Fk(α

(w)k ), whi h are P (α(p))

and P (α(w)), and the per entage of safe working modes in W(α(p)) and W(α(w)), whi h are

ω(α(p)) and ω(α(w)). For instan e, on a 39K node grid with 25 blo ks (G5 ), the average of the

peak powers for all blo ks under Fk(α(p)k ) and Fk(α

(w)k ) are 104 mW and 77 mW, respe tively

and the per entage of safe working modes under W(α(p)) and W(α(w)) are 0.16% and 3.78%,

respe tively. The results show that P (α(p)) ≫ P (α(w)) and ω(α(p)) ≪ ω(α(w)). Therefore, ea h

approa h provides a distin t trade-o for the design team.

5.7 Con lusion

With a tive devi es, most traditional te hniques are ill-equipped to verify the power grid. The

worst- ase voltage drop is the result of two things: the power budgets that were allo ated to the

various ir uit blo ks during the design pro ess and the ombination of blo ks that are turned ON

in a given operational mode. In this hapter, we proposed a framework to generate blo k-level

ir uit urrent onstraints as well as an impli it binary de ision diagram that helps identify the

safe working modes. Subje t to user-guidan e, we then proposed two design obje tives that

exploit the trade-o between how many blo ks are ON simultaneously and how big the power

budgets of individual blo ks are.

Chapter 6

Power Grid Fixing for EM-indu ed

Voltage Failures

Ele tromigration (EM) is a major reliability on ern in hip power grids in the wake of smaller

feature sizes. EM degradation of grid metal lines an ause large voltage drops on the grid,

leading to timing failures and logi errors. During the design pro ess, modi ations to the

grid design may be required in order to prote t from the risk of su h EM-indu ed voltage drop

failures. We onsider this problem in light of re ent e ient full- hip EM assessment te hniques

(reviewed in Se tion 2.8.5). We present a systemati approa h that resizes the grid metal lines

to meet a design target lifetime while requiring minimal in rease in metal area of the grid.

6.1 Problem Denition and Notation

If the grid's MTF is less than the target lifetime, the grid design is said to have an EM-lifetime

violation and needs to be xed. There are several ways to x an EM-lifetime violation. If the

violation is dis overed at early stages of the design pro ess, then it is possible to either lower the

urrent densities through the grid metal lines by modifying the grid design or lower the urrents

drawn by the logi ir uitry by hanging the ir uit layout. If the violation is dis overed at a

late stage ( loser to sign-o), it is very di ult and ostly to modify the ir uit layout and the

x is most ommonly done by performing minimal hanges to the grid design. Common pra ti e

in industry is to iteratively widen metal lines that are lose to a void nu leation or voltage drop

violation. However, this trial-and-error approa h might iterate forever as it blindly tries to x

the grid and an lead to the over-design of the grid. Many authors have studied this parti ular

problem in luding for example [22, 27, 66, 65, 70. However, the main limitation of these works

is that they are based on Bla k's model and series power grid failure model. In this hapter,

we propose a power grid xing s heme based on the re ent work in [19. We fo us on xing

an EM-lifetime violation that is dis overed lose to design sign-o by widening the metal lines

of the grid. Intuitively, this would in rease the ondu tivity of the grid whi h in reases the

Chapter 6. Power Grid Fixing for EM-indu ed Voltage Failures 117

(i)for every grid sample G(i)

and, in turn, in reases the MTF of the grid. We aim to solve

the following problem: given a power grid, the ee tive urrents drawn by the underlying logi

ir uitry, and a target lifetime, we will resize the inter onne t trees on the various layers in order

to satisfy the target lifetime, by making minimal hanges to the metal area.

Figure 6.1: A simple example of a power grid with 2 inter onne t trees.

In our framework, we assume that the width of ea h inter onne t tree k an be s aled by

a fa tor sk, so that the ondu tan e of ea h metal bran h within that tree is multiplied by sk.

Furthermore, be ause a via's resistan e usually orresponds to the equivalent resistan e of a

via array of the same width as the metal line, as shown in Fig. 6.1, then we assume that the

ondu tan e of a via onne ted between inter onne t trees j and k is a linear fun tion of the

overlapping area between trees j and k, i.e. the via's ondu tan e is a linear fun tion of the

produ t sjsk. We refer to the grid before s aling any of its inter onne t trees, i.e. the grid where

the width of ea h inter onne t tree is the width initially set by the designer, as the original grid.

Furthermore, let s = [s1 · · · snt ]Tbe an nt × 1 ve tor of s aling fa tors, where nt is the number

of inter onne t trees in the grid. Thus, the original grid orresponds to s = 1, where 1 is a

ve tor of all 1 entries, whose size will be lear from the ontext.

In Fig. 6.1, we show an example of a simple power grid with two inter onne t trees ea h

of whi h is a single bran h, so that s = [s1 s2]T. Suppose that the resistan e values shown in

the gure orrespond to the original grid, i.e. for s = [1 1]T . Noti e that, if s = [2 2]T , then

both inter onne t trees would have twi e their original widths, so that the resistan e between

nodes 3 and 4 would be 0.5Ω and its orresponding ondu tan e would be 2Ω−1. Furthermore,

the via resistan e ( onne ted between nodes 2 and 3) would be 0.25Ω and its orresponding

ondu tan e would be 4Ω−1. Thus, for this simple example, the ondu tan e matrix [52 an

be expressed in terms of s as follows:

G(s) =

4 + 2s2 −2s2 0 0

−2s2 2s2 + s1s2 −s1s2 0

0 −s1s2 s1s2 + s1 −s1

0 0 −s1 s1

Clearly, in general, the ondu tan e matrix is a fun tion of s. Again, be ause it is standard to

model the ee tive atomi diusivity of ea h bran h in the grid as a random variable, dierent

grid samples will experien e dierent sequen es of void nu leations. Hen e, for a grid sample

G(i), the ondu tan e matrix at time t and s aling fa tors s will be denoted as G(i)(t, s). Here,

and throughout the hapter, the supers ript (i) will be used to identify a grid sample.

To determine G(i)(t, s) at a spe i time t and for a spe i s aling fa tors s, we start with

the ondu tan e matrix where the line ondu tivities are s aled by s. Then, we onstru t the

LTI system in (2.58) for ea h tree. As voids nu leate, the ondu tan e matrix as well as the

LTI systems are updated, as des ribed in Se tion 2.8.5, until we rea h time t. The ondu tan e

matrix obtained at time t is G(i)(t, s).

The voltage drop at time t and under s aling fa tors s is expressed as v(i)(t, s) and an be

obtained by solving the following linear system:

G(i)(t, s)v(i)(t, s) = u (6.2)

The TTF sample orresponding to G(i)is also a fun tion of s, whi h we will denote as TTF

(i)(s).

Thus, the MTF of the grid is a fun tion of s, in fa t a nonlinear fun tion of s, whi h we will

denote as MTF(s). In our work, we aim to nd an s su h that:

MTF(s) ≥ T ∗(6.3)

where T ∗is a user-spe ied target lifetime.

6.2 Proposed Approa h

There are ertain design onsiderations that impose onstraints on s, su h as the minimum

spa ing between metal lines and the maximum metal area usage. We will see in Se tion 6.3

that these design onstraints an be represented as a set of linear onstraints on s whi h dene

a feasible spa e for s, denoted as S, as follows

= s ∈ Rnt : s ≥ 1, dlb ≤ Ds ≤ dub (6.4)

where the onstant matrixD and the onstant ve tors dlb and dub are derived based on the design

onstraints. The details are in Se tion 6.3, but for now it is useful to note a key requirement

that will be useful in this se tion, namely

s ≥ 1 (6.5)

whi h means that we will never resize a tree to below its original width.

In order to x an EM-lifetime violation, one should sear h for an s ∈ S su h that MTF(s) ≥

T ∗. This is di ult be ause for one thing, MTF(s) is an impli it nonlinear fun tion of s. One

Figure 6.2: TTF distributions for original and resized grids.

way is to iteratively in rease the value of sk for every inter onne t tree k that has a jun tion

failure, or at whi h a voltage violation o urred, while satisfying s ∈ S, and determine the

orresponding MTF(s), until MTF(s) ≥ T ∗is satised. This approa h, however, performs

lo alized (greedy) improvements to the grid design without fa toring in the response of the

whole grid and, as su h, xing the problem in a spe i area may simply move the problem to

another area of the design. Furthermore, this approa h does not provide any guidan e on how

mu h a tree needs to be widened. It is left to the user to de ide, whi h, in many ases, may

result in over-design of the grid. In fa t, this trial-and-error approa h may iterate forever as it

sear hes blindly for a safe point s in an intra table spa e of possible values. In this se tion,

we des ribe our approa h to x an EM-lifetime violation using Su essive Linear Programming

(SLP) [55, an iterative nonlinear optimization method. As we will see, our approa h provides a

systemati way to x an EM-lifetime violation by fa toring in both the grid design, both lo ally

and globally, and the randomness in EM degradation.

6.2.1 Overview

Starting from the original grid, one would like to nd an s that in reases the MTF(s). Stri tly

speaking, it may be enough to only in rease the TTFs of some, but not all, sample grids.

However, when we nd a new s to x one grid sample, this will also ae t the TTFs of all other

samples, so they have to be he ked or updated as well. For this reason, and to get full onden e

that the MTF will be improved, we sear h for an s that in reases TTF(i)(s) for every grid sample

G(i)whi h, in turn, would in rease MTF(s). This an be done by sear hing for an s that redu es

the voltage drop of every grid sample, by enough to a hieve v(i)(t, s) ≤ Vth, ∀t ≤ T ∗, ∀i. As an

example of what is possible, Fig. 6.2 shows the TTF distribution of an original grid and the

TTF distribution of the grid after being resized using our approa h.

Ideally, one would like to nd the shortest distan e to a safe point in the spa e of s. In

other words, one would like to nd an s that ensures the voltage drop at the nodes of every grid

sample remain within the threshold value until time t = T ∗while requiring minimal in rease

in the total metal area. Mathemati ally, this an be formulated as the following nonlinear

optimization problem:

Minimize aT s

v(i)(t, s) ≤ Vth, ∀t ≤ T ∗, ∀i

s ∈ S

where a = [w1l1 · · · wnt lnt ]T

is an nt × 1 ve tor whi h onsists of the metal areas of ea h

inter onne t tree, and S is the linearly bound domain given in (6.4) whi h represents the feasible

spa e of s based on the design rules. Note that, for any s ∈ S, we have s ≥ 1 so that the above

optimization problem only widens the inter onne t trees relative to their original size. The above

nonlinear optimization problem is solved by means of an iterative stepping strategy, whi h we

will implement using a linearization of the voltage drop around the latest solution point - this

leads to a linear program (LP) formulation in every iteration. The following provides a high-level

des ription of the proposed stepping strategy:

while an EM-lifetime violation exists, do

1. Find the MTF at the latest solution point, and quit

if the MTF is within spe i ation.

2. Linearize the voltage drop of all grid samples

around that point.

3. Determine a des ent dire tion that redu es the

voltage drop of all grid samples (solve an LP).

4. Update the solution point based on a step in the

des ent dire tion.

This stepping strategy is based on the result of the following lemma, whi h provides a des ent

dire tion for the voltage drop. The result of the lemma applies to any grid sample G(i)and so

we will drop the supers ript (i) to simplify the notation. Under the standard assumption that

the original undamaged grid (i.e. before any void nu leation) is onne ted and has at least one

voltage sour e, the ondu tan e matrix of the original grid G(0, 1) is non-singular [52 so that

G−1(0, 1) exists. As voids nu leate over time, the ondu tan e of a bran h does not quite go

to zero, so that the grid remains onne ted and its ondu tan e matrix remains non-singular.

Furthermore, widening the grid metal lines also keeps the grid onne ted, so G−1(t, s) exists for

any t ≥ 0 and s ≥ 1. Here is the lemma.

Lemma 6.1. For any s ≥ 1, we have:

∂v(t, s)

∂sk= −G−1(t, s)

∂G(t, s)

∂skG−1(t, s)u (6.7)

Proof. For any s ≥ 1, we an write:

v(t, s) = G−1(t, s)u (6.8)

so that:

∂v(t, s)

∂sk=∂G−1(t, s)

∂sku (6.9)

where we used the fa t that u is independent of s. Starting with:

G(t, s)G−1(t, s) = I (6.10)

where I is the n× n identity matrix, we an dierentiate both sides with respe t to sk to get:

G(t, s)∂G−1(t, s)

∂sk+∂G(t, s)

∂skG−1(t, s) = 0 (6.11)

or equivalently,

∂G−1(t, s)

∂sk= −G−1(t, s)

∂G(t, s)

∂skG−1(t, s) (6.12)

Substituting (6.12) in (6.9), we get:

∂v(t, s)

∂sk= −G−1(t, s)

∂G(t, s)

∂skG−1(t, s)u (6.13)

and the proof is omplete.

Fig 6.3 shows a high-level representation of the proposed approa h. Ea h inter onne t tree

in the grid an be s aled in the width dimension, and so, the gure represents a high-dimensional

spa e of the s aling fa tors. Noti e that in the gure we show an example of the region where

the MTF of the grid will be within an a eptable margin above the target lifetime and and

example of the region where the MTF of the grid overshoots the target lifetime. The goal of

our approa h is to determine s aling fa tors in the former region. Initially, the width of ea h

inter onne t tree in the grid is the width set by the designer, i.e. the s aling fa tors are set

to one. We evaluate the MTF of the grid under the urrent s aling fa tors (i.e. s(0) in the

gure). If the MTF of the grid is within an a eptable margin above the target lifetime, we

exit. Otherwise, if the MTF of the grid is below the target lifetime, we nd a des ent dire tion

for the voltage drop at the target lifetime in terms of the s aling fa tors (shown as a dashed

arrow that starts from s(0) in the gure), and then, we take a step in that dire tion (shown as

a solid arrow that starts from s(0) in the gure). After we take a step, we update the s aling

fa tors to the new solution point (i.e. s(1) in the gure) and nd the MTF of the grid under

the new s aling fa tors. We repeat this pro ess (evaluate the MTF, nd des ent dire tion, take

a step in that dire tion, and update the s aling fa tors) until the MTF of the grid is within the

a eptable margin above the target lifetime, and so, we exit, or until the MTF at the updated

s aling fa tors overshoots the target lifetime (e.g. s in the gure), and so, the grid is somehow

overdesigned. If the latter ase happens, then we would have bra keted a solution: the MTF

of the grid under the old s aling fa tors (s(4) in the gure) is below the target lifetime, and the

MTF of the grid under the new s aling fa tors (s in the gure) is above the target lifetime. So,

we use line sear h methods to determine a point along the line segment onne ting s(4) and s

that results in an MTF of the grid that is within the a eptable margin above the target lifetime

(i.e. s(5) in the gure).

Figure 6.3: High-level representation of the proposed approa h shown in a high-dimensional

spa e of s aling fa tors for the width of ea h inter onne t tree in the grid.

6.2.2 Stepping Strategy

This se tion introdu es a linearization of the voltage drop, whi h allows us to provide a lin-

earization of (6.6) into an LP around the latest solution point. Given a grid sample G(i), the

rst-order Taylor's expansion of v(i)(t, s) in the neighborhood of s = s(r), denoted as v(i)(t, s),

v(i)(t, s)

= v(i)(t, s(r)) + J (i)(t, s(r))(s− s(r)) (6.14)

where J (i)(t, s(r)) is the n× nt Ja obian matrix of v(i)(t, s(r)), dened as follows:

J (i)(t, s(r))

∂v(i)(t, s(r))

∂s1· · ·

∂v(i)(t, s(r))

∂snt

(6.15)

The olumns of J (i)(t, s(r)) an be omputed using Lemma 6.1. With this, at ea h step r where

an EM-lifetime violation exists, we an onstru t the linearized voltage drop v(i) around the

latest solution point s(r), as in (6.14), to determine a des ent dire tion that guarantees the

linearized voltage drop of all grid samples remain within spe i ations until time T ∗, while

requiring minimal in rease in metal area. This an be formulated as the following LP:

Minimize aT s

v(i)(t, s) ≤ Vth, ∀t ≤ T ∗, ∀i

s ∈ S

(6.16)

Clearly, the number of onstraints in (6.16) is intra table be ause of the ontinuous t domain.

In the following, we will make an assumption that simplies the onstraints spa e of (6.16) and

allows us to get rid of the ∀t ≤ T ∗requirement. This assumption is not really a limitation to our

work. It is used to guide our optimization but it does not invalidate the result if the assumption

does not hold, be ause in our ow we always he k the MTF before we exit. But the assumption

provides signi ant speed-up when it holds, and we will show an empiri al result that onrms

the validity of this assumption in the majority of ases.

Assumption 1. (Monotoni ity) For a grid sample G(i), the voltage drop v(i)(t, s) is a mono-

toni ally in reasing fun tion with respe t to time, i.e. v(i)(t1, s) ≤ v(i)(t2, s), ∀t1, t2, su h that

0 ≤ t1 ≤ t2.

In other words, the reation of voids always auses the voltage drop to in rease. This

intuitively makes sense, be ause void nu leation auses a resistan e in rease, but an in rease in

bran h resistan e does not ne essarily lead to a voltage drop in rease. Generally, however, the

assumption holds most of the time. In fa t, empiri al results for a 37k-node grid show that the

assumption holds in ≈ 90% of the ases. Based on this assumption, for any s ∈ S, we have:

v(i)(t, s) ≤ v(i)(T ∗, s) (6.17)

for any t ≤ T ∗. As a result, it is enough to sear h for an s ∈ S that de reases the voltage drop

at time T ∗. With this, we an simplify (6.16) into:

Minimize aT s

v(i)(T ∗, s) ≤ Vth, ∀i

s ∈ S

(6.18)

Note that the LP in (6.18) still has a large number of onstraints (on the order of Nmc × n,

where Nmc is the number of grid samples) and, thus, solving (6.18) is omputationally expensive.

However, one does not have to solve an LP that in ludes all the onstraints at on e, be ause

we have found that xing one grid sample will often automati ally x many others. Instead, we

start by solving the LP (6.18) using the EM onstraints of a single grid sample (e.g., the grid

sample with the smallest TTF). The solution of this LP, denoted as s, is then used to he k

whether v(i)(T ∗, s) ≤ Vth is satised for other grid samples. If not, we add the onstraints of

the most violated grid sample (the one for whi h (v(i)j − Vth,j) is largest); we solve the resulting

(larger) LP, and repeat until the onstraints of all grid samples are satised. It is possible for

this in remental approa h (as we will refer to it) to be ome more expensive than solving the

original LP (6.18), however, our experien e is that this approa h is mu h faster in general.

6.2.3 Step-size Sele tion

Let s denote the ve tor that solves the above LP (6.18), found using our in remental approa h.

Be ause the LP is a linearization of the original nonlinear problem (6.6) around the latest

solution, taking a large step-size towards s may be overkill ; the farther we go from the urrent

solution, the less a urate the linearization be omes. One typi al way [26 of taking a partial

step in a spe i dire tion is to enfor e a fra tion of the full-step that would respe t some user

riterion, e.g.

s(r+1) = s(r) + λ(r)(s− s(r)) (6.19)

where the unitless λ(r) ∈ [0, 1] represents the fra tional step-size at iteration r that is hosen

based on user riteria. Note that λ(r) ≥ 0 be ause we should move in the dire tion found by the

LP, and λ(r) ≤ 1 be ause there is no reason to take a larger step than the one returned by the

The following lemma establishes that if we start with an original grid that satises the design

onstraints in Se tion 6.3, i.e. s(0) ∈ S, then the s aled grid, at every step r, also satises the

design onstraints, i.e. s(r) ∈ S. Thus, the nal grid satises the design onstraints.

Lemma 6.2. Given s(0) ∈ S, and with referen e to (6.19), with λ(r) ∈ [0, 1], ∀r ≥ 0, it follows

that s(r) ∈ S, ∀r ≥ 0.

Proof. The proof is by indu tion. Noti e that s(0) ∈ S, due to the statement of the lemma.

In the following, we will show that, given an s(r) ∈ S, we have s(r+1) ∈ S, whi h would

omplete the proof. Be ause s(r) ∈ S, and s ∈ S due to (6.18), then s(r) ≥ 1 and s ≥ 1.

And due to λ(r) ∈ [0, 1], we have λ(r) ≥ 0 and (1 − λ(r)) ≥ 0, so that λ(r)s ≥ λ(r)1 and

(1− λ(r))s(r) ≥ (1− λ(r))1. This leads to

s(r+1) = s(r) + λ(r)(s− s(r)) = λ(r)s+ (1− λ(r))s(r) (6.20)

≥ λ(r)1 + (1− λ(r))1 = 1 (6.21)

so that s(r+1) ≥ 1. Likewise, be ause s(r) ∈ S and s ∈ S, due to (6.18), then

dlb ≤ Ds(r) ≤ dub (6.22)

dlb ≤ Ds ≤ dub (6.23)

Due to λ(r) ∈ [0, 1], we have λ(r) ≥ 0 and (1−λ(r)) ≥ 0, so that multiplying (6.22) with (1−λ(r))

(1− λ(r))dlb ≤ (1− λ(r))Ds(r) ≤ (1− λ(r))dub (6.24)

and multiplying (6.23) with λ(r) gives

λ(r)dlb ≤ λ(r)Ds ≤ λ(r)dub (6.25)

Adding (6.24) and (6.25) gives:

dlb ≤ D(

(1− λ(r))s(r) + λ(r)s)

≤ dub (6.26)

or equivalently,

dlb ≤ Ds(r+1) ≤ dub (6.27)

so that s(r+1) ∈ S, and the proof is omplete.

In our work, we hoose λ(r) su h that the in remental in rease in the total metal area is

within a ertain user-spe ied value δ > 0, i.e.

aT s(r+1) − aT s(r)

aT s(r)≤ δ (6.28)

Note that the left-hand side of the above inequality an be negative, in whi h ase, step r+1 tries

to redu e an unne essary additional metal area that was introdu ed at step r. In the following,

Lemma 6.3 provides a ne essary and su ient ondition on λ(r) so that (6.28) is satised. In

fa t, Lemma 6.3, ombined with the fa t that λ(r) ∈ [0, 1], provides a range of feasible values of

λ(r) as follows:

0 ≤ λ(r) ≤ min(γ(r), 1) (6.29)

where the s alar γ(r) is dened below:

δaT s(r)

aT s− aT s(r)if aT s > aT s(r),

1 otherwise.

(6.30)

Lemma 6.3. For any λ(r) ∈ [0, 1], then γ(r) > 0 and, with referen e to (6.19) and (6.28), we

aT s(r)≤ δ ⇐⇒ λ(r) ≤ γ(r) (6.31)

Proof. First, we will show that γ(r) > 0. Noti e that if aT s ≤ aT s(r), then γ(r) = 1, due

to (6.30), so that γ(r) > 0. Otherwise, if aT s > aT s(r), then aT s− aT s(r) > 0 whi h, ombined

with δ > 0 and aT s(r) > 0, be ause a > 0 and s(r) > 0, gives γ(r) > 0. Next, we will prove (6.31).

Noti e that, be ause aT s(r) > 0, then

aT s(r)≤ δ (6.32)

⇐⇒ aT s(r+1) − aT s(r) ≤ δaT s(r) (6.33)

⇐⇒ aT(

s(r) + λ(r)(

s− s(r)))

(6.34)

− aT s(r) ≤ δaT s(r)

⇐⇒ λ(r)(

aT s− aT s(r))

≤ δaT s(r) (6.35)

We will now show that (6.35) ⇐⇒ λ(r) ≤ γ(r) by separately onsidering the two ases: aT s >

aT s(r) and aT s ≤ aT s(r). Considering rst the ase aT s > aT s(r), we have by denition γ(r) =

δaT s(r)/(aT s−aT s(r)) and, with aT s−aT s(r) > 0, then (6.35)⇔ λ(r) ≤ δaT s(r)/(aT s−aT s(r)) =

γ(r). Considering now the ase aT s ≤ aT s(r), re all that, if p and q are two statements then

p ⇔ q is true if and only if the logi al statement (pq + pq) is always true (where + denotes

the boolean OR operator), so that p and q are always either both true or both false. In this

ase, with aT s ≤ aT s(r), then γ(r) = 1 so that λ(r) ≤ γ(r) and, with aT s − aT s(r) ≤ 0, then

λ(r)(aT s − aT s(r)) ≤ 0 ≤ δaT s(r). With both statements always true, it follows that (6.35)

⇐⇒ λ(r) ≤ γ(r) in this ase as well, and the proof is omplete.

Due to Lemma 6.3, it will always be possible to hoose a λ(r) in the feasible range (6.29).

So, in every step, we will start with λ(r) to be the largest value in this range be ause it already

satises all the requirements and there is no reason to take a smaller step, i.e. λ(r) = min(γ(r), 1).

However, it is possible that taking su h a step from the latest solution point s(r) to overshoot the

target lifetime beyond a ertain a eptable margin ∆, i.e. MTF(s(r) + λ(r)(s− s(r))) > T ∗ +∆.

In this ase, we have bra keted a solution, be ause MTF(s(r)) < T ∗and MTF(s(r) + λ(r)(s−

s(r))) > T ∗, and so, we will perform a line sear h to nd a 0 ≤ λ(r) ≤ min(γ(r), 1) (referred to

as a bra keted region) su h that MTF(s(r)+λ(r)(s− s(r))) = T ∗(or within the ∆-margin above

T ∗). The pro ess of nding su h a λ(r) an be posed as a root nding problem be ause basi ally

we are sear hing for a root to the nonlinear fun tion f(λ(r))

= MTF(s(r) + λ(r)(s− s(r)))− T ∗.

There are many ommon methods in the literature to nd a root in a bra keted region, su h as

bise tion method, false position method, se ant method, et . The reader is referred to [63 for

details on su h methods. Some of these methods preserve the bra keting of a root property

while tra king down a root, su h as the bise tion and the false position methods, whereas

other methods do not preserve this property, su h as the se ant method. In our work, we

use a method that preserves this important property, namely the false position method, as it

guarantees onvergen e to a root.

6.3 Design Rules

Several design rules must be onsidered when resizing the metal lines of the grid. These rules

are onstraints on how the grid may be modied, e.g., whi h inter onne t trees an be s aled,

and by how mu h. In this se tion, we dis uss some of these design rules and show how they boil

down to linear onstraints on s whi h an be ompa tly represented as

dlb ≤ Ds ≤ dub (6.36)

Every inter onne t tree in the grid has a minimum width limit that it should satisfy, usually

derived based on the te hnology node and other design onsiderations. We assume that the

width of ea h tree in the original grid is already set to its minimum, so that trees an only be

s aled up relative to their original width, whi h denes a linear onstraint on s, namely s ≥ 1.

As mentioned earlier, these linear onstraints will dene a feasible spa e for s, denoted as S, i.e.

= s ∈ Rnt : s ≥ 1, dlb ≤ Ds ≤ dub (6.37)

All the onstraints dis ussed below, along with any additional similar onstraint spe ied by the

user, an be handled by our approa h as long as they an be represented as linear inequalities

6.3.1 Maximum Metal Area Usage

The design team might have spe i ations on the maximum allowable in rease (relative to the

original grid) in the total metal area. This spe i ation an be expressed as follows

aT 1≤ β (6.38)

where β > 0 is the user-spe ied maximum allowable in rease in the total metal area, or

equivalently,

aT s ≤ βaT 1 (6.39)

be ause aT 1 > 0. Furthermore, the design team might have spe i ations on the maximum

allowable in rease in the metal area of a parti ular layer p in the grid. These spe s an also be

expressed as linear inequalities as follows

k∈Kp

skwklk ≤ βp∑

k∈Kp

wklk (6.40)

Figure 6.4: Minimum spa ing between supply and ground inter onne t trees.

where βp > 0 is the user-spe ied maximum allowable in rease in the metal area of layer p and

Kp is a set of indi es of all the inter onne t trees of metal layer p.

6.3.2 Minimum Spa ing

Typi ally, ea h metal layer onsists of a set of alternating supply and ground inter onne t trees

as shown in Fig. 6.4. Of ourse, supply and ground inter onne t trees within the same layer

must not overlap. In fa t, there is a minimum allowable spa ing between supply and ground

inter onne t trees. Suppose that a supply inter onne t tree j is adja ent to a ground inter onne t

tree within metal layer p with original spa ing of κ between the two trees, as shown in Fig. 6.4.

Noti e that after the inter onne t tree j of original width wj is s aled by sj , it will have a width

of sjwj , so that the tree width will in rease by (sjwj − wj)/2 at ea h of its two sides. Thus,

under s aling fa tors s, the spa e between the two trees is:

κ−sjwj − wj

2(6.41)

Let κp be the minimum allowable spa ing between a supply and ground inter onne t tree in

metal layer p, then s should satisfy:

κ−sjwj − wj

2≥ κp (6.42)

or equivalently,

sjwj ≤ 2(κ− κp) + wj (6.43)

6.4 Implementation

In this se tion, we dis uss the implementation of the LP in (6.18) that in ludes the EM on-

straints of a single grid sample G(i), as the implementation of the LP using the onstraints of

Figure 6.5: Flow hart of the proposed algorithm.

multiple grid samples is similar. Thus, we will drop the supers ript (i) to simplify the notation.

In other words, we will dis uss the implementation of the following LP:

Minimize aT s

v(T ∗, s(r)) ≤ Vth

s ∈ S

(6.44)

One way to onstru t the feasible region of (6.44) is to ompute v(T ∗, s), as in (6.14), whi h

requires the omputation of J(T ∗, s(r)) whi h, in turn, requires the omputation of

∂v∂sk

(T ∗, s(r)),

∀k ∈ 1, . . . , nt, i.e.

∂sk(T ∗, s(r)) = G−1(T ∗, s(r))

∂G(T ∗, s(r))

∂skv(T ∗, s(r)), ∀k (6.45)

whi h requires us to solve the following linear system:

G(T ∗, s(r))∂v

∂sk(T ∗, s(r)) =

∂G(T ∗, s(r))

∂skv(T ∗, s(r)) (6.46)

for every k ∈ 1, . . . , nt. In fa t, to onstru t the full LP in (6.18), one needs to solve the

above system for every grid sample G(i)and for every k ∈ 1, . . . , nt, whi h is omputationally

expensive. In the following, we will show that using a simple hange of variable, we an onstru t

the LP in (6.18) without performing any system solve.

Dene the following n× nt matrix:

J ′(T ∗, s(r)) =[

−∂G(T ∗,s(r))∂s1

v(T ∗, s(r)) · · · −∂G(T ∗,s(r))∂snt

v(T ∗, s(r))]

(6.47)

Noti e that we have

J(T ∗, s(r)) = G−1(T ∗, s(r))J ′(T ∗, s(r)) (6.48)

Consider the following hange of variable:

G(T ∗, s(r))y = J ′(T ∗, s(r))(s− s(r)) (6.49)

where y is an n× 1 ve tor. With this, noti e that

v(T ∗, s(r)) + y = v(T ∗, s(r)) +G−1(T ∗, s(r))J ′(T ∗, s(r))(s− s(r)) (6.50)

= v(T ∗, s(r)) + J(T ∗, s(r))(s− s(r)) = v(T ∗, s(r)) (6.51)

where the se ond step is due to (6.48). Thus, instead of solving the LP in (6.44), we will solve

the following equivalent LP:

Minimize aT s

y ≤ Vth − v(T ∗, s(r))

G(T ∗, s(r))y = J ′(T ∗, s(r))(s− s(r))

s ∈ S

(6.52)

Although the LP (6.52) has larger number of variables ompared to the LP in (6.44), a huge

runtime advantage is attained by solving (6.52) for two reasons: 1) the LP in (6.52) does

not require any linear system solve whereas the LP in (6.44) requires nt system solves (whi h

onsists of an LU-fa torization of G(T ∗, s(r)) and nt forward/ba kward substitutions) for every

grid sample, 2) J(T ∗, s(r)) is usually a dense matrix, be ause it in ludes the multipli ation of

Figure 6.6: A parallel implementation of the proposed algorithm.

non-zero ve tors (

∂v(T ∗,s(r))∂sk

) with the full matrix G−1(T ∗, s(r)), whi h makes the onstraints of

the LP (6.44) dense, whereas the onstraints of (6.52) are sparse.

6.5 Parallelization

The overall ow of the proposed approa h is shown in Fig. 6.5. Observing the proposed algo-

rithm, the most expensive operations are:

1. Evaluating the MTF (MC ananlysis)

2. Solving an LP

3. Che king satisability (Solving linear systems)

Re all that evaluating the MTF requires one to ompute the TTF of a large number of grid

samples and he k for MC onvergen e after nding ea h TTF sample. Noti e that nding

the TTF for dierent grid samples an be done independently. Also, re all that solving the

Power Grid

Layers

Nodes Bran hes Jun tions Trees

Current

Sour es

(in mm2)

ibmpg1 2 6,085 10,853 11,562 709 5,387 21.1

ibmpg2 4 61,677 61,143 61,605 462 18,419 66.31

ibmpg4 6 474,524 465,416 475,069 9,653 132,916 187.4

ibmpg6 3 403,915 797,579 807,825 10,246 380,742 416.53

ibmpgnew1 6 315,951 698,101 717,629 19,528 178,965 940.38

PG1 8 36,862 36,189 36,862 673 2,448 8.80×10−4

PG2 8 560,468 557,816 560,468 2,652 40,254 1.30×10−2

PG3 8 1,232,260 1,226,703 1,232,260 5,557 89,508 2.85×10−2

PG4 8 2,629,448 2,617,216 2,629,431 12,215 566,736 8.98×10−2

PG5 8 4,094,704 4,082,039 4,094,704 12,665 886,124 0.13

LP (6.18) using our in remental approa h onsists of solving an LP using the EM onstraints of

a subset of the grid samples and he king whether the onstraint v(i)(T ∗, s) ≤ Vth is satised for

the rest of the grid samples. To he k the satisability of this onstraint for ea h grid sample,

one needs to perform a linear system solve (as dis ussed in the previous se tion to ompute y

in (6.49)). However, the linear system solves orresponding to dierent grid samples an also be

done independently. Thus, the above operations (ex ept for solving an LP) an be parallelized.

Furthermore, in our implementation, all LPs are solved using the interior-point optimizers of

the MOSEK optimization pa kage [49. MOSEK has an intrinsi parallel implementation whi h

automati ally takes advantage of multi- ore ma hines. Fig. 6.6 shows a high-level des ription

of how the proposed algorithm an be parallelized.

6.6 Results

The approa h dis ussed in Se tion 6.2 has been implemented in C++. We veried our approa h

using two types of test grids: IBM power grids [54 and our own (internal) grids. The inter-

nal grids were generated based on user spe i ations, in luding grid dimensions, metal layers,

number of blo ks, number of metal layers in the global grid, pit h and width per layer, and C4

and urrent sour e distributions. The te hnology spe i ations were onsistent with 1V 45 nm

CMOS te hnology. The details for the grids are given in Table 6.1. The grids named PG1-PG4

are internal grids. All results were obtained using a hyperthreaded 12- ore 3GHz Linux ma hine

with 128GB of RAM. The LPs were solved using the MOSEK optimization pa kage [49. All

the linear systems are solved using Cholmod [21 ex ept for the voltage drop updates required

to ompute the MTF whi h were done using the Pre onditioned Conjugate Gradient (PCG)

method des ribed in [18. In our implementation, we use Pthreads to parallelize the ode as

dis ussed in the previous se tion and to take advantage of the 12- ore ma hine. All power grids

are assumed to have a target MTF of 12 years, an a eptable overshoot margin of 1 year (i.e.

∆ = 1 year), and the in remental in rease in the total metal area between two onse utive steps

is required to be within 0.2% (i.e. the δ parameter in (6.28) is 0.2%).

Table 6.2: Summary of results using proposed approa h

Power Grid Proposed Approa h

Original Final Metal area Num of Num of Num of

Total time

MTF MTF in rease s aled trees steps MTF eval.

ibmpg1 7.1 yrs 12.6 yrs 0.688% 3 4 6 11.6 min

ibmpg2 9.6 yrs 12.3 yrs 0.600% 15 3 4 13.0 min

ibmpg4 9.8 yrs 12.9 yrs 0.061% 8 1 4 44.3 min

ibmpg6 10.2 yrs 12.8 yrs 0.010% 2 1 4 3.9 hrs

ibmpgnew1 9.8 yrs 12.7 yrs 0.006% 2 1 3 34.6 min

PG1 9.3 yrs 12.3 yrs 0.134% 27 1 3 1.0 min

PG2 10.0 yrs 12.9 yrs 0.010% 10 1 8 25.1 min

PG3 10.5 yrs 12.2 yrs 0.021% 14 1 4 1.2 hrs

PG4 6.4 yrs 12.2 yrs 0.007% 6 1 8 2.9 hrs

PG5 8.8 yrs 13.0 yrs 0.018% 4 1 4 1.9 hrs

Geometri mean - - 0.04% - - - 33 min

Table 6.3: Summary of results using the greedy approa h

Power Grid Greedy Approa h

Metal area Num of Num of Num of

Total time

in rease s aled trees steps MTF eval.

ibmpg1 0.88% 29 5 9 18.1 min

ibmpg2 2.6% 145 13 14 32.9 min

ibmpg4 0.2% 70 1 2 17.1 min

ibmpg6 0.04% 18 1 4 3.2 hrs

ibmpgnew1 0.03% 58 1 5 1.1 hrs

PG1 0.6% 24 3 4 1.2 min

PG2 0.06% 4 1 12 35.0 min

PG3 0.03% 5 1 5 1.2 hrs

PG4 0.02% 12 1 9 2.4 hrs

PG5 0.04% 29 1 5 2.3 hrs

Geometri mean 0.12% - - - 38 min

Table 6.2 summarizes the results of the proposed approa h. In olumns 23, we show the

MTF of the original grid and the MTF of the grid after widening the inter onne t trees, respe -

tively. Furthermore, olumns 4-5 show the per entage in rease in the total metal area of the grid

as well as the number of trees that were s aled to x the EM-lifetime violation. For example, on

a 1.2 million node grid (PG3), we were able to in rease the MTF of the grid from 10.5 years to

Figure 6.7: Metal in rease breakdown.

Percentage of total time

0 20 40 60 80 100

ibmpg1

ibmpg2

ibmpg4

ibmpg6

ibmpgnew1

MTF Evaluations

Solving LPs

Figure 6.8: Runtime breakdown.

12.2 years using 0.02% metal area in rease and by s aling 14 inter onne t trees. It is important

to note that 7 out of the 14 s aled trees did not have either a void nu leation or a voltage drop

violation. To elaborate further on this point, Fig. 6.7 shows the breakdown of the per entage

in rease in metal area over 4 trees' ategories: trees with neither a voltage violation nor a void

nu leation, trees with only voltage violations only, trees with only void nu leations, trees with

both a voltage violation and a void nu leation. We an noti e that in 4 ases (PG1-PG3 and

PG5) a signi ant portion of the metal area in rease was allo ated to inter onne t trees that had

(a) The original grid (MTF ≈ 9 years)

(b) The xed grid (MTF ≈ 12 years)

( ) Colorbar for the ontour plots in (a) and (b)

Figure 6.9: A ontour plot of the worst- ase (over all grid samples) voltage drop at time T ∗on

a 37K-node grid with 8 metal layers. The olor bar represents the voltage drop in mV. The red

dots show the nodes that have voltage drop above Vth. The yellow stripes show the lo ation of

the inter onne t trees that have been resized by the proposed approa h. Both x and y axes are

in µm.

Figure 6.10: A ontour plot of the worst- ase voltage drop at time T ∗on the xed grid using

the greedy approa h (MTF = 12.8 years).

Figure 6.11: Top view of a 37K-node power grid with 8 metal layers. The red dots shows the

die region where the failed nodes are lo ated. The yellow stripes show the die region where an

inter onne t tree was s aled using the proposed approa h.

neither a voltage drop violation nor a void nu leation. Also, although most of the metal area

in rease for the other ases was allo ated to trees that had either a void nu leation or a voltage

drop violation, these trees onstituted a very small fra tion of the total number of trees with a

voltage violation or a void nu leation, meaning it is not straightforward to determine whi h of

Figure 6.12: Top view of a 37K-node power grid with 8 metal layers. The red dots shows the

die region where the failed nodes are lo ated. The yellow stripes show the die region where an

inter onne t tree was s aled using the greedy approa h.

the riti al trees to widen and by how mu h. This shows that xing an EM-lifetime violation

might not be intuitively easy or obvious, and demonstrates the value of our optimization based

approa h that fa tors in the behavior of the whole grid.

The total runtime of our approa h, i.e. the total wall lo k time of the parallel Pthreads

implementation, in luding both the MTF evaluations and the in remental LP solving, is shown

in olumn 8. Furthermore, olumn 6 shows the number of steps taken to x an EM-lifetime

violation. Also, olumn 7 shows the number of MTF evaluations performed to x an EM-lifetime

violation. For example, on a 1.2 million node grid (PG3), our approa h xed the violation using a

single step and 4 MTF evaluations, whi h took 1.2 hours. Fig. 6.8 shows the runtime breakdown

of our approa h. Noti e that around 70% of the total runtime time to x a grid is spent on

nding the MTF of the grid.

To further illustrate the value of our optimization based approa h, we also implemented a

greedy approa h that iteratively widens the inter onne t trees at whi h a voltage violation has

o urred. The overall ow of this greedy approa h is similar to the proposed approa h, i.e. at

the urrent solution point, we determine the MTF of the grid, if the MTF is less than the target

lifetime, we widen the trees at whi h a voltage violation has o urred, and repeat until the MTF

is within the a eptable margin (1 year) above the target lifetime. The in remental in rease in

the total metal area of the grid between two onse utive steps is also required to be within 0.2%.

At ea h step of the greedy approa h, all trees at whi h a voltage violation has o urred would be

widened uniformly while respe ting the 0.2% metal area in rease limit. Table 6.3 summarizes

the results of the greedy approa h. In olumns 2-3, we show the per entage in rease in the total

metal area of the grid as well as the number of trees that were s aled to x the EM-lifetime

violation. Noti e that, on most of the grids, the total metal area in rease and the number

of s aled trees using the greedy approa h are signi antly larger as ompared to the proposed

approa h. In fa t, the metal area in rease using the greedy approa h is around 3x the metal

area in rease using the proposed approa h. So, the greedy approa h an result in overdesigned

grids.

Fig. 6.9 shows a ontour plot of the worst- ase (taken over all grid samples) voltage drop at

time T ∗on a 37K-node grid (PG1). This power grid onsists of 8 metal layers (M1-M8) and

4 subgrids (spread over the bottom 4 metal layers M1-M4). Fig. 6.9a shows the ontour plot

orresponding to the original grid, Fig. 6.9b shows the ontour plot orresponding to the nal

grid using the proposed approa h, and Fig 6.9 shows the olorbar of the two ontour plots.

The nodes at whi h a voltage violation o urs are shown as red dots. The original grid has an

MTF of 9.3 years while the nal grid has an MTF of 12.4 years. In Fig. 6.9b, we show the

lo ation of the inter onne t trees that have been resized by the proposed approa h. We an

noti e that most of the resized trees are lo ated on the upper metal layers (M5-M7) whereas the

voltage drop violations o urred on M1. On the other hand, Fig. 6.10 shows the ontour plot

orresponding to the nal grid using the greedy approa h (the olorbar is in Fig 6.9 ). Noti e

that the resized trees by the greedy approa h are lo ated on M1, where voltage violation has

o urred. In Fig. 6.12, we show a top view of the grid with all of its inter onne t trees (shown

as bla k lines). Again, the yellow stripes and the red dots show the lo ations of the resized

trees using the proposed approa h and the nodes with voltage drop violation, respe tively. A

key observation is that the resized trees are lo ated on the south-east orner of the die, where

the nodes with voltage drop violation are lo ated. In Fig. 6.12, we show a top view of the grid

with the yellow stripes representing the lo ations of the resized trees using the greedy approa h.

Clearly, the greedy approa h performs lo alized modi ations to the grid and does not fa tor in

the behavior of the whole grid.

6.7 Con lusion

In this hapter, we proposed a power grid xing s heme that xes an EM-lifetime violation

by widening the grid metal lines. Subje t to design rules and other design onsiderations, the

proposed approa h iteratively improves the EM-lifetime of the grid, by fa toring in both the grid

design and the randomness in EM degradation, to meet a target EM-lifetime. We showed, based

on IBM ben hmarks and our own grids, that our approa h an x an EM-lifetime violation (2-6

years below the target lifetime) with less than 1% in rease in total grid metal area. Furthermore,

we showed that our approa h is omputationally e ient (roughly 2 hours to x a 4.1 million

node grid), whi h makes our approa h s alable to real grid designs.

Chapter 7

Con lusions and Future Work

The power grid of an integrated ir uit is the ele tri network that onne ts the external supply

pins of a hip to the on-die transistors. A well-designed power grid should guarantee the proper

logi fun tionality at the intended design speed over the hip's lifetime. Ideally, the power grid

must a t as a voltage sour e that supplies appropriate voltage levels to the underlying logi ells.

Due to various reasons, the grid metal lines experien e large deviations from the nominal voltage

whi h lead to timing violations and logi failures. The transients of the logi ir uitry may result

in una eptable voltage variations in the short-term. Whereas the aging and deformation of the

grid metal lines over time, a physi al phenomenon referred to as Ele tromigration, may result

in una eptable voltage variations in the long-term. EM degradation puts a limit on the useful

lifetime of a hip, referred to as EM-lifetime.

Verifying the grid voltages under the transient urrents of the logi ir uitry is a ommon

step in the design ow of a hip referred to as power grid veri ation. Power grid veri ation

tools generally belong to one of these two ategories: ve tor-based and ve torless. Ve tor-based

methods are ommonly used in industry. These methods require the user to spe ify urrent

ve tors (or waveforms) that represent the urrent load of the logi ir uitry and nds the voltage

drop at all grid nodes by means of simulation. The drawba k of ve tor-based methods is that

they rely on urrent waveforms, whi h are not available until late stages of the design y le when

it is hard to modify the grid design. Ve torless methods, on the other hand, require the user

to spe ify only onstraints on the urrents drawn by the logi ir uitry and nds the worst- ase

voltage drop at all grid nodes. Obtaining/spe ifying these urrent onstraints is a burdensome

task for users and has be ome a hurdle to the adoption of these methods.

These veri ation te hniques were extensively addressed over the past de ade to provide an

e ient way to verify a passive grid (in ludes resistive, apa itive, and indu tive parasiti s). In

modern designs, power grids often in lude a tive devi es that implement power gating to allow

the supply urrents of major ir uit blo ks to be turned OFF. Su h designs are widely used as

a way to manage the power dissipation and avoid overheating. Many authors have studied the

s heduling of a hip workload in order to remain within the power/temperature spe i ations.

But hip workload also impa ts the voltage drop on the grid. With the advent of su h designs,

Chapter 7. Con lusions and Future Work 140

most traditional te hniques are ill-equipped to verify the grid. Thus, there is a need to manage

the workload so that voltage variations remain within spe i ations. This an be done using an

on- hip s heduler that ensures a safe s hedule of hip workload.

Ele tromigration degradation, on the other hand, happens over an extended period of time.

Although the power grid may perform as desired for several years, slower hip performan e and

in orre t logi fun tionality may start to o ur after that. As grid lines degrade over time, their

resistan e in rease and, onsequently, the power grid might fail to supply appropriate voltage

levels to the logi ir uitry. The earliest time at whi h a voltage failure o urs is the EM-lifetime

of the hip. Typi ally, the design must meet a ertain target lifetime before sign-o. Modifying

the power grid design is a ommon way to satisfy this design spe i ation. This problem has

been addressed by many authors, however, existing methods are unreliable be ause they are

based on ina urate EM models and pessimisti power grid failure models.

In this thesis, we developed a olle tion of te hniques that help in the analysis, veri ation,

and design of power grids. We rst developed, in Chapters 3 and 4, a systemati approa h

that generates urrent onstraints whi h, if adhered to by the underlying logi ir uitry, would

guarantee safe voltage levels on the grid. We then introdu ed, in Chapter 5, a power s heduling

framework that targets power-gated grid designs. This framework generates urrent budgets for

individual blo ks and identies whi h blo ks an be turned ON simultaneously without violating

the voltage integrity spe i ation. Finally, we developed, in Chapter 6, a power grid xing

s heme that resizes the grid metal lines in order to meet a target EM-lifetime.

There are numerous appli ations to the onstraints generation framework that an be ex-

plored. The generated urrent/power budgets en apsulate mu h useful information about the

grid. The availability of su h budgets early in the design ow may drive the rest of the design

ow, so that automated oorplanning and grid-aware pla ement may be ome feasible, something

that has never been done before. The power s heduling framework developed in this thesis iden-

ties the hip workloads that result in safe voltage levels. A tive power grids must be veried

under both working and transition modes. A desirable extension to this work is to identify all

safe hip state transitions. Typi ally, there are various options of how a blo k may be turned

ON (fast, with high urrent demands, or slower with less urrent). A s heduler must be able

to determine the fastest way to startup a hip as well as transition between workloads without

violating supply integrity spe i ations. Regarding the EM work, it might be possible to de-

velop an ee tive statisti al model to dire tly evaluate the MTF of a grid and, thus, bypassing

the Monte Carlo iterations this enables us to he k for and x an EM-lifetime violation in a

more omputationally e ient manner. In fa t, this ee tive statisti al model might allow us

to address the more important question: given a power grid and a target MTF, an we generate

urrent budgets (or metal budgets) that guarantee the grid survival up to the target MTF?

Appendi es

Appendix A

Generating Current Budgets for RC

A.1 Appli ations

Lemma A.1. F(ul) is maximal in S.

Proof. Re all that L∗and ul are well-dened and (L∗, ul) ∈ T , so that L∗η ≤MGul and L

∗ ≥ 0

whi h, be ause η ≥ 0, gives 0 ≤ L∗η ≤ MGul and so that F(ul) is not empty. We will prove

that l(·) satises the onditions of Lemma 3.5, from whi h F(ul) is maximal in S. First, noti e

that for any u, u′ ∈ U , if F(u′) = F(u), it follows that l(u′) = l(u), due to (3.35). It remains to

prove that for any u, u′ ∈ U , if MGu′ > MGu, then l(u′) > l(u).

Let λ = min∀i (MGu′|i −MGu|i) /max∀i(di) and let L′ = l(u)+λ. Be ause MGu′ > MGu

and M > 0, it follows that λ > 0 and L′ > l(u). Furthermore, we have (L′, u′) ∈ T , be ause

u′ ∈ U and:

L′η = l(u)η +min∀i (MGu′|i −MGu|i)

max∀i(ηi)η (A.1)

≤MGu+min∀i

MGu′|i −MGu|i)

1n (A.2)

≤MGu′ (A.3)

where in (A.2) we used (l(u), u) ∈ T and η/max∀i(ηi) ≤ 1n. Therefore, we have l(u′) ≥ L′ >

l(u), so that l(·) satises the onditions of Lemma 3.5 and F(ul) is maximal in S.

Lemma A.2. Let u∗ = L∗G−1H1m, then u∗ ∈ U and l(u∗) = L∗.

Proof. Re all that η ≥ 0 and L∗ ≥ 0, so that L∗η ≥ 0. Moreover, be ause C(L∗) ⊆ F(ul), we

0 ≤ L∗η = L∗MH1m ≤MGul (A.4)

where we used the fa t that η = M ′1m = MH1m and the nal step due to Lemma 3.8.

Appendix A. Generating Current Budgets for RC Grids 143

Noti e that G−1A = G−1(G + B) = In + G−1B ≥ 0, be ause In ≥ 0, G−1 ≥ 0, and B ≥ 0.

Multiplying (A.4) with G−1A ≥ 0, we get:

0 ≤ L∗G−1H1m ≤ ul (A.5)

Therefore, we have 0 ≤ u∗ = L∗G−1H1m ≤ ul, so that Pu∗ ≤ Pul ≤ Vth, be ause P ≥ 0

and the nal step is due to ul ∈ U . It follows that u∗ ∈ U . Moreover, we have that MGu∗ =

L∗MH1m = L∗η, from whi h C(L∗) ⊆ F(u∗), due to Lemma 3.8, so that l(u∗) = L∗, due

to (3.35), and the proof is omplete.

Lemma A.3. F(uc) is maximal in S.

Proof. Re all that ζ, ω, and uc are well-dened and (ζ, ω, uc) ∈ C, so that M ′ζ ≤ MGuc and

ζ ≥ 0 whi h, be ause M ′ ≥ 0, gives 0 ≤M ′ζ ≤MGuc and so that F(uc) is not empty. We will

prove that ψ(·) satises the onditions of Lemma 3.5, from whi h F(uc) is maximal in S. First,

noti e that for any u, u′ ∈ U , if F(u′) = F(u), it follows that ψ(u′) = ψ(u), due to (3.44). It

remains to prove that for any u, u′ ∈ U , if MGu′ > MGu, then ψ(u′) > ψ(u).

For any u ∈ U , there must exist a ve tor I ∈ F(u) and L, where 0 ≤ Lη ≤ MGu, su h

∑mj=1 Ij+mL = ψ(u). Let λ = min∀i (MGu′|i −MGu|i) /max∀i,j(mij). Be auseMGu′ >

MGu and M > 0, it follows that λ > 0. Also, let e1 ∈ Rm

be the ve tor whose 1st entry is

1 and all other entries are 0 and let I ′ = I + λe1. Be ause λ > 0, we have λe1 ≥ 0, λe1 6= 0,

I ′ ≥ I ≥ 0, and I ′ 6= I, so that

∑mj=1 I

′j +mL >

∑mj=1 Ij +mL = ψ∗

. Furthermore, we have

I ′ ∈ F(u′), be ause:

M ′I ′ =M ′I + λM ′e1 =M ′I + λc′1 (A.6)

=M ′I +min∀i (MGu′|i −MGu|i)

max∀i,j(mij)c1 (A.7)

≤MGu+min∀i

MGu′|i −MGu|i)

1n (A.8)

≤MGu′ (A.9)

where in (A.8) we used I ∈ F(u) and c1/max∀i,j(mij) ≤ 1n. Also, we have 0 ≤ Lη ≤ MGu <

MGu′. Therefore, we have I ′ ∈ F(u′), and L satisfying 0 ≤ Lη ≤ MGu′, with ψ(u′) ≥∑m

j=1 I′j +mL > ψ(u), so that ψ(·) satises the onditions of Lemma 3.5 and F(uc) is maximal

Appendix B

Generating Current Budgets for RLC

B.1 Subset-Preserving (SP) Matri es

Lemma B.1. Let X be a 2n× 2n matrix represented as:

X11 X12

X21 X22

where X11, X12, X21, and X22 are n× n matri es. X is subset-preserving (SP) if and only if:

X11 ≥ 0 , X12 ≤ 0 , X21 ≤ 0 , and X22 ≥ 0 (B.2)

Proof of the if dire tion: Let X be a 2n × 2n matrix that satises (B.2). Let u =

and v =

be any two 2n × 1 ve tors, where ut, ub, vt, and vb are n × 1, and let u ⊆ v.

Be ause ut ≤ vt, then X11ut ≤ X11vt, and be ause ub ≥ vb, then X12ub ≤ X12vb, whi h gives

X11ut +X12ub ≤ X11vt +X12vb. Likewise, be ause ut ≤ vt, then X21ut ≥ X21vt, and be ause

ub ≥ vb, then X22ub ≥ X22vb, whi h gives X21ut +X22ub ≥ X21vt +X22vb, so that Xu ⊆ Xv

and X is SP.

Proof of the only if dire tion: Let X be subset-preserving, and let u =

and v =

be any two 2n×1 ve tors, where ut, ub, vt, and vb are n×1 ve tors, and let u ⊆ v or, equivalently:

ut ≤ vt and ub ≥ vb (B.3)

Appendix B. Generating Current Budgets for RLC Grids 145

Be ause Xu ⊆ Xv, we have that:

X11ut +X12ub ≤ X11vt +X12vb (B.4)

X21ut +X22ub ≥ X21vt +X22vb (B.5)

Let ut = ub = 0, vb = 0, and vt = ek, where ek ∈ Rnis the ve tor whose kth entry is 1 and

all other entries are 0. Be ause this assignment satises (B.3), then (B.4) and (B.5) lead to

X11ek ≥ 0 and X21ek ≤ 0, for every k ∈ 1, . . . , n. This means that X11 ≥ 0 and X21 ≤ 0.

Likewise, let ut = ub = 0, vt = 0, and vb = −ek. Be ause this assignment satises (B.3),

then (B.4) and (B.5) lead to −X12ek ≥ 0 and −X22ek ≤ 0, for every k ∈ 1, . . . , n. This means

that X12 ≤ 0 and X22 ≥ 0, whi h ompletes the proof.

Lemma B.2. If X and Y are SP, then XY and (X + Y ) are SP.

Proof. Suppose that X and Y are 2n × 2n SP matri es. For any two 2n × 1 ve tors u ⊆ v,

we have Y u ⊆ Y v, and X(Y u) ⊆ X(Y v), so that XY u ⊆ XY v and XY is SP. Furthermore,

Xu ⊆ Xv and Y u ⊆ Y v, so that (X+Y )u = Xu+Y u ⊆ Xv+Y v = (X+Y )v and so (X+Y )

is SP.

B.2 Proof of Lemma 4.6

Lemma B.3. [36 Let G = y ∈ Rm : Zy ≤ w be a non-empty onvex polytope, where Z is an

r×m matrix and w is an r×1 ve tor. Also, let zi and wi be the ith row of Z and the ith element

of w, respe tively. Then, there exists a y ∈ G su h that ziy = wi, for some i ∈ 1, . . . , r.

Lemma B.4. If F(u) is maximal in S then u is feasible and extremal in U .

Proof. We will prove the ontrapositive. Let u ∈ U be either infeasible or not extremal in U ;

we will prove that F(u) is not maximal in S. If u is infeasible then F(u) = φ, whi h we already

know is not maximal in S. Now onsider the ase when u is feasible but not extremal in U . In

other words, we have Pu ⊂ xth, so that ǫ

= min∀i (|Pu|i − xth,i|) > 0. Let 12d be the 2d × 1

ve tor whose rst d entries are 1 and the rest are −1, so that Pu ⊆ xth − ǫ12d. Also, let 12n be

the 2n×1 ve tor whose rst n entries are 1 and the rest are −1. Be ause P has exa tly one 1 in

ea h row, it follows that P12n = 12d. Also, let q = Q12n, so that δ

= max∀i |qi| > 0 be ause Q

is non-singular, and let u′ = u+ (ǫ/δ)q. Noti e that (1/δ)q ⊆ 12n, due to the denition of δ, so

that (ǫ/δ)Pq ⊆ ǫP12n be ause ǫP is SP, from whi h Pu+ (ǫ/δ)Pq ⊆ xth − ǫ12d + ǫP12n = xth,

due to P12n = 12d. Therefore, we have Pu′ ⊆ xth, so that u′ ∈ U . Also, noti e that Tu′ =

Tu + (ǫ/δ)Tq = Tu + (ǫ/δ)TQ12n = Tu + (ǫ/δ)12n, so that Tu ⊂ Tu′, be ause (ǫ/δ) > 0. We

have so far established that there exists u′ ∈ U with Tu ⊂ Tu′, so that F(u) ⊆ F(u′), due

to (4.24). It only remains to prove that F(u) 6= F(u′). Noti e that F(u′) 6= φ, be ause u is

feasible and F(u) ⊆ F(u′). Also, for any y, z ∈ F(u′) and 0 ≤ α ≤ 1, we have αy+(1−α)z ≥ 0

and R [αy + (1− α)z] = αRy+ (1− α)Rz ∈ αTu′ + (1− α)Tu′ = Tu′, so that F(u′) is onvex.

Therefore, due to Lemma B.3, there exists an I ∈ F(u′) su h that:

riI = tiu′

or riI = tn+iu′

for some i ∈ 1, . . . , n, where ri is the ith row of R, ti is the ith row of T , and tn+i is the

(n+ i)th row of T . Suppose, towards a ontradi tion, that I ∈ F(u), from whi h:

riI ≤ tiu and riI ≥ tn+iu, ∀i ∈ 1, . . . , n (B.7)

Therefore, due to (B.6) and (B.7), we have:

tiu′ ≤ tiu or tn+iu

′ ≥ tn+iu (B.8)

whi h ontradi ts Tu ⊂ Tu′, so that I /∈ F(u), F(u) 6= F(u′), F(u) is not maximal in S, and

the proof is omplete.

B.3 Proof of Lemmas 4.7 4.9

Lemma B.5. For any feasible u ∈ R2n

and any z ∈ R2n

su h that 0 ⊆ Tz ⊆ T (u− x(F(u))),

let u′ = u− z, it follows that F(u′) = F(u).

Proof. For any I ∈ F(u′), we have I ≥ 0 and RI ∈ Tu′ = Tu − Tz ⊆ Tu, due to (4.12) and

0 ⊆ Tz, so that I ∈ F(u). It follows that F(u′) ⊆ F(u). Conversely, for any I ∈ F(u), we have

I ≥ 0 and:

RI ∈ eoptI∈F(u)

(RI) = T x(F(u)) (B.9)

Noti e that for any z with 0 ⊆ Tz ⊆ T (u− x(F(u))), we have Tu′ = Tu − Tz ⊇ Tu −

T (u− x(F(u))), due to (4.10) and (4.8), so that Tu′ ⊇ T x(F(u)). Combining this with (B.9),

we get RI ∈ Tu′, so that I ∈ F(u′). Therefore, F(u) ⊆ F(u′) from whi h F(u′) = F(u), and

the proof is omplete.

Lemma B.6. For any feasible u ∈ R2n, let u′ = x(F(u)), it follows that F(u′) = F(u).

Proof. Let z = u − x(F(u)), so that Tz = Tu − Tx (F(u)) = Tu − eoptI∈F(u)(RI) ⊇ 0, the

last step due to the denition of F(u) and (4.12). As a result, z satises the onditions of

Lemma B.5. Let u′ = u− z = x(F(u)). Then, by Lemma B.5, F(u′) = F(u).

Lemma B.7. For any u ∈ R2n, u is irredu ible if and only if it is feasible and x(F(u)) = u.

Proof of the if dire tion: The proof is by ontradi tion. Let u be feasible with x(F(u)) = u

and suppose that u is redu ible so that there exists u′ ⊆ u, u′ 6= u, with F(u′) = F(u). Noti e

that F(u) is not empty, be ause u is feasible, so that F(u′) is not empty and u′ is feasible.

Therefore, we get:

u′ − x(F(u′)) = u′ − x(F(u)) = u′ − u+ u− x(F(u))

Be ause x(F(u′)) ⊆ u′, due to Lemma 4.4, it follows that u′−u+u−x(F(u)) ⊇ 0, due to (4.12),

so that u− x(F(u)) ⊇ u− u′ ⊇ 0, the nal step due to u′ ⊆ u and (4.12). But u− x(F(u)) = 0

due to x(F(u)) = u, so that u′ = u and we have a ontradi tion that ompletes the proof.

Proof of the only if dire tion: We will prove the ontrapositive. Let u be either infeasible or

x(F(u)) 6= u, and we will prove that u is redu ible. If u is infeasible then F(u) = φ and u 6= 0

(re all, u = 0 is always feasible), and it is easy to nd another infeasible u′ with u′ ⊆ u and

u′ 6= u, as follows. Let u′ = 12u, from whi h Tu′ = 1

2Tu. Suppose that there exists I ∈ F(u′), i.e.

∃I ≥ 0 su h that RI ∈ 12Tu, then 2I ≥ 0 and R(2I) ∈ Tu, so that 2I ∈ F(u) whi h ontradi ts

that u is infeasible; it follows that u′ is infeasible. Therefore, we have found u′ ⊆ u, u′ 6= u,

with F(u′) = F(u) = φ whi h means that u is redu ible. If u is feasible and x(F(u)) 6= u,

let u′ = x(F(u)), so that x(F(u)) ⊆ u, due to Lemma 4.4, leads to u′ ⊆ u, u′ 6= u, with

F(u′) = F(u) due to Lemma 4.7, and u is redu ible.

Lemma B.8. For any feasible u ∈ R2n, let u′ = x(F(u)), it follows that u′ is irredu ible.

Proof. Be ause u′ = x(F(u)), it follows from Lemma 4.7 that F(u′) = F(u), so that u′ is

feasible and x(F(u′)) = x(F(u)). With this, noti e that u′ − x (F(u′)) = u′ − x (F(u)) = 0,

from whi h x(F(u′)) = u′. Using Lemma 4.8, it follows that u′ is irredu ible, and the proof is

omplete.

Lemma B.9. For any u ∈ R2n, u is irredu ible if and only if :

Tu ⊆ Tu′ ⇐⇒ F(u) ⊆ F(u′), ∀u′ ∈ R2n

(B.10)

Proof of the if dire tion: We give a proof by ontradi tion. Given (B.10) and suppose u is

redu ible, so that it is either infeasible or x(F(u)) 6= u. If u is infeasible, then F(u) = φ ⊆ F(u′),

for any u′ ∈ R2n, so that Tu ⊆ Tu′, for any u′ ∈ R

2n, due to (B.10). But this is impossible,

be ause we an always nd a u′ ∈ R2n

that violates Tu ⊆ Tu′, as follows. Let 12n be the 2n× 1

ve tor whose rst n entries are 1 and the rest are −1 and let w = Q12n so that Tw = 12n ⊇ 0,

and let u′ = u − w so that Tu − Tu′ = Tw ⊇ 0 and, hen e, Tu′ ⊆ Tu, due to (4.12), with

Tu′ 6= Tu, be ause Tw = 12n 6= 0. This violates Tu ⊆ Tu′. Therefore, it must be that u is

feasible and x(F(u)) 6= u. Let u′ = x(F(u)), so that F(u′) = F(u) due to Lemma 4.7, with

Tu′ = T x(F(u)). Re all that T x(F(u)) = eoptI∈F(u)(RI) ⊆ Tu, and T x(F(u)) 6= Tu due

to x(F(u)) 6= u, so that Tu′ ⊆ Tu, Tu′ 6= Tu. This means that we have F(u) ⊆ F(u′) while

Tu 6⊆ Tu′, whi h ontradi ts (B.10), and the proof is omplete.

Proof of the only if dire tion: Let u be irredu ible, so that u is feasible with x(F(u)) = u. Due

to (4.24), it only remains to prove that ∀u′ ∈ R2n, F(u) ⊆ F(u′) =⇒ Tu ⊆ Tu′. Noti e that

F(u′) is non-empty, be ause F(u) 6= φ and F(u) ⊆ F(u′), from whi h u′ is feasible. Be ause u

and u′ are feasible, and using u = x(F(u)), noti e that:

Tu′ − Tu = Tu′ − T x(F(u))

= Tu′ − eoptI∈F(u)

⊇ Tu′ − eoptI∈F(u′)

(RI) ⊇ 0

where we used eoptI∈F(u′) (RI) ⊇ eoptI∈F(u) (RI), be ause F(u) ⊆ F(u′), making use of (4.10),

(4.8), and (4.12). Therefore, Tu′ − Tu ⊇ 0, so Tu ⊆ Tu′ due to (4.12) and the proof is

omplete.

B.4 Appli ations

Lemma B.10. Given a real-valued fun tion g(·) : R2n → R su h that, for any u, u′ ∈ U , with

0 ∈ Tu and 0 ∈ Tu′, we have: i) g(u′) = g(u) if F(u′) = F(u), and ii) g(u′) > g(u) if Tu′ ⊃ Tu.

Furthermore, let:

= maxu∈U0∈Tu

[g(u)] (B.11)

and let u∗ ∈ U be feasible with 0 ∈ Tu∗ and g(u∗) = g∗. It follows that F(u∗) is maximal in S.

Proof. We will prove that u∗ is irredu ible and extremal in U , so that F(u∗) is maximal in S,

due to Theorem 4.1. The proof is in two parts.

First, we will prove that u∗ is extremal in U ; the proof is by ontradi tion. Let u ∈ U be

feasible with 0 ∈ Tu and g(u) = g∗, and suppose that u is not extremal in U , so that Pu ⊂ xth.

Let ǫ

= min∀i

∣Pu|i − xth,i

> 0 and let 12n be the 2n × 1 ve tor whose rst n entries are 1

and the rest are −1. Be ause P has exa tly one 1 in ea h row, it follows that P12n = 12d, so

that Pu ⊆ xth − ǫ12d, due to the denition of ǫ. Also, let q = Q12n, so that δ

= max∀i |qi| > 0,

be ause Q is non-singular, and let u′ = u + (ǫ/δ)q, for whi h learly u′ 6= u. Noti e that

(1/δ)q ⊆ 12n, due to the denition of δ, so that (ǫ/δ)Pq ⊆ ǫP12n, be ause ǫP is SP, from whi h

Pu+(ǫ/δ)Pq ⊆ xth − ǫ12d + ǫP12n = xth, due to P12n = 12d. Therefore, we have Pu′ ⊆ xth, so

that u′ ∈ U . Note that Tu′ = Tu+ (ǫ/δ)Tq = Tu+ (ǫ/δ)TQ12n = Tu+ (ǫ/δ)12n, and, be ause

(ǫ/δ) > 0, we get Tu′ ⊃ Tu, so that 0 ∈ Tu′, due to 0 ∈ Tu. It follows that g(u′) > g(u) = g∗

with u′ 6= u, whi h ontradi ts (B.11). Therefore, u is extremal in U , so that u∗ is extremal in

U , whi h ompletes the rst part of the proof.

Next, we will prove that u∗ is irredu ible; the proof is by ontradi tion. Let u ∈ U be

feasible with 0 ∈ Tu and g(u) = g∗, and suppose that u is redu ible, then by Lemma 4.8 we

must have x(F(u)) 6= u. Let u′ = x(F(u)), so that F(u′) = F(u) due to Lemma 4.7. Be ause

u′ ⊆ u due to Lemma 4.4, from whi h Pu′ ⊆ Pu be ause P is SP, then u′ ∈ U . Note that

Tu′ = T x (F(u)) = eoptI∈F(u)(RI). Furthermore, be ause 0 ∈ Tu, we have 0 ∈ F(u), due to

Lemma 4.3, so that 0 ∈ eoptI∈F(u)(RI) due to (4.23), from whi h 0 ∈ Tu′, and the onditions

of the lemma provide that g(u′) = g(u) = g∗. Let δ = Tu− Tu′. Note that Tu′ = T x (F(u)) =

eoptI∈F(u)(RI) ⊆ Tu, due to (4.23), and Tu 6= T x(F(u)), due to x(F(u)) 6= u, from whi h

δ ⊇ 0 and δ 6= 0. Combining this with Q being SP, from Lemma B.18, and every element of

Q is non-zero, from Lemma B.16, so that we have 0 ⊂ Qδ = u − u′. Consequently, we have

u′ ⊂ u, due to (4.13), so that Pu′ ⊂ Pu ⊆ xth, making use of Lemma 4.2 and the nal step due

to u ∈ U , so that u′ is not extremal in U . But this ontradi ts the rst part of the proof. It

follows that u is irredu ible, so that u∗ is irredu ible. Therefore, F(u∗) is maximal in S.

Lemma B.11. For any θ ≥ 0 and u ∈ R2n

with 0 ∈ F(u), S+(θ) ⊆ F(u) if and only if

θν ⊆ Tu.

Proof. Let tk be the kth row of T . Be ause 0 ∈ F(u), it follows that tiu ≥ 0 and t(n+i)u ≤ 0,

∀i ∈ 1, . . . , n, due to Lemma 4.3. Also, noti e that θν ⊆ Tu if and only if θν+i ≤ tiu and

−θν−i ≥ t(n+i)u, ∀i ∈ 1, . . . , n. The proof is in two parts.

Proof of the if dire tion: Let θν ⊆ Tu. For any I ∈ S+(θ), we have I ≥ 0 and ‖I‖ ≤ θ, so

that ν+i ‖I‖ ≤ ν+i θ ≤ tiu and −ν−i ‖I‖ ≥ −ν−i θ ≥ t(n+i)u, ∀i ∈ 1, . . . , n, where we have used

the fa t that ν+i ≥ 0 and −ν−i ≤ 0. Noti e that:

riI ≤ r+i I = |r+i I| ≤ ‖r+i ‖‖I‖ = ν+i ‖I‖ (B.12)

the rst two steps due to I ≥ 0 and the third step due to the Cau hy-S hwarz inequality (see [61,

). Therefore, it follows that riI ≤ tiu. Similarly, noti e that:

riI ≥ r−i I = −|r−i I| ≥ −‖r−i ‖‖I‖ = −ν−i ‖I‖ (B.13)

so that riI ≥ t(n+i)u. Thus, RI ∈ Tu, so that I ∈ F(u) and S+(θ) ⊆ F(u).

Proof of the only if dire tion: Let S+(θ) ⊆ F(u). For any i ∈ 1 . . . , n, noti e that if r+i = 0,

then ν+i = 0 and θν+i ≤ tiu, be ause tiu ≥ 0. Otherwise, if r+i 6= 0, then let I = θ(r+i )T

ν+i≥ 0.

Noti e that ‖I‖ = θ, be ause ‖(r+i )T ‖ = ‖r+i ‖ = ν+i , so that I ∈ S+(θ), from whi h I ∈ F(u),

i.e., riI ≤ tiu. Therefore, we have:

riI = θri(r+i )

ν+i≤ tiu (B.14)

But ri(r+i )

T = r+i (r+i )

T = ‖r+i ‖2 = (ν+i )

2, so that:

θν+i ≤ tiu (B.15)

Similarly, if r−i = 0, then ν−i = 0 and −θν−i ≥ t(n+i)u, be ause t(n+i)u ≤ 0. Otherwise, if

r−i 6= 0, then let I ′ = −θ(r−i )T

ν−i≥ 0. Noti e that ‖I ′‖ = θ, be ause ‖(r−i )

T ‖ = ‖r−i ‖ = ν−i , so

that I ′ ∈ S+(θ), from whi h I ′ ∈ F(u), i.e., riI′ ≥ t(n+i)u. Therefore, we have:

riI′ = −θri

(r−i )T

ν−i≥ t(n+i)u (B.16)

But ri(r−i )

T = r−i (r−i )

T = ‖r−i ‖2 = (ν−i )

2, so that:

− θν−i ≥ t(n+i)u (B.17)

From (B.15) and (B.17), it follows that θν ⊆ Tu.

Lemma B.12. F(us) is maximal in S.

Proof. Re all that Θ∗and us are well-dened and (Θ∗, us) ∈ R, so that Θ∗ν ⊆ Tus and Θ∗ ≥ 0.

We will prove that Θ(·) satises the onditions of Lemma B.10, from whi h it would follow that

F(us) is maximal in S. First, noti e that for any u, u′ ∈ U , if F(u′) = F(u), it follows that

Θ(u′) = Θ(u), due to (4.52). It remains to prove that for any u, u′ ∈ U , with 0 ∈ Tu and

0 ∈ Tu′, if Tu′ ⊃ Tu, then Θ(u′) > Θ(u).

Let λ = min∀i

∣Tu′|i − Tu|i

/max∀i(|νi|), whi h is well-dened be ause ν 6= 0 due to

R 6= 0, and let θ′ = Θ(u) + λ. Be ause Tu′ ⊃ Tu, it follows that λ > 0 and θ′ > Θ(u) ≥ 0.

Therefore:

θ′ν = Θ(u)ν +min∀i

∣Tu′|i − Tu|i

max∀i(|νi|)ν (B.18)

⊆ Tu+min∀i

∣Tu′|i − Tu|i

12n (B.19)

where 12n is the 2n × 1 ve tor whose rst n entries are 1 and the rest are −1, and in (B.19)

we used (Θ(u), u) ∈ R and ν/max∀i(|νi|) ⊆ 12n. Noti e that, for any k ∈ 1, . . . , n, be ause

Tu ⊂ Tu′, we have:

min∀i

∣Tu′|i − Tu|i

≤∣

∣Tu′|k − Tu|k

∣= Tu′|k − Tu|k (B.20)

−min∀i

∣Tu′|i − Tu|i

≥ −∣

∣Tu′|k − Tu|k

∣= Tu′|k − Tu|k (B.21)

Combining (B.20) and (B.21), we get:

min∀i

∣Tu′|i − Tu|i

12n ⊆ Tu′ − Tu (B.22)

Therefore, due to (B.19) and (B.22) and making use of (4.8), we get:

θ′ν ⊆ Tu+ Tu′ − Tu = Tu′ (B.23)

This, oupled with u′ ∈ U , means that (θ′, u′) ∈ R, so that Θ(u′) ≥ θ′ > Θ(u), from whi h Θ(·)

satises the onditions of Lemma B.10 and F(us) is maximal in S.

Lemma B.13. F(uc) is maximal in S.

Proof. Re all that ζ, ω, and uc are well-dened and (ζ, ω, uc) ∈ C, so that Rζ ∈ Tuc, ων ⊆ Tuc,

and ζ, ω ≥ 0. We will prove that ξ(·) satises the onditions of Lemma B.10, from whi h it

would follow that F(uc) is maximal in S. First, noti e that for any u, u′ ∈ U , if F(u′) = F(u),

it follows that ξ(u′) = ξ(u), due to (4.59). It remains to prove that for any u, u′ ∈ U , with

0 ∈ Tu and 0 ∈ Tu′, if Tu′ ⊃ Tu, then ξ(u′) > ξ(u).

For any u ∈ U , there must exist a ve tor I ∈ F(u) and θ, where θν ⊆ Tu, su h that

∑mj=1 Ij +mθ = ξ(u). Let λ = min∀i

∣Tu′|i − Tu|i

/max∀i,j(|rij |), whi h is well-dened due

to R 6= 0. Be ause Tu ⊂ Tu′, it follows that λ > 0. Also, let e1 ∈ Rm

be the ve tor whose 1st

entry is 1 and all other entries are 0 and let I ′ = I + λe1. Be ause λ > 0, we have λe1 ≥ 0,

λe1 6= 0, I ′ ≥ I ≥ 0, and I ′ 6= I, so that

∑mj=1 I

′j +mθ >

∑mj=1 Ij +mθ = ξ(u). Denote by cj

the jth olumn of R and noti e that:

RI ′ = RI + λRe1 = RI + λc1 (B.24)

= RI +min∀i

∣Tu′|i − Tu|i

max∀i,j(|rij |)c1 (B.25)

Let 12n is the 2n×1 ve tor whose rst n entries are 1 and the rest are−1. Be ause c1/max∀i,j(|rij |) ∈

12n, noti e that:

min∀i

∣Tu′|i − Tu|i

max∀i,j(|rij |)c1 ∈ min

∣Tu′|i − Tu|i

12n (B.26)

whi h, ombined with RI ∈ Tu be ause (I, θ, u) ∈ C, and due to Lemma 4.1, gives:

RI +min∀i

∣Tu′|i − Tu|i

max∀i,j(|rij |)c1 ∈ Tu+min

∣Tu′|i − Tu|i

12n (B.27)

Therefore, using (B.25), it follows that:

RI ′ ∈ Tu+min∀i

∣Tu′|i − Tu|i

12n (B.28)

Also, noti e that, for any k ∈ 1, . . . , n, be ause Tu ⊂ Tu′, we have:

min∀i

∣Tu′|i − Tu|i

≤∣

∣Tu′|k − Tu|k

∣= Tu′|k − Tu|k (B.29)

−min∀i

∣Tu′|i − Tu|i

≥ −∣

∣Tu′|k − Tu|k

∣= Tu′|k − Tu|k (B.30)

Combining (B.29) and (B.30), we get:

min∀i

∣Tu′|i − Tu|i

12n ⊆ Tu′ − Tu (B.31)

This, ombined with Tu ⊆ Tu and making use of (4.8), gives:

Tu+min∀i

∣Tu′|i − Tu|i

12n ⊆ Tu+ Tu′ − Tu = Tu′ (B.32)

Therefore, due to (B.28) and (B.32), we get:

RI ′ ∈ Tu′ (B.33)

Also, we have θν ⊆ Tu ⊂ Tu′. Therefore, we have I ′ ∈ F(u′), and θ satisfying θν ⊆ Tu′, with

ξ(u′) ≥∑m

j=1 I′j +mθ > ξ(u), so that ξ(·) satises the onditions of Lemma B.10 and F(uc) is

maximal in S.

B.5 Properties of the matri es

In the following, we present key theoreti al results that are useful to arry out some of the

proofs presented above and in Chapter 4. First, we prove that the nv × nv matrix D given

in (2.29) is irredu ible, whi h is required to prove D−1 > 0, a key result that is useful to prove

Lemma B.15. Se ond, we prove that every element of Q is non-zero. This result is established in

Lemma B.16, depends on the result of Lemma B.15, and is a key result in proving Theorem 4.1

and Lemma B.10. Finally, we prove that Q is SP in Lemma B.18, depending on Lemma B.17,

whi h is used in the proofs of Theorem 4.1, Lemmas B.10, 4.4, and 4.5.

Lemma B.14. The nv × nv matrix D given in (2.29) is irredu ible.

Proof. Re all that D = G + B +ME−1MT, where G is an irredu ible matrix with positive

diagonal entries and non-positive o-diagonal entries, and B is a non-negative diagonal matrix

with a positive diagonal. We will show that G(D) is strongly onne ted, so that D is irredu ible.

We start by showing that ME−1MThas non-negative diagonal entries and non-positive o-

diagonal entries. Noti e that:

E−1)

∆tlii

if i = j

0 otherwise

(B.34)

Therefore, if X = E−1MT, we have:

E−1)

i = 1, . . . , nl

j = 1, . . . , nv

(B.35)

so that xij =∆tliimji. Also, if W = ME−1MT

, then:

mikxkj =

mik∆t

lkkmjk,

i = 1, . . . , nv

j = 1, . . . , nv

(B.36)

By denition, every olumn of the matrix M ontains either one non-zero entry, or two non-

zero entries where one of them is +1 and the other is −1. It follows that, for any i 6= j,

we have mikmjk ≤ 0, for any k, so that wij ≤ 0, ∀i 6= j. Also, we have mikmik ≥ 0, for

any k, so that wii ≥ 0, ∀i. Therefore, W = ME−1MThas non-negative diagonal entries and

non-positive o-diagonal entries. Therefore, B + ME−1MThas positive diagonal entries and

non-positive o-diagonal entries, similar to G. It follows that dij 6= 0 whenever gij 6= 0, so that

E(G(G)) ⊆ E(G(D)). Therefore, G(D) is strongly onne ted, so that D is irredu ible, due to

Lemma 2.1.

Lemma B.15. For the n × n matrix F given in (2.34), and its extension F a ording to

Denition 4.6, the dire ted graph G(F ) is strongly onne ted.

Proof. We an represent the 2n× 2n matrix Z

= F as follows:

Z11 Z12 Z13 Z14

Z21 Z22 Z23 Z24

Z31 Z32 Z33 Z34

Z41 Z42 Z43 Z44

(B.37)

(D−1B)+ (D−1M)+ (D−1B)− (D−1M)−

(−E−1MTD−1B)+ (Inl− E−1MTD−1M)+ (−E−1MTD−1B)− (Inl

− E−1MTD−1M)−

(D−1B)− (D−1M)− (D−1B)+ (D−1M)+

(−E−1MTD−1B)− (Inl− E−1MTD−1M)− (−E−1MTD−1B)+ (Inl

− E−1MTD−1M)+

(B.38)

where, using the notation introdu ed in Denition 4.6,

Z11 = Z33 = (D−1B)+ (B.39)

Z13 = Z31 = (D−1B)− (B.40)

Z12 = Z34 = (D−1M)+ (B.41)

Z14 = Z32 = (D−1M)− (B.42)

Z21 = Z43 = (−E−1MTD−1B)+ (B.43)

Z23 = Z41 = (−E−1MTD−1B)− (B.44)

Z22 = Z44 = (Inl− E−1MTD−1M)+ (B.45)

Z24 = Z42 = (Inl− E−1MTD−1M)− (B.46)

The matrix Z an be used to onstru t a graph G(Z) whose verti es are V = v1, v2, . . . , v2n

and whose dire ted edges are (vi, vj) for every zij 6= 0, where zij is the (i, j)th element of

Z. Let E denote the set of edges of G(Z). Also, onsider a partition V1 = v1, v2, . . . , vnv,

V2 = vnv+1, vnv+2, . . . , vn, V3 = vn+1, . . . , vn+nv, and V4 = vn+nv+1, . . . , v2n of V . For

any two verti es u, v ∈ V , dene a binary-valued fun tion β(u, v) as follows:

β(u, v) =

1, if u ↔ v;

0, otherwise.

(B.47)

where u ↔ v is a shorthand for u → v and v → u. It should be lear that β(·) is transitive, i.e.,

for any three verti es u, v, w ∈ V , if β(u, v) = 1 and β(v, w) = 1, then β(u,w) = 1, and β(·)

is ommutative, i.e., β(u, v) = β(v, u). In the following, we will show that for any u, v ∈ V we

have β(u, v) = 1, so that G(Z) is strongly onne ted.

We start by proving the following properties on the verti es of G(Z):

∀u, v ∈ V1 β(u, v) = 1 (B.48)

∀u, v ∈ V3 β(u, v) = 1 (B.49)

∀v ∈ V2, ∃u ∈ V1 su h that β(u, v) = 1 (B.50)

∀v ∈ V4, ∃u ∈ V3 su h that β(u, v) = 1 (B.51)

whi h will lead us to the desired result. To better visualize the proof, refer to Fig. B.1.

Re all that D−1 > 0, from whi h D−1B > 0, be ause B = C/∆t ≥ 0 is a diagonal matrix

with non-zero diagonal elements. Therefore, we have (D−1B)+ = D−1B > 0 whi h, due

Figure B.1: A high-level representation of G(Z).

to (B.39), gives:

u → v, ∀u, v ∈ V1 (B.52)

u → v, ∀u, v ∈ V3 (B.53)

This proves (B.48) and (B.49).

For any i ∈ 1, 2, . . . , nv and j ∈ 1, 2, . . . , nl, noti e that vi ∈ V1, vnv+j ∈ V2, vn+i ∈ V3,

and vn+nv+j ∈ V4, as shown in Fig. B.1. Also, for any i ∈ 1, 2, . . . , nv and j ∈ 1, 2, . . . , nl,

we dene the following indexing s heme and notation for ertain edges in E :

e12(i, j) = (vi, vnv+j) ∈ E ⇐⇒ (Z12)ij 6= 0 (B.54)

e21(i, j) = (vnv+j , vi) ∈ E ⇐⇒ (Z21)ij 6= 0 (B.55)

e23(i, j) = (vnv+j , vn+i) ∈ E ⇐⇒ (Z23)ij 6= 0 (B.56)

e32(i, j) = (vn+i, vnv+j) ∈ E ⇐⇒ (Z32)ij 6= 0 (B.57)

e34(i, j) = (vn+i, vn+nv+j) ∈ E ⇐⇒ (Z34)ij 6= 0 (B.58)

e43(i, j) = (vn+nv+j , vn+i) ∈ E ⇐⇒ (Z43)ij 6= 0 (B.59)

e41(i, j) = (vn+nv+j , vi) ∈ E ⇐⇒ (Z41)ij 6= 0 (B.60)

e14(i, j) = (vi, vn+nv+j) ∈ E ⇐⇒ (Z14)ij 6= 0 (B.61)

Z12 = Z34 from (B.41) leads to: e12(i, j) ∈ E ⇐⇒ e34(i, j) ∈ E (B.62)

Now, let X = D−1M be a nv × nl matrix and Y = −E−1MTD−1B be a nl × nv matrix, and

noti e that Y = −E−1XTB, where E−1 ≥ 0 and B ≥ 0 are diagonal matri es with non-zero

diagonal elements, so that yij = − 1ejj

(XT )jibii = − 1ejj

xijbii. Thus, the orresponding non-zero

elements of Y and XThave opposite signs. Two things follow from this:

1) Considering the positive elements of Y :

(Y +)ji 6= 0 ⇐⇒ yji > 0 ⇐⇒ xij < 0 ⇐⇒ (X−)ij 6= 0

But Y + = Z21 and X− = Z14, so that:

(Z21)ji 6= 0 ⇐⇒ (Z14)ij 6= 0

from whi h, due to (B.55) and (B.61):

e21(i, j) ∈ E ⇐⇒ e14(i, j) ∈ E (B.66)

2) Considering the negative elements of Y :

(Y −)ji 6= 0 ⇐⇒ yji < 0 ⇐⇒ xij > 0 ⇐⇒ (X+)ij 6= 0

But Y − = Z23 and X+ = Z12, so that:

(Z23)ji 6= 0 ⇐⇒ (Z12)ij 6= 0

from whi h, due to (B.56) and (B.54):

e23(i, j) ∈ E ⇐⇒ e12(i, j) ∈ E (B.67)

Furthermore, let xj and mj be the jth olumns of X and M , respe tively. Note that mj 6= 0,

by denition of M , and D−1is non-singular, so that xj = D−1mj 6= 0. Therefore, xij 6= 0 for

some i ∈ 1, 2, . . . , nv, so that either (Z12)ij 6= 0 or (Z32)ij 6= 0, depending on whether xij > 0

or xij < 0. By (B.55) and (B.57), it follows that either e12(i, j) ∈ E or e32(i, j) ∈ E .

This being said, and due to (B.62) - (B.67), for every i and j, we have either e12(i, j), e23(i, j),

e34(i, j), and e41(i, j) ∈ E or e32(i, j), e21(i, j), e14(i, j), and e43(i, j) ∈ E . Thus, for any vertex

Figure B.2: A simple high-level representation of G(Z).

in V2 and the orresponding vertex in V4, as indexed by j, there exists a y le onne ting all

the partitions of V and passing through these two verti es as shown in Fig. B.1. This ompletes

the proof of (B.50) and (B.51).

Now, we are ready to show that for any two verti es u, v ∈ V , we have β(u, v) = 1. Noti e

that, ∀u, v ∈ V1 ∪ V2, β(u, v) = 1, due to the following:

- if u or v ∈ V1, then learly u ↔ v, either due to (B.48) or due to (B.48) and (B.50), and

due to transitivity of β(·).

- if u, v ∈ V2, then there exist w,w′ ∈ V1 su h that β(w, u) = 1 and β(w′, v) = 1, due

to (B.50), whi h, ombined with β(·) being ommutative and transitive and due to (B.48),

gives β(u, v) = 1.

Therefore, we will ombine V1 and V2 as V12

= V1 ∪ V2, so that V12 is strongly onne ted.

Likewise V34

= V3 ∪ V4 is strongly onne ted, due to (B.49) and (B.51). W an now look at

G(Z) using the simple representation of Fig. B.2, where V12 and V34 are strongly onne ted.

Moreover, be ause either e23(i, j) and e41(i, j) ∈ E or e14(i, j) and e32(i, j) ∈ E , then G(Z) is

strongly onne ted.

Lemma B.16. Every element of Q is non-zero.

Proof. Let Z = F . Note that Q = (I2n−Z)−1 =

∞∑

Zk, be ause ρ(Z) < 1 [61. In the following,

we will rst show that |Zk| = |Z|k, for every integer k ≥ 1, starting with the blo k-form of Zi

in the following notation:

Z(i)11 Z

Z(i)21 Z

(B.68)

Re all that Z = F is SP, due to Denition 4.6 and Lemma B.1, so that Zkis SP, due to

Lemma B.2 whi h, due to Lemma B.1, gives Z(i)11 ≥ 0, Z

(i)12 ≤ 0, Z

(i)21 ≤ 0, and Z

(i)22 ≥ 0.

= |Z|, where |Z| is the matrix onsisting of the absolute values of the elements of Z,

and represent Y ias follows:

Y(i)11 Y

Y(i)21 Y

(B.69)

We will prove by indu tion that Y i = |Zi|, for every i ≥ 1. Noti e that for i = 1, the result is

trivially true, and suppose that Y k−1 = |Zk−1|, so that:

Y k = Y k−1Y = |Zk−1||Z| =

Z(k−1)11 −Z

(k−1)12

−Z(k−1)21 Z

(k−1)22

Z11 −Z12

−Z21 Z22

(B.70)

Z(k−1)11 Z11 + Z

(k−1)12 Z21 −Z

(k−1)11 Z12 − Z

(k−1)12 Z22

−Z(k−1)21 Z11 − Z

(k−1)22 Z21 Z

(k−1)21 Z12 + Z

(k−1)22 Z22

(B.71)

while:

Zk = Zk−1Z =

Z(k−1)11 Z

(k−1)12

Z(k−1)21 Z

(k−1)22

Z11 Z12

Z21 Z22

(B.72)

Z(k−1)11 Z11 + Z

(k−1)12 Z21 Z

(k−1)11 Z12 + Z

(k−1)12 Z22

Z(k−1)21 Z11 + Z

(k−1)22 Z21 Z

(k−1)21 Z12 + Z

(k−1)22 Z22

(B.73)

Re all that Zkis SP whi h, due to Lemma B.1, means that the diagonal blo ks in (B.73) are non-

negative and the o-diagonal blo ks are non-positive. Thus, omparing (B.71) and (B.73), we

see that Y k = |Zk|, for every k. Furthermore, noti e that G(Z) = G(|Z|), due to the denition of

G(·), and G(Z) is strongly onne ted, due to Lemma B.15, so that G(|Z|) is strongly onne ted

and |Z| is irredu ible (see [11, pages 29-30). This, ombined with |Z| ≥ 0, leads to |Z|p > 0,

for some integer p ≥ 1 (see [11, page 29). Therefore, |Zp| > 0, due to |Zp| = |Z|p, so that every

element of Zpis non-zero. Let z

(k)ij be the (i, j)th element of Zk

, so that the (i, j)th element of

Q =∞∑

p−1∑

z(k)ij + z

(p)ij +

∞∑

z(k)ij 6= 0

where we used the fa t that z(k)ij have the same sign, ∀k, due to Zk

being SP, ∀k, and Lemma B.1,

and z(p)ij 6= 0, be ause every element of Zp

is non-zero. It follows that every element of Q is

non-zero.

Lemma B.17. Let u(k), v(k) ∈ R2n

be sequen es of ve tors. If u(k) ⊆ v(k), ∀k ≥ 1, and if

= limk→∞

u(k) and v

= limk→∞

v(k) exist, then u ⊆ v.

Proof. Let u(k) =

, v(k) =

, and v =

, where u(k)t , v

(k)t ,

ut, vt, u(k)b , v

(k)b , ub, and vb are n × 1 ve tors. In the following, we will show that ut ≤ vt and

ub ≥ vb, so that u ⊆ v.

Let w(k)t = v

(k)t − u

(k)t ≥ 0, ∀k, and let wt

= limk→∞

w(k)t = lim

k→∞(v

(k)t − u

(k)t ) = lim

k→∞v(k)t −

limk→∞

u(k)t = vt − ut. If wt < 0, then there exists an integer N ≥ 1 su h that |w

(k)t − wt| < −wt,

∀k ≥ N , due to the denition of limits [10. It follows that wt < w(k)t − wt < −wt, ∀k ≥ N , so

that w(k)t < 0, ∀k ≥ N , and we have a ontradi tion. Therefore, wt = vt − ut ≥ 0 and ut ≤ vt.

Similarly, let w(k)b = u

(k)b −v

(k)b ≥ 0, ∀k, and let wb

= limk→∞

w(k)b = lim

k→∞(u

(k)b −v

(k)b ) = lim

k→∞u(k)b −

limk→∞

v(k)b = ub − vb. If wb < 0, then there exists an integer N ≥ 1 su h that |w

(k)b − wb| < −wb,

∀k ≥ N , due to the denition of limits [10. It follows that wb < w(k)b − wb < −wb, ∀k ≥ N , so

that w(k)b < 0, ∀k ≥ N , and we have a ontradi tion. Therefore, wb = ub − vb ≥ 0, ub ≥ vb, and

u ⊆ v.

Lemma B.18. Q is SP.

Proof. Let Z = F . Re all that, be ause ρ(Z) < 1, then the summation

∑∞k=0 Z

kexists, and

we have Q = (I2n − Z)−1 =∞∑

Zk[61. The proof is by indu tion. Noti e that Z0

is SP, due

to Lemma B.1. Suppose that Zk−1is SP, then Zk = Zk−1Z is also SP, due to Lemma B.2.

Therefore, Zkis SP, for any k ≥ 0. Also, noti e that, for any two 2n× 1 ve tors u ⊆ v and for

any integer p ≥ 1, we have:

v (B.74)

where in the se ond step we used the fa t that Zkis SP, for any k ≥ 0, and that the nite

sum of SP matri es is SP, due to Lemma B.2. Be ause limp→∞

exists and onverges to

(I2n − Z)−1, taking the limits on both sides of (B.74), due to Lemma B.17, leads to:

Qu = (I2n − Z)−1u ⊆ (I2n − Z)−1v = Qv (B.75)

Hen e, Q is SP, whi h ompletes the proof.

Appendix C

Power S heduling with A tive RC

Power Grids

C.1 Design Obje tives

Lemma C.1. Let g(u) : Rn → R be a real-valued fun tion dened as follows:

g(u) = maxI∈L(u)

Ij (C.1)

We have g(cu) = cg(u), for any real number c > 0.

Proof. Noti e that

g(cu) = maxM ′I≤cMGu

Ij (C.2)

Let I ′ = (1/c)I be a hange of variable, then we an rewrite (C.2) as follows:

g(cu) = c maxcM ′I′≤cMGu

I ′j (C.3)

or equivalently, be ause c > 0,

g(cu) = c maxM ′I′≤MGu

I ′j = cg(u) (C.4)

g(u) = maxC(L)⊆L(u)

L (C.5)

Appendix C. Power S heduling with A tive RC Power Grids 161

Proof. Noti e that

g(cu) = maxLη≤cMGu

L (C.6)

Let L′ = (1/c)L be a hange of variable, then we an rewrite (C.6) as follows:

g(cu) = c maxcL′η≤cMGu

L′(C.7)

g(cu) = c maxL′η≤MGu

L′ = cg(u) (C.8)

g(u) = maxI∈L(u)

C(L)⊆L(u)

Ij +mL

Proof. Noti e that

g(cu) = maxM ′I≤cMGuLη≤cMGu

Ij +mL

(C.10)

Let L′ = (1/c)L and I ′ = (1/c)I be hange of variables, then we an rewrite (C.10) as follows:

g(cu) = c maxcM ′I′≤cMGucL′η≤cMGu

I ′j +mL′

(C.11)

g(cu) = c maxM ′I′≤MGuL′η≤MGu

I ′j +mL′

= cg(u) (C.12)

Bibliography

[1 N. H. Abdul Ghani and F. N. Najm. Handling indu tan e in early power grid veri ation.

In IEEE/ACM International Conferen e on Computer-Aided Design (ICCAD), pages 127

134, San Jose, CA, November 5-9 2006.

[2 N. H. Abdul Ghani and F. N. Najm. Fast ve torless power grid veri ation using an

approximate inverse te hnique. In ACM/IEEE 46th Design Automation Conferen e (DAC),

pages 184189, San Fran is o, CA, July 26-31 2009.

[3 N. H. Abdul Ghani and F. N. Najm. Fast ve torless power grid veri ation under an RLC

model. IEEE Transa tions on Computer-Aided Design of Integrated Cir uits and Systems,

30(5):691703, May 2011.

[4 N. H. Abdul Ghani and F. N. Najm. Fast ve torless power grid veri ation under an RLC

model. IEEE Transa tions on Computer-Aided Design of Integrated Cir uits and Systems,

30(5):691703, May 2011.

[5 N. H. Abdul Ghani and F. N. Najm. Power grid veri ation using node and bran h dom-

inan e. In ACM/IEEE 47th Design Automation Conferen e, pages 682687, San Diego,

CA, June 5-9 2011.

[6 Abhishek and F. N. Najm. In remental power grid veri ation. In ACM/IEEE Design

Automation Conferen e, pages 151156, San Fran is o, CA, June 3-7 2012.

[7 R. Ahmadi and F. N. Najm. Timing analysis in presen e of power supply and ground

voltage variations. In IEEE/ACM International Conferen e on Computer-Aided Design,

pages 176183, San Jose, CA, November 2003.

[8 E. Ajith and F.N. Najm. Failure me hanisms in semi ondu tor devi es. J. Wiley, 1997.

[9 G. Bai, S. Bobba, and I. N. Hajj. Stati timing analysis in luding power supply noise ee t

on propagation delay in VLSI ir uits. In ACM/IEEE 38th Design Automation Conferen e

(DAC), pages 295300, Las Vegas, NV, June 18-22 2001.

[10 R. G. Bartle and D. R. Sherbert. Introdu tion to Real Analysis. Wiley, New York, NY,

Bibliography 163

[11 A. Berman and R. J. Plemmons. Nonnegative Matri es in the Mathemati al S ien e. So iety

for Industrial and Applied Mathemati s, 1994.

[12 J. R. Bla k. Ele tromigration-A brief survey and some re ent results. IEEE Trans. on

Ele troni devi es, 16(4):338347, Apr. 1969.

[13 I. A. Ble h. Ele tromigration in thin aluminum of titanium nitride. Journal of Applied

Physi s, 47(4):12031208, 1976.

[14 I. A. Ble h and C. Herring. Stress generation by ele tromigration. Applied Physi s Letters,

29(3):131133, 1976.

[15 I. A. Ble h and K. L. Tai. Measurement of stress gradients generated by ele tromigration.

Applied Physi s Letters, 30(8):387389, 1977.

[16 S. Chatterjee, M. Fawaz, and F. N. Najm. Redundan y-aware ele tromigration he king

for mesh power grids. In IEEE/ACM International Conferen e on Computer-Aided Design,

pages 540547, San Jose, CA, Nov. 2013.

[17 S. Chatterjee, M. Fawaz, and F. N. Najm. Redundan y-aware power grid ele tromigration

he king under workload un ertainties. IEEE Transa tions on Computer-Aided Design of

Integrated Cir uits and Systems, 34(9):15091522, Sept 2015.

[18 S. Chatterjee, V. Sukharev, and F. N. Najm. Fast physi s-based ele tromigration assess-

ment by e ient solution of linear time-invariant (LTI) systems. In Pro eedings of the 36th

International Conferen e on Computer-Aided Design (ICCAD-17), pages 659666, 2017.

[19 S. Chatterjee, V. Sukharev, and F. N. Najm. Power grid ele tromigration he king using

physi s-based models. IEEE Transa tions on Computer-Aided Design of Integrated Cir uits

and Systems, Feb 2017.

[20 Chi-Tsong Chen. Linear System Theory and Design. Oxford University Press, In ., New

York, NY, USA, 3rd edition, 1998.

[21 Y. Chen and et al. Algorithm 887: CHOLMOD, supernodal sparse holesky fa torization

and update/downdate. ACM Transa tions on Mathemati al Software (TOMS), 35(3):22:1

22:14, 2008.

[22 S. Chowdhury and M. A. Breuer. Optimum design of IC power/ground nets subje t to

reliability onstraints. IEEE Transa tions on Computer-Aided Design of Integrated Cir uits

and Systems, pages 787796, July 1988.

[23 J. Choy, V. Sukharev, S. Chatterjee, F. N. Najm, A. Kteyan, and S. Moreau. Finite-

dieren e methodology for full- hip ele tromigration analysis applied to 3d i test stru ture:

Simulation vs. experiment. In 2017 International Conferen e on Simulation of Semi on-

du tor Pro esses and Devi es (SISPAD), pages 4144, Sept 2017.

Bibliography 164

[24 A. K. Coskun and et al. Stati and dynami temperature-aware s heduling for multi-

pro essors SoCs. IEEE Transa tions on Very Large S ale Integration (VLSI) Systems,

16(9):11271140, September 2008.

[25 T.A. Davis. Suitesparse 4.4.6. [Online.

[26 J. E. Dennis, Jr. and R. B. S hnabel. Numeri al Methods for Un onstrained Optimization

and Nonlinear Equations. SIAM, Philadelphia, PA, 1996.

[27 R. Dutta and M. Marek-Sadowska. Automati sizing of power/ground (p/g) networks in

VLSI. In 26th ACM/IEEE Design Automation Conferen e, pages 783786, Las Vegas, NV,

June 25-29 1989.

[28 M. Fawaz and F. N. Najm. A urate veri ation of RC power grids. In Design, Automation,

and Test in Europe (DATE), pages 814817, Dresden, Germany, Mar h 14-18, 2016.

[29 M. Fawaz and F. N. Najm. Fast ve torless RLC grid veri ation. IEEE Transa tions on

Computer-Aided Design of Integrated Cir uits and Systems, 36(3):489502, Mar h 2017.

[30 I. A. Ferzli, E. Chiprout, and F. N. Najm. Veri ation and o-design of the pa kage and

die power delivery system using wavelets. In IEEE Conferen e on Ele tri al Performan e

of Ele troni Pa kaging (EPEP), pages 710, San Jose, CA, O tober 27-29 2008.

[31 I. A. Ferzli, F. N. Najm, and L. Kruse. A geometri approa h for early power grid veri a-

tion using urrent onstraints. In IEEE/ACM International Conferen e on Computer-Aided

Design (ICCAD), pages 4047, San Jose, CA, November 5-8 2007.

[32 C. L. Gan, C. V. Thompson, K. L. Pey, and W. K. Choi. Experimental hara terization and

modeling of the reliability of three-terminal dual-damas ene u inter onne t trees. Journal

of Applied Physi s, 94(2):12221228, 2003.

[33 M. Haus hildt, C. Hennesthal, G. Talut, O. Aubel, M. Gall, K. B. Yeap, and E. Zs he h.

Ele tromigration early failure void nu leation and growth phenomena in Cu and Cu(Mn)

inter onne ts. In IEEE International Reliability Physi s Symposium (IRPS), pages 2C.1.1

2C.1.6, April 2013.

[34 X. Huang, Y. Tan, V. Sukharev, and S. X. Tan. Physi s-based ele tromigration assessment

for power grid networks. In ACM/EDAC/IEEE Design Automation Conf., pages 16, June

[35 H. Jiang, M. Marek-Sadowska, and S. R. Nassif. Benets and osts of power-gating te h-

nique. In International Conferen e on Computer Design, pages 559566, O t 2005.

[36 K. D. Joshi. Applied Dis rete Stru tures. New Age International Pvt Ltd Publishers, 2001.

Bibliography 165

[37 H. Khdr and et al. mDTM: Multi-obje tive dynami thermal management for on- hip

systems. In Pro eedings of the Conferen e on Design, Automation & Test in Europe, pages

330:1330:6, Dresden, Germany, mar h 24-28 2014.

[38 K. Killpa k, S. Natarajan, and A. Krishnama hary. Case study on speed failure auses in

a mi ropro essor. In IEEE Design and Test of Computers, pages 224230, May/June 2008.

[39 M. A. Korhonen, P. Borgesen, K. N. Tu, and Che-Yu Li. Stress evolution due to ele tro-

migration in onned metal lines. Journal of Applied Physi s, 73(8):37903799, 1993.

[40 D. Kouroussis and F. N. Najm. A stati pattern-independent te hnique for power grid

voltage integrity veri ation. In ACM/IEEE 40th Design Automation Conferen e (DAC-

03), pages 99104, Anaheim, CA, June 2-6 2003.

[41 J. N. Kozhaya, S. R. Nassif, and F. N. Najm. Multigrid-like te hnique for power grid

analysis. In IEEE/ACM International Conferen e on Computer-Aided Design (ICCAD),

pages 480487, San Jose, CA, November 4-8 2001.

[42 J.D. Lambert. Numeri al Methods for Ordinary Dierential Systems: The Initial Value

Problem. John Wiley & Sons, In ., New York, NY, 1991.

[43 K. Lee. Ele tromigration re overy and short lead ee t under bipolar- and unipolar-pulse

urrent. In IEEE Int. Rel. Phys. Symp., pages 6B.3.1 6B.3.4, april 2012.

[44 W.-H. Lee, S. Pant, and D. Blaauw. Analysis and redu tion of on- hip indu tan e ee ts

in power supply grids. In IEEE International Symposium on Quality Ele troni Design

(ISQED), pages 131136, San Jose, CA, Mar h 22-24 2004.

[45 B. Li, P. S. M Laughlin, J. P. Bi kford, P. Habitz, D. Netrabile, and T. Sullivan. Statisti al

evaluation of ele tromigration reliability at hip level. IEEE Transa tions on Devi e and

Materials Reliability, 11(1):8691, Mar h 2011.

[46 J. Li and et al. Thermal-aware task s heduling in 3D hip multipro essor with real-time

onstrained workloads. ACM Trans. Embed. Comput. Syst., 12(2):24:124:22, February

[47 J. R. Lloyd. Bla k's law revisited-nu leation and growth in ele tromigration failure. Mi-

roele troni s Reliability, 47(9-11):14681472, 2007.

[48 J.R. Lloyd and J. Kit hin. The ele tromigration failure distribution: The ne-line ase.

Journal of Applied Physi s, 69(4):21172127, Feb 1991.

[49 MOSEK ApS. The MOSEK C Optimizer API Manual. Version 7.1 (Revision 28)., 2015.

Bibliography 166

[50 Z. Moudallal and F. N. Najm. Generating ir uit urrent onstraints to guarantee power

grid safety. In IEEE/ACM Asia and South Pa i Design Automation Conferen e (ASP-

DAC), pages 358365, Tokyo, Japan, January 19-22, 2015.

[51 A. Muramatsu, M. Hashimoto, and H. Ondera. Ee ts of on- hip indu tan e on power

distribution grid. In ACM International Symposium on Physi al Design, pages 6369, San

Fran is o, CA, April 3-6 2005.

[52 F. N. Najm. Cir uit Simulation. John Wiley & Sons, In ., Hoboken, NJ, 2010.

[53 F. N. Najm. Overview of ve torless/early power grid veri ation. In Pro eedings of the

ACM/IEEE International Conferen e on Computer-Aided Design (ICCAD), pages 670

677, San Jose, CA, November 2012.

[54 S. R. Nassif. Power grid analysis ben hmarks. In Pro eedings of the 2008 Asia and South

Pa i Design Automation Conferen e (ASPDAC), pages 376381, Seoul, Korea, January

21-24 2008.

[55 F. Pala ios-Gomez, L. Lasdon, and M. Engquist. Nonlinear optimization by su essive

linear programming. Management S ien e, 28(10):11061120, 1982.

[56 R. Panda, D. Blaauw, R. Chaudhry, V. Zolotov, B. Young, and R. Ramaraju. Model and

analysis for ombined pa kage and on- hip power grid simulation. In Pro eedings of the

2000 ACM/IEEE International Symposium on Low Power Ele troni s and Design, pages

179184, Italy, July 26-27 2000.

[57 S. Pant, D. Blaauw, V. Zolotov, S. Sundareswaran, and R. Panda. Ve torless analysis of

supply noise indu ed delay variation. In Pro eedings of the 2003 IEEE/ACM International

Conferen e on Computer-Aided Design (ICCAD), pages 184191, San Jose, CA, November

9-13 2003.

[58 S. Pant, D. Blaauw, V. Zolotov, S. Sundareswaran, and R. Panda. A sto hasti approa h

to power grid analysis. In ACM/IEEE Design Automation Conferen e, pages 171176, San

Diego, CA, June 7-11 2004.

[59 H. Quan, R. Nassif, and S. S. Sapatnekar. Random walks in a supply network. In Pro eedings

of the 40th ACM/IEEE Design Automation Conferen e (DAC), pages 9398, Anaheim, CA,

June 2-6 2003.

[60 A.M. Rahmani and et al. The Dark Side of Sili on. Springer, Switzerland, 2016.

[61 Y. Saad. Iterative Methods for Sparse Linear Systems. SIAM, Philadelphia, PA, 2003.

[62 S. Sapatnekar and H. Su. Analysis and optimization of power grids. IEEE Design & Test

of Computers, pages 715, May-June 2003.

Bibliography 167

[63 M. C. Seiler and F. A. Seiler. Numeri al re ipes in C: The art of s ienti omputing. Risk

Analysis, 9(3):415416.

[64 N. Srivastava, X. Qi, and K. Banerjee. Impa t of on- hip indu tan e on power distribution

network design for nanometer s ale integrated ir uits. In IEEE International Symposium

on Quality Ele troni Design (ISQED), pages 341351, San Jose, CA, Mar h 21-23 2005.

[65 S. X.-D. Tan and C.-J. R. Shi. E ient very large s ale integration power/ground network

sizing based on equivalent ir uit modeling. IEEE Transa tions on Computer-Aided Design

of Integrated Cir uits and Systems, 22(3):277284, Mar h 2003.

[66 X.-D. Tan, C.-J. R. Shi, D. Lungeanu, J.-C. Lee, and L.-P. Yuan. Reliability- onstrained

area optimization of VLSI power/ground networks via sequen e of linear programmings. In

Pro eedings of the 36th ACM/IEEE Design Automation Conferen e (DAC), pages 7883,

New Orleans, LA, June 21-25 1999.

[67 L. Ting, J. S. May, W. R. Hunter, and J. W. M Pherson. AC ele tromigration hara teri-

zation and modeling of multilayered inter onne tions. In International Reliability Physi s

Symposium (IRPS), pages 311316, 1993.

[68 I. Vaisband, R. Jakushokas, M. Popovi h, A. Mezhiba, S. Köse, and E. Friedman. On-Chip

Power Delivery and Management. Springer, Cham, Switzerland, 2016.

[69 Ri hard S. Varga. Matrix Iterative Analysis. Prenti e-Hall, In ., Englewood Clis, NJ,

[70 Kai Wang and M. Marek-Sadowska. On- hip power-supply network optimization using

multigrid-based te hnique. IEEE Transa tions on Computer-Aided Design of Integrated

Cir uits and Systems, pages 407417, Mar h 2005.

[71 Y. Wang, X. Hu, C.-K. Cheng, G.-K.-H. Pang, and N. Wong. A realisti early-stage

power grid veri ation algorithm based on hierar hi al onstraints. IEEE Transa tions on

Computer-Aided Design of Integrated Cir uits and Systems, 31(1):109120, January 2012.

[72 X. Xiong and J. Wang. Constraint abstra tion for ve torless power grid veri ation. In

Pro eedings of the 50th ACM/EDAC/IEEE Design Automation Conferen e (DAC), pages

16, Austin, TX, June 2013.

[73 T. Xu and P. Li. Design and optimization of power gating for dvfs appli ations. In Inter-

national Symposium on Quality Ele troni Design (ISQED), pages 391397, Mar h 2012.

[74 T. Xu, P. Li, and B. Yan. De oupling for power gating: Sour es of power noise and design

strategies. In Pro eedings of the 48th Design Automation Conferen e, pages 10021007,

June 2011.

Bibliography 168

[75 A. Yassine and F. N. Najm. A fast layer elimination approa h for power grid redu tion. In

IEEE International Conferen e on Computer-Aided Design (ICCAD), pages 101:1101:8,

Austin, TX, November 7-10 2016.

[76 Z. Zeng, Z. Feng, and P. Li. E ient he king of power delivery integrity for power gating.

In 2011 IEEE 12th International Symposium on Quality Ele troni Design (ISQED), pages

18, Santa Clara, CA, Mar h 14-16 2011.

[77 M. Zhao, R. Panda, S. Sapatnekar, and D. Blaauw. Hierar hi al analysis of power distri-

bution networks. IEEE Transa tions on Computer-Aided Design of Integrated Cir uits and

Systems, 21(2):159168, February 2002.

[78 H. Zhu and et al. E ient transient analysis of power delivery network with lo k/power

gating by sparse approximation. IEEE Transa tions on Computer-Aided Design of Inte-

grated Cir uits and Systems, 34(3):409421, Mar h 2015.

T I Marw Amina · Noura, Mohamed, and Maher; as m uc h, the y oungest of us, Samir, Jad, Dahlia, Marw an. T o all of y ou, a wholehearted thank ou and Go d bless ou. v. Con ten ts

Documents

10 Guideposts for Wholehearted Living - … ·...

Marw bambounakh oi palies agapes pane ston paradeiso...

Stampa di fax a pagina intera - Parrocchia di San Faustino.....

Becoming a wholehearted christian - Augustana...

iWaiBiter of ^ijilosoptjpir.amu.ac.in/6103/1/DS 3426.pdf ·...

Wholehearted IT Technology · 2019-04-17 · Contents. 1......

A Global Conversation - Stepping Into Wholehearted Living

Caleb…Wholehearted Joshua 14:6-15v 10/07/2007 Dr. Dane...

BadBury Park - Redrow · 2019. 2. 1. · AM YO MARW+ KEN O3...

Studiegids Marw Religie en Beleid 2015-2016

ANNUAL REPORT 1999 › fi › old › yrityspalvelin › pdf...

Presentation 52. Jacob had withheld wholehearted obedience.....

Lives Here…!!! - SHRI G.H.GOSRANI COMMERCE &...

desviacion estandar-marw

PERCEPTUAL INTELLIGENCE - 123seminarsonly.com · First and....

MARW I C - USDA · 2018-07-17 · Food” winner! 3 TALKING...