Top Banner
Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux , Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of Cambridge, UK Energy Recovery from High-frequency Clocks using DC-DC Converters
39

Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

Mehdi Alimadadi, Samad Sheikhaei,

Guy Lemieux, Shahriar Mirabbasi, William Dunford

University of British Columbia, Canada

Patrick Palmer

University of Cambridge, UK

Energy Recovery

from High-frequency Clocks

using DC-DC Converters

Page 2: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

2

Problem

Clock power in high-performance CPUsCPU Year Clock Power % Power

for ClockClock Power

Intel McKinley2002

(180nm)1 GHz 130W 33% 43W

Intel Montecito2005

(90nm)2.5 GHz 85W 30% 25W

IBM Power 62007

(65nm)5 GHz > 100W 22% > 22W

• Cause– Charge big clock capacitor Cclk with energy– Discharge Cclk energy to GND (WASTE IT!!)– Repeat every clock cycle

Page 3: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

3

Primary Contribution of This Work

• Primary contribution– Discharge Cclk using DC-DC converter instead of GND

• Use converter to power useful load (Rload)• Integrated clock drivers with DC-DC converters• Net savings in power

Voltage feedback (for regulation)

Useful

Load

Page 4: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

4

Summary Results

• Explore 3 main DC-DC power converter topologies– Buck converter our previous work [ ISSCC 2007 ]– Boost converter this paper [ ISVLSI 2008 ]– Buck-boost converter this paper [ ISVLSI 2008 ]

• 90nm layouts, 3GHz operation, < 0.3mm2

Clock-only power (input)

Extra power to operate

converter (input)

Converter output power

% clock energy

recovered

Buck converter [ ISSCC2007 ]

40mW 16mW 26mW 50%

Boost converter

100mW 25mW 28mW 20%

Buck-boost converter

100mW 72mW 48mW 30%

Page 5: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

Background

Page 6: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

6

Background – Typical Clocking Architecture

Level 3 Gaters & Final drivers

Final H-tree

Bottom mesh

Level 1 & Level 2 H-tree

Clock

Source

Page 7: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

7

Background – Typical Clocking Architecture

• Clock distribution

– Majority of energy used by final drivers

– Levels 1, 2• H-trees• Tunable delays (CVDs) to eliminate skew• Low-swing, differential low power, noise immunity• ~ 5W of power

– Level 3• Gaters reduce clock activity 50-85% (Power6)

– Can’t eliminate all activity still need a clock to compute• Final clock drivers

– Full-rail swing tapered inverters drive hundreds latches, high power• H-tree with ends shorted by Mesh low skew, high power

• ~15W to 40W of power

Page 8: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

8

Background –Reducing Clock Power

• Clock distribution– Low-swing (differential) signals

• Final drivers need full-rail

– Resonant clocking (saves 80%)• Final drivers need square clock

• Final clock drivers– Adiabatic switching

• Low-performance, < 100MHz

– Double-edge clocking• Feasible, but complex flip-flops, larger loads• Compatible with energy recovery in this paper

Page 9: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

9

Background – Switch Mode Power Supplies

• Basic DC-DC converter topologies– Buck

• Step down• 0 Vout VDD

– Boost• Step up• VDD Vout

– Buck-boost• Negative step up/down• Vout 0

CF

LF

D

S

RL

+

CF

LF D

S RL

+

CF

DS

RLLF

+

Page 10: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

10

Background – Switch Mode Power Supplies

• DC-DC buck converter– CMOS inverter as power switches

• Implementation of zero-voltage switching (ZVS)– Turn on NMOS when Vinv= 0– Turn on PMOS when Vinv=Vdd

C R

Vgate VoutVinv

Vdd

S

D

IL

LL

R

VoutVinv

DS

-+Vin C

Vgate

IL

Page 11: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

Background

ISSCC 2007 Design

• ZVS delay circuit• Integrated clock driver / power converter

Page 12: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

12

Integration of Clock and SMPS

• CPU clock: 3GHz clock and large Cclk

• SMPS: large Mp, Mn drive chain

Page 13: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

13

Integration of Clock and SMPS

• Combine the driver circuits

Vclk

Cclk

CLK in

Mp

Mn

VoutLf

Cf Rload

CLK in

Page 14: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

14

Key Concept: Energy Recycling

• Benefits– Shared driver chain

– Cclk added to SMPS

• Red path– NMOS drains Cclk wastes charge!

• Blue path– Delay NMOS turn-on recovers clock charge!– ZVS (zero voltage switching) in power electronics

Vclk

Cclk

CLK in

VoutLf

Cf Rload

Page 15: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

15

ZVS Detailed Operation

• ZVS delay circuit – Delay only rising edge of Vn

– Implemented inside the clock chain

Mp

Mn

GND

Vdd

Vn

Vp

VoutVclkLf

Cclk Cf Rload

Page 16: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

16

ZVS Detailed Operation (Mode 1)

• Mode 1 (0 < t < DTsw)

– Mp is ON

– Current builds up in the inductor

– Cclk charges up

Mp

Mn

GND

Vdd

Vn

Vp

VoutVclkLf

Cclk Cf Rload

D = Duty cycle

Tsw = Switching period

Page 17: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

17

ZVS Detailed Operation (Mode 2)

• Mode 2 (DTsw < t < DTsw+Tzvs)– Both power transistors are OFF

– Inductor current discharges Cclk

– Cclk charge is recycled to output load

Mp

Mn

GND

Vdd

Vn

Vp

VoutVclkLf

Cclk Cf Rload

D = Duty cycle

Tsw = Period

Tzvs = ZVS delay

Page 18: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

18

ZVS Detailed Operation (Mode 3)

• Mode 3 (DTsw+Tzvs < t < Tsw)

– Mn turns ON when Vclk 0

• ZVS for Mn

– Inductor current decreases linearly

Mp

Mn

GND

Vdd

Vn

Vp

VoutVclkLf

Cclk Cf Rload

D = Duty cycle

Tsw = Period

Tzvs = ZVS delay

Page 19: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

19

Detailed Operation

• ZVS delay circuit for Mn

– Delay rising edge of Vn

Mp

Mn

GND

Vdd

Vm

Vn

Vp

Vclk

M3

M4

M1

M2

ZVS Delay Circuit

12

3

4

Vout

RloadCclk

Lf

Cf

Page 20: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

20

Mp

Mn

GND

Vdd

Vm

Vn

Vp

Vclk

M3

M4

M1

M2

ZVS Delay Circuit

12

2

Vout

RloadCclk

Lf

Cf

Detailed Operation

• ZVS delay circuit for Mn

– Falling edges of Vp and Vn are synchronized

Page 21: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

21

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

0 0.2 0.4 0.6 0.8 1

Time (nSec)V

olt

age

(V)

VclkVclk-refVload

Simulation Voltages

Mp

Mn

GND

Vdd

Vm

Vn

Vp

Vclk

M3

M4

M1

M2

ZVS Delay Circuit

12

2

Vout

RloadCclk

Lf

Cf

Page 22: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

22

Simulation Currents

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

0 0.2 0.4 0.6 0.8 1

Time (nSec)

Vo

ltag

e (V

)

VclkVclk-refVload

Mp

Mn

GND

Vdd

Vm

Vn

Vp

Vclk

M3

M4

M1

M2

ZVS Delay Circuit

12

2

Vout

RloadCclk

Lf

Cf

-0.1

-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

0 0.2 0.4 0.6 0.8 1

Time (nSec)

Cu

rren

t (m

A)

LfMnMp

Page 23: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

23

Effective Efficiency

• How to measure power efficiency after clock drivers are integrated with DC-DC converters ?– Converter gets “free energy” from clock

– Effective efficiency: how efficient a regular (standalone) power converter must be to equal the efficiency of integrated clock/power converter

Raw efficiency Effective efficiency

1001

in

outraw P

P

Raw Efficiency

Pin1 Pout

Integrated Clock Driverand Power Converter

orStand-alone Power Converter

dummyEffective Efficiency

Pin2

Pin1 – Pin2 PoutPin1

Clock Driver Portion

Power Converter Portion

Recycled Energy(not counted as

input power)

10021

inin

outeffective PP

P

Page 24: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

24

Buck Converter – Simulation Results

0

50

100

150

200

250

300

40 50 60 70 80 90 100

Iout (mA)

Eff

ecti

ve E

ffic

ien

cy (

%)

D=30%D=40%D=50%D=60%D=70%

0

0.25

0.5

0.75

1

10 20 30 40 50 60 70 80

Duty Ratio (%)

Vo

ut

(V)

Iout=30

Iout=50

Iout=70

Iout=100

• Open loop converter (no regulation)– Higher efficiency at lowest duty cycle because

only a fixed amount of energy is available from Cclk

Page 25: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

25

ISSCC 2007

• 90nm test chip 1mm2, buck converter 0.27mm2

Page 26: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

26

Buck Converter – Chip Measurement vs. Simulation Results

0

50

100

150

200

250

300

40 50 60 70 80 90 100

Iout (mA)

Eff

ecti

ve E

ffic

ien

cy (

%)

D=30%D=40%D=50%D=60%D=70%

Chip Measurement Simulation (3GHz)

Fsw Sweep (D=50%)

0

40

80

120

160

200

240

30 40 50 60 70 80 90 100 110

Iout (mA)

Eff

ecti

ve E

ffic

ien

cy (

%)

3.5GHz3GHz2.5GHz2GHz

Page 27: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

ISVLSI 2008New Design 1

Boost Converter

Page 28: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

28

Boost Converter

• Basic operation– Vclk provides power & timing

• 0th order result… Vout = D/(1-D)*Vdd

Page 29: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

29

Boost Converter

192/0.1

Wp/Lp = 576/0.1 Wp/Lp = 192/0.1

VpulseMp1

Wp/Lp = 48/0.1 Wp/Lp = 16/0.1

Wp/Lp = 192/0.1 Wp/Lp = 64/0.1

64/0.1

4096/0.1

1024/0.1

512/0.1 x2

Clock Load Capacitance

+

ILf

Cshift=21pF

Vclk

Vclk_scaled

4096/0.1

2048/0.1Mp2

Mp3

Mn2

Mn3

Mn1

Cclk_scaled

Vshift

Dshift

Vout

1kW

Cclk=21pF

+CF=378pF

2.2pF

LF=310pH

216/0.75

36720/0.75

VDD

2016/0.75Cclk=Cshift

Page 30: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

30

Boost Converter – Simulation Results

• Open loop converter (no regulation)– Higher efficiency at lowest duty cycle because

only a fixed amount of energy is available from Cclk

0

0.5

1

1.5

2

2.5

30 40 50 60 70 80

Duty Ratio (%)

Vo

ut

(V)

Iout=10mAIout=30mAIout=50mAIout=70mAIout=100mA

0

25

50

75

100

125

0 20 40 60 80 100

Iout (mA)

Eff

ecti

ve E

ffic

ien

cy (

%)

D=40%D=50%D=60%D=70%D=80%

Page 31: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

ISVLSI 2008New Design 2

Buck-boost Converter

Page 32: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

32

Buck-boost Converter

• Basic operation– Vclk provides power & timing

• 0th order result… Vout = -D2/(1-D)*Vdd

Page 33: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

33

Buck-boost Converter

192/0.1

Wp/Lp = 576/0.1 Wp/Lp = 192/0.1

VpulseMp1

Mn1

ILf LF

Vclk

Clock Load Capacitance

Vinv

Wp/Lp = 48/0.1 Wp/Lp = 16/0.1 2016/0.75

Wp/Lp = 192/0.1 Wp/Lp = 64/0.1

64/0.1

4096/0.1

4096/0.1

1024/0.1

1024/0.1

+

+10.4kW

128/0.1

310pH

Cbias

2016/0.75 Cshift=21pF

Mp2

Mp3

4096/0.1Mp4

Mp5

Mn2

Mn3

Dshift

Vshift

Vclk

Vbias

Deep N-WellStructures

Vout

Three Diodesin Series, Each: 128/0.1

1kW

21pF

Cclk = 21pF

VDD

+CF=356pF

34560/0.75CF

2016/0.75

Cbias

Page 34: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

34

Buck-boost Converter

-2

-1.6

-1.2

-0.8

-0.4

0

10 20 30 40 50 60 70

Duty Ratio (%)

Vo

ut

(V)

Iout=10mA

Iout=30mA

Iout=50mA

Iout=70mA

Iout=90mA

• Open loop converter (no regulation)– Higher efficiency at lowest duty cycle because

only a fixed amount of energy is available from Cclk

0

20

40

60

80

100

0 20 40 60 80 100

Iout (mA)

Eff

ecti

ve E

ffic

ien

cy (

%)

D=20%D=30%D=40%D=50%D=60%D=70%

Page 35: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

Results and Comparisons

Page 36: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

36

Summary Results

• 90nm layouts, 3GHz operation, < 0.3mm2

Clock-only power (input)

Extra power to operate

converter (input)

Converter output power

% clock energy

recovered

Buck converter [ ISSCC2007 ]

40mW 16mW 26mW 50%

Boost converter

100mW 25mW 28mW 20%

Buck-boost converter

100mW 72mW 48mW 30%

Page 37: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

37

Comparative Results

• IBM Power6 100W@1V, 341mm2 Cclk = 13pF/mm2

• Other work: fully on-chip DC-DC buck converter– S. Abedinpour, B. Bakkaloglu, and S. Kiaei, "A Multi-Stage Interleaved Synchronous Buck Converter

with Integrated Output Filter in a 0.18µm SiGe Process," ISSCC 2006, pp. 356–357

– 27mm2, 45MHz– 65% power efficiency

• This work– 0.27, 0.26, 0.20 mm2, including 0.1mm2 inductor area, 3GHz

• Cclk 20pF, equiv to 1.6mm2 of Power6 area

• DC-DC converter adds 12.5% area overhead

– LC filter: 310pH inductor, 350pF capacitor• L and C similar and dominate layout area can stack to cut area in half

– Buck: 75 – 185% effective power efficiency (50% recovered)– Boost: 25 – 110% effective power efficiency (20% recovered)– Buck-boost: 20 – 66% effective power efficiency (30% recovered)

Page 38: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

38

Conclusion

• Key concepts– High switching frequency saves area– Combined drivers saves area and switching loss

– Recycled charge converter load discharges Cclk

– ZVS delay circuit lower power loss

• Limitations– Regulation needs variable duty cycle clock

• May introduce additional clock jitter• Mostly suitable for edge-triggered blocks

(no latches)

• Future work– Lots of improvements to make!

Page 39: Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of.

Thank you!

Questions ?