Top Banner
1 Sun Labs, Sun Microsystems, Inc. R. Ho - 2006 Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems Laboratories
22

Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

Jul 22, 2018

Download

Documents

doquynh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

1Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Interconnection technologiesRon HoVLSI Research GroupSun Microsystems Laboratories

Page 2: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

2Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Acknowledgements• Many contributors to the work described here

> Robert Drost, David Hopkins, Alex Chow, Tarik Ono, Jo Ebergen> … and the entire Sun Labs VLSI Research Group> Danny Cohen

Page 3: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

3Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Multicore chips today• Niagara: 8 cores

> 32 threads (4 threads/core), 1 shared FP> 90nm CMOS, 1.2GHz, 63W, 380mm2

> High-volume production now

• Niagara2: 8 (improved) cores> 64 threads (8 threads/core), 1 FP per core> 65nm CMOS, 342mm2

> Shipping systems in 2H07Interconnect networks: an 8x9 xbar

Page 4: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

4Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Multicore chips tomorrow?• How do you scale up to tens or hundreds of cores?

> …Assuming you want to (more total power for high performance)> …And on a single chip: avoid the costs of chip-to-chip IO

> Bumps are big (180μm), links are power-hungry (20pJ/bit = 20mW/Gbps)

• You need to do some combination of:> Making the cores very small

> But you lose functionality: what kind of programming models do you want?> Making the die very big

> But you lose the yield/cost battle very quickly

>

Source: 2006 IDF

5 10 15 20 25 30 35

100 200 300 400 500 600

Cos

t per

squ

are

mm

Die size in square mm

Page 5: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

5Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

A different direction…• We can rethink the “single-chip” requirement…

> If we reduce (eliminate) the costs of chip-to-chip communication> Power, area (bandwidth density), latency, known-good-die

• Break a multi-core chip into a many-chip-system> Smaller chips lead to higher yields and lower cost> Different chips lead to system adaptability and reconfigurability > Aggregate systems of chips effectively break the reticle limit

• Enable a broad set of interconnection explorations

Page 6: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

6Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Proximity I/O• Pioneered by Ivan Sutherland at Sun Labs

• The big idea:

Transmit

Transmit

Receive

Receive

Chip1 Chip3Chip2

Source: Drost, Sun, CICC 2003

Not a solderedconnection!

Page 7: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

7Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

200um 20um

Area bump bonds Proximity I/O pads

180um 20um

Proximity I/O benefits• Bandwidth density 60x-100x greater than balls

• Much lower power> No ESD required> Use wide parallel links instead of narrow SerDes links> Very small Tx and Rx circuits

Source: Drost, Sun, CICC 2003

Page 8: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

8Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Proximity I/O challenges • Misalignment is the proverbial monkey’s wrench

> Initial imperfect chip placement> Dynamic chip movement from vibration or thermal expansion

• A robust solution is (at least) two-pronged> Combination of specialized packaging and custom electronics> Here, discuss some of the electronic/circuit strategy

X Y

Z

θ

Chip1 Chip2

Φ Ψ

Page 9: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

9Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Fixing misalignment• Detect misalignment using Vernier-like arrays

> Measure capacitance between chips to sub-fF resolution

• Correct in-plane misalignment using data steering

Source: Hopkins, Sun, ISSCC 2007

Page 10: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

10Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Dealing with noise• Receivers are differential sense-amplifiers

> “Butterfly” scheme rejects noise (receivers are offset-limited)> We employ a clock—it uses unclocked receivers with larger pads

Source: Hopkins, Sun, ISSCC 2007

Page 11: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

11Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Silicon measurements• 144b Proximity I/O datapath, 180nm TSMC chip

> 1.8Gbps per pin for 260Gbps in 0.5mm2

> Measured BER<10-15

> 3pJ/bit at a 1.8V power supply> > 0.7UI timing margin, > 200mV voltage margin at speed> Measured sensitivity to chip separation

Source: Hopkins, Sun, ISSCC 2007

Page 12: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

12Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Proximity I/O as an enabler• Low-cost chip-to-chip communication

> Off-chip I/O looks like an extension of on-chip wires> Many chips look like a (big) single chip> What kinds of interconnect networks should we consider?

Proximity I/O at the overlap

Page 13: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

13Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Proximity I/O-based grids• Another, perhaps more realistic grid

> All big (high-power) chips are all face-up on a cold plate> Face-down chips merely transmit data

• Natural extension to various network topologies

Page 14: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

14Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

A red flag?• Such a system will have lots of VLSI wires

> The entire interconnection network consists of on-chip wires

• Latency and bandwidth characteristics well-known> …And so are the energy costs: E=C*V*ΔV

> Cap is about 0.45pF/mm (incl. repeaters), and not really scaling> Voltage is about 1V, and not really scaling

> Therefore, energy is about 0.45pJ/bit/mm of linear length> So 100 64b buses at 4GHz, 10% activity over 20mm: 24W!

• Need an efficient wiring system

Page 15: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

15Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Efficient capacitively-driven wires

• Reduced power: low voltage swing on wires> Swing is Vdd/(n+1) without requiring a second power supply

• Reduced power: small load seen by driver> Allows use of a 1/n-th sized driver, saving power and area

• Reduced latency: capacitor pre-emphasizes edges> Also, charge distribution effectively cuts wire delay in half

Rwire, Cwire Cload CcCc = (Cwire+Cload)/nwhere n = 10 - 20

Page 16: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

16Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Pre-emphasis extends bandwidth• The inline capacitor acts as a high-frequency short

> Example of a 14mm minimum width wire (simulated)> Bandwidth RC-limited to 50MHz> Capacitor (of 1/20th total capacitance) extends it to 180MHz

-60

-40

-20

0

1e+06 1e+07 1e+08 1e+09 1e+10

Nor

mal

ized

vol

tage

at e

nd o

f wire

(dB

)

Frequency

-3dB bandwidth improves by 3.3x

Full-swing voltage

50mV swing through Cc

Source: Ho, Sun, ISSCC 2007

Page 17: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

17Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Making a better repeater• Analysis of energy-efficiency shows benefits

> Compare against repeaters for a 10mm wire in a 180nm process

0

2

4

6

8

10

12

0.8 1 1.2 1.4 1.6 1.8

Ene

rgy

in p

J

Performance in 1/nS

Capacitively-driven wire,varying coupling capacitance

from Cwire/18 to Cwire/40

One repeater stage

Two repeaters

Three repeatersRepeater curves arefrom varying driver sizes 30% faster

10X low

er power

Source: Ho, Sun, ISSCC 2007

Page 18: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

18Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

driver long wire

Building pitchfork capacitors• Exploit a “problem” of wires: sidewall capacitance

> Can use more tines or multiple wire layers

• Coupling caps ideal for summing junctions> Use them to build a cheap FIR filter

CloadCwire

Cpar

Cc1

Cc2

z-1 Source: Ho, Sun, ISSCC 2007

Page 19: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

19Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Some costs• Differential wires cost area, power; twisted for noise

• Sense-amps have offset and biasing requirements> Biasing can use refresh, with (n+1) channels for an n-bit bus

• Verification requires some care> Correctness comes from being near, not from being connected!

A

B

A_b

B_b

A

B_b

A_b

B

Page 20: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

20Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Silicon measurements• Multiple datapaths on a 180nm TSMC 1.8V chip

> Measured 10x less energy at a 50mV swing (max. savings 18x)> Measured 4x bandwidth extension from pre-emphasis> Bit error rates < 10-11 (limited by test time)> 50% UI eye opening on each experiment

• 14mm, unrepeated, min.width> RC-limited to 55MHz> Capacitor pushes it to 200MHz> Sub-optimal 2-tap FIR (wrong delay)

-20 -15 -10 -5 0 5 10 15 20

Volta

ge

Time (nS)

After capacitor

End of wire

End of wire, using FIR filter

1.7

1.7

1.7

1.8

1.8

1.8

1.9

1.9

1.900010000 sequence

Source: Ho, Sun, ISSCC 2007

Page 21: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

21Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Look at new system architectures• A pair of complementary enabling technologies

> Multi-chip grids connected using Proximity I/O and efficient wires> Chip-to-chip latency, bandwidth, power equal to on-chip wires> Long on-chip wires can be lower power and higher performance

• Multi-chip grids look like a big “virtual” chip> With (re)configurability, cost, and “reticle-breaking” benefits

• A question: how to best interconnect these chips?> Or: how to best arrange these chips for an interconnect network?

Page 22: Sun Microsystems Laboratories - UC Davis ECEocin06/talks/ho.pdf · 1 R. Ho - 2006 Sun Labs, Sun Microsystems, Inc. Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems

22Sun Labs, Sun Microsystems, Inc.R. Ho - 2006

Interconnection technologiesRon Ho, [email protected]