1 Sun Labs, Sun Microsystems, Inc. R. Ho - 2006 Interconnection technologies Ron Ho VLSI Research Group Sun Microsystems Laboratories
1Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Interconnection technologiesRon HoVLSI Research GroupSun Microsystems Laboratories
2Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Acknowledgements• Many contributors to the work described here
> Robert Drost, David Hopkins, Alex Chow, Tarik Ono, Jo Ebergen> … and the entire Sun Labs VLSI Research Group> Danny Cohen
3Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Multicore chips today• Niagara: 8 cores
> 32 threads (4 threads/core), 1 shared FP> 90nm CMOS, 1.2GHz, 63W, 380mm2
> High-volume production now
• Niagara2: 8 (improved) cores> 64 threads (8 threads/core), 1 FP per core> 65nm CMOS, 342mm2
> Shipping systems in 2H07Interconnect networks: an 8x9 xbar
4Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Multicore chips tomorrow?• How do you scale up to tens or hundreds of cores?
> …Assuming you want to (more total power for high performance)> …And on a single chip: avoid the costs of chip-to-chip IO
> Bumps are big (180μm), links are power-hungry (20pJ/bit = 20mW/Gbps)
• You need to do some combination of:> Making the cores very small
> But you lose functionality: what kind of programming models do you want?> Making the die very big
> But you lose the yield/cost battle very quickly
>
Source: 2006 IDF
5 10 15 20 25 30 35
100 200 300 400 500 600
Cos
t per
squ
are
mm
Die size in square mm
5Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
A different direction…• We can rethink the “single-chip” requirement…
> If we reduce (eliminate) the costs of chip-to-chip communication> Power, area (bandwidth density), latency, known-good-die
• Break a multi-core chip into a many-chip-system> Smaller chips lead to higher yields and lower cost> Different chips lead to system adaptability and reconfigurability > Aggregate systems of chips effectively break the reticle limit
• Enable a broad set of interconnection explorations
6Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Proximity I/O• Pioneered by Ivan Sutherland at Sun Labs
• The big idea:
Transmit
Transmit
Receive
Receive
Chip1 Chip3Chip2
Source: Drost, Sun, CICC 2003
Not a solderedconnection!
7Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
200um 20um
Area bump bonds Proximity I/O pads
180um 20um
Proximity I/O benefits• Bandwidth density 60x-100x greater than balls
• Much lower power> No ESD required> Use wide parallel links instead of narrow SerDes links> Very small Tx and Rx circuits
Source: Drost, Sun, CICC 2003
8Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Proximity I/O challenges • Misalignment is the proverbial monkey’s wrench
> Initial imperfect chip placement> Dynamic chip movement from vibration or thermal expansion
• A robust solution is (at least) two-pronged> Combination of specialized packaging and custom electronics> Here, discuss some of the electronic/circuit strategy
X Y
Z
θ
Chip1 Chip2
Φ Ψ
9Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Fixing misalignment• Detect misalignment using Vernier-like arrays
> Measure capacitance between chips to sub-fF resolution
• Correct in-plane misalignment using data steering
Source: Hopkins, Sun, ISSCC 2007
10Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Dealing with noise• Receivers are differential sense-amplifiers
> “Butterfly” scheme rejects noise (receivers are offset-limited)> We employ a clock—it uses unclocked receivers with larger pads
Source: Hopkins, Sun, ISSCC 2007
11Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Silicon measurements• 144b Proximity I/O datapath, 180nm TSMC chip
> 1.8Gbps per pin for 260Gbps in 0.5mm2
> Measured BER<10-15
> 3pJ/bit at a 1.8V power supply> > 0.7UI timing margin, > 200mV voltage margin at speed> Measured sensitivity to chip separation
Source: Hopkins, Sun, ISSCC 2007
12Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Proximity I/O as an enabler• Low-cost chip-to-chip communication
> Off-chip I/O looks like an extension of on-chip wires> Many chips look like a (big) single chip> What kinds of interconnect networks should we consider?
Proximity I/O at the overlap
13Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Proximity I/O-based grids• Another, perhaps more realistic grid
> All big (high-power) chips are all face-up on a cold plate> Face-down chips merely transmit data
• Natural extension to various network topologies
14Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
A red flag?• Such a system will have lots of VLSI wires
> The entire interconnection network consists of on-chip wires
• Latency and bandwidth characteristics well-known> …And so are the energy costs: E=C*V*ΔV
> Cap is about 0.45pF/mm (incl. repeaters), and not really scaling> Voltage is about 1V, and not really scaling
> Therefore, energy is about 0.45pJ/bit/mm of linear length> So 100 64b buses at 4GHz, 10% activity over 20mm: 24W!
• Need an efficient wiring system
15Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Efficient capacitively-driven wires
• Reduced power: low voltage swing on wires> Swing is Vdd/(n+1) without requiring a second power supply
• Reduced power: small load seen by driver> Allows use of a 1/n-th sized driver, saving power and area
• Reduced latency: capacitor pre-emphasizes edges> Also, charge distribution effectively cuts wire delay in half
Rwire, Cwire Cload CcCc = (Cwire+Cload)/nwhere n = 10 - 20
16Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Pre-emphasis extends bandwidth• The inline capacitor acts as a high-frequency short
> Example of a 14mm minimum width wire (simulated)> Bandwidth RC-limited to 50MHz> Capacitor (of 1/20th total capacitance) extends it to 180MHz
-60
-40
-20
0
1e+06 1e+07 1e+08 1e+09 1e+10
Nor
mal
ized
vol
tage
at e
nd o
f wire
(dB
)
Frequency
-3dB bandwidth improves by 3.3x
Full-swing voltage
50mV swing through Cc
Source: Ho, Sun, ISSCC 2007
17Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Making a better repeater• Analysis of energy-efficiency shows benefits
> Compare against repeaters for a 10mm wire in a 180nm process
0
2
4
6
8
10
12
0.8 1 1.2 1.4 1.6 1.8
Ene
rgy
in p
J
Performance in 1/nS
Capacitively-driven wire,varying coupling capacitance
from Cwire/18 to Cwire/40
One repeater stage
Two repeaters
Three repeatersRepeater curves arefrom varying driver sizes 30% faster
10X low
er power
Source: Ho, Sun, ISSCC 2007
18Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
driver long wire
Building pitchfork capacitors• Exploit a “problem” of wires: sidewall capacitance
> Can use more tines or multiple wire layers
• Coupling caps ideal for summing junctions> Use them to build a cheap FIR filter
CloadCwire
Cpar
Cc1
Cc2
z-1 Source: Ho, Sun, ISSCC 2007
19Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Some costs• Differential wires cost area, power; twisted for noise
• Sense-amps have offset and biasing requirements> Biasing can use refresh, with (n+1) channels for an n-bit bus
• Verification requires some care> Correctness comes from being near, not from being connected!
A
B
A_b
B_b
A
B_b
A_b
B
20Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Silicon measurements• Multiple datapaths on a 180nm TSMC 1.8V chip
> Measured 10x less energy at a 50mV swing (max. savings 18x)> Measured 4x bandwidth extension from pre-emphasis> Bit error rates < 10-11 (limited by test time)> 50% UI eye opening on each experiment
• 14mm, unrepeated, min.width> RC-limited to 55MHz> Capacitor pushes it to 200MHz> Sub-optimal 2-tap FIR (wrong delay)
-20 -15 -10 -5 0 5 10 15 20
Volta
ge
Time (nS)
After capacitor
End of wire
End of wire, using FIR filter
1.7
1.7
1.7
1.8
1.8
1.8
1.9
1.9
1.900010000 sequence
Source: Ho, Sun, ISSCC 2007
21Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Look at new system architectures• A pair of complementary enabling technologies
> Multi-chip grids connected using Proximity I/O and efficient wires> Chip-to-chip latency, bandwidth, power equal to on-chip wires> Long on-chip wires can be lower power and higher performance
• Multi-chip grids look like a big “virtual” chip> With (re)configurability, cost, and “reticle-breaking” benefits
• A question: how to best interconnect these chips?> Or: how to best arrange these chips for an interconnect network?
22Sun Labs, Sun Microsystems, Inc.R. Ho - 2006
Interconnection technologiesRon Ho, [email protected]