1 Loopback Loopback Architecture for Architecture for Wafer Wafer - - Level At Level At - - Speed Testing of Speed Testing of Embedded Embedded HyperTransport HyperTransport ™ ™ Processor Links Processor Links Alvin Loke, Bruce Doyle, Michael Oshima Alvin Loke, Bruce Doyle, Michael Oshima 1 1 , Wade Williams , Wade Williams 2 2 , , Robert Lewis Robert Lewis 2 2 , Charles Wang , Charles Wang 1 1 , Audie Hanpachern , Audie Hanpachern 3 3 , , Karen Tucker, Prashanth Gurunath Karen Tucker, Prashanth Gurunath 1 1 , Gladney Asada , Gladney Asada 1 1 , , Chad Lackey, Tin Chad Lackey, Tin Tin Tin Wee, and Emerson Fang Wee, and Emerson Fang 1 1 AMD, Fort Collins, Colorado, USA AMD, Fort Collins, Colorado, USA 1 1 AMD, Sunnyvale, California, USA AMD, Sunnyvale, California, USA 2 2 AMD, Austin, Texas, USA AMD, Austin, Texas, USA 3 3 Cortina Cortina Systems, Sunnyvale, CA Systems, Sunnyvale, CA Custom Integrated Circuits Conference Custom Integrated Circuits Conference September 16, 2009 September 16, 2009
24
Embed
Loopback Architecture for Wafer-Level At-Speed Testing of ...€¦ · 1 Loopback Architecture for Wafer-Level At-Speed Testing of Embedded HyperTransport™ Processor Links Alvin
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
LoopbackLoopback Architecture forArchitecture forWaferWafer--Level AtLevel At--Speed Testing of Speed Testing of Embedded Embedded HyperTransportHyperTransport™™Processor LinksProcessor LinksAlvin Loke, Bruce Doyle, Michael OshimaAlvin Loke, Bruce Doyle, Michael Oshima11, Wade Williams, Wade Williams22,,Robert LewisRobert Lewis22, Charles Wang, Charles Wang11, Audie Hanpachern, Audie Hanpachern33,,Karen Tucker, Prashanth GurunathKaren Tucker, Prashanth Gurunath11, Gladney Asada, Gladney Asada11,,Chad Lackey, Tin Chad Lackey, Tin TinTin Wee, and Emerson FangWee, and Emerson Fang11
AMD, Fort Collins, Colorado, USAAMD, Fort Collins, Colorado, USA11 AMD, Sunnyvale, California, USAAMD, Sunnyvale, California, USA22 AMD, Austin, Texas, USAAMD, Austin, Texas, USA33 CortinaCortina Systems, Sunnyvale, CASystems, Sunnyvale, CA
Processor dies now talk with each other using Processor dies now talk with each other using fullfull--duplex, bidirectional pointduplex, bidirectional point--toto--point linkspoint links•• HighHigh--bandwidth, lowbandwidth, low--latency communicationlatency communication•• Scalable vs. common FSB architectureScalable vs. common FSB architecture•• e.g., e.g., HyperTransportHyperTransport™™ (HT) in AMD products(HT) in AMD products
I/O ports per die is increasingI/O ports per die is increasing•• Higher socket counts Higher socket counts more board connectivitymore board connectivity•• MCM embedded links MCM embedded links more package connectivitymore package connectivity
Cost benefit is increasing to sort for functional Cost benefit is increasing to sort for functional I/O before packaging, especially for I/O before packaging, especially for MCMsMCMs
Implement onImplement on--chip I/O chip I/O loopbackloopback for for lowlow--costcost atat--speed waferspeed wafer--level testinglevel testing
4
Embe
dded
Nor
thB
ridge
(NB
)
Embe
dded
Nor
thB
ridge
(NB
)
DieDie--toto--Die Processor CommunicationDie Processor Communication
Source synchronousSource synchronous•• Forward halfForward half--rate clock for RX data retimingrate clock for RX data retiming•• CommonCommon--mode jitter rejection, low latencymode jitter rejection, low latency
0.4 to 6.4Gb/s (0.4Gb/s steps) 0.4 to 6.4Gb/s (0.4Gb/s steps) –– NRZ PAMNRZ PAM--22
20 lanes per direction (split into 2 20 lanes per direction (split into 2 sublinkssublinks))•• 1 CLK & 9 data (CAD/CTL) lanes per 1 CLK & 9 data (CAD/CTL) lanes per sublinksublink
HT1 (0.4HT1 (0.4––2.0Gb/s)2.0Gb/s)•• CDR bypassed, data RX simply retimed by CLK RXCDR bypassed, data RX simply retimed by CLK RX
HT3 (2.4HT3 (2.4––6.4Gb/s)6.4Gb/s)•• DLLDLL--based CDR aligns received forwarded CLK to based CDR aligns received forwarded CLK to
received data transitions for lower BER retimingreceived data transitions for lower BER retiming
DDCL embeddedembeddedfull HT linkfull HT linkembeddedembeddedhalf HT linkhalf HT link
7
HT Link Training (Handshaking)HT Link Training (Handshaking)
Coordinated by Coordinated by NBNB--IOC in both diesIOC in both dies
Each NBEach NB--IOC sends IOC sends predefined training predefined training pattern to the other pattern to the other diedie
Training arms CDR to Training arms CDR to align clock to data & align clock to data & signals start of data signals start of data transfertransfer
# data lanes enabled # data lanes enabled depends on link trafficdepends on link traffic
NorthBridgeI/O Controller
(NB-IOC)
Processor Cores
Die1HT Port
Data Lane
Shared Memory
NorthBridge Core
Die2HT Port
Data Lane
Board or Package Channel
TX PLLClock
Training Pattern
TX
NBClock
CLK LaneRX Clock
FIFO
RX
Decoder
1:4 Deserializer
CDR
Encoder
FIFO
4:1 Serializer
8
HT Data TransferHT Data Transfer
Data transfer starts Data transfer starts immediately after last immediately after last bit of trainingbit of training
Once data transfer is Once data transfer is completed, HT port is completed, HT port is disabled into one of disabled into one of several possible sleep several possible sleep states for power states for power savingsaving
Data is scrambled by Data is scrambled by XOR or by 8b/10b to XOR or by 8b/10b to reduce ISIreduce ISI
NorthBridgeI/O Controller
(NB-IOC)
Processor Cores
Die1HT Port
Data Lane
Shared Memory
NorthBridge Core
Die2HT Port
Data Lane
Board or Package Channel
TX PLLClock
Training Pattern
TX
NBClock
CLK LaneRX Clock
FIFO
RX
Decoder
1:4 Deserializer
CDR
Encoder
FIFO
4:1 Serializer
9
OutlineOutline
MotivationMotivation
HyperTransportHyperTransport™™ Links in AMD ProcessorsLinks in AMD Processors
Enabling Internal Serial Enabling Internal Serial LoopbackLoopback
TXTX RX sRX serial erial loopbackloopbackvia onvia on--chip channelchip channel
No external channel No external channel required, hence test required, hence test can be performed at can be performed at waferwafer--level sortlevel sort
NBNB--IOC initiates link by IOC initiates link by sending training bits, sending training bits, then userthen user--specified test specified test patternpattern
RX is RX is selfself--trainedtrained using using bits sent by own TXbits sent by own TX
WaferWafer--Level Test Supply NoiseLevel Test Supply Noise
Bum
p Su
pply
Vol
tage
(V)
Probe Card PinProbe Card PinModelModel
Comes primarily from TX driver switching high Comes primarily from TX driver switching high currents through probe card pin inductancecurrents through probe card pin inductance
Can disable any TX driver per Can disable any TX driver per sublinksublink during during loopbackloopback
Simulated with 1, 8 & Simulated with 1, 8 & 16 TX drivers enabled16 TX drivers enabled
FullFull--rate architecturerate architectureEqualization: 1Equalization: 1--bit speculative DFE + analog DFR filterbit speculative DFE + analog DFR filter
Ale
xand
erPh
ase
Det
ecto
r
SerialLoopback
Signal
SerialLoopback
Signal
Single-Ended to
DifferentialConverter
18
External Serial External Serial LoopbackLoopback
PackagePackage--level sort testlevel sort test
Provides test coverage Provides test coverage not exercised by internal not exercised by internal serial serial loopbackloopback
TX output driverTX output driver
RX analog front endRX analog front end
TX & RX equalizationTX & RX equalization
Can inject jitter into Can inject jitter into external channel for eye external channel for eye marginingmargining
CLK LaneRX Clock
NorthBridgeI/O Controller
(NB-IOC)
Processor Cores
TX PLLClock
HT PortData Lane
NBClock
Shared Memory
NorthBridge Core
STARTSTARTFINISHFINISH
Bit Error CounterTest Pattern
Training PatternFIFO
Decoder
1:4 Deserializer
CDR
HT Serial Loopback
RXTX
Encoder
FIFO
4:1 Serializer
Jitter
19
Parallel Parallel LoopbackLoopback ModesModes
PackagePackage--level sort testlevel sort test
RXRX TX pTX parallel arallel loopbackloopbackin HT or in NBin HT or in NB--IOCIOC
Requires another HT port Requires another HT port or BERT to initialize link or BERT to initialize link & provide test pattern to & provide test pattern to RXRX
Enables fault isolationEnables fault isolation
20
OutlineOutline
MotivationMotivation
HyperTransportHyperTransport™™ Links in AMD ProcessorsLinks in AMD Processors
Port0 Sublink0/Sublink1 @ 5.2,6.4Gb/s Port0 Sublink0/Sublink1 @ 5.2,6.4Gb/s –– 1.1,1.3V1.1,1.3VTraining failure in all CTL/CAD lanesTraining failure in all CTL/CAD lanes33
Port0 Sublink0 @ 6.4Gb/s Port0 Sublink0 @ 6.4Gb/s –– 1.1V1.1VCAD2 bit error count = 2CAD2 bit error count = 222
Transceiver Transceiver loopbackloopback enables waferenables wafer--level level atat--speed testing of speed testing of HyperTransportHyperTransport I/OI/O
Demonstrated 6.4Gb/s test functionalityDemonstrated 6.4Gb/s test functionality
Entirely digital architecture for simple Entirely digital architecture for simple implementation & verificationimplementation & verification
Significantly improves packageSignificantly improves package--level yield, level yield, especially for more expensive MCM packagesespecially for more expensive MCM packages
Adds no extra sort infrastructure costAdds no extra sort infrastructure cost
Established test for waferEstablished test for wafer--level screen of AMD level screen of AMD 45nm products45nm products