Transcript
The Dissertation Committee for Ji Hwan Chuncertifies that this is the approved version of the following dissertation:
Cost Effective Tests for High Speed I/O Subsystems
Committee:
Jacob A. Abraham, Supervisor
Nur A. Touba
Ranjit Gharpurey
David Z. Pan
Ghani Kanawati
Cost Effective Tests for High Speed I/O Subsystems
by
Ji Hwan Chun, B.S.; M.S.E.
DISSERTATION
Presented to the Faculty of the Graduate School of
The University of Texas at Austin
in Partial Fulfillment
of the Requirements
for the Degree of
DOCTOR OF PHILOSOPHY
THE UNIVERSITY OF TEXAS AT AUSTIN
December 2011
Acknowledgments
First of all, I would like to express sincere gratitude to my advisor, Dr.
Jacob A. Abraham for his support and insightful advice. He always encourages
me to think out of the box at a higher level while considering the fundamentals
of the problem. With his enthusiasm and breadth of knowledge, I was able to
view many technical problems from various perspectives and solve them all.
I also would like to thank to Dr. Nur A. Touba, Dr. Ranjit Gharpurey,
Dr. David Z. Pan, and Dr. Ghani Kanawati for serving as my dissertation
committee and for insightful advice and guidance.
My gratitude extends to Hak-soo Yu, Jae Wook Lee, Hongjoong Shin,
Byoungho Kim, Hyun Jin Kim, Joonsung Park, Joonsoo Kim, Eun Jung Jang,
Jaeyong Chung, Junyoung Park, Ashwin Raghunathan for their time on valu-
able discussions.
Also, I would like to appreciate all CERC members and ECE friends,
Minyoung Park, Changyong Shin, Soonhyuk Choi, Byungchul Jang, Jinkyu
Lee, Bong Wan Jun, Yonghyun Kim, Joon-Sung Yang, Dam Sunwoo, Jungho
Jo, Donghyuk Shin, Wonsoo Kim, Taesoo Jun, Kihyuk Han, Hyunsun Um,
Ickjae Yoon, Jiseon Park, Jae Hong Min, Adam Tate, Romi Datta, Ramtilak
Vemu, Ravi Gupta, Sankar Gurumurthy, Tung-Yeh Wu, Whitney Wadlow,
Rajeshwary Tayade, Chaoming Zhang, Melissa Campos, Debi Prather, and
v
Andrew Kieschnick.
I would like to thank my former and current managers, Jeremy Scofield,
Nancy Wang-lee, Ghani Kanawati, Puneet Singh, Sam Chiang, and Nilesh
Bhagat in Intel Corporation for their mentoring and management support for
pursuing my Ph.D. I have benefited from my colleagues in Intel who provided
valuable discussions, guidance, and collaborations and I would like to thank
them all for their support. To name a few; Harsha Narravula, Abhijit Sathaye,
Pulkit Sangani, Andrew Saquing, Silvio Picano, Srirama Pedarla, Giri Vadla-
mudi, Tom Barrett, Karan Tewari, Bob Roeder, Huesung Kim, Hangkyu Lee,
Daeho Seo, Pankaj Sharma, Freddy Salazar, Arthur Chan, Jasveen Kaur, Ram
Rajamani, Ashish Gupta, Nazar Haider, and Dilip Bhavsar.
Finally, my best gratitude goes to my wife, Dr. Suhyun Park and my
parents for their sincere support. Without their continuous encouragement, I
would not have been able to achieve this milestone.
vi
Cost Effective Tests for High Speed I/O Subsystems
Publication No.
Ji Hwan Chun, Ph.D.
The University of Texas at Austin, 2011
Supervisor: Jacob A. Abraham
The growing demand for high performance systems in modern comput-
ing technology drives the development of advanced and high speed designs in
I/O structures. Due to their data rate and architecture, however, testing of
the high speed serial interfaces becomes more expensive when using conven-
tional test methods. In order to alleviate the test cost issue, a loopback test
scheme has been widely adopted. To assess the margin of the signal eye in
the loopback configuration, the eye margin is purposely reduced by additional
devices on the loopback path or using design for testability (DFT) features
such as timing and voltage margining. Although the loopback test scheme
successfully reduces the test cost by decoupling the dependency of external
test equipment, it has robustness issues such as a fault masking issue and a
non-ideality problem of margining circuits. The focus of this dissertation is to
propose new methods to resolve the known issues in the loopback test mode.
The fault masking issue in a loopback pair of analog to digital and digital to
analog converters (ADC and DAC) which can be found in pulse amplitude
vii
modulation (PAM) signaling schemes is resolved using a proposed algorithm
which separates the characteristics of the ADC and the DAC from a combined
loopback response. The non-ideality problem of margining circuit is resolved
using a proposed method which utilizes a random jitter injection technique.
Using the injected random jitter, the jitter distribution is sampled by under-
sampling and margining, which provides the nonlinearity information using
the proposed algorithm. Since the proposed method requires a random jitter
source on the load board, an alternative solution is proposed which uses an
intrinsic jitter profile and a sliding window search algorithm to characterize
the nonlinearities. The sliding search algorithm was implemented in a low
cost high volume manufacturing (HVM) tester to assess feasibility and valid-
ity of the proposed technique. The proposed methods are compatible with
the existing loopback test scheme and require a minimal area and design over-
head, hence they provide cost effective ways to enhance the robustness of the
loopback test scheme.
viii
Table of Contents
Acknowledgments v
Abstract vii
List of Tables xii
List of Figures xiii
Chapter 1. Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Contributions of the Dissertation . . . . . . . . . . . . . . . . 4
1.3 Organization of the Dissertation . . . . . . . . . . . . . . . . . 7
Chapter 2. Design and Test of High Speed Serial I/Os 10
2.1 Overview of High Speed Interface Scheme . . . . . . . . . . . . 10
2.1.1 Timing Alignment Consideration . . . . . . . . . . . . . 10
2.1.1.1 Forwarded Clock . . . . . . . . . . . . . . . . . 11
2.1.1.2 Embedded Clock . . . . . . . . . . . . . . . . . 12
2.1.2 Data Rate Consideration . . . . . . . . . . . . . . . . . 14
2.2 Test of High Speed Interface . . . . . . . . . . . . . . . . . . . 16
2.2.1 BER and Jitter . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1.1 Deterministic Jitter (DJ) . . . . . . . . . . . . . 18
2.2.1.2 Random Jitter (RJ) . . . . . . . . . . . . . . . 19
2.2.2 Built in Self Test (BIST) of High Speed Interface . . . . 21
2.2.3 Loopback Test . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.4 On-chip Timing Margining Implementation . . . . . . . 24
2.3 Limitations of DFT Based Loopback Test and Related Work . 25
2.3.1 Fault Masking Issue . . . . . . . . . . . . . . . . . . . . 25
2.3.2 Margining Circuitry Linearity Issue . . . . . . . . . . . 28
ix
Chapter 3. Efficient ADC and DAC Loopback Test 32
3.1 Review of Converter Linearity Errors . . . . . . . . . . . . . . 33
3.2 Proposed Technique . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2.1 Loopback Configuration . . . . . . . . . . . . . . . . . . 36
3.2.2 ADC Test . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2.3 DAC Test . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4 Comparison with Prior Work . . . . . . . . . . . . . . . . . . . 53
3.5 Other Considerations . . . . . . . . . . . . . . . . . . . . . . . 54
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Chapter 4. Phase Interpolator Test Using a Random Jitter In-jection 59
4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.1.1 High Speed I/O Design and Phase Interpolator Basics . 60
4.1.2 Impact of Nonlinearity of PI . . . . . . . . . . . . . . . 63
4.2 Overview of The Proposed Technique . . . . . . . . . . . . . . 65
4.2.1 Distribution Vector Creation Using Undersampling . . . 68
4.2.2 Calculation of Predicted DNL . . . . . . . . . . . . . . . 70
4.2.3 Random Jitter Injection Considerations . . . . . . . . . 71
4.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . 75
4.3.1 Simulation Configuration . . . . . . . . . . . . . . . . . 75
4.3.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . 76
4.4 Comparison with Prior Work . . . . . . . . . . . . . . . . . . . 80
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Chapter 5. Phase Interpolator Test Using a Sliding WindowSearch 84
5.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.1.1 Undersampling Technique Basics . . . . . . . . . . . . . 85
5.1.2 Jitter and BER . . . . . . . . . . . . . . . . . . . . . . . 86
5.2 Proposed Technique . . . . . . . . . . . . . . . . . . . . . . . . 87
5.2.1 Test Procedure . . . . . . . . . . . . . . . . . . . . . . . 87
x
5.2.2 Jitter Aliasing Reduction Algorithm Using Sliding Win-dow Search . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.2.3 Interpolation Technique to Overcome Finite Resolution 91
5.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . 93
5.3.1 Simulation Results . . . . . . . . . . . . . . . . . . . . . 93
5.3.1.1 Size of Window Sweep . . . . . . . . . . . . . . 94
5.3.1.2 Number of Samples Sweep . . . . . . . . . . . . 95
5.3.1.3 Amount of Jitter Sweep . . . . . . . . . . . . . 100
5.3.1.4 Repeatability Analysis . . . . . . . . . . . . . . 102
5.3.2 Hardware Validation . . . . . . . . . . . . . . . . . . . . 106
5.4 Comparison with RJ Injection Based PI Test Method . . . . . 112
5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Chapter 6. Conclusions and Future Research Directions 114
6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.2 Future Research Directions . . . . . . . . . . . . . . . . . . . . 116
Bibliography 119
Vita 136
xi
List of Tables
2.1 Q-factor with Respect to BER . . . . . . . . . . . . . . . . . . 20
2.2 Various Timing Margining Implementations for High Speed I/ODesigns [70] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.1 Nonlinearity Prediction Errors vs. Noise σ (LSB) . . . . . . . 50
3.2 Nonlinearity Prediction Errors vs. Number of Samples (LSB) . 50
3.3 Statistics of Nonlinearity Prediction Errors (LSB) . . . . . . . 51
3.4 Nonlinearity Prediction Errors vs. Converter Resolutions (LSB) 55
3.5 Comparison among Various BIST Schemes . . . . . . . . . . . 55
4.1 Example Code of Phase Interpolator Encoding . . . . . . . . . 63
4.2 Nonlinearity Prediction Errors vs. RJ σ (LSB) . . . . . . . . . 77
4.3 Summary of Simulation Condition and Results . . . . . . . . . 80
4.4 Nonlinearity Prediction Errors vs. PJ Amplitude (LSB) . . . . 81
4.5 Comparison among Various PI Test Schemes . . . . . . . . . . 82
5.1 Summary of Simulation Conditions . . . . . . . . . . . . . . . 94
5.2 Estimation Error (LSB) vs. Number of Bits (Condition A) . . 98
5.3 Estimation Error (LSB) vs. Number of Bits (Condition B) . . 99
5.4 Estimation Error (LSB) vs. RJ σ (Condition A) . . . . . . . . 101
5.5 Estimation Error (LSB) vs. RJ σ (Condition B) . . . . . . . . 103
5.6 Repeatability Analysis Results (LSB) (Condition A) . . . . . . 105
5.7 Repeatability Analysis Results (LSB) (Condition B) . . . . . . 105
5.8 Hardware Validation Results (LSB) . . . . . . . . . . . . . . . 111
5.9 Before & After Voltage Correction Results (LSB) . . . . . . . 112
5.10 Comparison between Two PI Test Methods . . . . . . . . . . . 113
xii
List of Figures
1.1 High Speed Interface Bit Rate Trend [8] . . . . . . . . . . . . 2
2.1 Forwarded Clock Scheme . . . . . . . . . . . . . . . . . . . . . 11
2.2 Embedded Clock Scheme . . . . . . . . . . . . . . . . . . . . . 13
2.3 Clock and Data Recovery Block Diagram [82] . . . . . . . . . 13
2.4 Time Interleaved Transmitter [102] . . . . . . . . . . . . . . . 15
2.5 10GBASE-T Block Diagram [7] . . . . . . . . . . . . . . . . . 16
2.6 BER vs. Sampling Time [87] . . . . . . . . . . . . . . . . . . . 17
2.7 Jitter Decomposition Hierarchy . . . . . . . . . . . . . . . . . 18
2.8 Incorrect Extrapolation Example [87] . . . . . . . . . . . . . . 21
2.9 High Speed I/O Loopback Test Configuration [65] . . . . . . . 23
2.10 On-chip Timing Margining Concept [70] . . . . . . . . . . . . 25
2.11 Loopback vs. Actual Pass/Fail Result Analysis [84] . . . . . . 26
3.1 Test Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Proposed Loopback Setup . . . . . . . . . . . . . . . . . . . . 36
3.3 Loopback Conversion Process . . . . . . . . . . . . . . . . . . 37
3.4 LFSR Based Random Noise Generator [10] . . . . . . . . . . . 39
3.5 Random Noise Generator Based on Thermal Noise Amplifica-tion [39] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.6 Random Noise Generator Using Delta-Sigma Modulation [10] . 39
3.7 Loopback Response Comparison with Added Gaussian Noise . 41
3.8 ADC Test Procedure . . . . . . . . . . . . . . . . . . . . . . . 43
3.9 Estimated DAC Output Points from Each ADC Code Transi-tion Voltages . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.10 DNL Prediction Errors vs. Noise σ . . . . . . . . . . . . . . . 48
3.11 INL Prediction Errors vs. Noise σ . . . . . . . . . . . . . . . . 49
3.12 DNL Prediction Errors vs. Number of Samples . . . . . . . . . 51
xiii
3.13 INL Prediction Errors vs. Number of Samples . . . . . . . . . 52
3.14 Prediction Errors of ADC Nonlinearities . . . . . . . . . . . . 53
3.15 Prediction Errors of DAC Nonlinearities . . . . . . . . . . . . 54
3.16 DNL Prediction Errors vs. Converter Resolutions . . . . . . . 56
3.17 INL Prediction Errors vs. Converter Resolutions . . . . . . . . 57
4.1 Forwarded Clock Scheme . . . . . . . . . . . . . . . . . . . . . 60
4.2 Derived Clock Scheme . . . . . . . . . . . . . . . . . . . . . . 60
4.3 Phase Interpolator Schematic . . . . . . . . . . . . . . . . . . 61
4.4 Example Bathtub Curve of Receiver . . . . . . . . . . . . . . . 64
4.5 Proposed Configuration for Forwarded Clock Scheme . . . . . 64
4.6 Proposed Configuration for Derived Clock Scheme . . . . . . . 65
4.7 Test Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.8 Undersampling Technique . . . . . . . . . . . . . . . . . . . . 69
4.9 Piecewise Cubic Polynomial Interpolation of Dpos . . . . . . . 71
4.10 Delay Adjustment Circuit Architecture Used for Jitter Injec-tion [53] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.11 Timing Generator Block Diagram [34] . . . . . . . . . . . . . . 74
4.12 RJ Injection Circuitry Block Diagram [34] . . . . . . . . . . . 74
4.13 Injected DNL vs. Predicted DNL . . . . . . . . . . . . . . . . 75
4.14 Injected Random Jitter vs. Prediction Error . . . . . . . . . . 77
4.15 Number of Bits in Alternating Data Sequence vs. PredictionError . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.16 Monte Carlo Simulation of the Proposed Technique . . . . . . 79
4.17 Injected Periodic Jitter vs. Prediction Error . . . . . . . . . . 81
5.1 Undersampling Technique Concept . . . . . . . . . . . . . . . 86
5.2 Test Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.3 Circuit Configuration Concept . . . . . . . . . . . . . . . . . . 88
5.4 Estimation Error vs. Size of Window (Condition A) . . . . . . 95
5.5 Estimation Error vs. Size of Window (Condition B) . . . . . . 96
5.6 Estimation Error vs. Number of Bits (Condition A) . . . . . . 97
5.7 Estimation Error vs. Number of Bits (Condition B) . . . . . . 97
xiv
5.8 Estimation Error vs. RJ σ (Condition A) . . . . . . . . . . . . 100
5.9 Estimation Error vs. RJ σ (Condition B) . . . . . . . . . . . . 102
5.10 Estimated DNL vs. Injected DNL (Condition A) . . . . . . . . 104
5.11 Estimated DNL vs. Injected DNL (Condition B) . . . . . . . . 105
5.12 Hardware Validation Configuration . . . . . . . . . . . . . . . 107
5.13 Tester Pattern and Test Program Synchronization . . . . . . . 108
5.14 PI Step Location Plot for 32 Positions . . . . . . . . . . . . . 110
xv
Chapter 1
Introduction
1.1 Motivation
The growing demand for high performance computing systems has
driven the bandwidth increase of chip-to-chip data communications. Tradi-
tional interconnect schemes which use parallel data transfer with low speed
inputs/outputs (I/Os) are limited by technical difficulties such as data timing
alignment and high pin count issues. In order to achieve the goal of multi-
gigabit transfer rates, the traditional I/O structures are replaced by indus-
trial I/O designs such as PCI ExpressTM [5], XAUITM [3], Serial ATATM(S-
ATATM) [11], QuickPath InterconnectTM(QPITM) [59], and HyperTransportTM [2],
etc. These designs have adopted a serial interconnect scheme to resolve timing
skew and high pin count issues in conventional parallel data transfer schemes.
Although the serial interface bit rate increases rapidly, testing of the
high speed serial interfaces becomes a significant challenge due to their speed
and architecture. Figure 1.1 illustrates the speed trend projection of the serial
interface reported by International Technology Roadmap for Semiconductors
(ITRS) [8]. In year 2010, it is not uncommon to find 10 Gbps bit rate inter-
faces. The trend indicates that the high speed serial interfaces would reach
1
Figure 1.1: High Speed Interface Bit Rate Trend [8]
100 Gbps in a next decade or so. Conventional test methods use automated
test equipment (ATE) which has a direct connectivity to the interface of the
device under test (DUT). Since the speed of the interface increases in a short
period of time, the ATE needs to be upgraded to test the interface at speed.
The cost of investment to upgrade the ATE hardware is high which becomes
a major hindrance when using external equipment to test the serial interfaces.
Another challenge on high speed interface test is attributed to the trend
of modern VLSI design methodology. Traditional mixed signal components
have been designed with sufficient design margins to guardband from unex-
pected yield loss due to process and design variations. Nowadays, however,
design methodology for modern processors is trending to system on chip (SOC)
2
development methodology to meet the time-to-market requirement, where the
individual functional components such as serial interface circuits are delivered
as intellectual property (IP) blocks. Integration issues such as coupling noise
between digital and mixed signal circuits degrade the performance of the se-
rial interface blocks, which adversely impacts the design margin during the
integration stage. With ever increasing challenges on data rate requirements
which also forces design to comply with less design margin, the margin be-
comes smaller which leaves more burdens on testing, since not only the data
rate of the I/O but also the accuracy of the test hardware increases the testing
cost. Cost factors to improve equipment’s edge placement accuracy as well as
to design sophisticated signal traces on the load board to ensure the signal in-
tegrity are typical examples that precision test equipment is required to screen
subtle defects.
I/O loopback test [20, 60, 65, 67, 70, 81, 95] has been gaining popularity
as a cost effective alternative for high speed serial interface testing. In this
technique, the transmitter (TX) output is connected to the receiver (RX)
input, and the TX transmits data stream to the RX which requires no direct
interface connections to the external equipment. In order to determine whether
the actual signal eye meets or exceeds the eye mask specification, either passive
device on the loopback path or design for testability (DFT) features such as
timing and voltage margining are used to reduce the size of the signal eye
margin. Despite its popularity, loopback test has limitations such as fault
masking and non-ideality problem on margining circuitry which are described
3
in detail in Chapter 2.
The focus of the dissertation is to develop cost effective yet accurate
test techniques for high speed serial interfaces. The limitations of loopback
tests are studied and innovative test methods to overcome the limitations are
proposed to achieve test completeness.
1.2 Contributions of the Dissertation
In this dissertation, novel test techniques are proposed to address issues
and limitations of the current loopback technique. The contributions of the
dissertation are summarized as follows.
• Development of I/O test methodologies that provide additional coverage
while not disrupting existing test methods.
Fault masking and non-ideality of margining circuits are studied and
three test methodologies are proposed to resolve the issues. The goal of
the dissertation is to develop a methodology that can resolve the known
loopback testing issues while keeping the existing loopback configuration
intact.
A pulse amplitude modulation (PAM) signaling scheme is used in many
serial interface architectures to increase the transfer rate by creating
the symbols with various voltage levels. Analog to digital and digital
to analog converters (ADC and DAC) are used to implement a PAM
signaling scheme as a receiver and a transmitter respectively; however
4
the loop back test of the data converters suffers from fault masking.
The proposed method in Chapter 3 resolves the fault masking issues in
loopback configuration of the DAC and ADC pair.
Timing margining of the current loopback test has a limitation in that
the margining is expected to be uniformly spaced. In real silicon, how-
ever, the phase interpolator circuit which provides the timing margining
capability is not linear and is susceptible to process, voltage, and tem-
perature (PVT) variations. Nonlinearity of the phase interpolator can
result in false fail or false pass of the test which are translated to yield loss
or test escapes. Two techniques to measure the nonlinearities of phase
interpolators are proposed. Both techniques configure the transmitter
and the receiver in the loopback test mode. The first technique utilizes
random jitter injected on the loopback path to provide the reference dis-
tribution to extract nonlinearities. While the method provides accurate
estimates of the nonlinearities, it has a limitation in that it needs to have
a random jitter source on the load board. The second method does not
require the random jitter source on the load board in order to extract
nonlinearity information. Instead, it calculates nonlinearity based on an
intrinsic jitter distribution and a sliding window search algorithm.
• Development of cost effective I/O test methodologies.
As previously mentioned, due to the concern on the increasing test cost,
cost effectiveness of the test methods is an important factor when adopt-
5
ing the technique. In terms of external tester resource usage, our meth-
ods do not require the contact of the external hardware on the high speed
I/O interfaces, since all the proposed test techniques operate in loopback
test mode configurations. Although additional hardware is needed for
the random jitter injection technique, the random jitter source is cheaper
when compared to the precision ATE hardware that operates at speed.
From the silicon design perspective, the proposed methods may require a
few wires or multiplexing logics to enable the proposed test mode, which
do not require major circuit modification to enable the proposed test
methods, since the assumption is that the loopback test mode configu-
ration is already implemented in the silicon.
Since the proposed method provides additional coverage over the current
loopback test configuration, it can be used in wafer level tests, where the
signal integrity of the wafer probe is worse than that of package level
tests, hence less coverage is guaranteed when testing high speed I/Os.
Early identification of the defective part in wafer level test leads to overall
test cost savings since the packaged part is not built based on defective
die.
• Development of HVM test pattern and program that incorporate the
proposed algorithm.
The sliding window algorithm was designed to fit into low cost HVM
tester environment where the response capture memory is limited. With
6
the same low cost HVM tester configuration that can be used for pro-
duction tests, the proposed nonlinearity test algorithm was implemented
in the tester environment. A pattern and test program synchronization
architecture which aligns clock timing between various pattern generator
modules was proposed and implemented in the ATE. The method was
demonstrated to provide an accurate estimation of nonlinearity with-
out additional capital investment on the test hardware. System board
to tester correlation was performed to validate the accuracy of the pro-
posed methodology. The results suggest that the proposed algorithm
provides an accurate estimation of the nonlinearity of the phase interpo-
lator circuitry.
1.3 Organization of the Dissertation
The rest of the dissertation is organized as follows. Chapter 2 presents
an overview of high speed serial link architectures. Signaling and clocking
schemes of the serial interface are explained. Test requirements and related
work are also discussed along with details of the limitations of previous tech-
niques. Based on the limitations, subsequent chapters present proposed test
methodologies to overcome the limitations.
Chapter 3 describes a novel technique for testing the linearity of on-chip
high speed data converters in the loopback configuration. With a loopback
setup and additional noise in the middle of the loopback path, differential
nonlinearities (DNLs) and integral nonlinearities (INLs) of analog-to-digital
7
converters (ADCs) and digital-to-analog converters (DACs) are extracted by
the proposed algorithm. The proposed method exploits the fact that loop-
back output code distribution is distorted by nonlinearities of the ADC when
Gaussian noise is present. From this fact, we can fully characterize the ADC,
without dependency of the DAC characteristics. Then, from the combined
loopback response and the extracted characteristics of the ADC, DAC test is
performed and nonlinearities of the DAC are calculated to construct full ADC
and DAC linearity profiles. Experimental results show that the proposed al-
gorithm can predict INL and DNL of the data converters accurately.
Chapter 4 explains a novel linearity test technique for the margining
circuitry. Using a random jitter injection technique that can be implemented
on the load board using a separate jitter injection source, jitter distribution is
collected using two different sampling techniques such as undersampling and
phase interpolator sampling. The difference between the collected distributions
is attributed to the nonlinearities of the phase interpolators. This fact can be
used to derive the nonlinearity of the phase interpolator circuitry. Experimen-
tal results show that the proposed method provides accurate estimations of
the linearity of the margining circuitry.
Although the method proposed in Chapter 4 presents a novel test
methodology to characterize the margining circuit’s linearity, it requires a
random jitter source on the load board to inject random jitter to estimate the
linearity. Chapter 5 presents a high volume manufacturing (HVM) friendly
technique to measure the linearity, which does not require the random jitter
8
source on the load board. Rather than using the injected random jitter, an
intrinsic jitter distribution is used to construct the profile of the distribution.
A sliding window search algorithm is developed to accurately estimate the
linearity from the collected distribution. The algorithm is computationally
simple to implement in a low cost HVM tester environment. The algorithm
was implemented in a low cost HVM test environment to assess the feasibility
and validity of the method. The experimental results indicate that the method
can accurately predict the nonlinearity of the phase interpolators.
Chapter 6 concludes the dissertation and highlights potential future
research areas in high speed I/O tests.
9
Chapter 2
Design and Test of High Speed Serial I/Os
In this chapter, we review high speed serial interface architecture along
with the challenges for which the architecture provides solutions. An overview
of serial interface test requirements is presented and limitations are discussed.
2.1 Overview of High Speed Interface Scheme
High speed serial interfaces achieve multi-giga transfer rates by resolv-
ing several technical obstacles. In this section, we review the timing alignment
and data rate considerations which are key aspects of high speed serial inter-
face to enable the high data transfer rate.
2.1.1 Timing Alignment Consideration
As the data transfer rate increases, the unit interval (UI) of the bit is
proportionally reduced. For example, the UI of 10 Gbps data rate is 100 psec.
In a conventional parallel data transfer scheme in which the data transfer rate
is increased mainly by increasing the number of pins, aligning the clock signal
with respect to the UI of the data is a significant challenge. Entire board
traces which correspond to the I/Os of two communicating chips need to be
10
Channel D QD QData
Local Clock
TX RX
PIChannel
Figure 2.1: Forwarded Clock Scheme
designed to match the latency within a margin of the half of UI when using
a single clock signal to sample the data. This requirement is a very difficult
one to achieve, since the data rate of the signal is in a multi-Gbps range, and
subsequently the UI is in a range of several hundred picoseconds.
In high speed serial interface schemes, the data is serialized to allevi-
ate the challenge of timing alignment among multiple data signals to a single
clock signal. The transmitter of the serial interface serializes the data, and the
receiver de-serializes the received bit stream to reconstruct the data stream.
While the serializer-deserializer (SerDes) architecture overcomes the difficulty
of the board level latency matching problem across multiple data signals, tim-
ing alignment between a data signal and a clock is still challenging. SerDes
resolves this issue in two different schemes which are described as follows.
2.1.1.1 Forwarded Clock
In a forwarded clock scheme, the clock signal is generated from the
transmitter and sent along with the data stream via another channel as de-
scribed in Figure 2.1.
11
This scheme ensures that the clock frequency of the receiver end is
identical to the transmitter one as the source of the clock is the transmitter.
When the data and the clock signals reach the receiver, the alignment may
not hold, since the trace length of the data channel and the clock channel may
not be identical. In general, the timing skew between data and clock signal
is mitigated by a phase interpolator circuitry. The phase interpolator takes
two different phase signals and creates an intermediate phase signal based
on mixture of the two signals. It takes digital control signals to control the
percentage of the phase mixture, which provide a finer delay control feature
based on the digital code, hence programmable delay of several picoseconds
can be achieved. During power up sequence of the I/O interface, a pre-defined
training sequence is executed to align the timing and exchange configuration
parameters between the transmitter and the receiver. After this sequence,
the phase interpolator is programmed to delay the clock signal by a certain
amount that the training algorithm has found.
The forwarded clock scheme is mainly adopted in industry I/Os such
as QuickPath InterconnectTM(QPITM) [59], Fully Buffered DIMMTM(FBDTM,
FB-DIMMTM) [1], etc.
2.1.1.2 Embedded Clock
Another approach to align data signal timing with the clock signal is
to generate a clock signal from the data bit stream. Figure 2.2 illustrates the
topology of the embedded clock scheme, which is also called a derived clock
12
Channel D QD QData
Local Clock
TX RX
CDR
Figure 2.2: Embedded Clock Scheme
Figure 2.3: Clock and Data Recovery Block Diagram [82]
scheme. In this architecture, the clock signal is generated from the receiver
end and the signal transition of the data bit stream is used to recover the clock.
Clock and data recovery (CDR) circuitry is used to recover the clock from the
signal transitions of the data bit stream. Figure 2.3 describes a block diagram
for a typical CDR circuit. Although there are other variations of CDR designs
such as phase interpolator based design, the generic CDR circuit’s building
blocks are very similar to the ones in a phase locked loop (PLL) design.
Since proper data transition is essential to avoid clock frequency drift,
an encoding scheme is used to maintain the number of transitions. 8 bit to 10
bit (8b/10b) encoding is one of the popular approaches that allows the clock
recovery circuit to construct proper sampling clock at the receiver end. The
13
recovered clock is properly delayed to align with the center of the data bit’s
UI to ensure the correct sampling of the data.
Embedded clock scheme is adopted in many industrial I/O architectures
such as PCI ExpressTM [5], XAUITM [3], HyperTransportTM [2], etc.
2.1.2 Data Rate Consideration
In order to achieve a high data rate, the serial interfaces use either
time interleaving or multi-level signaling [102]. In time interleaving schemes,
transmitter and receiver contain more than one instance of the transmitter
and the receiver respectively. When N transmitters are wired in parallel and
each of them operates at a slightly different time in a staggered way, they can
generate N times of data stream as compared to the one of a single transmit-
ter. Most high speed serial links use 2-way time interleaving which requires to
sample at the rising and falling edge of the clock and effectively reduces down
the operating speed of the rest of the logic by half. A time interleaved trans-
mitter is illustrated in Figure 2.4. In this design, two clock signals are used
to determine when to enable each transmitter to interleave the transmission.
Depending on the design preference, this design can be modified to use 4 clock
signals and use only the rising edges of the clock.
In multi-level signaling, rather than using traditional two voltage levels
which are high and low, more than two levels of voltage can be used. Since
intermediate voltage levels are available for signaling purposes, more than 1
bit can be associated to the various voltage levels. In communication theory,
14
TX TXTXTX
Phase a Phase b Phase c Phase d
Phase a Phase b Phase c Phase d
Clk A
Clk B
Figure 2.4: Time Interleaved Transmitter [102]
this scheme is called pulse amplitude modulation (PAM) where amplitude of
the pulse creates a unique symbol which corresponds to multiple bits [76].
The transmitter and the receiver circuits are designed as a digital to analog
converter (DAC) and an analog to digital converter (ADC) respectively to
enable the pulse amplitude modulation (PAM) signaling scheme.
In industrial designs, 10GBASE-T [7] and other ethernet standards are
using the PAM signaling scheme. Figure 2.5 illustrates a 10 Gbps physical
layer (PHY) block diagram which uses ADC and DAC pair and digital signal
processing (DSP) modules to achieve multi-level signaling and equalization.
15
Figure 2.5: 10GBASE-T Block Diagram [7]
2.2 Test of High Speed Interface
2.2.1 BER and Jitter
The quality of the communication interface is measured by bit error
rate (BER) metric. BER is defined as follows.
BER =Number of Received Bits in Error
Total Number of Transmitted Bits(2.1)
Modern industry standard I/O specifications require a BER of 10−12 or
lower to ensure the quality. The conventional method to test BER at various
error rates is to use external equipment such as bit error rate tester (BERT)
in which a pattern generator, a transmitter, a receiver and a comparison logic
for error detection are implemented. When testing 10 Gbps serial interface
using the conventional method, it takes from several minutes to even a few
hours to collect statistically meaningful data at BER of 10−12 level [21], which
is test time prohibitive for high volume manufacturing. Since one of the major
16
Figure 2.6: BER vs. Sampling Time [87]
contributors to the BER is jitter, understanding jitter and BER relationship
is important. When the timing jitter probability density function (PDF) is
given as f01(∆t) for 0-1 transition and f10(∆t) for 1-0 transition, the overall
BER cumulative density function (CDF) is given as [63]:
F (ts) = P01
∫ ∞
ts
f01(∆t)d∆t + P10
∫ ts
−∞f10(∆t)d∆t (2.2)
where P01 and P10 represent transition densities for 0-1 and 1-0 transitions
respectively. An example BER CDF with respect to sampling time ts in X
axis is illustrated in Figure 2.6.
Describing BER as a function of jitter has an advantage in that total
jitter can be further separated in various jitter components. Total jitter can
be separated to deterministic jitter (DJ) and random jitter (RJ) components.
DJ and RJ are further separated to the following categories [63]. Figure 2.7
illustrates the decomposition hierarchy for various jitter components.
17
TJ
DJ RJ
PJ BUJ DDJ
DCD ISI
Figure 2.7: Jitter Decomposition Hierarchy
2.2.1.1 Deterministic Jitter (DJ)
Deterministic jitter can be separated into data dependent jitter (DDJ),
periodic jitter (PJ), and bounded uncorrelated jitter (BUJ). DDJ can be fur-
ther divided into duty cycle distortion (DCD) and inter symbol interference
(ISI). DCD is a special type of DDJ when the data pattern is a clock like pat-
tern, i.e. 1010. DDJ generally occurs due to the loss of the signal’s frequency
component when the data bit stream is transmitted through the lossy channel.
PDF for DDJ is defined as
fDDJ(∆t) =
N∑
i=1
P DDJi δ(∆t − DDDJ
i ) (2.3)
where P DDJi is the probability for the DDJ value of DDDJ
i .
PJ, also called as sinusoidal jitter (SJ), is a repeating jitter at a certain
18
period or frequency. PDF for PJ is defined as
fPJ(∆t) =1
π√
1 − (∆t/A)2(2.4)
where A is the amplitude of the PJ and −A ≤ ∆t ≤ A. BUJ generally occurs
due to crosstalk and due to its nature of the source, it is bounded. The PDF
is a truncated Gaussian which can be defined as
fBUJ(t) = pBUJ√2πρBUJ
e− t2
2ρ2BUJ for |t| ≤ ABUJ (2.5)
= 0 for |t| > ABUJ (2.6)
where ABUJ represents the peak value, ρBUJ is the sigma value, and pBUJ is
the normalization probability for the BUJ PDF.
2.2.1.2 Random Jitter (RJ)
Random jitter is caused by thermal noise, 1/f flicker noise, shot noise,
or other unbounded jitter source which can be modeled as Gaussian white
noise. Gaussian jitter PDF is defined as
fGJ(∆t) =1√2πσ
e−(∆t−µ)2
2σ2 (2.7)
where, µ represents the mean of the Gaussian distribution and σ represents
its standard deviation.
Due to the RJ component in a total jitter form, the total jitter is
unbounded and a bit error would occur to keep the BER greater than 0. Since
the RJ component is the contributor for the unbounded nature for the TJ, by
19
BER Q(BER)
10−7 5.19910−8 5.61210−9 5.99810−10 6.36110−11 6.70610−12 7.03510−13 7.34910−14 7.651
Table 2.1: Q-factor with Respect to BER
separating the RJ and the DJ components from the TJ, we can extrapolate the
TJ at certain BER level. Tailfit algorithm is a popular method to decompose
the jitter components based on Gaussian tail of the TJ distribution [63]. Once
the DJ and RJ components are determined, the TJ at certain BER can be
written as
TJ(BER) = DJ + 2Q(BER)σRJ (2.8)
where, σRJ is standard deviation of the RJ. The Q-factor values are given as
shown in Table 2.1.
Modern BERT systems support a timing scan function in which edge
placement of the clock can be varied to provide multiple data points to create
a BER curve. Some of them also support a built in function to extrapolate
the obtained BER curve. This can be viewed as a similar method to the
jitter decomposition since random jitter is dominant at lower BER level, hence
extrapolation is more accurate if the BER is measured at lower BER level.
Correct selection of extrapolation point is essential, since extrapolation at an
20
Figure 2.8: Incorrect Extrapolation Example [87]
incorrect point could result in over- or underestimation of the BER at the 10−12
level. An example incorrect extrapolation due to this problem is illustrated
in Figure 2.8. In this figure, extrapolation at higher BER level denoted as µ′L
and µ′R caused an overestimation of RJ as compared to the real RJ µL and
µR.
2.2.2 Built in Self Test (BIST) of High Speed Interface
In order to resolve the test cost issue, adoption of built in self test
(BIST) methods has become an attractive solution. The BIST methods en-
able testing of the devices using on-chip test circuitry. Without relying on
costly automated test equipment (ATE), the BIST methods provide effective
solutions to test high speed serial links [22, 70, 81, 90]. Authors in [22, 90] use
on-chip circuitry such as flip-flops and vernier delay lines to characterize the
21
jitter. Although they can provide jitter measurements without using ATE,
they cannot enable at-speed functional testing of the digital logics in the serial
links when the methods are used alone. Therefore, there needs to be a way to
enable at-speed testing of the interface without depending on the ATE.
2.2.3 Loopback Test
Loopback based testing schemes [20, 60, 65, 67, 70, 81, 95], on the other
hand, have been gaining popularity since they provide a way to exercise the
interface without depending on the ATE. In loopback test configuration, both
jitter tolerance and logics in the physical layer of the high speed links can be
tested at speed.
Figure 2.9 illustrates a typical loopback configuration. In the loopback
scheme, the output node of the transmitter is connected to the input node of
the receiver so that transmitted data can be easily compared with received
data on the same device. The transmitter is driving the receiver at speed to
screen any delay defects on the serial links. With the loopback scheme, it
is required to determine whether the actual signal eye meets or exceeds the
eye mask specification. There are various techniques to achieve the margining
capability. One solution is to use external jitter injection filters in the loopback
channel to margin the timings and voltages of the data eye [20]. The other
one is to implement a design for testability (DFT) feature by reusing existing
circuitry in high speed links to enable the margining capability [67, 70]. Since
it is difficult to inject an exact amount of jitter using the external filters, the
22
reuse of the existing circuitry is a preferred way to implement the margining
capability. Although there has been some success on controlling the injected
jitter amount [65], the loopback with the DFT based eye margining approach
has been widely adopted in many industrial high speed I/O tests, because it
is rather simple and easy to implement in existing high speed I/O schemes.
2.2.4 On-chip Timing Margining Implementation
The timing margining concept is described in Figure 2.10. In general,
the clock signal is placed at the center of the data eye to ensure proper data
latching with low BER. The on-chip margining capability enables capability
to move the clock placement by the desired amount. Since the clock and data
recovery (CDR) architecture determines how to align the clock with the data,
the on-chip margining capability implementation takes the CDR architecture
as a baseline, and enables the margining capability by adding additional cir-
cuits for controllability. This approach minimizes area overhead associated
with timing margining implementation.
On-chip timing margining capability provides a similar function as the
timing scan in a BERT where the BER is measured at certain locations of the
clock edge to obtain BER curve. By assessing BER at certain timing location,
or simply determining pass/fail at the timing location with pre-determined
guardband, we can achieve low cost HVM test of the high speed I/Os without
requiring expensive external equipment.
24
Figure 2.10: On-chip Timing Margining Concept [70]
2.3 Limitations of DFT Based Loopback Test and Re-
lated Work
Despite its popularity, the DFT based loopback scheme has some draw-
backs. In this section, two major issues of the loopback testing are discussed,
and prior work on each issue is presented.
2.3.1 Fault Masking Issue
Unlike using precision equipment where we can guarantee either a signal
source or a response analyzer is accurate, the accuracy of both a transmitter
and a receiver is not guaranteed in a device under test. In other words, the
performance of the transmitter and the receiver may vary, hence the signal
generated by an outperforming transmitter could be received by an underper-
forming receiver, or vice versa, which may create a combined response with
passing results. This compensation effect is called a fault masking effect in the
loopback configuration and could result in false pass in go/no-go production
test environment which is translated as a test escape. Figure 2.11 illustrates
simulated cases for loopback response to examine the distribution of the fault
25
Figure 2.11: Loopback vs. Actual Pass/Fail Result Analysis [84]
masking issue. In this Monte-Carlo simulation example, 2200 ensembles were
generated with statistically induced errors and 8% of the distribution indi-
cates either false fail or false pass cases. 6.5% of the distribution shows fault
masking which is a significant portion of the distribution.
The fault masking issue becomes more challenging when pulse ampli-
tude modulation (PAM) scheme is used for high speed serial links [32, 106]; this
architecture uses high speed analog to digital converters (ADCs) and digital
to analog converters (DACs) for the interfaces to perform multi-level signaling
and equalization. In the PAM architecture, since the linearity of data convert-
ers determines the bit error rate (BER) of the link, linearity characterization
without the fault masking effect is very important when testing data converter
pairs in the loopback mode.
Although there are many proposed methods [12, 15, 80, 98] to test only
one type of the data converters, either ADC or DAC, which use extra logic to
26
test the data converters, they may not be desirable ways to perform testing
for high speed I/O cases where both types of data converters are available,
which causes the area overhead to double. Schemes that test both converters
are proposed as well [13, 47, 96]. Compared to the methods above, these are
optimized in terms of area for both converter tests. However, they still have
some area overhead since they also have extra hardware for BIST implemen-
tation. In terms of test time, they test each converter sequentially, and thus
test time doubles when we test both converters. Moreover, some of the tech-
niques [47, 98, 103] exploit certain circuit blocks in ADC or DAC to achieve the
BIST technique without significant area overhead; however, availability of the
specific functional blocks may limit the general application of those methods.
There are some previous papers to test data converters in a loopback
configuration. In [108], delta-sigma data converters in the loopback mode are
used as a study case to separate the ADC and DAC characteristics. However,
the application of the method is limited to delta-sigma type data converters, so
it cannot be easily generalized. Shin et al. [84] propose a loopback characteris-
tic separation methodology based on loopback response of the data converters
when the loopback path has an analog filter. With the presence of the analog
filter on the loopback path, the response from the DAC is attenuated, then
the attenuated response is converted to digital code by the ADC. From the
difference of the loopback responses, the characteristics of the ADC and the
DAC are extracted. Park et al. [75] propose a parallel test method to separate
ADC’s and DAC’s characteristics using an analog summer and an RMS de-
27
tertor. The aforementioned approaches focus on dynamic specifications of the
data converters in loopback configuration which may have less importance,
when testing data converters used for the multi-level signaling drivers and
receivers.
For static specification such as nonlinearity characterization methods
in loopback mode, Yu et at. [109] propose a statistical method based on noise
characterization to calculate nonlinearities of data converters in the loopback
test mode. However, this method is not appropriate for separating character-
istics of each converter without monitoring an internal node, which is difficult
in today’s system on chip (SOC) development practices, where I/O designs are
delivered as hard IP (Intellectual Property) blocks. Shin et al. [85] propose
using a digital equalizer to calibrate the DAC prior to the loopback mode test
to separate the data converter characteristics, which may have dependency on
availability of such an equalizer for data converters. Due to the equalization
procedure which is pre-requisite, the two step operations of the test sequence
may require longer test time. Our proposed method to resolve the fault mask-
ing issue in a loopback configuration without the aforementioned dependencies
is presented in Chapter 3.
2.3.2 Margining Circuitry Linearity Issue
In general, the timing margining capability in loopback test is enabled
by using phase interpolator (PI) circuitry when it is enabled by the internal
circuitry reuse. Table 2.2 summarizes various implementations for on-chip
28
Interface CDR Type and Methods Range Resolution
S-ATA Over-sampled; 2 UI 1/8 UITX phase select
PCI Express PI based; 2 UI 1/32 UISupplemental or offset
DMI PI based; 2 UI 1/32 UISupplemental
FBD PI based; 2 UI 1/32 UIOffset
Table 2.2: Various Timing Margining Implementations for High Speed I/ODesigns [70]
timing margining capability for high speed I/Os. Although one design adopts
oversampling based clock and data recovery (CDR) circuits which determines
timing margining capability to be implemented as TX phase select method,
most of the designs use PI based CDR, hence the implementation of the timing
margining capability is based on the PI circuits.
The phase interpolators are used to margin the timing of the data
eye to identify the total jitter from a given set of data pattern and to screen
defective parts if the jitter exceeds the allowed amount in the specification. As
another application of PI, Casper et al. [68] implement an on-die oscilloscope
to measure the timing aspects of the signal, and the PI is used to scan the
signal boundary with respect to timing. In both cases, in order to ensure the
validity of the measured data, the linearity of the phase interpolator should
be fairly good. However, in real manufacturing cases, process, voltage and
temperature (PVT) variation significantly affects the linearity of the PI in
each die. The PVT impact becomes more severe in today’s highly advanced
29
process technologies since the variation tends to increase as the size of device
shrinks; therefore, it is necessary to find a way to test the linearity of phase
interpolator itself in a cost effective manner.
Conventional methods for testing the linearity of typical analog mix-
ers may include direct measurement of phase relationships for various phase
configurations. However, this approach is difficult to apply to PIs in high
speed I/O applications since the resolution of interpolated phases needs to be
in the unit of the several picoseconds and measuring of the subtle difference
is a significant challenge. This challenge becomes more obvious when using
external equipments such as ATEs due to signal integrity and loading effect is-
sues at high speeds. To relieve this issue, measuring linearities at lower speeds
can substitute for high speed measurement. However, at-speed measurement
is becoming more important, since at-speed measurement of linearity shows
differences from the measurements at lower speeds.
There is some previous work regarding linearity test techniques for the
PI. Provost [77] proposes a PI linearity measurement technique that requires
an additional phase interpolator to determine whether each PI satisfies speci-
fications in terms of linearity. Shi et al. [83] introduce self test circuitry which
is composed of a phase detector, a phase-difference-to-voltage converter, an
analog-to-digital converter (ADC) and control logic. While it is possible to
measure the linearity of PIs using these techniques, both these techniques re-
quire large amounts of on chip real estate as compared to that for the PI which
raises yield concern since the probability of defects in the test logic becomes
30
greater as the size of the logic increases. Our proposed methods to resolve this
issue are presented in Chapter 4 and 5.
31
Chapter 3
Efficient ADC and DAC Loopback Test
A pulse amplitude modulation (PAM) signaling scheme is used in many
serial interface architectures to increase the transfer rate by creating the sym-
bols with more voltage levels. Analog to digital and digital to analog converters
(ADC and DAC) are used to implement the PAM signaling scheme, however
loopback test of the data converters suffers from the fault masking issue.
In this chapter, we propose a new methodology which provides complete
linearity characterization with a proposed loopback mode setup. It exploits a
Gaussian noise added loopback scheme of DAC and ADC to obtain simplicity
and facility of implementation.
This chapter is organized as follows. Section 3.1 reviews definition
of converter errors. Section 3.2 explains proposed methodology. Simulation
results are presented in Section 3.3 and comparison with prior work is presented
in Section 3.4. Section 3.5 presents other factors to consider when applying
the proposed method. Section 3.6 summarizes the chapter.
32
3.1 Review of Converter Linearity Errors
Among all the DC characteristics of converters, the linearity test con-
sumes the largest portion of test time since it is required to test entire codes
with a large number of samples. Various definitions of DNL and INL have been
introduced [19], and the most common definition of data converter linearity
errors is used for this chapter.
For N code converters, ith endpoint DNLs of DAC and ADC are defined
as follows.
DNLDAC(i) =V (i + 1) − V (i)
V LSBDAC
− 1 (3.1)
where,
V LSBDAC =
∑N−1i=1 V (i + 1) − V (i)
N
V (i) is ith output voltage level.
DNLADC(i) =CT (i + 1) − CT (i)
V LSBADC
− 1 (3.2)
where,
V LSBADC =
∑N−1i=1 CT (i + 1) − CT (i)
N
CT (i) is ith code transition voltage. Since our definition for DNLs are endpoint
DNLs, integration of the DNLs yields endpoint INLs. For N code converters,
ith endpoint INLs of DAC and ADC are defined as follows.
INLEP (i) =i
∑
k=1
DNL(k)
33
From the endpoint INLs, we can derive best-fit INLs by
INLBF (i) = INLEP (i) − max(INLEP ) + min(INLEP )
2(3.3)
The max() and min() functions yield maximum and minimum values among
all INLEP s respectively. All DNLs and INLs are in units of LSB. We se-
lected the definitions of endpoint DNLs and best-fit INLs for evaluation of our
proposed algorithm.
3.2 Proposed Technique
Figure 3.1 illustrates test procedure of the proposed method. First,
inherent or deliberately created Gaussian random noise is injected in the mid-
dle of the loopback path. Next, a varied linear histogram testing method
is performed for our scheme explained in the next subsection. We supply a
slowly increasing finite resolution ramp generated by a DAC to the input of an
ADC. Unlike the traditional linear histogram testing method [66], we record
the number of code occurrences in a matrix H , for each ADC output code
and each DAC voltage level. Compared to other data converter test methods
in earlier literature in which data from data converters are collected in a se-
quential manner, our method simultaneously collects data for both converters,
which provides advantage in terms of test time. From the collected data in
matrix H , we can calculate ADC’s DNLs and INLs. Then, we can calculate
DAC’s DNLs and INLs based on both the collected data and the calculated
ADC’s characteristics.
34
Inject Gaussian
Noise in the
loopback path
Transmit Certain
Number of
Samples per Code
of DAC
Create Code
Occurrence
Histogram H from the
Loopback Response
Calculate ADC
Nonlinearities
Calculate DAC
Nonlinearities
Figure 3.1: Test Procedure
35
Noise
DAC ADCDigitalInput Output
Digital
Figure 3.2: Proposed Loopback Setup
3.2.1 Loopback Configuration
Figure 3.2 is the proposed loopback setup. In this setup, DAC output
is connected to ADC input so that we can handle the input and output of
the setup in digital by CPU or DSP units. This loopback setup has several
advantages. First, it allows us to test the ADC and the DAC simultaneously
which reduces test time. Second, we don’t need additional hardware to achieve
BIST. Third, since the test algorithm can be run as software, there is more
flexibility. Finally, this is an appropriate approach for routine monitoring
during operation. Despite those prominent advantages, nonlinearity extraction
for each converter is difficult due to fault masking. Recently, however, it has
been shown that linearity of ADCs can be fully tested with a finite resolution
ramp [69]. According to this paper, considering appropriate amount of noise
at ADC input, we can use an imperfect ramp, i.e., staircase output from DAC
in order to calculate the accurate code transition voltages of the ADC. Based
on this fact, we add random noise in the middle of the loopback path. The
noise can be either inherent or deliberately added. Thermal noise and noise
from other circuitry are major sources for the inherent noise. Dithering is a
common technique to improve ADC’s linearity by adding noise [101].
36
�����
�����
����������������������
����������������������
������������������������������
������������������������������
������
������
���
�����������
k
C’k(l−2) C’klC’k(l+3)
k(l−1)C’ klC’k(l−2)C’ k(l+1)C’ k(l+2)C’ k(l+3)C’
DAC��������������
������������������������������������
������������������������������������
������������������������������������������
������������������������������������������
����������������������������������������������������������������
����������������������������������������������������������������
����������������������������������������������������������������������������������������������������������������������������������������������������������������
����������������������������������������������������������������������������������������������������������������������������������������������������������������
���������������������������������������������
���������������������������������������������
������������������������������
������������������������������
0
Gaussian Noise
Gaussian Input Signal to ADC Code Occurrence Histogram
Area divided by unequally sized bins
k
C
V
kVf
ADC
g
Figure 3.3: Loopback Conversion Process
37
Although inherent noise from circuits may be enough for specific ap-
plications of the proposed method, precise Gaussian noise may be required in
other applications where test precision is one of the important factors. In this
case, we can implement Gaussian random noise injection circuitry to facilitate
the requirement. Many techniques have been proposed to implement Gaus-
sian noise injection circuits [10, 39]. One technique utilizes pseudo-random
sequence generation capability of LFSR (Linear Feedback Shift Register) [10].
In this technique, LFSR output is connected to a low pass filter through a level
shifter to generate analog random noise based on the pseudo-random sequence
from LFSR as shown in Figure 3.4. The drawback of this method is that the
generated distribution is not strictly Gaussian. Also the mean and the variance
of the distribution are not well controlled by users when using this method.
Another technique is to use thermal noise from resistors and to amplify it to
generate random noise [39]. Figure 3.5 illustrates the topology of the thermal
noise amplification method. This method produces an accurate Gaussian noise
distribution; however it may produce different results due to process variation
of the circuitry. Our method does not require a tightly controlled Gaussian
noise source as shown in the simulation result section. However, the user may
require a parameter control capability on the generated noise for various rea-
sons such as additional debug capability on the signal sources. In such cases,
delta-sigma modulation based Gaussian noise shaping proposed in [10] can be
used to implement such a noise source with a full parameter control capability.
Figure 3.6 illustrates the block diagrams of the noise generator architecture.
38
Figure 3.4: LFSR Based Random Noise Generator [10]
Figure 3.5: Random Noise Generator Based on Thermal Noise Amplifica-tion [39]
Figure 3.6: Random Noise Generator Using Delta-Sigma Modulation [10]
39
The loopback conversion process is explained in Figure 3.3. Suppose
that we have the kth digital code, Ck. This code is converted to an analog
voltage, Vk by the DAC. Since the DAC’s characteristics uniquely determine
Vk for Ck, if the DAC has nonlinearities, the voltage level of Vk is slightly
different from an ideal DAC. Gaussian noise is added before the signal enters
the ADC. With the assumption that the noise is Gaussian with zero mean and
σ2 variance, the input signal of the ADC has its mean at the DAC’s output
voltage and variance of σ2. The ADC divides the area of Gaussian probability
density function (PDF) of signal with respect to its code transition voltages. If
the ADC is completely linear, all code transition voltages are equally spaced,
and thus it produces a Gaussian shaped code occurrence histogram that has
a mean at the center code, C ′kk. However, because of nonlinearities, the ADC
divides the PDF unequally, which results in a code occurrence histogram not
similar to Gaussian distribution. Figure 3.7 illustrates the effect of Gaussian
noise when injected in the middle of the loopback path. In an ideal DAC and
ADC loopback pair, the Gaussian distributions are equally spaced, which is
attributed to the linearity of the DAC. The linear ADC produces Gaussian
code occurrence since the locations of code transition voltages, which deter-
mine the size of code bins, are also equally spaced. However, in a non-ideal
loopback pair case, the nonlinearity of the DAC may position the means of
Gaussian PDFs in a non-uniform manner. Also the nonlinearity of ADC may
produce non-trivial code occurrence histogram due to non-uniform code tran-
sition voltage locations.
40
Suppose that f and g represent code conversion functions of the DAC
and the ADC respectively. A digital code Cin is converted to analog by f(Cin).
Because the noise (E) is added in the middle, the digital loopback output is
represented by Cout = g(E +f(Cin)). Nonlinear functions can be expanded by
Taylor series, hence
Cout =∞
∑
k=0
{f(Cin)}kg(k)(E)
k!
Since f(Cin) is constant and E is converted only by g(k), loopback output code
occurrence histogram is determined by the g(k)(E) term, which means that it
is affected only by the ADC. Thus, the distorted code occurrence histogram is
due only to nonlinearities of the ADC, not those of the DAC.
3.2.2 ADC Test
Figure 3.8 describes the ADC test procedure. Suppose that H is a
matrix that represents the code occurrence histogram. The dimension of H is
m by n where m is the number of codes of DAC and n is the number of voltage
levels of ADC. Each element in H represents the number of code occurrences
for each ADC code and DAC voltage level. The code occurrence histogram
matrix, H is converted to the cumulative distribution function matrix A by
akl =
∑lj=1 hkj
NS
where akl and hkl are the (k,l) elements of A and H respectively, and NS is the
number of samples. Using Gaussian cumulative distribution function (CDF)
that has zero mean and variance of 1, the ADC’s normalized code transition
42
ctr k(l+2)ctr k(l−2) ctrk(l−1) ctr kl ctr k(l+1)
C’k(l−2) C’kl k(l+3)C’
C’k(l−2) C’kl C’k(l+3)
������
������
�����
�����
����������
����������
������������������������������
������������������������������
���������
���������
����
����
����
��������
����
��������
��������
��������������������
��������������������
����
����
�����������
�������
���
���
����������
����������
������������
����������
����������
��������
�������
�������
��������
������
������
����������
����������
����
��������������
��������������
�������
�������
������
������
Code Occurrence Histogram Cumulative Distribution
Gaussian CDF
mean
Projected to
CTr
Code Transition Voltage
1. Obtain the differencesbetween the row elements
2. Average the column elements
Information Matrix
DNLs of ADC produced
Figure 3.8: ADC Test Procedure
43
voltage with respect to zero mean Gaussian can be calculated without de-
pendency on DNLs of the DAC. With zero mean and variance of 1 Gaussian
CDF,
P (x) =1√2π
∫ x
−∞e−
t2
2 dt,
for akl which is neither 0 nor 1, the ADC code transition voltage information
matrix, CTr is derived by
ctrkl = P−1(akl) (3.4)
where ctrkl is the (k,l) element of m by n matrix CTr. The other elements
for which Equation 3.4 is not calculated are filled with zeros. Note that the
transition voltage ratio is preserved even though the calculated value for each
transition voltage is different from the original due to the normalization. Each
row of CTr has the deviation of transition voltages from the mean of Gaussian
input signal of the ADC. Excessive calculation due to the inverse Gaussian
CDF calculation can be reduced by tabulating x and y values of the function
at the cost of memory space. In order to obtain estimated code widths, we
calculate subtraction of the adjacent voltages by
ecwkl = ctrk(l+1) − ctrkl
where ecwkl is the (k,l) element of estimated code width matrix ECW . The
kth row of ECW has the code widths around the center code C ′kl. As the code
span of the code occurrence histogram becomes larger, we have more code
width information around the center code from a single row. Since center
codes increase one by one as we move row by row, we may have some slightly
44
ctl l+2ctl l+1ctl l−2 ctl lctl����
������������
����
l−1
Those are averaged to yield
Code Transition Voltage Locations of ADC
The kth row element locations of (CTL−CTr)
the kth DAC’s voltage level
Figure 3.9: Estimated DAC Output Points from Each ADC Code TransitionVoltages
different code width values in adjacent rows for an identical code width. Code
width information is less accurate as it deviates more from C ′kl. In order to
obtain accurate code widths from this ECW , we average the nonzero elements
in each column of ECW by
CW (i) =
∑nk=1 ecwki
N1i(3.5)
where N1i is the number of nonzero elements in ith column of ECW . Because
CW (i) = CT (i+1)−CT (i), we can eventually derive DNLs of the ADC from
the Equations 3.2 and 3.5. INLs of the ADC also can be derived by Equation
3.3.
45
3.2.3 DAC Test
From the known code widths of the ADC, we can locate the ADC’s
normalized code transition voltages (except for the first and the last code
transition voltage) by
CTLV (i) =
i−1∑
k=0
CW (k)
where CTLV (i) is a row vector of ith code transition voltage location and
CW (0) = 0. As mentioned earlier, CTr from the Equation 3.4 has the de-
viations of transition voltages from the mean of Gaussian input signal of the
ADC. Because we already know accurate code transition voltage locations,
we can exploit this information as an estimation of mean location from each
code transition voltage of the ADC. CTLV is expanded to the same dimension
matrix as CTr by
CTL = Ones(2N) × CTLV
where Ones(2N) is a column vector of all 1s whose length is 2N . After filling
the elements of CTL with zero at the same location as CTr’s zero elements,
each element of (CTL − CTr) indicates an estimated mean location of the
ADC Gaussian input signal. Figure 3.9 describes the estimated DAC voltage
points before average. The mean of the ADC Gaussian input signal is the
output of the DAC before adding the Gaussian noise. Therefore we can derive
normalized voltage level for each DAC output by averaging the row elements
of (CTL − CTr) by
V (i) =
∑nk=1 (ctl − ctr)ik
N2i(3.6)
46
where V (i) is the ith voltage level of the DAC and N2i is the number of
nonzero elements in ith row of (CTL − CTr). As mentioned earlier, not the
real voltage values but the ratios of voltage level are preserved, thus we can
derive DNLs of the DAC by Equations 3.1 and 3.6. INLs of the DAC also can
be derived by Equation 3.3.
3.3 Simulation Results
Simulation by MATLABR© has been performed to validate our method-
ology. First, we modeled an ideal ADC and DAC, and connected them in
loopback mode. Randomly generated nonlinearities were injected into these
models. The DNLs and INLs of the ADC and the DAC were then predicted
by the proposed method. Errors between the originally injected nonlinearities
and the predicted nonlinearities were calculated to measure the accuracy of
our method. During the simulation, we first used 8 bit, 50 MSPS DAC and
ADC, then applied the setup to other bit converters. Reference voltages for
both converters were set to 3V.
Our test methodology requires that some of samples must fall in adja-
cent code bins in order to calculate code widths of the ADC. It needs at least
0.5 LSB deviation in order to move across the code transition voltages. 95% of
samples fall in the deviation of 2σ for Gaussian distribution. Assuming that
at least 5% of code occurrence in adjacent bins is required to calculate code
widths, the noise σ is required to be more than 0.25 LSB.
Figures 3.10 and 3.11 show that the relationship between noise stan-
47
0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
Noise σ (LSB)
Max
DN
L E
rror
(LS
B)
DAC DNL Loopback Test
0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
Noise σ (LSB)
Max
DN
L E
rror
(LS
B)
ADC DNL Loopback Test
Figure 3.10: DNL Prediction Errors vs. Noise σ
dard deviation and absolute values of errors. Table 3.1 summarizes the data
for nonlinearity errors with respect to the noise σ. Similar to our analysis, σ of
less than 0.3 LSB couldn’t validate our methodology. The errors were greatly
decreased when the noise σ was increased from 0.3 LSB to 0.6 LSB. Errors
were almost flat once the noise standard deviation exceeded 0.7 LSB. As the
noise σ increases, code span of code occurrence histogram increases. Increased
code span produces more deviated codes which may produce more erroneous
estimations. This results in the slight error increase for the DAC with high
noise σ. However, we can avoid this problem by cutting off largely deviated
data, and thus noise variance is not a very important factor for our method-
ology. For our test, we chose Gaussian noise with zero mean and standard
deviation of 1 LSB. In case the internal noise is insufficient when we imple-
48
0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
Noise σ (LSB)
Max
INL
Err
or (L
SB
)
DAC INL Loopback Test
0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
Noise σ (LSB)
Max
INL
Err
or (L
SB
)
ADC INL Loopback Test
Figure 3.11: INL Prediction Errors vs. Noise σ
ment real circuits, noise injection circuits which are described in the previous
section can be used to provide additional noise in Gaussian distribution.
As seen in Figures 3.12 and 3.13, errors of less than ±0.25 LSB can be
achieved with 5,000 samples, and more accuracy can be achieved with more
samples. The graphs show absolute values of errors. Table 3.2 shows the
nonlinearity errors with respect to the number of samples. Maximum errors
are almost inversely proportional to the number of samples. Because we use
Gaussian CDF to calculate code widths, the greater the number of samples,
the better the accuracy. A large number of samples are essential especially for
the regions with small code widths because the probability for samples to fall
in the code widths is low. Insufficient code occurrence for those regions can
produce largely erroneous results. In order to obtain practical accuracy, we
49
Noise σ DAC DNL Err. ADC DNL Err. DAC INL Err. ADC INL Err.
0.3 1.996 1.907 2.534 2.4530.5 0.077 0.088 0.170 0.1740.7 0.040 0.051 0.094 0.0890.9 0.041 0.036 0.076 0.0741.1 0.046 0.036 0.065 0.0731.3 0.044 0.028 0.081 0.0661.5 0.045 0.022 0.091 0.0761.7 0.059 0.025 0.083 0.0641.9 0.055 0.027 0.077 0.0692.1 0.067 0.030 0.111 0.0722.3 0.060 0.028 0.105 0.0742.5 0.071 0.025 0.106 0.0952.7 0.072 0.025 0.102 0.0742.9 0.091 0.020 0.099 0.0773.1 0.079 0.026 0.114 0.073
Table 3.1: Nonlinearity Prediction Errors vs. Noise σ (LSB)
#Samples DAC DNL Err. ADC DNL Err. DAC INL Err. ADC INL Err.
5,000 0.102 0.082 0.230 0.22010,000 0.081 0.055 0.167 0.14515,000 0.077 0.044 0.107 0.09620,000 0.060 0.043 0.150 0.13325,000 0.054 0.045 0.152 0.11930,000 0.049 0.033 0.112 0.09735,000 0.049 0.035 0.105 0.10240,000 0.047 0.038 0.100 0.11445,000 0.043 0.040 0.083 0.07750,000 0.038 0.027 0.102 0.090
Table 3.2: Nonlinearity Prediction Errors vs. Number of Samples (LSB)
50
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
x 104
0
0.05
0.1
0.15
0.2
0.25
Number of Samples
Max
DN
L E
rror
(LS
B)
DAC DNL Loopback Test
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
x 104
0
0.05
0.1
0.15
0.2
0.25
Number of Samples
Max
DN
L E
rror
(LS
B)
ADC DNL Loopback Test
Figure 3.12: DNL Prediction Errors vs. Number of Samples
ADC DNL Err. ADC INL Err. DAC DNL Err. DAC INL Err.
Min -0.038 -0.056 -0.032 -0.032Max 0.026 0.025 0.035 0.060Mean -0.002 -0.013 0.003 0.010STD 0.007 0.013 0.010 0.014
Table 3.3: Statistics of Nonlinearity Prediction Errors (LSB)
set the number of samples as 50,000 at each ADC input level.
Figures 3.14 and 3.15 show that maximum prediction errors of less
than ±0.1 LSB are achieved, which validates our methodology. Table 3.3
summarizes statistics of the prediction errors.
Test time can be approximately determined by
(Number of Codes) × (Number of Samples)
Conversion Speed(3.7)
51
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
x 104
0.05
0.1
0.15
0.2
0.25
Number of Samples
Max
INL
Err
or (L
SB
)
DAC INL Loopback Test
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
x 104
0.05
0.1
0.15
0.2
0.25
Number of Samples
Max
INL
Err
or (L
SB
)
ADC INL Loopback Test
Figure 3.13: INL Prediction Errors vs. Number of Samples
=28 × 50, 000
50 × 106= 0.256 sec
Since constructing the code transition voltage information matrix can be per-
formed concurrently with entering input samples and counting the number
of code occurrence, total test time is 0.256 + δ second which is considered a
reasonable test time. As expected, there is a tradeoff between test time and
accuracy. As the number of samples increases, test time also increases while
the errors decrease.
With the selected noise σ and the number of samples, we simulated
other resolution converters. Figures 3.16 and 3.17 show our scheme has less
than ±0.1 LSB errors for the converters of 4 to 18 bits. Even though errors
are small enough in high resolution, it is more appropriate to use this method
52
0 50 100 150 200 250
−0.1
−0.05
0
0.05
0.1
Code Index
Err
or (L
SB
)
Difference between Original and Predicted ADC DNLs
50 100 150 200 250
−0.1
−0.05
0
0.05
0.1
Code Index
Err
or (L
SB
)
Difference between Original and Predicted ADC INLs
Figure 3.14: Prediction Errors of ADC Nonlinearities
for the high speed, medium resolution (6 to 12 bits) converters because of the
test speed determined by Equation 3.7. Table 3.4 summarizes the nonlinearity
errors with respect to the converter resolutions.
3.4 Comparison with Prior Work
A comparison of our method with other methods in [47, 85] is summa-
rized in Table 3.5. Note that the previously mentioned methods in [15, 16] are
not included in this comparison as they are only applicable to ADC tests. This
comparison is based on 8 bit data converters with maximum 3 LSB nonlinear-
ity errors. Note that our method does not have hardware overhead other than
circuits associated with enabling loop test mode on the assumption that we
have on-chip DSP or CPU units which can process the digital input and out-
53
0 50 100 150 200 250
−0.1
−0.05
0
0.05
0.1
Code Index
Err
or (L
SB
)
Difference between Original and Predicted DAC DNLs
50 100 150 200 250
−0.1
−0.05
0
0.05
0.1
Code Index
Err
or (L
SB
)
Difference between Original and Predicted DAC INLs
Figure 3.15: Prediction Errors of DAC Nonlinearities
put of the loopback pair. From the assumption that the loopback test mode is
already available in the ADC and DAC test pair, the proposed method is most
cost effective as well as reasonably accurate among the state-of-art techniques.
The proposed method does not depend on data converter architecture which
is another advantage of the technique, unlike the method in [47]. Although
the test times are not reported in the comparison cases, test time is expected
to be shorter than [85] since it does not require multiple steps to calculate
nonlinearities of the data converters.
3.5 Other Considerations
Although the proposed method successfully predicts the linearities of
data converters, the method relies on the Gaussian randomness of the noise
54
#Bits DAC DNL Err. ADC DNL Err. DAC INL Err. ADC INL Err.
4 0.024 0.027 0.027 0.0185 0.025 0.017 0.029 0.0226 0.026 0.027 0.042 0.0397 0.035 0.026 0.050 0.0548 0.035 0.027 0.078 0.0849 0.037 0.039 0.099 0.09610 0.039 0.037 0.080 0.07111 0.034 0.028 0.099 0.09312 0.040 0.033 0.080 0.06713 0.048 0.028 0.085 0.08714 0.037 0.032 0.095 0.09715 0.037 0.035 0.100 0.09216 0.035 0.028 0.081 0.07617 0.039 0.034 0.093 0.08118 0.042 0.038 0.087 0.099
Table 3.4: Nonlinearity Prediction Errors vs. Converter Resolutions (LSB)
Our Method Huang et al. [47] Shin et al. [85]
Max Error ≤ 0.1 LSB 0.05 LSB 0.67 LSBNo significant LPF, Digital
additional circuits Analog equalizerrequired comparator,
Area ControlOverhead unit,
(Additional Counter,Circuits) Memory,
SwitchesType All types Delta-Sigma All types
Test Time < 0.3 sec N/A N/A
Table 3.5: Comparison among Various BIST Schemes
55
4 6 8 10 12 14 16 180
0.05
0.1
0.15
0.2
Number of Bits
Max
DN
L E
rror
(LS
B)
DAC DNL Loopback Test
4 6 8 10 12 14 16 180
0.05
0.1
0.15
0.2
Number of Bits
Max
DN
L E
rror
(LS
B)
ADC DNL Loopback Test
Figure 3.16: DNL Prediction Errors vs. Converter Resolutions
distribution. This, in turn, may imply that the injected noise distribution,
which may not be perfectly Gaussian in real applications, could cause addi-
tional errors on top of the factors that are analyzed in this chapter. In the real
application, careful characterization of the proposed method may be required
to adjust skew of the prediction errors due to the imperfect Gaussian noise
source. Since the randomness of the Gaussian noise is more important in the
proposed method than the parameters such as variance, users may implement
additional random noise source for which various schemes are reviewed in the
previous section.
56
4 6 8 10 12 14 16 180
0.05
0.1
0.15
0.2
Number of Bits
Max
INL
Err
or (L
SB
)
DAC INL Loopback Test
4 6 8 10 12 14 16 180
0.05
0.1
0.15
0.2
Number of Bits
Max
INL
Err
or (L
SB
)
ADC INL Loopback Test
Figure 3.17: INL Prediction Errors vs. Converter Resolutions
3.6 Summary
In this chapter, we propose a novel methodology to test linearity of
ADC and DAC. With the loopback setup with additional Gaussian noise, a
novel test methodology is developed, which enables us to calculate nonlineari-
ties of ADC and DAC respectively. Accurate DNLs and INLs of each converter
are successfully extracted regardless of fault masking. Other than interconnec-
tion and test mode switches, no additional hardware is necessary as compared
to other BIST schemes.
Since the proposed method is a system level approach, it can be applied
to all ADC and DAC topologies. Compared to testing with automated test
equipment (ATE) or other BIST circuitry, this approach is cost effective and
57
Chapter 4
Phase Interpolator Test Using a Random
Jitter Injection
Non-ideality of margining circuit can lead to an incorrect assessment
of the timing margin when a DFT based loopback test is used to screen the
I/O defects. In this chapter, a new method to measure the linearity of PI code
steps without significant hardware overhead is proposed. Using a random
jitter (RJ) source on the load board, RJ is injected into the data channel
when the data channel is configured as a loopback mode. The amount of the
injected jitter needs to be adjusted such that it can barely close the data eye.
The distribution of the random jitter is captured by two different methods,
which are undersampling and sampling using phase interpolator code sweeping.
Then, the two results are compared and predicted differential nonlinearities
(DNLs) are derived using a series of mathematical calculations.
This chapter is organized as follows. In Section 4.1, we explain various
high speed I/O schemes and basic operation of phase interpolator circuitry.
The motivation of the proposed technique is also described in the section. In
Section 4.2, the proposed technique is explained in detail. Simulation results
and comparison with state-of-art techniques follow in Sections 4.3 and 4.4
59
Channel D QD QData
Local Clock
TX RX
PIChannel
Figure 4.1: Forwarded Clock Scheme
Channel D QD QData
Local Clock
TX RX
CDR
Figure 4.2: Derived Clock Scheme
respectively. In Section 4.5, we summarize the chapter.
4.1 Background
4.1.1 High Speed I/O Design and Phase Interpolator Basics
Typically, there are two architectural variations for I/O clocking schemes
in high speed I/O design [63, 72]. High level block diagrams of both high speed
I/O configurations are illustrated in Figures 4.1 and 4.2. The forwarded clock
scheme shown in Figure 4.1 is used in QuickPath InterconnectTM(QPITM),
Fully Buffered Dual In-line Memory ModuleTM(FBDTM, FB-DIMMTM), etc.
In this scheme, the clock signal is forwarded from transmitter to receiver which
provides inherent tracking between clock and data. PI is implemented to elim-
inate phase skew between clock and data lanes created by board level channel
length differences. Figure 4.2 shows the derived clock scheme which is being
60
V2V1
Vout
I1 I2 I3 I4
R1 R2
T1 T2 T3 T4 T5 T6 T7 T8
Figure 4.3: Phase Interpolator Schematic
used in PCI express, Serial ATA, etc. In this scheme, the clock signal is de-
rived from data transitions in data signal using clock and data recovery (CDR)
circuitry. While the derived clock scheme uses fewer number of pins since it
does not need a clock channel, the data signal is required to provide a suffi-
cient number of transitions to guarantee proper clock signal generation at the
receiver side. 8b/10b encoding is a popular scheme to provide necessary signal
transitions. Although typical CDRs are designed using a structure similar to
a PLL, PI based CDR scheme is gaining popularity due to its fast acquisition
time as well as lower area overhead and better stability control.
Figure 4.3 describes a circuit level diagram of a typical phase interpo-
lator circuit implementation [58]. The output voltage Vout achieved by this
circuitry can be expressed as follows.
Vout = a1V1 + a2V2 (4.1)
61
where a1 and a2 are weighted factors which can be realized by adjustable
current sources (I1 through I4). If V1 and V2 are identical sinusoidal signals
with the phase difference of 90◦, Equation 4.1 can be rewritten as follows.
Using trigonometric identities, we obtain:
Vout = sin(ωt)cos(α) + cos(ωt)sin(α)
= sin(ωt + α) (4.2)
where
V1 = sin(ωt), V2 = cos(ωt) = sin(ωt + 90◦)
a1 = cos(α), a2 = sin(α)
Thus, it is shown that the phase of Vout is shifted by α.
In many phase interpolator designs, the adjustable current sources that
determine the coefficients can be digitally controlled so that the possible set
of interpolation can be uniformly distributed. Each digitally encoded code
is mapped to specific current value of the current source, thus it can achieve
a specific degree of interpolation. An example encoding of PI is shown in
Table 4.1.
Some phase interpolation designs use multiple phase interpolators to
implement high resolution PI circuitry [49]. In one scheme, instead of using
just one PI with in-phase and quadrature phase inputs to create uniformly
distributed middle phase outputs, a network consisting of a coarse resolution
interpolation circuit and multiple fine resolution PIs is used. Although this
scheme can increase the resolution of the PI by reducing the burden of fine
62
Index Code Phase Degree
1 0000 0◦
2 0001 10◦
3 0010 20◦
4 0011 30◦
......
...9 1000 80◦
Table 4.1: Example Code of Phase Interpolator Encoding
tuning of adjustable current sources, it increases the size of the circuitry, thus
becoming more likely to be defective or more sensitive to process variation in
the manufacturing process.
4.1.2 Impact of Nonlinearity of PI
One may ask whether several picoseconds of nonlinearity in PIs would
make a significant difference in high speed I/O performance. To illustrate
the point, we use an example bathtub curve that is captured from a high
speed serial link. Figure 4.4 is drawn based on measurement data from [18].
The curve on the left side shows that it takes about one step of the phase
interpolator’s position to change bit error rate (BER) from 10−12 to 10−9.
This can be translated to 132
UI or 3.125 psec, thus 3.125 psec of jitter can
degrade BER from 10−12 to 10−9.
Addition of a small amount of total jitter can degrade the BER sig-
nificantly due to the sharp slope of the bathtub curve which results from
characterization of most multi gigahertz high speed I/O. If the differential
nonlinearity (DNL) of the PI in the region of the eye boundary is δ and the
63
−20 −15 −10 −5 0 5 10 15 2010
−12
10−10
10−8
10−6
10−4
10−2
100
Phase Interpolator Position
BE
R
Figure 4.4: Example Bathtub Curve of Receiver
measured total jitter using a timing margining technique is Jtotal, the worst
case real jitter can be given as δ+Jtotal, which could exceed specification limits
depending on the test conditions, with the risk of defective part escapes.
D QD QData
Local Clock
TX RX
PI
Random Jitter Injection
Separate Clock Source
Figure 4.5: Proposed Configuration for Forwarded Clock Scheme
64
D QD QData
Local Clock
TX RX
PI
Random Jitter Injection
Separate Clock Source
PI Based CDR
Figure 4.6: Proposed Configuration for Derived Clock Scheme
4.2 Overview of The Proposed Technique
The configurations for our proposed technique to measure nonlinearity
of PI are described in Figures 4.5 and 4.6. We configure the transmitter
(TX) and receiver (RX) such that the data channels are connected together.
This can be either through a loopback configuration on a single device or by
connecting two identical devices by pairing up TX and RX. A separate clock
source is introduced to undersample the data. The quality and the cost of the
clock source are not of concern as high quality and low jitter clock source can
be implemented on-board with low cost [90]. For the forwarded clock scheme,
the clock channel is directly connected to the undersampling clock source as
shown in Figure 4.5. For the derived clock scheme, we use a multiplexer to
reach the PI in the CDR block as shown in Figure 4.6. With this configuration,
we follow the following steps to predict the PI linearity.
Figure 4.7 illustrates the steps to test the PI with the proposed tech-
nique. First, the data port sends an alternating data pattern, i.e., the 1010
sequence to the channel. The jitter injection shown in the figure does not
occur at this step. Depending on the clocking scheme of the high speed serial
65
link, we undersample the data by injecting a clock signal with the period of
T +∆T , where T is a data signal period. When we undersample the signal, we
select ∆T to be the same as the phase resolution or code step of the PI. Since
the PI is connected to the clock signal path, we program the PI code such that
PI does not shift the phase of the signal while undersampling the signal. The
collected undersampled data is then stored as the expected bit sequence from
undersampling. Then we supply a normal clock signal with the period of T
and use the PI to sample the alternating data pattern. We collect the sampled
data for each PI code index and store it as the expected bit sequence from PI
sampling for the code index i. We repeat this procedure until we sweep all
combinations of PI codes and collect the sampled data.
Next, we inject random jitter into the data channel with a fair amount
of variance to barely close the data eye. Since random jitter injection occurs in
data channel, jitter injection capability in the system or load board is required.
Then, we undersample and PI-sample the jitter injected data signal with the
same period and steps.
In order to construct the random jitter distribution vector from the in-
jected random jitter in the data channel, error bit sequences EVus and EVps,i
for undersampling and PI-sampling, respectively, are created by comparing the
expected bit sequence with the jitter injected bit sequence. Since we reduce
other interference factors by comparing bits from the same data source, ran-
dom jitter injection is the only difference, and it is ensured that the distribution
from the error bit sequence will be closer to a random Gaussian distribution.
66
Transmit 1010
Pattern through
Data Lane
Characterize the
Transmitted
Pattern with
Undersampling
Chatacterize the
Transmitted
Pattern with PI
Sampling
Map PI Sampling
Distribution to
Undersampling
Dist. to Calculate
DNL
Inject RJ in the
Middle of
Loopback Channel
1. Repeat
the Steps
with RJ
2. Difference Calc. to
Extract Dist.
Figure 4.7: Test Procedure
67
EVus and EVps,i are re-bucketed based on sampling positions of the data eye,
then these distributions are stored in vector Dpos and D′pos, respectively, after
normalization. The distribution vector Dpos represents a random jitter dis-
tribution sampled with ∆T resolution. If we use an ideal PI to characterize
the jitter, the PI sampled distribution, D′pos will be identical to Dpos under
one condition that the PI’s resolution is also ∆T . Since, in reality, the PI is
not ideal and contains nonlinearities, the distribution collected using PI code
sweeping, D′pos will be different from Dpos. Therefore, we can mathematically
derive DNLs from the difference between Dpos and D′pos. The details of the
mathematical algorithm are explained in the following subsections.
4.2.1 Distribution Vector Creation Using Undersampling
Undersampling techniques have been used to measure jitter of high
speed applications [90]. The idea of undersampling technique is illustrated in
Figure 5.1. In Figure 5.1 (a), the period of the data signal is given as T and
the period of the sampling clock is given as T + ∆T . Since the sampling clock
period is delayed by ∆T , every cycle of data signal is sampled with the delay
of ∆T . After we collect all the sampled data and construct the eye diagram,
the equivalent sampling points of the eye diagram are shown in Figure 5.1 (b).
By comparing the sampled jittery bit sequence with the expected bit
sequence, we can derive an error vector EVus and EVps,i where i represents
PI code index. The normalized distribution vectors can be derived by the
following equations.
68
T 2 T 3 T
T
(a) Undersampling of Alternating Data Pattern
(b) Equivalent Sampling of the Pattern
T
. . . . .
. . . . .
Figure 4.8: Undersampling Technique
For all integers i such that 1 ≤ i ≤ T
∆T,
Dpos(i) =
∑lk=0(EVus(i + k T
∆T))
∑lk=1 EVus(k)
(4.3)
D′pos(i) =
∑lk=1 EVps,i(k)
∑li,k=1 EVps,i(k)
(4.4)
where k in the form of EVus/ps,i(k) represents the kth bit in the vectors and
l is the largest number that makes the vector index become the maximum
number.
69
4.2.2 Calculation of Predicted DNL
Based on the jitter distribution, Dpos, which is derived from the under-
sampling technique, we can calculate DNLs of the PI. We consider Dpos as a
golden reference here since it can be assumed to be the identical distribution
when PI DNLs are all zero. To ease the mapping of distribution and the DNLs,
we use a piecewise cubic polynomial interpolation technique [30]. Piecewise
cubic polynomials of Dpos are given as follows.
For i = 0, 1, 2, . . . , n − 1 and xi ≤ x ≤ xi+1,
Si(x) = ai(x − xi)3 + bi(x − xi)
2 + ci(x − xi) + di (4.5)
For Equation 5.5, we need to determine 4n conditions to have an analytic
solution for the equation. The conditions are given as follows.
Si(xi) = yi, for all i = 0, 1, 2, . . . , n,
Si(xi+1) = yi+1, for all i = 1, 2, 3, . . . , n − 1,
S ′i−1(xi) = S ′
i(xi), for all i = 1, 2, 3, . . . , n − 1,
S ′′i−1(xi) = S ′′
i (xi), for all i = 1, 2, 3, . . . , n − 1,
S ′′0 (x0) = S ′′
n−1(xn) = 0. (4.6)
An example of the piecewise cubic polynomial interpolation result is shown
in Figure 4.9. Once we derive the piecewise polynomials from Dpos, we can
create a vector that contains the estimated positions of sampling points of PI.
Given D′pos(i), a piecewise polynomial exists so that Sk−1(k − 1) ≤ D′
pos(i) ≤
Sk(k). Since there exists more than one piecewise polynomial that satisfies the
70
0 10 20 30 40 500
0.01
0.02
0.03
0.04
0.05
0.06
0.07
DataInterpolation Curve
Figure 4.9: Piecewise Cubic Polynomial Interpolation of Dpos
condition, i.e., the distribution is not monotonic but more like a U-shape, we
need to divide the region so that the search space for the solution is always
monotonic. In our algorithm, we divide the search space into two regions:
0 ≤ i ≤ T2∆T
and T2∆T
< i ≤ T∆T
. Then we use the polynomial in an appropriate
region to find a solution of ti which results in Sk−1(ti) = D′pos(i) where k−1 ≤
ti ≤ k. Here, ti represents the predicted position of sampling point for PI code
index i. The predicted DNL for each PI code index i is determined by the
following equation.
DNLi = ti+1 − ti (4.7)
4.2.3 Random Jitter Injection Considerations
The proposed technique requires a random jitter injection module on
the load board to obtain desired distribution. Various schemes have been
71
proposed to enable jitter injection capability [34, 53]. Keezer et al. [53] pro-
pose a delay adjustment circuit which is used to deskew high speed signals.
By performing AC-coupling of a random noise source which represents the
random noise in voltage, the deskew logic is demonstrated to be used as a
random jitter injection circuit. Various random noise sources are discussed in
Chapter 3 which can be used in conjunction with the delay adjustment circuit
to implement the random jitter injection capability. Figure 4.10 depicts the
architecture of the delay adjustment circuit.
Fujibe et al. [34] propose a timing generator based jitter injection ar-
chitecture. The timing generator is used to control the timing of the signal
transitions. With delay control logic and variable vernier delay line which
provide coarse and fine control of the delay, timing data controls the timing of
the signal transitions on cycle by cycle basis. The generator has been designed
with 90nm CMOS technology and supports 6.5 Gbps signals. Figure 4.11
shows basic building blocks of the timing generator. Since we have capability
to control the timings of each bit, we can implement jitter injection capability
by adding the jitter information to the timing data. Figure 4.12 illustrates
the jitter injection capability implementation based on the timing generator
module.
72
Figure 4.11: Timing Generator Block Diagram [34]
Figure 4.12: RJ Injection Circuitry Block Diagram [34]
74
0 10 20 30 40 50 60 70 80 90 100−10
−8
−6
−4
−2
0
2
4
6
8
10
Phase Interpolator Position (psec)
DN
L (
pse
c)
Injected DNLPredicted DNL
Figure 4.13: Injected DNL vs. Predicted DNL
4.3 Experimental Results
4.3.1 Simulation Configuration
MATLABR© simulation was performed to validate the proposed tech-
nique. First, we configured a 10 GHz forwarded clock serial link simulation
model where the eye size for each data bit was 100 psec. Phase interpolator
resolution was set to 2 psec to provide enough resolution for 10 GHz serial
links. Then we randomly generated DNLs in a uniform distribution with the
maximum value of 3 LSB and injected each one of them to each code posi-
tion of the PI. The proposed algorithm was implemented and the predicted
DNLs were captured per PI code basis. An example simulation result from a
single execution is shown in Figure 4.13. As shown in the figure, the random
75
injection of DNLs and the predicted DNLs are tracking each other very well,
except for the center of the PI sampling code.
The prediction error in the central region can be explained as follows.
As the sampling position for the PI code converges to the center of the data
eye, it has a lower number of the histogram samples due to the lower prob-
ability of the occurrence, which results in lower sharpness in the constructed
piecewise cubic polynomial curves. This increases the prediction error rate
since a sharper curve determines a smaller range of values in x axis for a given
range of values in y axis, as compared to a gradual one. Although the predicted
DNLs are less accurate at the central region, our proposed scheme does not
seem to be flawed since the linearity of phase interpolator is more important
in the region of the signal eye boundary, therein BER is determined by sev-
eral picoseconds of jitter. Therefore, we have configured our scheme such that
we ignore the prediction error in the center of the PI code and we obtained
sufficient prediction accuracy with a fair number of transmitted bit sequences.
4.3.2 Simulation Results
To determine the optimal conditions for the simulation, we performed
the following experiments. First, we experimentally studied the impact of
amount of RJ to the prediction accuracy. We increased the amount of the
root mean square (RMS) value of the RJ’s standard deviation (σ) from 1 psec
RMS to 30 psec RMS, and repeated the simulation for each RJ σ value to
analyze the impact. Figure 4.14 shows the simulation results and Table 4.2
76
0 5 10 15 20 25 300
0.5
1
1.5
2
2.5
3
3.5
4
RJ sigma (psec)
Prediction RMS Error (LSB)
Figure 4.14: Injected Random Jitter vs. Prediction Error
RJ σ DNL RMS Error
1 3.8273 2.3725 0.8957 0.5259 0.31311 0.34313 0.39115 0.63017 0.42919 0.88121 0.66723 1.12225 0.70027 0.77929 0.910
Table 4.2: Nonlinearity Prediction Errors vs. RJ σ (LSB)
77
102
103
104
105
106
107
0
0.5
1
1.5
2
2.5
3
3.5
4
Number of Bits
Pre
dic
tion
RM
S E
rro
r (L
SB
)
Figure 4.15: Number of Bits in Alternating Data Sequence vs. PredictionError
summarizes the data. As can be seen, the prediction accuracy increased when
the RJ σ value increased, and when it reached 7 psec the prediction error
reached the lowest level and stayed at this level until the value became 15 psec.
From 15 psec to 30 psec, the prediction error started increasing, showing more
uncertainty for the simulation. This is because two Gaussian distributions of
random jitter at the left and right sides of the signal eye are convolved together
if excessive random jitter exists. This effect reduces the sharpness of the slope
for the distribution thus degrading the prediction performance. Since it did
not show dependency on the amount of RJ σ for the range of 7 psec to 15
psec, we selected 10 psec RMS as a simulation set point.
Next, we experimented on the relationship between the number of
78
0 0.5 1 1.5 2 2.5 30
0.5
1
1.5
2
2.5
3
3.5
Injected DNL RMS (LSB)
Pre
dic
ted
DN
L R
MS
(L
SB
)
Figure 4.16: Monte Carlo Simulation of the Proposed Technique
transmitted bits and prediction accuracy. The result in Figure 4.15 shows
that the prediction error decreases as the number of transmitted bits increases.
Once it reached 500,000 bits, the prediction accuracy became stabilized. We
selected 1 million bit sequences as a simulation set point to obtain an accurate
result with reasonable simulation performance.
Using the selected simulation conditions, we performed a Monte-Carlo
simulation with randomly generated ensembles to determine the repeatability
of our scheme. 100 iterations of this simulation set were performed. Fig-
ure 4.16 describes the result of the simulation. Simulation conditions as well
as resultant mean and standard deviation of the prediction RMS error are
summarized in Table 4.3. As shown in the results, our scheme can predict the
79
Description Value
Link Speed 10 GHzNum of Bits 1000,000Injected RJ σ 10 psec RMS
Prediction Error Mean 0.31 LSBPrediction Error STD 0.12
Table 4.3: Summary of Simulation Condition and Results
DNLs with a mean prediction error of 0.31 LSB or 0.62 psec. 99.7% of the
measurements will be within the error of 0.31 + 3 × STD or 0.67 LSB if we
assume that the measurement error distribution is Gaussian.
Another experiment was performed to observe the sensitivity of our
scheme to periodic jitter (PJ). We modeled sinusoidal jitter with a frequency
of 200 MHz and mixed it with an RJ σ of 10 psec RMS. With the same
configuration of the serial link simulation, we plotted DNL prediction error
for various amounts of PJ in amplitude while we fixed RJ σ at 10 psec RMS.
Figure 4.17 and Table 4.4 shows the result; the prediction error increased
slightly as more PJ was present. This result is expected as periodic jitter is a
part of deterministic jitter which can be modeled as dual Dirac delta functions
and affects monotonicity and slope of distribution Dpos. This consequently
resulted in slight increase in the misprediction rate of the proposed algorithm.
4.4 Comparison with Prior Work
In this section, we discuss the advantages and disadvantages of var-
ious state-of-art test schemes for PI. Table 4.5 summarizes the comparison
80
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50
0.5
1
1.5
2
2.5
PJ Amplitude (psec)
Prediction RMS Error (LSB)
Figure 4.17: Injected Periodic Jitter vs. Prediction Error
PJ Amp. (psec) DNL RMS Error
0 0.2820.5 0.3091 0.341
1.5 0.4302 0.437
2.5 0.5093 0.842
3.5 0.7554 0.905
4.5 1.0675 1.176
Table 4.4: Nonlinearity Prediction Errors vs. PJ Amplitude (LSB)
81
Our Method Provost [77] Shi et al. [83]
Mean Error 0.31 LSB N/A N/AMax Error 0.67 LSB ≥ 1 LSB N/A
No significant Additional PI Phase detector,additional circuits if not avail. Phase-to-voltage converter,
required ADCArea
Overhead(AdditionalCircuits)
Table 4.5: Comparison among Various PI Test Schemes
items. Although our method does not require significant additional logic to
enable PI test capability, mutual PI test scheme [77] requires additional phase
interpolator circuitry if it is not already available in the design. Also the max-
imum prediction error for the method cannot be lower than 1 LSB since the
resolution of the test cannot be better than one PI step.
PI test circuit implementation method described in [83] requires addi-
tional circuits such as a phase detector, a phase-to-voltage converter and an
ADC which consume significant amount of real estate in silicon. Although the
accuracy of the technique is not reported in the literature, PVT variation af-
fects the accuracy of the measurement circuits, hence calibration of the analog
circuitry may be required to obtain desired accuracy, which may increase test
time.
82
4.5 Summary
In this chapter, we proposed an efficient test technique for phase inter-
polators using random jitter injection. The proposed algorithm is cost effective
in that it does not require hardware overhead in the case of forwarded clock
scheme. For the derived clock scheme, we only need to implement one mul-
tiplexer in the clock lane to implement clock injection capability. The load
board or system board only needs to contain a random jitter injector and a
clock generator for undersampling purpose. The proposed algorithm based
on statistical data collection and curve fitting scheme accurately predicts the
DNL of the phase interpolator. Simulation results show that our method ac-
curately predicted nonlinearities of the PI. Since our method does not require
significant circuit changes, it can be easily applied to various high speed I/O
microarchitectures where PI is used.
This method can be implemented in production test in a cost effective
manner and combined with timing margining tests, such that margining steps,
which are equivalent to PI code steps, are accurately mapped to actual timing
information. This information can then be used to margin the timing of the
eye, thus reducing the need of excessive guardbanding based on the worst case
variation scenarios.
83
Chapter 5
Phase Interpolator Test Using a Sliding
Window Search
As discussed in the previous chapter, although the timing margining
test based on loopback mode provides a cost effective test method for high
speed I/Os, it has its own limitation; since the margining is required to move
by an even distance, non-uniform steps in PI could result in an incorrect as-
sessment of the timing margin [28]. This can be translated to either false pass
which results in test escape and increases defective parts per million (DPPM),
or false fail which decreases yield by sacrificing good devices. There are a few
previous papers in this area to mitigate the risk [77, 83]; most of them require
additional circuitry to measure the linearity of the PI. Chun et al. [28] propose
a PI linearity test technique using random jitter injection. This method pro-
vides a PI linearity test capability without major circuit modification; however
it still requires a random jitter injection source on the load board to properly
characterize the PI linearity.
In this chapter, a novel PI test technique is presented, which is an ex-
tension of the method described in the previous chapter. Using the proposed
algorithm utilizing intrinsic jitter, our new method does not require additional
84
random jitter injection. We demonstrate that our method works both in sim-
ulation and on a low cost high volume manufacturing (HVM) tester environ-
ment. This chapter is organized as follows. Section 5.1 reviews basic theory
behind the proposed method. The proposed technique is described in Section
5.2. Experimental results are discussed in Section 5.3 for both simulation and
hardware validation. In Section 5.4, comparison with the RJ injection based
test method is presented. Section 5.5 summarizes the chapter.
5.1 Preliminaries
5.1.1 Undersampling Technique Basics
For serial link testing, selection of the sampling frequency for the mea-
sured signal is challenging since the speed of the link is already high, and
hence it is difficult to further increase the sampling frequency. Due to this
challenge, undersampling is widely used in both ATE based [29] and on-chip
based [90] methods. The concept of the undersampling technique is illustrated
in Figure 5.1.
For a clock like signal, i.e. 1010 signal (measured signal), let T be the
period of the measured signal. dT is selected so that the new sampling period
T +dT would result in a slightly lower sampling frequency as compared to the
original one as shown in Figure 5.1 (a). Since the measured signal is periodic,
the coherent undersampling results in strobe scanning of the measured signal
with an effective resolution of dT as illustrated in Figure 5.1 (b). Due to the
jitter present in the signal edge, the undersampling of the measured signal’s
85
R 1 1 R R 0 0 R R 1 1 R R 0 0 R R 1 1 R R 0 0RR 1 1 R R 0 0 R R
T
R 1 1 R R 1 1 R R 0 0 R R 1 1 R R 0 0 R R 1 1 R R 0 0RR 1 1 R 0 0RR 1 1 R R 0 0 R R
dT 2dT
dT
(a)Clock
(b)Clock
(a)Signal
(b)Signal
(c)Sample
Figure 5.1: Undersampling Technique Concept
jittery region (i.e. transition region) yields random 0’s and 1’s, which is de-
scribed in Figure 5.1 (c). In the figure, ‘R’ denotes a binary value which can
be either 0 or 1.
5.1.2 Jitter and BER
The bits sampled from the transition region create a stochastic distri-
bution which can be denoted as a probability density function (PDF), fJ(t).
BER can be described as a function of fJ as follows.
BER(ts) =
∫ ∞
ts
fJ(t)dt (5.1)
The signal edge can be considered to occur at ts when BER(ts) = 0.5 as
long as the jitter aliasing is minimal. Now for the undersampled bit sequence,
86
BER(ts) can be re-written as a discrete function:
BER(tn) =∞
∑
k=tn
fJ(k) (5.2)
5.2 Proposed Technique
5.2.1 Test Procedure
Figure 5.2 describes the test procedure of the proposed method. First,
the PI circuit’s input/outputs are re-configured as conceptually depicted in
Figure 5.3. An internal clock signal, fm of the period T is supplied to the
PI and delayed by the PI step configuration. The fs signal is supplied, which
has the period T + dT , externally to undersample the shifted signal fm. A
comparator input voltage, Vt is properly set so that the edge transition voltage
is picked and sampled at the flip-flop (FF). The comparator and the FF need
to be properly designed to operate at high frequencies. Next, in order to find
the estimated edge location, the sliding window search algorithm which we
describe in the next subsection is used. After obtaining the estimated edge
location, the PI step is advanced by one and the proposed procedure and
algorithm are repeated to construct the estimated edge location array. From
the estimation, the differential nonlinearity (DNL) and integral nonlinearity
(INL) of each PI step are calculated.
87
Reconfigure PI for
Test Mode
Operation
Undersample the
PI Output
Apply Sliding
Window Search
Algorithm to the
Undersampled
Bitstream
Store the
Estimated PI Code
Location
1. Step PI by One
And Repeat the
Procedure until
All Data Collection
Is Complete
Calculate DNL/INL
2
Figure 5.2: Test Procedure
Comparator
Vt
PIFF
fs
fm
Out
Figure 5.3: Circuit Configuration Concept
88
5.2.2 Jitter Aliasing Reduction Algorithm Using Sliding WindowSearch
There exist related approaches to measure jitter and delays using un-
dersampling. Before presenting our method, the difference among the related
approaches is discussed. Huang and Cheng [46] propose a time interval mea-
surement technique based on undersampling. Since the method aggregates the
distribution in entire transition regions, jitter components are aliased, thus
deteriorating the measurements. The authors indicate the jitter in sampling
clock needs to be under a certain limit to obtain a desired accuracy on the
time interval measurement.
The jitter aliasing problem between the measured signal and the sam-
pling clock signal can be alleviated by various techniques. In order to measure
jitter accurately, Sunter et al. [90] propose a median alignment technique to
reduce the low frequency jitter induced by the undersampling clock. In this
technique, rather than aggregating all transition regions, each transition re-
gion’s median is first determined, then a jitter distribution is created based
on the median aligned aggregation of the distributions. This technique re-
quires either additional on-chip circuitry to determine the median value of
each transition region, or post-processing of the entire bit stream to aggregate
the distribution with the median alignment, which may be computationally
expensive, when implemented in a low cost HVM ATE environment. Hong et
al. [42] propose a method based on a root mean square calculation of individ-
ual standard deviations (σs) on the transition region. This method provides
89
good accuracy on jitter measurement since the σs are first calculated based
on each transition region; however it is mainly applicable to random jitter
measurement rather than measuring linearity.
The aforementioned approaches focus on the individual transition re-
gions to obtain an accurate jitter distribution by limiting the low frequency
jitter aliasing effect. Our method focuses on the fact that adjacent samples
are less susceptible to low frequency jitter. This translates to the idea that if
the scope of edge location search is limited in a certain window, whose size can
be either larger than the size of an individual transition region or smaller than
that, it will result in the distribution which suffers less from low frequency
jitter. We define a window for the calculation of the edge location as follows.
For a bit position n in an integer, b(n) is defined as
b(n) = 1, for bit 1
= −1, for bit 0 (5.3)
A transition density function can be defined as follows.
Ftd(n) =
m∑
i=−m
b(n + i) (5.4)
where, m in integer is an empirical parameter to determine the window size.
If the amount of jitter on the sampling clock is excessive, the window size can
be decreased to avoid excessive aliasing. With the defined summation window,
n is searched so that it satisfies Ftd(n) = 0, which is an equivalent point of
BER = 0.5. The proposed method is referred to as the sliding window search
90
algorithm, since it works as if the summation window slides towards the right
side as n increases when the undersampled bit stream is denoted from the left
to the right.
Since both rising edge and falling edge can yield Ftd(n) = 0, differential
values were checked so that the solutions corresponding to rising edges are
used for our technique. In fact, either rising or falling edges can be used as
long as only one is used for consistency. This algorithm is simple and easy
to implement in a low cost HVM tester environment where the size of the
response capture memory is limited.
5.2.3 Interpolation Technique to Overcome Finite Resolution
There may be the case where the solution that yields Ftd(n) = 0 does
not exist. In this case, the closest solution to determine the edge location is
the integer value n, which makes abs(Ftd(n)) to be minimum. This is because
the function is discrete in terms of the effective resolution dT . By reducing
the value of dT , we can increase the resolution, and subsequently make the
transition density function more continuous. However, it has a tradeoff in
that test time increases in lieu of the resolution improvement, since the num-
ber of samples increases. Physical hardware limitations might also exist for
the undersampling clock generation in that the supported frequency of the
signal generator cannot provide a finer effective resolution. Piecewise cubic
polynomials can be used to fit the distribution into a continuous function.
This interpolation technique provides benefits to overcome the finite resolu-
91
tion problem, while keeping the number of samples consistent. The piecewise
cubic polynomial based on Ftd(n) is defined as
For i = 0, 1, 2, . . . , n − 1 and ti ≤ t ≤ ti+1,
Fi(t) = ai(t − ti)3 + bi(t − ti)
2 + ci(t − ti) + ti (5.5)
For Equation 5.5, 4n conditions can be determined to have an analytic
solution. The conditions are given as follows.
Fi(ti) = Ftd(i), for all i = 0, 1, 2, . . . , n,
Fi(ti+1) = Ftd(i + 1), for all i = 1, 2, 3, . . . , n − 1,
F ′i−1(ti) = F ′
i (ti), for all i = 1, 2, 3, . . . , n − 1,
F ′′i−1(ti) = F ′′
i (ti), for all i = 1, 2, 3, . . . , n − 1,
F ′′0 (t0) = F ′′
n−1(tn) = 0. (5.6)
Due to non-monotonicity of the Ftd(n) as well as Fi(t), there may exist
multiple solutions that satisfy the conditions. Let t1, t2, . . . , tk be the multiple
solutions that satisfy the given conditions, and the final signal transition posi-
tion L can be derived by averaging the solution values and multiplying by dT
as follows.
L =∑
j
tjj× dT (5.7)
Now since the L is determined for one PI step position, PI step can be
advanced by one and repeat the whole process. Let us define Li the estimated
92
edge location for the PI step i, and the DNL and the INL can be derived for
step i as follows.
DNLi = Li+1 − Li (5.8)
INLi =∑i
k=1 DNLk (5.9)
5.3 Experimental Results
Our technique was validated both in simulation and through hardware
measurements. We simulated the DNL estimation part only for convenience
of analysis, since INL measurement error can be calculated as aggregation of
DNL measurement error. Both INL and DNL were measured for the hardware
validation.
5.3.1 Simulation Results
Numerical simulation using MATLABR© was performed to validate the
proposed algorithm. The parameters used in the models are listed in Table 5.1.
In order to assess the validity of the method at various operating frequencies,
two sets of simulation conditions are used. Random DNL values were gen-
erated and injected to the PI model and DNL estimations were calculated
and error of the estimation was reported. For comparison purposes, we devel-
oped the models for the two cases: 1) undersampling method along with our
proposed sliding window search algorithm, and 2) the plain undersampling
method where nonlinearity is calculated by aggregating entire distributions
from all transition regions [46]. 3 parameters were varied to understand the
93
Description Condition A Condition B
Link Speed 2 Gbps 10 GbpsPI step size 5 psec 2 psec
Undersampling dT 2 psec 2 psecInjected RJ σ 5 psec RMS 5 psec RMS
Injected PJ Amplitude 2 psec 2 psecPJ Frequency 200 MHz 200 MHz
Table 5.1: Summary of Simulation Conditions
capability and limitations of our technique: the window size, m, number of
sampled bits, and amount of random jitter (RJ) in conjunction with periodic
jitter (PJ). With the optimal values in each parameter, another simulation
was performed with 100 iterations to understand repeatability.
5.3.1.1 Size of Window Sweep
Random DNL values were injected in the simulation and the corre-
sponding estimated DNL values based on the proposed algorithm were cal-
culated. The size of the window, m, was varied from 1 to 1,000 in log scale
while we set the number of sampled bit as 10,000. Mean estimation errors
of 10 iterations per data point were plotted with respect to the size of m as
shown in Figures 5.4 and 5.5. For Condition A, as shown in the plot, the
estimation error was minimum when m was between about 10 to 100 for the
given simulation conditions. From this data, m was selected as 30 for the
rest of simulations. For Condition B, the region which provided lowest pre-
diction error was between about 10 to 40, which was narrower than the case
for condition A. This result is expected since higher frequency I/O operation
94
100
101
102
103
0
5
10
15
20
25
30
Window Size (m)
Pre
dic
tio
n E
rro
r (L
SB
)
Figure 5.4: Estimation Error vs. Size of Window (Condition A)
decreases signal eye width which makes the signal more susceptible to aliased
jitter. m = 20 was selected to obtain minimum prediction error for Condition
B.
5.3.1.2 Number of Samples Sweep
In this simulation, the number of transmitted bits was varied from
100 to 10,000 to determine the dependency on the number of samples. For
each iteration of the simulation, a DNL value was randomly generated, and
then an estimated DNL was calculated using each algorithm. 20 iterations
per each number of samples were made to obtain stable results. Figures 5.6
and 5.7 present the estimation error for each data point of the number of bits.
95
100
101
102
103
0
5
10
15
20
25
30
Window Size (m)
Pre
dic
tio
n E
rro
r (L
SB
)
Figure 5.5: Estimation Error vs. Size of Window (Condition B)
Tables 5.2 and 5.3 summarize the data. In general, both algorithms showed
that the accuracy greatly increased once more than 1,000 bits were sampled.
For Condition B, prediction errors of the undersampling method were greater
than the ones for Condition A. Prediction errors of the sliding window search
based method were slightly increased for Condition B as compared to the ones
for Condition A. The proposed method based on the sliding window search
algorithm yielded better accuracy as compared to the plain undersampling
method in both conditions.
96
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000
2
4
6
8
10
12
14
16
18
20
Number of Bits
Pre
dic
tio
n E
rro
r (L
SB
)
UndersamplingUndersampling w/ Sliding Window Search
Figure 5.6: Estimation Error vs. Number of Bits (Condition A)
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000
2
4
6
8
10
Number of Bits
Pre
dic
tio
n E
rro
r (L
SB
)
UndersamplingUndersampling w/ Sliding Window Search
Figure 5.7: Estimation Error vs. Number of Bits (Condition B)
97
#Bits Undersampling w/ Sliding Window Search
100 12.75 9.5200 19.65 0.9300 18.9 0.5400 12.1 1.25500 16.65 0.6600 17.45 0.55700 17.85 0.6800 17.6 0.45900 15.1 1.1
1,000 0.85 0.82,000 0.95 0.453,000 0.9 0.354,000 1.25 0.25,000 0.7 0.36,000 1.1 0.37,000 0.85 0.28,000 1.45 0.19,000 1.05 0.310,000 0.8 0.3
Table 5.2: Estimation Error (LSB) vs. Number of Bits (Condition A)
98
#Bits Undersampling w/ Sliding Window Search
100 8.375 4.375200 2.25 4.5300 3.5 1.125400 2.375 1.75500 2.25 1.125600 2.875 1.0700 2.375 0.875800 3.75 0.5900 1.875 0.75
1,000 2.375 0.752,000 2.125 0.753,000 2.5 0.6254,000 2.125 0.255,000 1.5 0.56,000 2.875 0.57,000 1.375 0.58,000 1.5 0.3759,000 1.5 0.510,000 1.75 0.5
Table 5.3: Estimation Error (LSB) vs. Number of Bits (Condition B)
99
0 2 4 6 8 10 12 14 16 18 200
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
RJ RMS (psec)
Pre
dic
tio
n E
rro
r (L
SB
)
Undersampling
Undersampling w/ Sliding Window Search
Figure 5.8: Estimation Error vs. RJ σ (Condition A)
5.3.1.3 Amount of Jitter Sweep
In this simulation for Condition A, we set the amplitude of PJ to 10
psec across the simulation data points and varied RJ from 1 psec to 20 psec,
which ensured to cover both PJ and RJ component dominant cases. 10,000
bits were sampled and 20 iterations were made to select reliable data points.
Figure 5.8 and Table 5.4 present the results. Although there was a trend of
a slight increase of estimation errors when RJ was increased, the proposed
algorithm again showed better accuracy as compared to the undersampling
only case.
For Condition B, we scaled the amplitude of PJ to 3 psec to inject
reasonable jitter amount with respect to the link speed. Then we varied RJ
100
RJ RMS (psec) Undersampling w/ Sliding Window Search
1 0.4 0.12 0.65 0.13 0.65 0.14 0.65 0.25 1.25 0.26 0.7 0.27 0.9 0.158 1.25 0.39 0.7 0.110 1.3 0.311 1.05 0.312 1.3 0.313 1.3 0.214 1.25 0.5515 1.2 0.516 1.4 0.417 1.35 0.2518 1.85 0.319 1.3 0.4520 1.45 0.25
Table 5.4: Estimation Error (LSB) vs. RJ σ (Condition A)
101
0 2 4 6 8 10 12 14 16 18 200
1
2
3
4
5
RJ RMS (psec)
Pre
dic
tio
n E
rro
r (L
SB
)
UndersamplingUndersampling w/ Sliding Window Search
Figure 5.9: Estimation Error vs. RJ σ (Condition B)
from 1 psec to 20 psec. The same number of bits were sampled and 20 iterations
were made to select reliable data points. The results are presented in Figure 5.9
and Table 5.5. In addition to the fact that there is a general trend of the
sliding window search yielding better accuracy, the prediction errors for the
undersampling method were larger than the ones for Condition A.
5.3.1.4 Repeatability Analysis
With the given simulation conditions, we ran the simulation 100 times
to observe the repeatability of our method. Randomly generated DNL values
were injected and corresponding estimated DNLs based on each algorithm were
plotted to determine the repeatability. 10,000 bits were sampled to estimate
102
RJ RMS (psec) Undersampling w/ Sliding Window Search
1 1.0 1.02 1.0 1.1253 1.125 1.04 1.625 0.55 2.375 0.56 2.625 0.57 1.0 0.58 2.5 0.59 3.5 0.37510 2.25 0.87511 1.75 0.7512 4.25 0.513 2.25 0.7514 3.875 0.515 3.625 0.7516 2.125 1.12517 1.875 1.018 3.75 1.62519 3.375 2.520 3.25 2.125
Table 5.5: Estimation Error (LSB) vs. RJ σ (Condition B)
103
0 5 10 15 20 25 30 350
10
20
30
40
50
60
Injected DNL (LSB)
Pre
dic
ted
DN
L (
LS
B)
UndersamplingUndersampling w/ Sliding Window Search linear
Figure 5.10: Estimated DNL vs. Injected DNL (Condition A)
the DNLs. Figures 5.10 and 5.11 present the comparison between the plain
undersampling method and our sliding window search algorithm based method
for 2 Gbps and 10 Gbps links, respectively. More outliers were observed in
the plain undersampling method as compared to the proposed method, since
our method reduces the aliasing impact of injected jitters. Mean of DNL
estimation errors and standard deviation (STD) of the errors are summarized
in Tables 5.6 and 5.7. The results suggest that the sliding window search
algorithm is more repeatable as compared to the plain undersampling method.
Note that both mean error and STD of the errors were slightly increased in
10 Gbps link as compared to the ones for 2 Gbps link.
104
w/ SlidingUndersampling Window Search
DNL Mean Error 4.90 0.226DNL Error STD 14.79 0.20
Table 5.6: Repeatability Analysis Results (LSB) (Condition A)
0 2 4 6 8 10 12 14 160
5
10
15
20
25
30
35
Injected DNL (LSB)
Pre
dic
ted
DN
L (
LS
B)
UndersamplingUndersampling with Sliding Window Search linear
Figure 5.11: Estimated DNL vs. Injected DNL (Condition B)
w/ SlidingUndersampling Window Search
DNL Mean Error 3.918 0.528DNL Error STD 4.132 0.353
Table 5.7: Repeatability Analysis Results (LSB) (Condition B)
105
5.3.2 Hardware Validation
Hardware measurement using the proposed method was performed us-
ing silicon with a 1.25 Gbps serial link circuit. The serial link was implemented
with a PI based clock and data recovery (CDR) circuit which had 32 pro-
grammable PI steps. A design for testability (DFT) feature was implemented
to enable the proposed technique.
The general HVM test configuration for the serial link was based on the
loopback test scheme in that the TX output is connected to the RX input to
enable the timing margining based loopback. During the PI test mode, multi-
plexers and test mode signals reconfigured the PI based CDR circuitry to have
a proper undersampling setup, which is conceptually depicted in Figure 5.3.
Test content was developed using the proposed technique in a low cost
production tester environment. The ATE configuration included a standard
digital module that supports up to 667 Mbps input data rate and 500 Mbps
output sampling rate for general production test purposes. An arbitrary wave-
form generator (AWG) was installed to enable the proposed technique by driv-
ing a proper clock signal with the period of T + dT into the DUT. Figure 5.12
illustrates the hardware validation environment. The sampler circuit was im-
plemented in a way that it operates in high speed and latches 1 if the input
voltage is more than the threshold voltage, which is considered as the signal
transition voltage. The digital module denoted as ‘DM’ was connected to gen-
eral purpose I/Os (GPIOs) as well as the output of the sampler. The AWG
drove the undersampling clock for the sampler. When creating the undersam-
106
ATE Platform
DM
AWG
HCDPS
LCDPS
DUT
GPIOs
SerDes TX
PI
SerDes RX
Sampler
Sync
Gen
Figure 5.12: Hardware Validation Configuration
107
Pattern
xxx
xxx
xxx
SNDC 1
xxx
A:
xxx
JECH A
xxx
xxx
xxx
.
.
.
EXIT
xxx
PG#1
Pattern
xxx
xxx
xxx
A:
xxx
JECH A
xxx
xxx
xxx
.
.
.
EXIT
xxx
PG#2
Test Pgm:
C++ code;
C++ code;
C++ code;
.
.
executePattern();
C++ code;
.
.
Callback {
C++ code;
C++ code;
.
.
.
restartPattern();
}
(a)
(b)
(b)
(c)
(d)
(d)
(e)(e)
Figure 5.13: Tester Pattern and Test Program Synchronization
pling clock waveform, up to 5th harmonics were used to generate a low jitter
square waveform. Clock domains between two modules were synchronized by
a sync generation module.
In order to develop tester patterns that fit in pattern memory require-
ments and properly synchronize the digital module with the AWG, the fol-
lowing pattern and test program handshaking architecture was proposed as
illustrated in Figure 5.13. Each module had its own pattern generator which
is denoted as PG#1 or PG#2. The actual sequence is enumerated from (a)
108
through (e). The test program which was implemented in C++ had an initial-
ization code that was executed at the beginning of the test (a). The ‘execute
pattern’ function in the test program invoked patterns in both domains and ran
pre-conditioning patterns to enable the test mode on the DUT (b). A call back
function (c) was introduced to handle tasks such as tester conditioning and
initial result comparison that needed to be executed after the pre-conditioning
patterns. While the call back function was being executed the pattern from
PG#2 was running in an infinite loop to wait for the call back function to
complete. The call back function contained a ‘pattern restart’ function (d)
that properly synchronized the two domains again to collect undersampled bit
stream. The undersampled bit stream from the DUT was stored in vector
memory and after the patterns were executed, it returned to the test program
to post-process the data to extract the PI step edge location (e). The pro-
gram repeated the sequence to construct the full PI step edge locations and
extracted DNLs and INLs. Note that for the hardware validation, the reso-
lution interpolation technique was not implemented to create a simpler test
program.
At a nominal device supply voltage and room temperature condition,
we performed 12 measurements on 1 DUT and plotted the PI step edge lo-
cations. A normalized graph with respect to the 32 locations is described in
Figure 5.14. As shown in the figure, depending on the initial starting point
of the undersampling, the estimated PI step location starts from a different
location and the value wrap around when it reaches the maximum value. This
109
Figure 5.14: PI Step Location Plot for 32 Positions
fact does not impact the stability of our algorithm, since DNL values are cal-
culated based on differential values of the locations. For the cases where the
edge location values were wrapped around, proper handling was implemented
to yield correct DNL and subsequently INL estimations.
After collecting the 12 measurements, measured values were averaged
to calculate the final DNLs and INLs for the device. The same sequence was
repeated on 4 DUTs. The initial data collection on 4 DUTs suggested that
the synchronization delay induced between ATE’s dual pattern generation op-
eration and AWG instrument clock degraded the accuracy of the linearity
test technique. By characterizing the offset delta induced by this delay and
110
DUT ID A B C D
DNL Max Error 0.21 0.57 0.35 0.47INL Max Error 2.90 1.16 0.89 0.66DNL Error STD 0.16 0.24 0.20 0.17INL Error STD 0.94 0.51 0.61 0.48
Table 5.8: Hardware Validation Results (LSB)
properly compensating the measured data by applying the offset, we obtained
repeatable results which are summarized in Table 5.8 in a normalized man-
ner. The hardware measurement results were within reasonable error margin
as compared to 2 Gbps simulation results considering a slight difference on
simulation condition and speed.
Correlation activities were performed between system and tester result
outliers. A few DUTs randomly failed when the proposed method was used
in an HVM tester, although the same parts showed good linearities on a sys-
tem board. The failure was observed as an abrupt jump of the calculated
PI step location values for certain PI steps. Although the effort to minimize
the configuration difference between the system board and the tester such as
matching the board termination resistance did not yield the positive results,
experiments showed that increasing the common mode voltage for the under-
sampling clock signal improved the correlation results. Further investigation
identified that instrument noise from the tester pin electronics as well as the
AWG degraded signal integrity of the undersampling clock and the resultant
clock signal was marginal for certain DUTs in a random manner. Increasing
common mode voltage improved signal to noise ratio (SNR) of the undersam-
111
Before After
DNL Error 18.97 0.09INL Error 53.48 0.20
Table 5.9: Before & After Voltage Correction Results (LSB)
pling clock signal and resulted in elimination of the phase divergence issue.
Table 5.9 summarizes the difference between outlier results and corrected re-
sults after increasing the common mode voltage.
5.4 Comparison with RJ Injection Based PI Test Method
Table 5.10 summarizes the comparison between the RJ injection based
PI test method presented in the previous chapter and the sliding window
search based PI test method. In order to present fair comparison, simulation
results for 10 Gbps high speed I/O configuration are compared. 3σ based
maximum prediction error calculation results in 0.67 LSB for the RJ injection
based method and 1.58 LSB for the sliding window search based method. In
general, maximum prediction error is the most important factor to consider
when adopting certain method, since that the mean prediction errors can be
compensated by characterizing the measurement, and then applying offset
in real HVM test environment. With this consideration, RJ injection based
method is a more desirable method in terms of the test accuracy. However, the
sliding window search algorithm based method does not require a RJ injection
source on the load board which provides simpler and more cost effective test
solution. Users should consider the tradeoff between accuracy and cost to
112
RJ Injection Sliding Window Search
Mean DNL Error 0.31 LSB 0.53 LSBError STD 0.12 LSB 0.35 LSBMax Error 0.67 LSB 1.58 LSB
Tested Freq. 10 Gbps 10 GbpsArea Overhead Not significant Not significant
Additional RJ injection RJ injectionEquipment source required source not required
Table 5.10: Comparison between Two PI Test Methods
determine the optimum PI test solution for the application.
5.5 Summary
In this chapter, a cost effective phase interpolator test technique is
proposed. We demonstrate that the proposed method yields accurate mea-
surement results in both simulation and HVM ATE environment. The results
based on hardware validation show good correlation with actual system PI
characteristics as well as good repeatability. The proposed nonlinearity esti-
mation algorithm is efficient and concise in that it can be implemented in a
conventional HVM tester.
113
Chapter 6
Conclusions and Future Research Directions
6.1 Conclusions
In this dissertation, novel test methodologies are proposed which pro-
vide additional coverage on loopback test issues. The known limitations of
loopback test such as fault masking and non-ideality of margining circuitry
issues are discussed and three solutions are proposed to resolve the current
limitations.
The fault masking issue is addressed using the proposed methodology
in Chapter 3 to test the linearity of embedded ADCs and DACs in high speed
serial I/Os. It uses a loopback setup with additional Gaussian noise on the
loopback path, which distorts the code probability distribution of ADC. The
distortion was post-processed using the proposed algorithm to calculate the
nonlinearity of the ADC independent of DAC. Then, using the combined loop-
back response and the ADC’s individual response, the nonlinearity of DAC is
extracted. This procedure provides accurate DNLs and INLs of each con-
verter regardless of fault masking effect. Other than interconnection and test
mode switches which may already be available due to the existing loopback
configuration, no additional circuits are necessary as compared to other BIST
114
schemes. Since the proposed method is an algorithm based approach, it does
not depend on specific architectures of data converters in high speed I/Os.
Compared to testing with automated test equipment (ATE) or other BIST
circuitry, this approach is cost effective and is easy to implement.
The non-ideality issue of margining circuit is addressed using the pro-
posed test technique in Chapter 4 for phase interpolators using random jitter
injection. Using a random jitter source that generates a jitter in Gaussian
distribution, the bit probability distribution generated by the receiver is used
as reference. When generating the reference distribution, undersampling tech-
nique is used to construct an ideal distribution for the case when non-ideality
does not exist for the margining circuitry. Then, a phase interpolator is used
to create another set of distribution to be compared with the reference. The
difference between two distributions provide information to extract nonlinear-
ities of the phase interpolator. The proposed algorithm is cost effective in that
it does not require hardware overhead in the case of forwarded clock scheme.
For the derived clock scheme, we only need to implement one multiplexer in
the clock lane to implement clock injection capability. The load board or sys-
tem board only needs to contain a random jitter injector and a clock generator
for undersampling purpose. The proposed algorithm based on statistical data
collection and curve fitting scheme accurately predicts the DNL of the phase
interpolator. Experimental results show that our method accurately predicted
nonlinearities of the PI. Since our method does not require significant circuit
changes, it can be easily applied to various high speed I/O microarchitectures
115
where phase interpolators are used.
Although the proposed technique to extract non-ideality of phase in-
terpolators accurately estimates DNLs, it requires random jitter source on the
load board which may not be always available in a certain HVM test environ-
ment. Another PI linearity test technique described in Chapter 5 is proposed
to mitigate the requirement for the random jitter source using the proposed
sliding window search algorithm. Rather than using the random jitter source
on the load board, it uses intrinsic jitter profile collected from undersampling.
Instead of aggregating the entire bit stream to create the jitter distribution, it
calculates the edge transition location based on individual transition region to
avoid low frequency jitter aliasing effect. The sliding window search algorithm
is proposed to simplify the calculation while maintaining the accuracy of in-
dividual calculation of edge transition regions. The new algorithm accurately
estimates linearities and the method was implemented in a low cost HVM ATE
environment to assess the feasibility. The results based on hardware valida-
tion show good correlation with actual system PI characteristic as well as good
repeatability. The proposed nonlinearity estimation algorithm is efficient and
concise in that it can be implemented in a conventional HVM tester.
6.2 Future Research Directions
Overall, this dissertation addresses key limitations in loopback test by
proposing loopback compatible test methods to provide coverage on areas of
concern. As conclusions of the dissertation are discussed, high speed interface
116
test challenges will continue to grow with ever increasing speed and perfor-
mance. Some potential areas for future research direction are highlighted as
follows.
• The proposed sliding window search algorithm was validated on a hand-
ful of DUTs, further validation will be performed to evaluate the method
on volume manufacturing environment. We will also evaluate the method
on next generation serial link schemes operating over 10 Gbps, which
would provide a new set of challenges.
• As the speed of SerDes increases, it increases the size of the I/O design
due to fine calibration requirements which are mostly achieved by digital
logics. Although there is prior work on automated test pattern genera-
tion (ATPG) on various mixed signal blocks [9], methods to develop test
patterns automatically on high speed mixed signal IP blocks are not well
defined partly due to lack of widely adopted analog fault model. Devel-
opment of widely adopted analog fault models and novel technique that
can generate effective test patterns automatically will augment current
test solutions on high speed I/Os.
• Adaptive equalization function is being implemented in modern high
speed serial interface to overcome the channel bandwidth limitation.
There are previous papers [26, 45, 64] on the topic of the adaptive equal-
izer test. However, with ever increasing data rate on high speed serial
117
interface, the equalization circuitry will be more complex and sophis-
ticated, and hence the test will take long test time. Continuation of
research investment on the adaptive equalization test would provide so-
lutions to overcome many roadblocks in testing next generation high
speed I/O circuits.
• Since increasing number of building blocks requires digitally assisted
calibration, pre-silicon mixed signal verification is now an essential ver-
ification step in modern I/O IP design flow. Applying mixed signal
verification techniques to automate the fault grading and correlate the
pre-silicon test results with post silicon validation results would provide
added benefit by identifying effective test patterns and expediting turn
around time of the debug.
118
Bibliography
[1] DDR2 SDRAM Fully Buffered DIMMTM(FB-DIMMTM) design specifi-
cation. http://www.jedec.org/.
[2] HyperTransportTMI/O link specification revision 1. http://www.hypertransport.org.
[3] Introduction to XAUITM. http://www.10gea.org/.
[4] OC800TM user manual. Apria Technology.
[5] PCI ExpressTMbase specification. http://www.pcisig.com.
[6] T2000TM250Mhz digital module product description. Advantest Corpo-
ration.
[7] IEEE 802 10GBASE-T tutorial. http://ieee802.org/, 2003.
[8] Test and test equipment. The International Technology Roadmap For
Semiconductors, 2009.
[9] M. Abbas, K.-T. Cheng, Y. Furukawa, S. Komatsu, and K. Asada.
An automatic test generation framework for digitally-assisted adaptive
equalizers in high-speed serial links. In Design, Automation Test in
Europe Conference Exhibition (DATE), 2010, pages 1755 –1760, Mar.
2010.
119
[10] S. Aouini and G.W. Roberts. A predictable robust fully programmable
analog gaussian noise source for mixed-signal/digital ATE. In Test Con-
ference, 2006. ITC ’06. IEEE International, pages 1–10, Oct. 2006.
[11] M. Aoyama, K. Ogasawara, M. Sugawara, T. Ishibashi, S. Shimoyama,
K. Yamaguchi, T. Yanagita, and T. Noma. 3 Gbps, 5000 ppm spread
spectrum SerDes PHY with frequency tracking phase interpolator for
serial ATA. In VLSI Circuits, 2003. Digest of Technical Papers. 2003
Symposium on, pages 107–110, Jun. 2003.
[12] K. Arabi and B. Kaminska. Efficient and accurate testing of analog-to-
digital converters using oscillation-test method. In Proc. of European
Design and Test Conference, pages 348–352, 1997.
[13] K. Arabi, B. Kaminska, and J. Rzeszut. A new built-in self-test ap-
proach for digital-to-analog and analog-to-digital converters. In Proc.
of International Conference on Computer Aided Design, pages 491–494,
1994.
[14] K. Arabi, B. Kaminska, and J. Rzeszut. BIST for D/A and A/D con-
verters. In IEEE Design and Test of Computers, volume 13, pages
40–49, 1996.
[15] F. Azais, S. Bernard, Y. Bertrand, and M. Renovell. Towards an ADC
BIST scheme using the histogram test technique. In Proc. of IEEE
European Test Workshop, pages 53–58, 2000.
120
[16] F. Azais, S. Bernard, Y. Bertrand, and M. Renovell. Implementation
of a linear histogram BIST for ADCs. In Proc. of the 2001 Design,
Automation and Test in Europe Conference, pages 590–595, 2001.
[17] M. Benyahia, J.B. Moulard, F. Badets, A. Mestassi, T. Finateu, L. Vogt,
and F. Boissieres. A digitally controlled 5 GHz analog phase interpolator
with 10 GHz LC PLL. In Design and Technology of Integrated Systems
in Nanoscale Era, 2007. DTIS. International Conference on, pages 130–
135, Sep. 2007.
[18] J.F. Bulzacchelli, M. Meghelli, S.V. Rylov, W. Rhee, A.V. Rylyakov,
H.A. Ainspan, B.D. Parker, M.P. Beakes, Aichin Chung, T.J. Beukema,
P.K. Pepeljugoski, L. Shan, Y.H. Kwark, S. Gowda, and D.J. Friedman.
A 10-Gb/s 5-Tap DFE/4-Tap FFE transceiver in 90-nm CMOS tech-
nology. Solid-State Circuits, IEEE Journal of, 41(12):2885–2900, Dec.
2006.
[19] M. Burns and G. W. Roberts. An Introduction to Mixed-Signal IC Test
and Measurement. Oxford University Press, New York, NY, 2001.
[20] Y. Cai, B. Laquai, and K. Luehman. Jitter testing for gigabit serial
communication transceivers. IEEE Design and Test of Computers,
19(1):66–74, 2002.
[21] Y. Cai, S.A. Werner, G.J. Zhang, M.J. Olsen, and R.D. Brink. Jit-
ter testing for multi-gigabit backplane serdes ” techniques to decompose
121
and combine various types of jitter. In Proceedings of the 2002 IEEE In-
ternational Test Conference, ITC ’02, pages 700–709, Washington, DC,
USA, 2002. IEEE Computer Society.
[22] A.H. Chan and G.W. Roberts. A jitter characterization system using a
component-invariant vernier delay line. IEEE Trans. Very Large Scale
Integr. Syst., 12(1):79–95, 2004.
[23] H.-M. Chang, C.-H. Chen, K.-Y. Lin, and K.-T. Cheng. Calibration and
testing time reduction techniques for a digitally-calibrated pipelined adc.
In VLSI Test Symposium, 2009. VTS ’09. 27th IEEE, pages 291 –296,
May 2009.
[24] H.-M. Chang, M.-S. Lin, and K.-T. Cheng. Digitally-assisted analog/rf
testing for mixed-signal socs. In Asian Test Symposium, 2008. ATS
’08. 17th, pages 43 –48, Nov. 2008.
[25] A. Chatterjee and N. Nagi. Design for testability and built-in self-test
of mixed-signal circuits: a tutorial. In VLSI Design, 1997. Proceedings.,
Tenth International Conference on, pages 388–392, Jan. 1997.
[26] K.-T. Cheng and H.-M. Chang. Test strategies for adaptive equalizers.
In Custom Integrated Circuits Conference, 2009. CICC ’09. IEEE, pages
597 –604, Sep. 2009.
[27] J.H. Chun, H.Yu, and J.A. Abraham. An efficient linearity test for on-
chip high speed ADC and DAC using loop-back. In ACM Great Lakes
122
Symposium on VLSI, pages 328–331, 2004.
[28] J.H. Chun, J.W. Lee, and J.A. Abraham. A novel characterization tech-
nique for high speed I/O mixed signal circuit components using random
jitter injection. In Proceedings of the 2010 Asia and South Pacific De-
sign Automation Conference, ASPDAC ’10, pages 312–317, Piscataway,
NJ, USA, 2010. IEEE Press.
[29] W. Dalal and D.A. Rosenthal. Measuring jitter of high speed data
channels using undersampling techniques. In Proceedings of the 1998
IEEE International Test Conference, ITC ’98, pages 814–818, Washing-
ton, DC, USA, 1998. IEEE Computer Society.
[30] P. Dierckx. Curve and surface fitting with splines. Oxford University
Press, Inc., New York, NY, USA, 1993.
[31] P. Dudek, S. Szczepanski, and J.V. Hatfield. A high-resolution cmos
time-to-digital converter utilizing a vernier delay line. Solid-State Cir-
cuits, IEEE Journal of, 35(2):240 –247, Feb. 2000.
[32] D.J. Foley and M.P. Flynn. A low-power 8-pam serial transceiver in 0.5-
um digital cmos. IEEE Journal of Solid-State Circuits, 37(3):310–316,
2002.
[33] W.A. Fritzsche and A.E. Haque. Low cost testing of multi-GBit device
pins with ATE assisted loopback instrument. In Test Conference, 2008.
ITC 2008. IEEE International, pages 1–8, Oct. 2008.
123
[34] T. Fujibe, M. Suda, K. Yamamoto, Y. Nagata, K. Fujita, D. Watanabe,
and T. Okayasu. Dynamic arbitrary jitter injection method for ¡ 6.5Gb/s
SerDes testing. In Test Conference, 2009. ITC 2009. International,
pages 1–10, Nov. 2009.
[35] S. Goyal and A. Chatterjee. Linearity testing of A/D converters using
selective code measurement. J. Electron. Test., 24:567–576, Dec. 2008.
[36] S. Goyal and M. Purtell. Alternate test methodology for high speed
A/D converter testing on low cost tester. In Test Symposium, 2005.
Proceedings. 14th Asian, pages 14 – 17, Dec. 2005.
[37] A. Haider, S. Bhattacharya, G. Srinivasan, and A. Chatterjee. A system-
level alternate test approach for specification test of RF transceivers in
loopback mode. In VLSI Design, 2005. 18th International Conference
on, pages 289 – 294, Jan. 2005.
[38] H. Higashi, S. Masaki, M. Kibune, S. Matsubara, T. Chiba, Y. Doi,
H. Yamaguchi, H. Takauchi, H. Ishida, K. Gotoh, and H. Tamura. A
5-6.4-Gb/s 12-channel transceiver with pre-emphasis and equalization.
Solid-State Circuits, IEEE Journal of, 40(4):978 – 985, Apr. 2005.
[39] W.T. Holman, J.A. Connelly, and A.B. Dowlatabadi. An integrated
analog/digital random noise source. Circuits and Systems I: Funda-
mental Theory and Applications, IEEE Transactions on, 44(6):521–528,
Jun. 1997.
124
[40] D. Hong and K.-T. Cheng. Bit error rate estimation for improving jitter
testing of high-speed serial links. In Test Conference, 2006. ITC ’06.
IEEE International, pages 1 –10, Oct. 2006.
[41] D. Hong and K.-T. Cheng. Bit-error rate estimation for bang-bang
clock and data recovery circuit in high-speed serial links. In VLSI Test
Symposium, 2008. VTS 2008. 26th IEEE, pages 17 –22, May 2008.
[42] D. Hong, C. Dryden, and G. Saksena. An efficient random jitter mea-
surement technique using fast comparator sampling. In Proceedings
of the 23rd IEEE Symposium on VLSI Test, VTS ’05, pages 123–130,
Washington, DC, USA, 2005. IEEE Computer Society.
[43] D. Hong, C.-K. Ong, and K.-T. Cheng. BER estimation for serial
links based on jitter spectrum and clock recovery characteristics. In
Proceedings of the International Test Conference on International Test
Conference, ITC ’04, pages 1138–1147, Washington, DC, USA, 2004.
IEEE Computer Society.
[44] D. Hong, C.-K. Ong, and K.-T. Cheng. Bit-error-rate estimation for
high-speed serial links. Circuits and Systems I: Regular Papers, IEEE
Transactions on, 53(12):2616 –2627, Dec. 2006.
[45] D. Hong, S. Saberi, K.-T. Cheng, and C.P. Yue. A two-tone test method
for continuous-time adaptive equalizers. In Design, Automation Test in
Europe Conference Exhibition, 2007. DATE ’07, pages 1 –6, Apr. 2007.
125
[46] J.-L. Huang and K.-T. Cheng. An on-chip short-time interval measure-
ment technique for testing high-speed communication links. In Proceed-
ings of the 19th IEEE VLSI Test Symposium, VTS ’01, pages 380–385,
Washington, DC, USA, 2001. IEEE Computer Society.
[47] J.-L. Huang, C.-K. Ong, and K.-T. Cheng. A BIST scheme for on-chip
ADC and DAC testing. In Proc. of the 2000 Design, Automation and
Test in Europe Conference and Exhibition, pages 216–220, 2000.
[48] P. Iyer, S. Jain, B. Casper, and J. Howard. Testing high-speed io
links using on-die circuitry. In VLSI Design, 2006. Held jointly with
5th International Conference on Embedded Systems and Design., 19th
International Conference on, pages 4–10, Jan. 2006.
[49] Y. Jiang and A. Piovaccari. A compact phase interpolator for 3.125G
SerDes application. In Mixed-Signal Design, 2003. Southwest Sympo-
sium on, pages 249–252, Feb. 2003.
[50] L. Jin. Linearity test time reduction for analog-to-digital converters
using the kalman filter with experimental parameter estimation. In
Test Conference, 2008. ITC 2008. IEEE International, pages 1 –8, Oct.
2008.
[51] L. Jin, K. Parthasarathy, T. Kuyel, D. Chen, and R.L. Geiger. Ac-
curate testing of analog-to-digital converters using low linearity signals
with stimulus error identification and removal. Instrumentation and
Measurement, IEEE Transactions on, 54(3):1188 – 1199, Jun. 2005.
126
[52] D. A. Johns and K. Martin. Analog Integrated Circuit Design. John
Wiley and Sons, Inc., 1997.
[53] D.C. Keezer, D. Minier, and P. Ducharme. Variable delay of multi-
gigahertz digital signals for deskew and jitter-injection test applications.
In Design, Automation and Test in Europe, 2008. DATE ’08, pages 1486
–1491, Mar. 2008.
[54] B. Kim and J.A. Abraham. Efficient loopback test for aperture jitter
in embedded mixed-signal circuits. Circuits and Systems I: Regular
Papers, IEEE Transactions on, 58(8):1773 –1784, Aug. 2011.
[55] B. Kim, Z. Fu, and J.A. Abraham. Transformer-coupled loopback test
for differential mixed-signal specifications. In VLSI Test Symposium,
2007. 25th IEEE, pages 291 –296, May 2007.
[56] B. Kim, H. Shin, J.H. Chun, and J.A. Abraham. Predicting mixed-
signal dynamic performance using optimised signature-based alternate
test. Computers Digital Techniques, IET, 1(3):159 –169, May 2007.
[57] S. Kim and M. Soma. An all-digital built-in self-test for high-speed
phase-locked loops. Circuits and Systems II: Analog and Digital Signal
Processing, IEEE Transactions on, 48(2):141 –150, Feb. 2001.
[58] R. Kreienkamp, U. Langmann, C. Zimmermann, T. Aoyama, and H. Sied-
hoff. A 10-Gb/s CMOS clock and data recovery circuit with an analog
127
phase interpolator. Solid-State Circuits, IEEE Journal of, 40(3):736–
743, Mar. 2005.
[59] N. Kurd, J. Douglas, P. Mosalikanti, and R. Kumar. Next generation
IntelR© micro-architecture (nehalem) clocking architecture. In VLSI Cir-
cuits, 2008 IEEE Symposium on, pages 62–63, Jun. 2008.
[60] B. Laquai and Y. Cai. Testing gigabit multilane serdes interfaces with
passive jitter injection filters. In Test Conference, 2001. Proceedings.
International, pages 297 –304, 2001.
[61] J.W. Lee, J.H. Chun, and J.A. Abraham. A random jitter RMS estima-
tion technique for BIST applications. In Asian Test Symposium, 2009.
ATS ’09., pages 9 –14, Nov. 2009.
[62] J.W. Lee, J.H. Chun, and J.A. Abraham. A delay measurement method
using a shrinking clock signal. In ACM Great Lakes Symposium on
VLSI, pages 139–142, 2010.
[63] M. Li. Jitter, noise, and signal integrity at high-speed. Prentice Hall
Press, Upper Saddle River, NJ, USA, 2007.
[64] M. Lin and K.-T. Cheng. Testable design for adaptive linear equalizer
in high-speed serial links. In Test Conference, 2006. ITC ’06. IEEE
International, pages 1 –10, Oct. 2006.
[65] M. Lin, K.-T. Cheng, J. Hsu, M.C. Sun, J. Chen, and S. Lu. Production-
oriented interface testing for PCI-ExpressTMby enhanced loop-back tech-
128
nique. In Test Conference, 2005. Proceedings. ITC 2005. IEEE Inter-
national, pages 11–20, Nov. 2005.
[66] M. Mahoney. DSP-Based Testing of Analog and Mixed-Signal Circuits.
IEEE Computer Society Press, Washington, D.C., 1987.
[67] T.M. Mak, M. Tripp, and A. Meixner. Testing Gbps interfaces without
a gigahertz tester. IEEE Design and Test of Computers, 21(4):278–286,
2004.
[68] A. Martin, B. Casper, J. Kennedy, J. Jaussi, and R. Mooney. 8 Gb/s dif-
ferential simultaneous bidirectional link with 4mV 9ps waveform capture
diagnostic capability. In Solid-State Circuits Conference, 2003. Digest
of Technical Papers. ISSCC. 2003 IEEE International, pages 78–479
vol.1, 2003.
[69] S. Max. Ramp testing of ADC transition levels using finite resolution
ramps. In Proc. of the 2001 IEEE International Test Conference, pages
495–501, 2001.
[70] A. Meixner, A. Kakizawa, B. Provost, and S. Bedwani. External loop-
back testing experiences with high speed serial interfaces. In Test Con-
ference, 2008. ITC 2008. IEEE International, pages 1–10, Oct. 2008.
[71] M. J. Ohletz. Hybrid built in self test(HBIST) for mixed analogue/digital
ICs. In Proc. of the 1993 IEEE International Test Conference, pages
307–316, 1993.
129
[72] V. G. Oklobdzija and R. K. Krishnamurthy. High-Performance Energy-
Efficient Microprocessor Design (Series on Integrated Circuits and Sys-
tems). Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2006.
[73] S. Ozev and A. Orailoglu. An integrated tool for analog test gener-
ation and fault simulation. In Proc. of the 2002 IEEE International
Symposium on Quality of Electronic Design, pages 267–272, 2002.
[74] J. Park, H. Shin, and J.A. Abraham. Pseudorandom test for nonlinear
circuits based on a simplified volterra series model. In ISQED, pages
495–500, 2007.
[75] J. Park, H. Shin, and J.A. Abraham. Parallel loopback test of mixed-
signal circuits. In VLSI Test Symposium, 2008. VTS 2008. 26th IEEE,
pages 309 –316, May 2008.
[76] J.G. Proakis. Digital Communications. McGraw Hill, 4th edition, 2001.
[77] B. Provost. Testability techniques for phase interpolators. In US Patent
Application no. 20090122849, 2009.
[78] A. Raghunathan, J.H. Chun, J.A. Abraham, and A. Chatterjee. Quasi-
oscillation based test for improved prediction of analog performance pa-
rameters. In Test Conference, 2004. Proceedings. ITC 2004. Interna-
tional, pages 252 – 261, Oct. 2004.
130
[79] A. Raghunathan, H. Shin, and J.A. Abraham. Prediction of analog per-
formance parameters using oscillation based test. In VLSI Test Sympo-
sium, 2004. Proceedings. 22nd IEEE, pages 377 – 382, Apr. 2004.
[80] M. Renovell, F. Azais, S. Bernard, and Y. Bertrand. Hardware resource
minimization for histogram-based ADC BIST. In Proc. of the 18th IEEE
VLSI Test Symposium, pages 247–252, 2000.
[81] I. Robertson, G. Hetherington, T. Leslie, I. Parulkar, and R. Lesnikoski.
Testing high-speed, large scale implementation of SerDes I/Os on chips
used in throughput computing systems. Test Conference, International,
0:8–17, 2005.
[82] J. Savoj and B. Razavi. A 10-Gb/s CMOS clock and data recovery
circuit with a half-rate linear phase detector. Solid-State Circuits, IEEE
Journal of, 36(5):761 –768, May 2001.
[83] X. Shi and F. Assaderaghi. Phase linearity test circuit. In US Patent
Application no. 20070252735, 2007.
[84] H. Shin, B. Kim, and J.A. Abraham. Spectral prediction for specification-
based loopback test of embedded mixed-signal circuits. In VLSI Test
Symposium, 2006. Proceedings. 24th IEEE, pages 1–6, May 2006.
[85] H. Shin, J. Park, and J.A. Abraham. A statistical digital equalizer for
loopback-based linearity test of data converters. In Test Symposium,
2006. ATS ’06. 15th Asian, pages 245 –250, Nov. 2006.
131
[86] M. Soma, W. Haileselassie, J. Yan, and R. Raina. A wavelet-based
timing parameter extraction method. In Test Conference, 2002. Pro-
ceedings. International, pages 120 – 128, 2002.
[87] R. Stephens. Jitter analysis: The dual-dirac model, RJ/DJ, and Q-
scale. Agilent Technologies, 2004.
[88] S. Sunter, C. McDonald, and G. Danialy. Contactless digital testing
of IC pin leakage currents. In Test Conference, 2001. Proceedings.
International, pages 204 –210, 2001.
[89] S. Sunter and N. Nagi. Test metrics for analog parametric faults. In
VLSI Test Symposium, 1999. Proceedings. 17th IEEE, pages 226 –234,
1999.
[90] S. Sunter and A. Roy. On-chip digital jitter measurement, from mega-
hertz to gigahertz. Design and Test of Computers, IEEE, 21(4):314–321,
Jul.-Aug. 2004.
[91] S. Sunter and A. Roy. A mixed-signal test bus and analog BIST with
’unlimited’ time and voltage resolution. In European Test Symposium
(ETS), 2011 16th IEEE, pages 81 –86, May 2011.
[92] S. Sunter, A. Roy, and J.-F. Cote. An automated, complete, structural
test solution for SerDes. In Test Conference, 2004. Proceedings. ITC
2004. International, pages 95 – 104, Oct. 2004.
132
[93] S. K. Sunter and N. Nagi. A simplified polynomial-fitting algorithm for
DAC and ADC BIST. In Proc. of International Test Conference, pages
389–395, 1997.
[94] S.K. Sunter. Testing high frequency ADCs and DACs with a low fre-
quency analog bus. In Test Conference, 2003. Proceedings. ITC 2003.
International, volume 1, pages 228 – 235, Oct. 2003.
[95] M. Suzuki, R. Shimizu, N. Naka, and K. Nakamura. High-speed inter-
face testing. Asian Test Symposium, 0:461, 2001.
[96] E. Teraoka and et al. A built-in self-test for ADC and DAC in a single-
chip speech CODEC. In Proc. of the IEEE International Test Confer-
ence, pages 791–796, 1993.
[97] Y. Tomita, M. Kibune, J. Ogawa, W.W. Walker, H. Tamura, and T. Kuroda.
A 10-Gb/s receiver with series equalizer and on-chip ISI monitor in 0.11-
mu;m CMOS. Solid-State Circuits, IEEE Journal of, 40(4):986 – 993,
Apr. 2005.
[98] M.F. Toner and G.W. Roberts. A BIST scheme for an SNR test of a
sigma-delta ADC. In Proc. of the 1993 IEEE International Test Con-
ference, pages 805–814, 1993.
[99] M. Tripp, T.M. Mak, and A. Meixner. Elimination of traditional func-
tional testing of interface timings at Intel. In Test Conference, 2004.
Proceedings. ITC 2004. International, pages 1448 – 1454, Oct. 2004.
133
[100] P.N. Variyam and A. Chatterjee. Specification-driven test generation
for analog circuits. Computer-Aided Design of Integrated Circuits and
Systems, IEEE Transactions on, 19(10):1189 –1201, Oct. 2000.
[101] M. F. Wagdy and M. Goff. Linearizing average transfer characteristics
of ideal ADC’s via analog and digital dither. IEEE Trans. on Instru-
mentation and Measurement, 43(2):146–150, 1994.
[102] N. Weste and D. Harris. CMOS VLSI Design: A Circuits and Systems
Perspective. Addison-Wesley Publishing Company, USA, 4th edition,
2010.
[103] C. L. Wey. Built-in self-test design of current-mode algorithmic analog-
to-digital converters. IEEE Trans. on Instrumentation and Measure-
ment, 46(3):667–671, 1997.
[104] T. Xia, H. Zheng, J. Li, and A. Ginawi. Self-refereed on-chip jitter mea-
surement circuit using vernier oscillators. In VLSI, 2005. Proceedings.
IEEE Computer Society Annual Symposium on, pages 218 – 223, May
2005.
[105] T.J. Yamaguchi, M. Soma, M. Ishida, T. Watanabe, and T. Ohmi. Ex-
traction of instantaneous and rms sinusoidal jitter using an analytic sig-
nal method. Circuits and Systems II: Analog and Digital Signal Pro-
cessing, IEEE Transactions on, 50(6):288 – 298, Jun. 2003.
134
[106] C.-K.K. Yang, V. Stojanovic, S. Modjtahedi, M.A. Horowitz, and W.F.
Ellersick. A serial-link transceiver based on 8-g samples/s A/D and
D/A converters in 0.25-um cmos. Solid-State Circuits, IEEE Journal
of, 36(11):1684–1692, Nov. 2001.
[107] J. Yang, J. Kim, S. Byun, C. Conroy, and B. Kim. A quad-channel
3.125Gb/s/ch serial-link transceiver with mixed-mode adaptive equalizer
in 0.18 um CMOS. In Solid-State Circuits Conference, 2004. Digest of
Technical Papers. ISSCC. 2004 IEEE International, pages 176 – 520
Vol.1, Feb. 2004.
[108] H. Yu, J.A. Abraham, S. Hwang, and J. Roh. Efficient loop-back testing
of on-chip ADCs and DACs. In Proc. of the Asia and South Pacific
Design Automation Conference, pages 651–656, 2003.
[109] H. Yu, S. Hwang, and J.A. Abraham. DSP-based statistical self test of
on-chip converters. In Proc. of the 21st IEEE VLSI Test Symposium,
pages 83–88, 2003.
[110] H. Yu, H. Shin, J.H. Chun, and J.A. Abraham. Performance charac-
terization of mixed-signal circuits using a ternary signal representation.
In Test Conference, 2004. Proceedings. ITC 2004. International, pages
1389 – 1397, Oct. 2004.
135
Vita
Ji Hwan Chun was born in Seoul, South Korea in 1977. He received the
Bachelor of Science degree in Electronic Engineering from Yonsei University,
Seoul, South Korea in 2001. After joining the University of Texas at Austin
for his graduate study, he received the Master of Science in Engineering de-
gree in Electrical and Computer Engineering from the University of Texas at
Austin in 2003. While continuing his Ph.D. study in part-time, he joined
Intel Corporation in 2004 where he has worked on design, pre-silicon verifica-
tion, and post-silicon tests for IntelR© PentiumR© 4, AtomTM, XeonR©, and HPC
class microprocessors. He is currently a Senior Component Design Engineer
specializing on Mixed Signal Verification and DFX (Design for Testability, De-
bug, Verification, and Manufacturing) for next generation high performance
CPUs. He is a recipient of University of Texas Microelectronics and Computer
Development (MCD) Fellowship in 2001-03.
Permanent address: 900 Pepper Tree Ln Apt 618Santa Clara, California 95051
This dissertation was typeset with LATEX† by the author.
†LATEX is a document preparation system developed by Leslie Lamport as a special
version of Donald Knuth’s TEX Program.
136
top related