CALIBRATION ADC AND ALGORITHM FOR ADAPTIVE PREDISTORTION OF HIGH-SPEED DACS A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Alireza Dastgheib March 2013
113
Embed
CALIBRATION ADC AND ALGORITHM FOR ADAPTIVE …bk656yt4469/Alireza_thesis... · rithms required for digital predistortion of a digital-to-analog converter (DAC) with ... 2.2 The calibration
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
CALIBRATION ADC AND ALGORITHM FOR ADAPTIVE
PREDISTORTION OF HIGH-SPEED DACS
A DISSERTATION
SUBMITTED TO THE DEPARTMENT OF ELECTRICAL
ENGINEERING
AND THE COMMITTEE ON GRADUATE STUDIES
OF STANFORD UNIVERSITY
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
Alireza Dastgheib
March 2013
http://creativecommons.org/licenses/by-nc/3.0/us/
This dissertation is online at: http://purl.stanford.edu/bk656yt4469
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Boris Murmann, Primary Adviser
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Ada Poon
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Bruce Wooley
Approved for the Stanford University Committee on Graduate Studies.
Patricia J. Gumport, Vice Provost Graduate Education
This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file inUniversity Archives.
iii
Abstract
In this thesis, the design and implementation of circuits and signal processing algo-
rithms required for digital predistortion of a digital-to-analog converter (DAC) with
an open-loop driver is presented. On the circuit design side, the implementation of
a high precision, high acquisition-bandwidth calibration analog-to-digital converter
(ADC) for sampling the DAC output is discussed. The ADC has a cyclic core and
a variant of the switched-RC sampling network suitable for high frequency opera-
tion. It is implemented in 90-nm CMOS and achieves an SFDR of higher than 72
dB for input frequencies under 500 MHz. On the signal processing side, a calibration
algorithm is presented that uses the input/output data of the DAC to identify its
nonlinearity and cancel it through digital predistortion. The algorithm represents a
novel technique for the linearization of Hammerstein systems with low computational
cost and is suitable for hardware realization. Overall, this calibration architecture
improves the SFDR of the DAC by close to 30 dB to achieve a final linearity of 53
dB for input frequencies up to 400 MHz and peak-to-peak differential output swing
of 800 mV.
iv
Acknowledgements
Much like any journey in life, my PhD was not all about the destination—the thesis
in this case, but rather mostly about the process that took me here. I would like
to thank all the people who accompanied me in this journey and made me a better
person.
Firstly, I thank my advisor, Prof. Boris Murmann, who in addition to being a
research all-star, was a great instructor and a supportive mentor. I also thank Prof.
Bruce Wooley and Prof. Ada Poon, for being on my orals committee as well as my
dissertation reading committee. I must also thank Prof. Norbert Pelc for chairing
my orals committee. I am deeply thankful to Ann Guerra for her extraordinary help
with administrative issues and lots of other random errands. I thank my research
partner, Clay Daigle, for the many things I learned from him.
Being in the silicon valley, I had the opportunity to get help from several industry
members. I would like to thank Ken Poulton (Agilent Technologies), Keith Ring
(Intersil) and Sim Narasimha for their help and brainstorming during various phases
of the research. I thank Phil Golden (Intersil) and Evan Chang (Hamamatsu) for
their sincere help with debugging my chips. Henrique Miranda went out of his way
in spite of a busy schedule, and helped me fix a problem with my evaluation board.
I appreciate the help and valuable feedbacks from my group mates as well as other
student in the Allen building throughout this research. I thank the current members
of the group: Alex, Bill, Doug, Jon, Mahmoud, Man-Chia, Martin, Noam, Ross,
Ryan, Siddharth, Vaibhav; as well as the alumni: Donghyun, Drew, Echere, Jason,
Justin, Manar, Parastoo, Pedram, Wei. I also thank Mohammad, Maryam, Roozbeh
and Rostam for their friendship and help in and out of work.
v
I was fortunate to have the wonderful company of many friends during my years
at Stanford. Their help and support was invaluable all along the way. Among them
are some of the most amazing and intelligent people I have ever known in my life. I
am deeply grateful to every one of them. Finally, I would like to thank my family for
their love and support. I would not have made it without them.
capacitances due to large switches and large resistors degrade tracking linearity.
CHAPTER 4. PROTOTYPE DESIGN 48
t
htrack(t)
Tt
hhold(t)
RC T RC
CsVin
Cf
CsVin Vt
track hold
Figure 4.4: Modified RC sampling based on Wheatstone bridge operation.
feedthrough in the hold mode. In order to reduce the feedthrough, two [35] or three
[36] switched-RC branches can be cascaded and both the switches and the resistors
have to be made larger. This is illustrated in Figure 4.3(a). Shown in (b) are the
capacitances along the signal path in the track mode. We can see that large switches
or large resistors add nonlinear junction capacitances along the sampling path, which
decrease the amplitude of the sampled voltage and degrade the linearity at high
frequencies. Therefore, this type of sampling is only appropriate for low-frequency
operation.
An alternative implementation is to arrange the switches so that they form a
Wheatstone bridge in the hold mode, as shown in Figure 4.4. In this case, the input
suppression is achieved through resistance matching between switches and resistors
instead of resistive division. The matching cannot be made perfect so there would
still be some feedthrough. Again, the feedthrough can be made smaller by cascading
two stages; however, in this case there is no need for large resistors or switches and
it is possible to use minimum size devices instead. This will reduce the nonlinear
capacitance along the path and improve tracking linearity. Figure 4.4 also shows the
CHAPTER 4. PROTOTYPE DESIGN 49
Vin
Cs
r
Cs
Vin
CsR-r
(a) (b) (c)
Vin
R-rR r
r1
r2
r1
r2
Figure 4.5: Comparison of three switch configurations. (a) Switch in series with a
resistor, (b) RC sampler, (c) Wheatstone sampler.
resulting time-domain response of the sampling interface. During track phase, the
sampler impulse response is a decaying exponential corresponding to the low-pass
filter due to R and Cs. In the hold mode, the signal is shut off. If the time constant
RC is much larger than the sampling window, then the sampler can approximate a
boxcar response, which, according to the discussion in Chapter 2, is the desired filter
before the ADC.
Figure 4.5 recaps our discussion with a side-by-side comparison of three basic
switch arrangements. In (a), regular switches are paired with series resistors, which
are included for a fair comparison with the other two configurations. The series re-
sistor improves linearity by decreasing the contribution of nonlinear switch resistance
as well as lowering the input signal level. It also decreases the loading of the ADC
on the DAC driver which is important in this application. To improve the linearity
of the switch, it can be bootstrapped to achieve a signal independent resistance [37].
In the current application, the input signal Vin is the voltage across off-chip loads
and centers around 2 V. This complicates the design of the bootstrap switch due to a
requirement of high voltage charge pump. Also, to avoid breakdown, the switch itself
has to be implemented with high voltage devices which make worse switches than the
core transistors.
Figure 4.5(b) shows an RC-sampler. The signal feedthrough is proportional to
r/R, where r is the resistance of the shorting switches and R is the value of the pas-
sive resistor. As mentioned before, in order to decrease this ratio, the resistor and/or
CHAPTER 4. PROTOTYPE DESIGN 50
1 2 3 4 510
-2
10-1
100
101
102
feed
thro
ugh
[mV
]
1 2 3 4 5
4
8
12
16
switc
h pa
rasi
tic c
apac
itanc
e [p
F]
W [µm]
RC sampler feedthrough
Wheatstone sampler feedthrough
switch capacitance
Figure 4.6: Signal feedthrough in an RC and a Wheatstone sampler as well as switch
parasitic capacitance are shown as a function of the switch width. The length of
transistors in the switch is set to 300 nm, the total signal path resistance R = 10 KΩ,
and Cs = 500 fF. The resistance r in the Wheatstone sampler is chosen to minimize
the signal feedthrough. It can be seen that the signal feedthrough in the Wheatstone
sampler is about 3 orders of magnitude smaller than that of the RC sampler.
CHAPTER 4. PROTOTYPE DESIGN 51
the switches must be made larger, and this will increase the nonlinear parasitic ca-
pacitance along the signal path. Figure 4.5(c) illustrates the proposed Wheatstone
sampler. The total resistance along the signal path during the track phase is again
equal to R. This resistance is broken into two parts: A first resistor equal to R − rwhich isolates the DAC and the ADC, and a second resistor equal to r which approxi-
mately matches the switch resistances r1 and r2 and results in minimum feedthrough.
The feedthrough can be found to be
VftVin
=r1
R + r1− R
R + r2(4.1)
Assuming r1 = R(1− ε1) and r2 = R(1 + ε2) we have
VftVin
=ε2 − ε1
4 + 2(ε2 − ε1)(4.2)
The switch resistances r1 and r2 are voltage dependent, so the feedthrough cannot
be made exactly zero. However, as we can see from (4.2), the feedthrough depends
on the difference of the errors in r1 and r2. Therefore, it can be made small by
making the switch resistances have a symmetric profile around the input common
mode voltage. Furthermore, unlike the RC sampling configuration, there is no need
for a large resistor or large switches. The switches must be designed to match the
resistor and this results in much smaller switches than in the RC sampler.
Figure 4.6 compares the signal feedthrough in the hold phase for an RC sampler
and a Wheatstone sampler. The sampler circuits are implemented as shown in Fig-
ures 4.5(b) and (c). The total track phase resistance in each case is set to R=10 KΩ,
the sampling capacitance is Cs = 500 fF, and the input signal is 1 Vppd. The switches
are implemented with an anti-parallel pair of I/O PMOS devices with a length of
300nm. The signal feedthrough is plotted as a function of the width of the PMOS
transistors, W . For equal values of tracking resistances and identical switches, the
RC sampler and the Wheatstone sampler achieve comparable linearity performance.
However, as seen in the picture, the feedthrough in a Wheatstone sampler is ap-
proximately three orders of magnitude smaller than that of an RC sampler. For a
fixed value of R, the feedthrough in an RC sampler can be made smaller by increas-
ing W , which in turn, adds to the parasitic capacitance along the path. The signal
CHAPTER 4. PROTOTYPE DESIGN 52
Csr
Vin
R
r1
r2
(a) (b)
Figure 4.7: RC switch implementation. Two Wheatstone samplers are cascaded to
further reduce the signal feedthrough (a), all of the switches are implemented using
anti-parallel thick-oxide PMOS devices (b).
feedthrough drops as 1/W whereas the parasitic capacitance grows proportional to
W . Therefore, considering the numbers in the figure, in order to make the feedthrough
of the RC sampler as small as the feedthrough of the Wheatstone sampler, we have
to use very wide transistors with significant parasitic capacitances.
Another requirement of the sampling network is that it must shield the ADC core
from the large input voltage levels. In this design, the input common mode voltage
is stored on the sampling capacitors. As a result, it can change from 1.5 V to 2
V without affecting the sampling linearity. A schematic of the Wheatstone sampler
circuit implemented in the ADC is shown in Figure 4.7. It consists of a cascade of
two Wheatstone samplers to further reduce the signal feedthrough. The switches in
the sampling network are designed with thick-oxide I/O devices with an aspect ratio
of 2µ/240nm. The isolating resistor in the sampling network equals 15 KΩ, which
is much larger than the DAC output resistors in order to reduce loading. Also, r =
1KΩ and Cs = 400 fF. All resistors are implemented using high-resistance poly (HR
poly) which provides a tight variation.
Two replica sampling networks are used in parallel with the main sampling net-
work in order to regulate the ADC sampling current. During every DAC pulse, either
one of the replicas or the main sampling network is connected to the DAC output,
therefore any glitch due to sampling is pushed out of the DAC bandwidth to its
CHAPTER 4. PROTOTYPE DESIGN 53
C4
C1C3
(a) End of previous phase 2
C2
C3 C4C1
Vref
2±
(b) Phase 1: addition
C4
C1C3
(c) Phase 2: multipliation
C4
C1C3
(d) End of phase 2
Figure 4.8: Basic two-phase operation of the ADC.
sampling frequency. Figure 5.2 in Chapter 5 illustrates a schematic of the sampling
circuit with the replica networks.
4.1.2 ADC Core
The ADC core is a variant of the cyclic architecture similar to the design in [38]. All
of the core operations are achieved through different configurations of one OTA and
six pairs of capacitors Cs, Cf , and C1 to C4. The conversion starts with sampling
the input signal on the sampling capacitor Cs (track phase in Figure 4.4). Then the
voltage is amplified and transferred to a capacitor C1 (hold phase in Figure 4.4).
This amplification compensates for the signal attenuation through the switched-RC
network and reduces the input referred noise of successive cycles. Next, the ADC
goes back and forth between two phases until it resolves the required number of bits.
In the first phase, as shown in Figure 4.8(b), the voltage on C1 is added or subtracted
from the reference voltage and transferred onto capacitors C3 and C4. In the second
phase (Figure 4.8(c)), the voltage is multiplied by two by combining the charges on
CHAPTER 4. PROTOTYPE DESIGN 54
Φ
Φe
Vin Vtop
VbotQinj
Cs
Figure 4.9: Charge injection from the bottom-plate switch.
C3 and C4. The resulting voltage is transferred back onto C1 and makes the new
residue voltage. During phase 1 the OTA drives C3 and C4 and during phase 2 it
drives C1 and the comparator. So even though it is possible to combine the reference
addition and multiplication by two in one phase, separating those helps balance the
OTA load.
Since the noise requirement for the ADC is relaxed, we can use small capacitors
in the design to lower the power. However, small capacitors are more susceptible to
charge injection errors. As a result, certain design aspects become more important
to mitigate these errors and maintain high linearity. We will next describe the two
most important of these design aspects.
Switch Implementations
Apart from the sampling switch that has direct impact on the linearity of the sampled
voltage, other switches also need to be carefully designed to maintain linear processing
of the signal. In this circuit, the bottom plate switches are implemented with anti-
parallel NMOS transistors and the top plate switches are transmission gates. Several
considerations must be taken into account in sizing these switches. First, their resis-
tances must be small enough so that their associated time constants do not slow the
settling performance of the circuit. Also, the switch resistances must be examined to
CHAPTER 4. PROTOTYPE DESIGN 55
avoid phase margin degradation. Another important caveat exists in regards to the
charge injection from the bottom plate switches, illustrated in Figure 4.9. It is known
that the channel charge from the bottom plate switches is signal independent and
so bottom plate sampling is immune to nonlinear charge injection. However, when
a bottom plate switch opens, the splitting of the channel charge between its source
and drain is a function of the resistance seen from these terminals. Therefore, the
charge that will be injected onto the bottom plate of the sampling capacitor depends
on the resistance of the top plate switch and consequently on the signal level. When
the circuit processes a large differential voltage, the different signal levels in the two
halves of the circuit can cause non-equal resistances in top-plate switches and result
in mismatched charge injection from bottom-plate switches. In order to reduce the
effect of this error charge, the top-plate switches must be designed to have a balanced
resistance profile around the common-mode voltage. A residual error still remains
due to imperfect symmetry and mismatch between switches in the two halves of the
circuit, which was verified to be negligible through simulations.
ADC Phase Transitions
A cyclic ADC operates through the rearrangement of a few circuit components to
accomplish the operations of an extended pipeline converter. As a result, the con-
nectivities and the switching patterns are more complicated in a cyclic ADC than a
pipeline. This can be seen from Figure 4.10 which illustrates a single-ended diagram
of the MDAC along with the clock signals. It can be seen that the ADC requires
many more clock phases than, e.g., a pipeline which operates using a pair of non-
overlapping clock signals. It is important, then, to carefully arrange all clock edges
in order to preserve the sampled charge while transitioning between different phases.
Figure 4.11 illustrates an example of incorrect phase transitions. Figures (a) to
(d) show the charge distributions during phase 1 and phase 2 operation and Figures
(e) and (f) show stages of transition between phase 1 and 2. The end of phase 1
operation is shown in (c). When transitioning to phase 2, first step is to open the
bottom-plate sampling switch—as is the case in pipeline ADCs—which in this case
is the switch to the left of C4. This state is shown in (e). There are still eleven more
CHAPTER 4. PROTOTYPE DESIGN 56
PH
I2S
MP
LH
OLD
PH
I1P
HI2
PH
I1
s h a b c d e 1 1e 2
sh
ab
c1
11 d d
1e ed
122
1
c
f
RS
T
v out
v CM
I
v R+
v CM
I
v R-
v out
Idle
/ph
ase2
sam
ple
phas
e1ho
ldv C
M
v CM
v in
v CM
v CM
v CM
v CM
(c)
(a)
(b)
Fig
ure
4.10
:M
ore
det
aile
ddia
gram
ofth
eA
DC
MD
AC
.(a
)F
init
est
ate
mac
hin
ere
pre
senti
ng
phas
esof
oper
atio
n.
(b)
AD
Ccl
ock
ing
sign
als.
(c)
Sin
gle-
ended
MD
AC
.
CHAPTER 4. PROTOTYPE DESIGN 57
voutvCM
vR+vCMvR-vout
(a) Beginning of phase 1
voutvCM
vR+vCMvR-vout
(b) End of phase 1
voutvCM
vR+vCMvR-vout
(c) Beginning of new phase 2
voutvCM
vR+vCMvR-vout
(d) End of phase 2
voutvCM
vR+vCMvR-vout
(e) Transition between phase 1 and 2
voutvCM
vR+vCMvR-vout
(f) Transition between phase 1 and 2
Figure 4.11: Phase 1 operation duplicates the residue from C1 in (a) onto C3 and C4
in (b). In phase 2, C3 and C4 charges add up (c) and construct new residue on C1
(d). In the transition between phases 1 and 2, the bottom-plate switch is opened first
(e). If the top-plate switches are not opened right after that, switching glitches can
get transferred to the bottom-plate and cause charge leakage (f).
CHAPTER 4. PROTOTYPE DESIGN 58
switches that must change state until we get into phase 2. We must open the switches
that connect C3 and C4 to the output of the amplifier, close the switches on the two
sides of C2, disconnect C1 from VCM and connect it to the output, and also switch
the summing node of the amplifier. Interestingly, out of the many possible switching
sequences, only a few are safe. For example, Figure (f) shows a failing scenario in
which the switch to the left of C2 is closed next. This could result in a negative
voltage excursion on the left plate of C2 that gets coupled to the node between C3
and C4. A negative impulse on that node could partially turn on the bottom-plate
switch and lead to charge leakage.
4.1.3 Foreground Calibration
The output of the cyclic ADC is related to the output bits through the following
relationship:
Vout = Vref∑i
(g−1di + ε+ Vos
)(4.3)
where g is the stage radix, ε is the unresolved residue and Vos is the offset of the
ADC, and they all contain random errors due to circuit mismatches and non-idealities.
However, it is only the stage radix that governs the linearity of the converter and it
can be found through a foreground calibration [38].
4.1.4 OTA Structure
The OTA is implemented as a two-stage Miller compensated amplifier with nested
gain boosting, as illustrated in Figure 4.12(a). The first stage of the top-level OTA as
well as all the nested OTAs are implemented as folded cascade amplifiers. A schematic
of a nested OTA is shown in Figure 4.12(b).
The boosted output impedance will have a pole-zero doublet at the unity gain
frequency of each auxiliary amplifier. To avoid slow settling, these doublets must be
pushed beyond the unity gain frequency of the main amplifier, hence making “fast
doublets” [39]. The feedback factor of the auxiliary amplifiers is near one; while that
of the main amplifier is usually a fraction of unity. So the above condition does not
CHAPTER 4. PROTOTYPE DESIGN 59
M1
Mc
M2
R Cc
Rsense
M12 M21
M11
VDD = 1 V
Vin
Vout
Vout,CM
(a) Top-level two-stage gain-boosted OTA.
VDD = 1 V
Vin
Vout,CM
Vout
(b) Booster amplifier.
Figure 4.12: Top-level and booster amplifier schematics.
CHAPTER 4. PROTOTYPE DESIGN 60
require additional stages to be faster than the main OTA. The transistors in auxiliary
amplifiers have smaller widths and non-minimal channel lengths and are biased at low
currents, thus providing high gains. The main stage on the other hand will have fast
transistors. The separation of gain and speed requirements allows the use of minimal
channel length transistors in the input.
The nested OTAs use active common-mode feedback sensing to preserve their
gains, whereas simple resistive averaging was enough to sense the output common-
mode at the second stage of the top-level OTA. The overall OTA achieves a gain of
more than 95 dB and a unity gain frequency of 130 MHz.
4.2 Noise Analysis
In this section, we review the noise analysis of the ADC. First we consider the noise
contribution of the amplifiers. Due to the complexity of arithmetic involved, there are
multiple recipes for the derivation of noise, each with a different degree of accuracy.
We briefly review these methods to present a unified view. Next, we revisit the
sampling noise, keeping an eye on the differences between the proposed sampling and
the regular sampling processes. We finally combine the results to get the total noise
of the converter.
4.2.1 Review of Amplifier Noise Analysis
The total output noise of an amplifier in a sampled system is obtained by integrating
its output noise power across all frequencies from zero to infinity. The output noise
power is found by multiplying each noise source in the circuit by the magnitude
square of its transfer function to the output of the amplifier and then adding up all
the results.
Nout,tot =∑i
∫ ∞0
Si(f) |Hi(f)|2 df (4.4)
where Si is the one-sided noise power spectral density (PSD) of the ith component
and Hi(f) is its corresponding transfer function. Here, we will only consider thermal
noise of transistors and resistors because these are the dominant sources of electronic
CHAPTER 4. PROTOTYPE DESIGN 61
M1
McM2
Cx
C1
CL
Cy Mb
vo
-βvo
v1
vx
vy
CcR
(a) Simplified model of the OTA with all noise sources along the signal path.
M1 M2
C1
CL
vo
-βvo
v1
CcR
(b) Simplified model of the OTA with dominant noise sources.
Figure 4.13: Small-signal model of the OTA with varying details.
CHAPTER 4. PROTOTYPE DESIGN 62
(a) (b)
× × × ×××Re
ImIm
×p1p2p3 p3 z1 -z1p1
p2
p2
*
Re
Figure 4.14: Typical pole-zero plots of amplifiers. (a) Miller-compensated amplifier.
(b) Cascode-compensated amplifier.
noise in the circuit. The channel noise of transistors can be modeled as a current
source between the drain and source terminals and its one-sided PSD in active region
is approximately given by:
SMOS(f) = 4kBTγgm (4.5)
where kB is the Boltzmann constant, kB = 1.38 × 10−23 J/K, T is the absolute
temperature in degrees kelvin, γ is a coefficient equal to 2/3 for long channel devices
[40], and gm is the transconductance of the transistor. The thermal noise of a resistor
R can be modeled by a voltage source with the following PSD:
SR(f) = 4kBTR (4.6)
Derivation of noise transfer functions (NTF) for each noise source in a circuit and
then carrying out the noise integrals can be performed accurately even with symbolic
expressions. However, including every single device in calculations can lead to very
“high-entropy” results which does not provide good design intuition. But depending
on the desired accuracy—for example, whether it is a back of the envelop design or
part of an automated circuit optimization program—we can make assumptions to
simplify the calculations.
A first order approximation of this integral can be obtained by approximating the
filter response with a box and simply multiplying the noise PSD by an equivalent
noise bandwidth [41].
Nout,tot =∑i
∫ ∞0
v2n,outdf = v2n0 ·Bn (4.7)
CHAPTER 4. PROTOTYPE DESIGN 63
where v2n0 is the total low frequency output referred noise of the circuit and Bn is the
equivalent noise bandwidth, chosen such that the above equation holds. The main
contributors to the low frequency noise are the non-cascode transistors of the first
stage of amplifier. The contribution of the second stage can be ignored because they
do not see the gain of the first stage. Also cascode transistors have a negligible low
frequency gain and so their noise contribution is negligible.
In order to determine the noise bandwidth, we consider the typical pole-zero plots
of amplifier closed loop responses as shown in Figure 4.14. The first picture shows the
case where all poles are real as in a Miller-compensated amplifier and the second one
shows an amplifier with a real and a pair of complex conjugate poles. Although these
pole-zero plots look crowded, first-order approximations can be worked out to obtain
the noise bandwidth [42] [43]. For example, in the first case a first-order approxima-
tion would be to ignore all the poles except the dominant one and approximate the
amplifier with a one-pole system, in which case we have Bn = π/2 · p1.However, the single pole model is quite often a very crude model of the amplifier.
Usually amplifiers are designed for a phase margin of 60 to 70—as opposed to 90—
for best settling response. This means that the higher order poles already begin to kick
in around the unity gain bandwidth of the amplifier. Therefore, a better estimation is
obtained if the effect of these higher order pole are also considered in noise bandwidth
calculation. A small signal model of the amplifier considering the effect of the second
pole is shown in Figure 4.13(b). It is seen that the noise generating elements are now
the input transistors of the first and second stage and the nulling resistor, and they
each see a different transfer function to the output. In fact, due to the dominant pole
at node v1, the noise of M1 is more severely filtered compared to M2 and the resistor,
and so the previous argument about ignoring the noise of second stage components
based on DC gain analysis is inaccurate. The noise integrals are more involved in
this case but the math can nevertheless be worked out. Assuming R = 1/gm2 and
neglecting C1 compared to Cc and CL, we can obtain the following results [44].
NOTA,single−ended = N1 +N2 +NR =1
β
kBT
Ccγ +
kBT
CLtot(γ + 1) (4.8)
Comparing Figures 4.13(a) and 4.13(b), we can see that the input cascode device and
CHAPTER 4. PROTOTYPE DESIGN 64
the gain boosting device are still out of our noise equations. The noise of these devices
can only appear at the output at high frequencies close to the cascode pole related
to node vx. However, while designing the amplifier, we made sure to push out the
cascode pole far away from the first two dominant poles in order to preserve phase
margin. Also, both the cascode as well as the booster device are located in the first
stage where they are influenced by the dominant poles of the loop gain. Therefore, at
frequencies that the noise of these devices start to leak out the signal transfer from
v1 to the output has vanished and the noise will be directed to ground. So it is a
reasonable assumption to ignore the noise of these two devices. Next, we include the
noise of active load devices M11, M12, M13 in the first stage and M21 in the second
stage and double the noise power to account for the other half of the circuit (see
Figure 4.12(a)). Thus we get
NOTA =2
β
kBTγ
Cc
(1 +
gm11 + gm12
gm1
)+
2kBT
CLtot
(γ
(1 +
gm21
gm2
)+ 1
)(4.9)
4.2.2 Notes on Sampling Noise
In this section, we consider the noise that is collected in the sampling process. The
present sampling method exhibits non-stationary noise and incomplete settling which
makes it different from the common sampling that involves stationary noise and so it
is worth to revisit the result.
Figure 4.15(a) shows a transient waveform of sampled signal with stationary noise.
The signal exhibits noisy perturbations during track phases and it is frozen in time
during hold phases. In the beginning of every track phase the signal starts off from
the value that it was previously held. If we eliminate the hold phases and stitch the
track phase waveforms together we will get a continuous signal whose statistics do
not change over time, and therefore is stationary. The result is no different from the
output of an RC branch that operates in continuous time and has no switches. The
output noise power in this case can be found by integrating the filtered output noise
PSD, as shown in Figure 4.16(a).
Since the input PSD is directly proportional to R and the circuit bandwidth is
inversely proportional to it, the resistance value drops out and we get the familiar
CHAPTER 4. PROTOTYPE DESIGN 65
t
vn1
t
vn2
t
vn1
t
vn2
(a) (b)
(c) (d)
Figure 4.15: T/H transient noise.
PS
D [
V2/H
z]
Pn
[V2 ]
frequency time
(a) (b)
R1
R2
T1 T2
noise from track phase
noise from reset phase
Figure 4.16: Derivation of kT/C result. (a) Frequency domain analysis of a continuous
RC branch. (b) Time domain analysis of switched RC branch.
CHAPTER 4. PROTOTYPE DESIGN 66
kT/C result. This also does not depend on the length of the tracking phase and
whether the settling is incomplete or not. If the track phase is short compared to the
time constant of the branch, successive noise samples will be correlated. However, as
we take more and more samples they will be spread across longer time frames and
their statistics will approach those of the continuous time process.
The waveforms of a non-stationary noise process are shown in Figure 4.15(c),
which also applies to the Wheatstone sampler used in this work. In this case, the
periods with large signal perturbations denote the track phases and the periods with
quieter variations are hold phases when the sampling capacitor is being reset. The
branch resistance during the track phase is much larger than the resistance of the
resetting switches so the noise variance is larger in track phase compared to hold
phase. From the noise eye diagram in Figure 4.15(d) we see that the noise is not
stationary, which is in contrast with the eye diagram in Figure 4.15(b) that does not
show any time dependence. In this case, the usual frequency domain analysis does
not apply. Instead, the noise power can be found through time-domain analysis of
the noise process’ autocorrelation function [45][46].
Ryy(t1, t2) = h(t1) ∗Rxx(t1, t2) ∗ h(t2) (4.10)
where Rxx and Ryy are the autocorrelation of the input and output noise, and h(t)
is the impulse response of the RC filter. The variance of the output noise power can
be found from its autocorrelation as follows:
σ2y(t) = Ryy(t, t) (4.11)
Since the input noise is white, its autocorrelation is a Dirac delta function:
Rxx(t1, t2) =Sx02δ(t1 − t2) (4.12)
where Sx0 = 4kBTR is the one sided white noise PSD of the resistor. Combining
these equations, we can carry out the convolution and find the output noise power as
CHAPTER 4. PROTOTYPE DESIGN 67
follows:
σ2y(t) =
Sx02
∫ Ts
0
h2(τ)dτ
= 2kBTR ·1
2τ
(1− e−2Ts/τ
)=kBT
Cs
(1− e−2
Tsτ
)(4.13)
Therefore, we see that the sampled noise grows exponentially until it reaches the final
value of kBT/Cs. Another noise source that must be included in this result is the
noise that is stuck on the capacitor from the reset phase. Since the time constant of
the branch is much shorter in this phase, the noise variance reaches the steady value
of kBT/Cs. During the following track phase, this voltage decays exponentially and
its final value at the end of the track phase will be
v2n,reset =kBT
Cse−2
Tsτ (4.14)
Combining the two contributions of the two noise sources, we get the total sampled
noise as follows:
v2n =kBT
Cs(4.15)
Once again, we arrive at the pervasive expression of kBT/Cs. The noise being in-
tegrated increases during sampling window while the noise power stored on the ca-
pacitor decreases, both exponentially and with the same time constant. So the total
noise stays the same, as shown in Figure 4.16(b). Interestingly, the way the branch
resistance falls out of the equation in this time domain analysis is reminiscent of the
situation where the noise power of an RC branch stays independent of the resistor
value in the frequency domain analysis.
4.2.3 Track-and-hold Noise Analysis
The noise contribution at the output of a track-and-hold stage consists of the noise
that is generated during track mode and gets transferred to the output, as well as
the hold mode noise. We already calculated the noise contribution of the sampling
CHAPTER 4. PROTOTYPE DESIGN 68
capacitor. But the track mode noise also includes contributions from the other capac-
itances that are connected to the amplifier virtual ground node, such as the feedback
capacitor Cf and the parasitic capacitance of that node Cpar. It can be shown that
the total output referred noise of the track mode is [47]
Ntrack = 2kBTCs + Cf + Cpar
C2f
(4.16)
The noise added to output during hold-mode operation primarily consists of the
noise of the amplifier as well as the thermal noise of switches:
Nhold = NOTA +Nswitches (4.17)
The amplifier noise follows from (4.9). The switches’ noise contributions is found
by transferring their noise PSDs to the output, and integrating the PSDs across all
frequencies. Noise voltages corresponding to switches that are in series with the
sampling capacitor see a gain of (Cs/Cf )2 while others that are in the feedback path
or are connected to the load see unity gain. We assume a second order filter response
with a resonant frequency ω0 and quality factor Q. Therefore, the noise contribution
of switches can be obtained from
Nswitches = 4kBT
(∑Rs
(CsCf
)2
+∑
Rf +∑
RL
)ω0Q
4(4.18)
where the summations are carried out over all switch resistances in series with the
sampling, feedback, and load capacitors in the complete differential circuit. The
values of ω0 and Q are related to the loop gain unity gain frequency ωc and phase
margin PM as follows:
ω0 = ωc4
√1 + tan2(PM)
Q =4√
1 + tan2(PM)
tan(PM)(4.19)
4.2.4 ADC Noise Analysis
In order to calculate the total converter noise, we follow the signal path to see what
noise sources are added to the signal. First, the input signal is sampled through the
CHAPTER 4. PROTOTYPE DESIGN 69
RC sampling network and accrues a noise given by (4.16). Then, in the hold phase, the
sampled voltage is transferred from Cs onto C1, meanwhile being amplified by Cs/Cf
and getting a noise of NSH . Next, the two-phase ADC operation starts and repeats
until the desired number of bits are resolved. We denote the noise contributions in
phase 1 and 2 by N1 and N2, respectively. Each time the signal goes through the
phase 2 operation, it is amplified by a factor of 2. The total input referred noise of
the cyclic ADC is therefore
NADC = Ntrack +NSH +
(N1 + N2
4
) (1 + 1
4+ 1
16+ · · ·
)(CsCf
)2 (4.20)
where the noise componentsNSH , N1 andN2 follow from the hold mode noise equation
(4.17), with the only difference being the capacitive network the amplifier is embedded
in.
4.3 Summary
We covered some of the important and unique aspects of the design of the calibra-
tion ADC. Input sampling is accomplished by the Wheatstone sampling technique
presented in this chapter. Due to the small sizes of capacitors, care was taken to
minimize charge injection errors from switches considering second order effects that
are usually insignificant in regular designs. Appropriately sequencing clock edges be-
tween various phases of the ADC is another important aspect in this design that was
pointed out. Finally, the noise analysis of the complete converter was reviewed.
Chapter 5
Experimental Results
A prototype chip was taped out as a proof of concept for the proposed architecture.
The chip was fabricated in UMC’s 90-nm, 9 metal, 1 poly, CMOS process. The
die micrograph is shown in Figure 5.1(a). The die has a total area of 2 mm × 2
mm and was packaged in an 88-pin QFN package. The active area of the DAC is
0.36 mm2 and that of the ADC is only 0.04 mm2. The chip contains three slightly
different DAC-ADC pairs for testing purposes. The layout of the ADC is shown in
Figure 5.1(b).
5.1 Test Setup
The chip I/O interface is shown in Figure 5.2. The interface is passive and bidirec-
tional and was used both for capturing the data out of the DAC as well as testing the
ADC. Both single-ended as well as differential I/O options are included on the test
board. For the purpose of testing the ADC, the input signal is generated from an
HP8644B low-jitter, high-performance RF signal generator. It is then filtered through
a K&L tunable bandpass filter to block the spurious tones before connecting to the
test board. A cascade of two ETC-1-1-13 baluns are used on the board for single-
ended to differential conversion with small amplitude imbalance. The 50 Ω resistors
are the off-chip loads of the DAC. The inductors in parallel with the loads are used to
set the input common-mode voltage when testing the ADC. The combination of the
70
CHAPTER 5. EXPERIMENTAL RESULTS 71
DACADC
(a) Chip micrograph.
OTA
switches
digitalcontrol
capacitors
comp
(b) ADC layout.
Figure 5.1: Die photo of the prototype chip and the ADC layout. The chip also
includes two other DAC-ADC pairs (not outlined in the photo) for debugging options.
CHAPTER 5. EXPERIMENTAL RESULTS 72
chip boundary
dummysamplers
DAC output driver
22Ω
2pF
0.1uF
0
0
0
0
50Ω
to the rest of ADC
S
ETC-1-1-13100uH 50Ω
0.1uF
Vcmi
2V
V id
V ip
Vim
Vca
scod
e
Figure 5.2: Chip I/O interface. The Wheatstone samplers are simplified in this
schematic.
CHAPTER 5. EXPERIMENTAL RESULTS 73
2 pF capacitor and 22 Ω resistors provide an additional stage of filtering right at the
border of the chip. The value of the capacitor is adjusted depending on the frequency
of the test input to the ADC. This capacitor is removed when the DAC is under test
with a broadband signal. As shown in the picture, the bias of the cascode transistors
in the DAC output driver can be set off-chip. During regular DAC operation, they
are biased at 2 V. However, when testing the ADC they are turned off so that the
(nonlinear) output impedance of the DAC driver does not load the ADC. Two repli-
cas of the ADC sampling network are placed in parallel with the ADC. The clocking
of these replicas is programmed such that at any time one branch is configured in
sample mode and two branches are in hold mode. Therefore, the loading on the DAC
driver is regularized across time and there is similar sampling current drawn from the
DAC driver during every output pulse.
5.2 Measurement Results
5.2.1 ADC Measurements
Due to capacitor mismatches, the accurate closed-loop gain of the residue amplifier
in the ADC is not known a priori and must be obtained through measurements. So
before using the ADC to calibrate the DAC, we must perform a foreground calibration
on the ADC and find the residue gain. The right gain is the one that maximizes the
ADC linearity, and can be found by testing either the static or dynamic linearity of
the ADC. In this work, we chose to measure the dynamic linearity by applying a
sinusoid to the ADC and finding the residue gain that maximizes the output SFDR.
This gain is used in all other measurements afterwards.
ADC Static Nonlinearity
Figure 5.3 shows the measured static nonlinearity of the ADC. The input is a low
frequency signal at about 50 MHz and the integral and differential nonlinearities (INL
and DNL) are found through a histogram test. The ADC achieves a DNL below 0.62
LSB and an INL below 2.2 LSB.
CHAPTER 5. EXPERIMENTAL RESULTS 74
0 2000 4000 6000 8000 10000 12000 14000 16000-1
-0.5
0
0.5
1
DN
L (L
SB
)
Peak DNL = 0.61 / -0.52
0 2000 4000 6000 8000 10000 12000 14000 16000-4
-2
0
2
4
code
INL
(LS
B)
Peak INL = 2.20 / -2.17
Figure 5.3: ADC measured static nonlinearity.
ADC Dynamic Nonlinearity
Figures 5.4 and 5.5(a) show single-tone tests of the ADC dynamic nonlinearity. Input
frequency was swept from 50 MHz to above 500 MHz. The ADC sampling frequency
is about 1 MHz, therefore the ADC downsamples the input signal. The SFDR rolls
off at higher frequency mainly due to filtering of the sampling interface. This is
confirmed by the expected droop of the fundamental tone.
The bandwidth of the sampling interface is larger than the bandwidth of an ideal
boxcar sampler. This is because the actual sampling interface is only an approxima-
tion of an ideal boxcar sampler. It exhibits non-zero rise and fall times as well as a
drooping amplitude due to the finite RC time constant of the sampling branch. This
results in an effective sampling window smaller than ideal value of Ts = 1.25 ns and
hence a larger bandwidth.
A shortcoming of the single-tone tests is that the observed linearity might be
optimistic as the harmonics of high frequency inputs can get blocked by the ADC
CHAPTER 5. EXPERIMENTAL RESULTS 75
0 100 200 300 400 500 60040
50
60
70
80
SF
DR
/SN
DR
(dB
)
Frequency (MHz)
SFDR
SNDR
0 100 200 300 400 500 600-5
0
5
Fun
dam
enta
l (dB
)
Frequency (MHz)
Figure 5.4: Measured SFDR/SNDR and magnitude of the fundamental tone of the
ADC output versus input frequency.
itself. Therefore, a two-tone test must be done to cross-check the high-frequency
performance of the ADC. Figure 5.5(b) shows the output spectrum of a two-tone
test. The amplitude of the input signal is kept the same as in the single-tone tests
at 1 Vppd. It can be seen that the performance is in agreement with the results from
single-tone tests.
5.2.2 DAC Measurements
Figure 5.6 shows the main test setup during background calibration of the DAC. The
high-speed inputs for the DAC are generated using ADI’s Data Pattern Generator 2
(DPG2). The calibration ADC digitizes the DAC with a down sampling ratio of
about 800. The ADC data is captured using an NI board. The calibration algorithm
runs in software and updates the LUTs in the computer. A new set of predistorted
codes are generated accordingly and sent to the DAC through the DPG2. This cycle
continuously repeats until the LUTs converge to their steady state values.
CHAPTER 5. EXPERIMENTAL RESULTS 76
0 0.1 0.2 0.3 0.4 0.5-100
-90
-80
-70
-60
-50
-40
-30
-20
-10
0
DF
T M
agni
tude
(dB
FS
)
Frequency (f/fs)
3rd harmonic
(a) Single-tone input with a frequency of 491 MHz.
0 0.1 0.2 0.3 0.4 0.5-100
-90
-80
-70
-60
-50
-40
-30
-20
-10
0
DF
T M
agni
tude
(dB
FS
)
Frequency (f/fs)
(b) Two-tone input with frequencies of 487 and 491 MHz.
Figure 5.5: Measured frequency responses of the ADC. The input amplitude is 1
Vppd. The ADC sample rate is about 1 MHz.
CHAPTER 5. EXPERIMENTAL RESULTS 77
DAC
ADC
HP8644B signal generator
NI PXI-6120 NI PXI-6562
PC
ADI DPG2Pattern generator
prototype chipon test board
HP 8560Espectrum analyzer
control
800 MS/s
1 MS/s
Figure 5.6: Test setup for DAC calibration.
CHAPTER 5. EXPERIMENTAL RESULTS 78
During calibration, the DAC input data is a random vector in order to provide
linearly independent equations. When the calibration converges, the DAC is tested
with predistorted sinusoid tones and the performance is evaluated with a spectrum
analyzer.
Single tone output spectra of the DAC before and after calibration are shown
in Figure 5.7. The first picture shows the outputs due to a low frequency input at
29 MHz. It can be seen that the linearity before calibration is about 30 dB and
limited by third order harmonic. The plot also shows tones at 800/3±29 MHz which
are due to the input tones mixing with the clock of the DAC cores at 800/3 MHz.
After calibration, the harmonics of the input as well as the mixing terms are pushed
down and the SFDR improves to about 60 dB. The second picture is a test with an
input frequency close to the Nyquist frequency. Similarly in this picture, the linearity
improves by about 30 dB and the DAC achieves a final SFDR of 53 dB.
After calibration, the DAC was also tested using two-tone inputs to make sure that
the single-tone readings are in fact valid and no significant high frequency harmonic
is being filtered and missed. One of these tests is shown Figure 5.8. The input signal
contains two tones, at frequencies of 370 and 376 MHz, both very close to the DAC
Nyquist frequency of 400 MHz. The resulting spectrum plot shows an SFDR of about
68 dB.
Figure 5.9 shows a plot of DAC SFDR before and after calibration versus fre-
quency. The DAC achieves low frequency SFDRs as high as 58.5 dB and an overall
linearity of better than 53 dB for input frequencies up to 400 MHz and peak-to-peak
differential output swing of 800 mV. The SFDR roll-off is gradual across frequencies
which confirms that the DAC nonlinearity is frequency independent for the most part.
One factor that can explain this roll-off though is due to some filtering prior to the
DAC output driver. Specifically, because of non-zero switch resistances there is a
finite bandwidth filter at the output of the SC cores [48]. This filter goes between the
predistorter and the DAC nonlinearity and so the nonlinearity can only be partially
cancelled. As shown in Appendix B, even if the predistorter has the exact inverse of
CHAPTER 5. EXPERIMENTAL RESULTS 79
0 50 100 150 200 250 300 350 400-80
-60
-40
-20
0
Frequency (MHz)
Spe
ctru
m (d
Bm
)
0 50 100 150 200 250 300 350 400-80
-60
-40
-20
0
Frequency (MHz)
Spe
ctru
m (d
Bm
)
(a) Input signal is 800 mVppd at 29 MHz.
0 50 100 150 200 250 300 350 400-80
-60
-40
-20
0
Frequency (MHz)
Spe
ctru
m (d
Bm
)
0 50 100 150 200 250 300 350 400-80
-60
-40
-20
0
Frequency (MHz)
Spe
ctru
m (d
Bm
)
(b) Input signal is 800 mVppd at 373 MHz.
Figure 5.7: Measured output spectra of the DAC before and after calibration.
CHAPTER 5. EXPERIMENTAL RESULTS 80
0 50 100 150 200 250 300 350 400-80
-70
-60
-50
-40
-30
-20
-10
0
Frequency (MHz)
Spe
ctru
m (d
Bm
)
Figure 5.8: Measured output spectrum of the DAC before and after calibration. Input
signal is 800 mVppd signal containing two tones at 370 and 376 MHz.
the DAC driver characteristic, this filter will limit the achievable linearity to
HD3 =3
4A2 c3c1
(fsigfT/H
)2
(5.1)
where A is the signal amplitude, c3/c1 is the relative magnitudes of the third order
nonlinearity to the linear term, fsig is the signal frequency and fT/H is the band-
width of the track-and-hold. Therefore, at high frequencies the simple model of static
nonlinearity becomes invalid and the SFDR starts to degrade.
5.3 Performance Summary
Figure 5.10 compares the performance achieved by the calibrated DAC discussed in
this thesis with some other published works. It can be seen that the design landscape
is dominated by the current steering DACs. The DAC discussed in this thesis is one of
the few ones that are based on an SC architecture, and achieves the best performance
CHAPTER 5. EXPERIMENTAL RESULTS 81
0 50 100 150 200 250 300 350 40020
30
40
50
60
70
80
SF
DR
(dB
)
Signal Frequency (MHz)
Before CalibrationAfter Calibration
Figure 5.9: Measured SFDR of the DAC before and after calibration versus input
frequency.
in this category. The performance is still comparable to the best current steering
DACs which proves the potential of this new architecture.
CHAPTER 5. EXPERIMENTAL RESULTS 82
0 100 200 300 400 500 600 700 800 900 10000
10
20
30
40
50
60
70
80
90
Frequency (MHz)
SF
DR
(dB
)
Current Steering CMOSSwitched Cap CMOSThis Work
[49]
[50]
[51]
[52]
[53]
Figure 5.10: SFDR versus frequency of some published DACs in comparison with a
data point from the prototype chip.
Chapter 6
Conclusions and Future Work
6.1 Conclusions
Digital calibration techniques are increasingly used to push the performance of analog
integrated circuits (ICs). These techniques are especially suitable for mixed-signal
systems where one end of the signal chain is in digital and therefore already accessible
to digital signal processing (DSP) functions. Examples include digital calibration of
data converters to correct the nonlinearity of op amps or digital predistortion of
power amplifiers where the aggregate nonlinearity of the transmitter chain can be
compensated in the baseband. Among data converter circuits, much focus has been
on the ADCs—especially pipeline—where the inherent redundancy in the architecture
and availability of the output in digital facilitates the implementation of calibration
algorithms. DAC design, on the other hand, comes in a much more limited variety
and offers very few calibration ideas.
Complementing the new DAC architecture proposed in [48], this work investi-
gates some important ramifications of calibrating a DAC. Important considerations
and trade-offs in the design of a sense ADC as well as an adaptive background cali-
bration algorithm were discussed. The complete system was implemented and tested
as a proof of concept. The results are comparable with those achieved from more tra-
ditional designs. We hope that this will open up doors to new ideas and architectures
in the future.
83
CHAPTER 6. CONCLUSIONS AND FUTURE WORK 84
6.2 Future Work
This work was primarily a step towards identifying and understanding the various
trade-offs involved in the design. Some of this understanding was gained along the
way, and in light of that, some of our design choices can be further improved. Also,
this work was meant to be a proof of concept. Therefore, there are additional steps
that must be taken to obtain an industry-standard product. These two considerations
are explained further below.
Optimization of the Design
Being the first one of its kind, this design can be further tuned and optimized. This
was primarily due to the exploratory nature of the design approach, and that the final
calibration algorithm was developed in the lab after having received the chips from
the foundry. Therefore, there was a disconnect between the design and optimization
of the calibration algorithm and the design of the circuit blocks. Specifically, the
adaptive calibration algorithm requires in the order of 106 measurements before it
converges, and that translates into a calibration time in the order of seconds, which
might be too long for some applications. There are two ways to shorten the calibration
time: One is to modify the design of the calibration ADC and increase its sampling
frequency or improve its SNR. The second option is to modify the DAC design and
operate the output drives at a lower efficiency. This will squeeze the operating zone
of the output differential pair and result in a more linear response, thereby decreasing
the number of nonlinearity coefficients.
Hardware Implementation of the Calibration Algorithm
The proposed calibration algorithm was implemented in software. This gave us the
flexibility to experiment with different algorithms. However, it also introduced major
complications in testing. The transportation of data between the chip, the computer
and the DPG was the main bottleneck in real time calibration. Furthermore, the
lack of handshaking signal from the DPG caused syncing problems during real time
operation. The DPG was only able to play an array of input codes indefinitely, and
CHAPTER 6. CONCLUSIONS AND FUTURE WORK 85
the array could not be updated while the DPG was playing. Therefore, we had to
stop the DPG, update its RAM and start it over again. Now, there were three DAC
cores in the chip and three corresponding LUTs in the software, and at every update
the DPG would start from a random LUT out of sync with the cores. We were able to
solve this problem by interrupting the DPG operation repeatedly until chance would
put everything in sync. But, this resulted in a more complex and longer calibration.
In a real application, the LUTs must be implemented in hardware and these issues
will be avoided all the way.
Appendix A
Calculation of Least Squares
Residual Error
Let’s assume that y(n) denotes the vector of measured outputs up until time n, and
the sampled output at time n is yn; therefore, the following relationship captures the
growth of the vector y(n) based on its previous value:
y(n) =
[y(n− 1)
yn
]∈ Rn (A.1)
Similarly, D(n) denotes the data matrix at time n. It is constructed by appending a
new row of input data dTn to the end of D(n− 1):
D(n) =
[D(n− 1)
dTn
]∈ Rn×m (A.2)
As explained earlier and illustrated in Figure 3.6, the least-squares problem in (3.17)
can be solved by applying a series of Givens rotations to the data matrix and output
vector. At time n− 1, we have
G(n− 1)Λ(n− 1)[D(n− 1) y(n− 1)
]=
[R(n− 1) p1(n− 1)
0 p2(n− 1)
](A.3)
where G(n−1) is the aggregate product of all the Givens rotations that triangularize
the data matrix D(n − 1), R(n − 1) is an m-by-m upper triangular matrix, and
86
APPENDIX A. CALCULATION OF LEAST SQUARES RESIDUAL ERROR 87
p1(n− 1) and p2(n− 1) are m-by-1 and (n−m− 1)-by-1 vectors, respectively. The
matrix Λ(n − 1) is a diagonal forgetting matrix that emphasizes the more recent
entries of the LS equation.
Λ(n− 1) =
λn−1 0 . . . 0
0 λn−2 . . . 0...
.... . .
...
0 0 . . . 1
(A.4)
where 0 < λ < 1 is the forgetting factor.
Now, using equations (A.1) and (A.2), we write an equation similar to equation
(A.3), but corresponding to time n.
G(n)
λR(n− 1) λp1(n− 1)
0 λp2(n− 1)
dTn yn
=
[R(n) p1(n)
0 p2(n)
](A.5)
The error vector at time n is defined as:
e(n) = y(n)−D(n)h(n) (A.6)
Similar set of recursion equations can be written for the error vector as follows:
e(n) =
[e(n− 1)
en
](A.7)
where
en = yn − dTnh(n) (A.8)
Now, we apply the Givens rotation matrix to the weighted error vector at time n− 1,
APPENDIX A. CALCULATION OF LEAST SQUARES RESIDUAL ERROR 88
The same relationship at time n will be
G(n)
[λe(n− 1)
en
]=
[p1(n)
p2(n)
]−
[R(n)
0
]h(n) =
[0
p2(n)
](A.10)
where the last equality holds because at time n we have p1(n) = R(n)h(n).
Now let’s examine the Givens rotation matrix G(n). This matrix is the product
of m Givens rotation matrices that zero out the last row of the data matrix, dTn :
G(n) =
cos(φ1) sin(φ1)
1. . .
−sin(φ1) cos(φ1)
1
cos(φ2) sin(φ2). . .
−sin(φ2) cos(φ2)
· · ·
· · ·
1. . . 0
cos(φm) sin(φm)
0 . . .
−sin(φm) cos(φm)
(A.11)
Examining the product of these matrices, we can see that the Givens rotation matrix
has the following structure:
m
n−m
0
0
0
0
0G(n) = 1
1
1
Multiplying both sides of (A.10) by GT (n), we have[λe(n− 1)
en
]= GT (n)
[0
p2(n)
](A.12)
APPENDIX A. CALCULATION OF LEAST SQUARES RESIDUAL ERROR 89
Now let’s focus on the row-column multiplication in (A.12) that gives en. The vector
on the right side of (A.12) consists of m zeros on top. So multiplying that by the
last row of G(n) filters out all but the corner element of G(n). Again, if we inspect
(A.11) we can see that the corner element is the product of all the cosφ ’s. Therefore,
we have
en = Gnnpn =m∏i=1
cos(φi)pn (A.13)
Therefore, we obtain the LS residual error without explicit derivation of the so-
lution. Algorithm 3 summarizes the steps to get the error en. The inputs to the
algorithm are the latest sample yn and the corresponding data row dTn . Also available
to the algorithm are the upper triangular matrix R(n) and the vector p1(n).
APPENDIX A. CALCULATION OF LEAST SQUARES RESIDUAL ERROR 90
Algorithm 3 RLS error calculation
Require: sampled output yn and the corresponding row of data matrix dTn
1: append dTn to the bottom of the upper-triangular matrix R ∈ Rm×m
2: append yn to the bottom of the vector p ∈ Rm×1
3: set Π = 1
4: for i = 1 to m do // calculate the rotation angle to zero out Rm+1,i
5: rx = λRi,i
6: ry = Rm+1,i
7: sinφ = ry√r2x+r
2y
8: cosφ = rx√r2x+r
2y
9: for j = i to m do // rotate elements of R
10: [Ri,j
Rm+1,j
]=
[cosφ sinφ
−sinφ cosφ
][rx
ry
]11: end for
12: rotate elements of p:[pi
pm+1
]=
[cosφ sinφ
−sinφ cosφ
][λpi
pm+1
]
13: build up the product of cosφ ’s: Π = Π · cosφ14: end for
15: return e = Π · pm+1
Appendix B
Predistortion Error Due to Model
Inaccuracy
When modeling the DAC, we assumed that it comprises of a static nonlinearity fol-
lowed by a filter at the output, thereby arriving at a Hammerstein model. Of course,
there is always some modeling inaccuracy at this level of abstraction. Specifically,
there is some filtering before the DAC drivers—due to finite bandwidth of the SC core
track-and-hold—and therefore the assumption of memoryless nonlinearity is not quite
true. The system can be more accurately modeled using power series with memory.
In this section, we use the more general Volterra filter model to analyse the signal
chain.
A model that includes the filtering effect prior to DAC nonlinearity is shown in
Figure B.1. In this picture, the sigmoid nonlinearity denoted by the function f(·)corresponds to the DAC driver nonlinearity. The filter h(t) models the filtering effect
that was absent in our earlier modeling. The function g(·) denotes the predistorter
x y
f(·)g(·) h(t)
t
Figure B.1: Alternative model of the DAC including a filter before the driver.
91
APPENDIX B. PREDISTORTION ERROR DUE TO MODEL INACCURACY 92
curve and it is assumed that it has the inverse characteristic of f . The output filter
of the Hammerstein model is not shown in this picture. If g and f were cascaded
back-to-back, their nonlinearities would cancel out. However, in the presence of h,
the predistortion is not perfect and there would be some residual nonlinearity in the
signal path.
Assuming a simple third order nonlinearity for the DAC output driver, we have
f(x) = x− εx3 (B.1)
If we ignore terms with higher powers of ε, the predistorter for this nonlinearity will
be
g(x) = x+ εx3 (B.2)
We assume that the filter h has a first-order response with time constant T . Thus,
h(t) =1
Te−
tT (B.3)
Now, ignoring powers of ε larger than one, we can obtain the output of the DAC as
follows:
y = f (h ∗ g (x))
= f
(∫h(τ)
(x(t− τ) + εx3(t− τ)
)dτ
)=
∫h(τ)
(x(t− τ) + εx3(t− τ)
)dτ − ε
(∫h(τ)
(x(t− τ) + εx3(t− τ)
)dτ
)3
≈∫h(τ)x(t− τ)dτ + ε
(∫h(τ)x3(t− τ)dτ −
(∫h(τ)x(t− τ)dτ
)3)
(B.4)
The last expression in (B.4) can be cast into the following format:
y =
∫h(τ)x(t− τ)dτ
+ ε
(∫h(τ)x3(t− τ)dτ
−∫∫∫
h(τ1)h(τ2)h(τ3)x(t− τ1)x(t− τ2)x(t− τ3)dτ1dτ2dτ3)
(B.5)
APPENDIX B. PREDISTORTION ERROR DUE TO MODEL INACCURACY 93
We recognize that the output is in the form of a third order Volterra filter as shown