DIRECT DIGITAL FREQUENCY SYNTHESIZER by SHASHIKANT SHRIMALI, B.E. A THESIS IN ELECTRICAL ENGINEERING Submitted to the Graduate Faculty of Texas Tech University in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE IN ELECTRICAL ENGINEERING Approved Jon Bredeson Chairperson of the Committee Micheal Parten Accepted John Borrelli Dean of the Graduate School May 2007
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
DIRECT DIGITAL FREQUENCY SYNTHESIZER
by
SHASHIKANT SHRIMALI, B.E.
A THESIS
IN
ELECTRICAL ENGINEERING
Submitted to the Graduate Faculty of Texas Tech University in
Partial Fulfillment of the Requirements for
the Degree of
MASTER OF SCIENCE
IN
ELECTRICAL ENGINEERING
Approved
Jon Bredeson Chairperson of the Committee
Micheal Parten
Accepted
John Borrelli Dean of the Graduate School
May 2007
Copyright 2007, Shashikant Shrimali
Texas Tech University, Shashikant Shrimali, May 2007
ii
ACKNOWLEDGEMENTS
I wish to express my sincere gratitude to Prof. Dr. Jon Gustav Bredeson for
providing the opportunity to carry out this study, and for guidance and support. I am
deeply indebted to Prof. Dr. Micheal Eugene Parten for his valuable suggestions and
active cooperation and having been a part of every single stage of my thesis, from
inception to completion. I am very grateful to fellow graduate students for their constant
support, timely suggestions and inspiration. I gratefully acknowledge the cooperation I
received from other faculty members of this department. I will be failing in my duty if I
do not mention the administrative staff of this department for their timely help.
My friends outside the field of engineering have kept me interested in matters
other than just electrical. I have spent very relaxing moments with my friends taking part
in different activities and hopefully will continue to do so. I would like to thank all whose
direct and indirect support helped me completing my thesis in time. I would also like to
recognize the encouragement and unconditional love of my friends in India.
Finally, there are those whose spiritual support is even more important. I thank
my parents, who taught me the value of hard work by their own example. I would also
thank my sister, who has constantly encouraged me to study as much as possible.
Texas Tech University, Shashikant Shrimali, May 2007
iii
TABLE OF CONTENTS
ACKNOWLEDGEMENTS……………………………………………………………….ii
ABSTRACT……………………………………………………………………………….v
LIST OF FIGURES………………………………………………………………………vi
CHAPTER
I. INTRODUCTION TO DDFS…………………………………………………..1
1.1 DDFS- A Brief History………………………………………………..1
1.2 DDFS- An Overview………………………………………….………2
1.3 Aim of Thesis………………………………………………………….3
1.4 Thesis Organization…………………………………………………...4
II. OPERATION OF DDFS……………………………………………………….5
2.1 Architecture of Sine Output DDFS……………………………………5
Texas Tech University, Shashikant Shrimali, May 2007
generators, cellular base stations and wireless local loop base stations.
1.2 DDFS – An Overview
Direct digital frequency synthesis (DDFS) is a method of producing an analog
waveform—usually a sine wave— by generating a time-varying signal in digital form
and then performing a digital-to-analog conversion. The operations within a DDFS
device are primarily digital, therefore, it can offer fast switching between output
frequencies, fine frequency resolution, and operation over a broad spectrum of
frequencies.
The digital frequency synthesis approach employs a stable source frequency i.e.
reference clock to define times at which digital sinusoidal sample values are produced.
These samples are converted from digital to analog format and smoothed by
reconstruction filter to produce analog frequency signals. A DDFS typically consists of a
phase accumulator (PA) and a sine lookup table (LUT). The input to the phase
accumulator is a frequency control word, which determines the periodicity of the phase
accumulator. The PA is updated to the frequency control word or tuning word, at each
clock, the output of the PA is fed to the LUT. The output of the LUT is then converted to
an analog signal using a digital to analog converter.
The size of the LUT depends on the length of the n-bit PA. If n is large then the
LUT becomes too large, which is not desirable. This slows down the speed of the DDS
and results in higher power consumption. To reduce the size of the LUT, a technique of
phase truncation (PT) is employed. Since in this technique part of the phase generated by
the PA is truncated that gives rise to spurs in output spectrum. To minimize these spurs,
dither is added to the system that reduces the spurs in output spectrum. Since, the DDFS
is a digital system clock jitter also introduces noise in the output spectrum. Jitter is an
abrupt and unwanted variation of one or more signal characteristics, such as the interval
between successive pulses, the amplitude of successive cycles, or the frequency or phase
of successive cycles.
2
Texas Tech University, Shashikant Shrimali, May 2007
With advances in design and process technology, today’s DDFS devices are very
compact and draw little power. The ability to accurately produce and control waveforms
of various frequencies and profiles has become a key requirement common to a number
of industries. Whether providing agile sources of low-phase-noise variable-frequencies
with good spurious performance for communications, or simply generating a frequency
stimulus in industrial or biomedical test equipment applications, convenience,
compactness, and low cost are important design considerations. Many possibilities for
frequency generation are open to a designer, ranging from phase-locked-loop (PLL)-
based techniques for very high-frequency synthesis, to dynamic programming of digital-
to-analog converter (DAC) outputs to generate arbitrary waveforms at lower frequencies.
The DDFS technique is rapidly gaining acceptance for solving frequency- (or waveform)
generation requirements in both communications and industrial applications because
single-chip IC devices can generate programmable analog output waveforms simply and
with high resolution and accuracy.
1.3 Aim of Thesis
A DDFS generates analog waves digitally; hence, the output spectrum has certain
amounts of distortion. Therefore, the output wave is not spectrally pure. The output
spectrum of a DDFS consists of fundamental frequency along with its image frequencies
and noise. Various factors affect the purity of the output spectrum of the DDFS some of
them are the DAC nonlinearity, spurs due to phase truncation, amplitude quantization and
clock jitter. The size of the LUT depends on the length of the n-bit PA. If n is large then
the LUT becomes too large, which is not desirable. This slows down the speed of the
DDS and results in higher power consumption.
This thesis examines the performance of direct digital frequency synthesizer and
shows the effect of non-ideal characteristics of building blocks of the DDFS on the output
spectrum. Different models of the DDFS are created, implemented and examined. This
provides the designer better understanding of the DDFS. These models would help to
examine different aspects of the DDFS and to determine what would work best for an
3
Texas Tech University, Shashikant Shrimali, May 2007
application.
In the digital part, phase truncation introduces errors and when some dither is
added, spurious free dynamic range (SFDR), is improved. The analog part of the DDFS
includes a D/A converter and a low pass filter. Random jitter is added to the input
reference clock for system-level verification and simulation.
Spurious performances of DDFS are partly caused by quantization operations in
its digital part. These errors are deterministic and periodic in the time domain; therefore,
they appear as undesired components: spurs in the frequency domain. Hence, it is quite
natural to analyze the effects by DFT (Discrete Fourier Transform). The amplitude
quantization (AQ) is present permanently and causes harmonically related spurs, while
phase truncation (PT) produces spurs around the output frequency by phase modulation.
1.4 Thesis Organization
The structure of this thesis is to provide some background on DDFS design,
followed by the design and simulation results. The focus of this thesis has been outlined
in CHAPTER I itself. Operation of a DDFS and the role of each building block is
explained in detail in CHAPTER II. The output spectrum of a DDFS is explained in
CHAPTER III, with the help of an example. Implementation of a DDFS, with simulation
results is explained in CHAPTER IV. At last, CHAPTER V gives conclusions and
recommendations for future work on DDFS.
4
Texas Tech University, Shashikant Shrimali, May 2007
CHAPTER II
OPERATION OF DDFS
2.1 Architecture of Sine Output DDFS
The basic block diagram of a direct digital frequency synthesizer is shown in
Figure 2.1 [2].
Figure 2.1: DDFS function blocks and signal flow diagrams [2]
As shown in Figure 2.1, the main components of a DDFS are a phase
accumulator, phase-to-amplitude converter (a sine look-up table), a Digital-to-Analog
Converter and filter. A DDFS produces a sine wave at a given frequency. The frequency
depends on three variables; the reference-clock frequency and the binary number
programmed into the phase register (frequency control word,
clkf
M ), length of n-bit
accumulator. The binary number in the phase register provides the main input to the
phase accumulator.
5
Texas Tech University, Shashikant Shrimali, May 2007
If a sine look-up table is used, the phase accumulator computes a phase (angle)
address for the look-up table, which outputs the digital value of amplitude—
corresponding to the sine of that phase angle—to the DAC. The DAC, in turn, converts
that number to a corresponding value of analog voltage or current. To generate a fixed-
frequency sine wave, a constant value (the phase increment—that is determined by the
binary number M ) is added to the phase accumulator with each clock cycle. If the phase
increment is large, the phase accumulator will step quickly through the sine look-up table
and thus generate a high frequency sine wave. If the phase increment is small, the phase
accumulator will take many more steps, accordingly generating a slower waveform [4].
The heart of the system is the phase accumulator whose contents are updated once
each clock cycle. Each time the PA is updated, the digital number or M , stored in the
phase register is added to the number in the phase accumulator register. If the number in
the phase register is 00...01 and the initial content of the phase accumulator are 00...00.
The phase accumulator is updated by 00...01 on each clock cycle. If the accumulator is
32-bits wide, 232 clock cycles (over 4 billion) are required before the phase accumulator
returns to 00...00, and the cycle repeats. The output of the phase accumulator serves as
the address to a sine (or cosine) lookup table/ROM/phase-to-amplitude converter. Each
address in the LUT corresponds to a phase point on the sine wave from 0° to 360°. The
LUT contains the corresponding digital amplitude information for one complete cycle of
a sine wave. The LUT, therefore, maps the phase information from the phase accumulator
into a digital amplitude word, which in turn drives the DAC. For n=32, and M =1. The
phase accumulator steps through each of 232 possible outputs before it overflows. The
corresponding output sine-wave frequency is equal to the clock frequency divided by 232.
If M=2, then the phase accumulator register "rolls over" twice as fast, and the output
frequency is doubled. For an n-bit phase accumulator (n generally ranges from 24 to 32 in
most DDFS systems), there are 2n possible phase points. The digital word in the phase
register, M represents the amount the phase accumulator is incremented each clock cycle.
6
Texas Tech University, Shashikant Shrimali, May 2007
If is the clock frequency, then the frequency of the output sine wave is equal
to:
clkf
nclk
outfMf
2*
= (1)
Above equation is known as the DDFS "tuning equation." The frequency resolution of
the system equals nclkf
2. In a practical DDFS system, all the bits out of the phase
accumulator are not passed on to the LUT, but are truncated, leaving only the first 13 to
15 MSBs. This reduces the size of the LUT and does not affect the frequency resolution.
The phase truncation only adds a small but acceptable amount of phase noise to the final
output. The resolution of the DAC is typically 2 to 4 bits less than the width of the lookup
table. Even a perfect N-bit DAC adds quantization noise to the output [17].
2.2 Frequency Tuning Equation
A sine wave is generally expressed as )sin()( tta ω= which is non-linear and not
easy to generate except through constructing it from pieces. However, the angular
information is linear because the phase angle rotates through a fixed angle for each unit
of time. Thus, the angular rate depends on the frequency of the signal described
as fπω 2= , where,ω is the angular frequency. As shown in Figure 2.2, the phase
increases linearly from 0 to π2 over one complete cycle of the sine wave.
Figure 2.2: Sine magnitude and phase representation [19]
7
Texas Tech University, Shashikant Shrimali, May 2007
Knowing that the phase of a sine wave is linear and that it depends on a reference clock
period, with clock frequency , the phase rotation (clkf pΔ ) for that period can be
determined by
tp Δ=Δ .ω (2)
Where, = change in phase of sine wave, pΔ ω = angular frequency of wave, = small
change in time. Solving for ω in Equation 2, gives
tΔ
ftp πω 2=⎟
⎠⎞⎜
⎝⎛
ΔΔ= (3)
The overflowing accumulator (phase accumulator, PA) clocked with , generates the
phase value sequence, where,
clkf
tΔ is the minimum amount of change,
tf clk Δ= 1 (4)
Solving for from Equation 3 and substituting the reference clock frequency for the reference period in Equation 4, specifies the frequency of the output signal:
π2* clk
outfpf Δ
= (5)
Finally, for an n-bit accumulator the output signal will have the frequency specified
nclk
outfpf
2*Δ
= (6)
Where, (in degree) is the phase increment word or frequency control word or
frequency tuning word and is the clock frequency, n is the length of accumulator.
This phase value is generated using the modulo overflowing property of an n-bit
PA. The rate of the overflow is the output frequency given by Equation 6 or,
pΔ
clkf
pΔ n2
nclk
outfpf
2*Δ
= (7)
8
Texas Tech University, Shashikant Shrimali, May 2007
pΔ , is an integer, therefore the frequency resolution is found by setting = 1, pΔ
n
clkff2
=Δ (8)
A DDFS works on a point (memory location)-skipping technique (and a constant
interpolation of the stored signal) and runs at constant update (clock)-rate or reference
clock. As the DDFS output frequency is increased, the number of samples per waveform
cycle decreases.
In the point (memory location)-skipping technique, N data points cover one
complete cycle of the (sine) waveform. A block of N samples is stored in the memory
LUT (look-up table). The address pointer of the LUT is a "D step size, modulo N (mod
N, overflowing)" phase accumulator. D, a positive integer, is the FCW: frequency control
word [3].
• D = 1 gives an "exact copy" of the stored waveform (the pointer steps sequentially
through each address, i.e. the pointer accesses each consecutive entry in the table
in the same fashion as the PPC: point-per-clock synthesis or the sample playback
synthesis).
• When D > 1 , the pointer will "skip" some address, resulting in a higher frequency
value:
ND
ff
clk
out = D prime to N, and 2ND < (9)
As the output frequency is increased, the number of samples per (sinusoid) cycle
decreases.
outf
2.3 Building Blocks of DDFS
A DDFS is a mixed signal device i.e. it has both analog and digital blocks. These blocks
are the Phase Register, Phase Accumulator, Phase-to-Amplitude Converter (ROM/LUT),
Digital-to-Analog Converter, and Reconstruction Filter. The functionality of each of
these blocks is discussed in the following section.
9
Texas Tech University, Shashikant Shrimali, May 2007
2.3.1 Phase Accumulator
Continuous-time sinusoidal signals have a repetitive angular phase range of 0 to
360 degrees. The digital implementation is no different. The counter’s carry function
allows the phase accumulator to act as a phase wheel in the DDFS implementation, as
shown in Figure 2.3 [4]. To understand this basic function, consider the sine-wave
oscillation as a vector rotating around a phase circle. Each designated point on the phase
wheel corresponds to the equivalent point on a cycle of a sine wave. As the vector rotates
around the wheel, visualize that the sine of the angle generates a corresponding output
sine wave. One revolution of the vector around the phase wheel, at a constant speed,
results in one complete cycle of the output sine wave. The phase accumulator provides
the equally spaced angular values accompanying the vector’s linear rotation around the
phase wheel. The contents of the phase accumulator correspond to the points on the cycle
of the output sine wave [4].
Figure 2.3: Digital phase wheel [4]
The PA is a modulo- M counter that increments its stored number each times it
10
Texas Tech University, Shashikant Shrimali, May 2007
receives a clock pulse. The magnitude of the increment is determined by the binary-coded
input word ( M ). This word forms the phase step size between reference-clock updates; it
effectively sets how many points to skip around the phase wheel. The larger the jump
size, the faster the phase accumulator overflows and completes the equivalent of a sine-
wave cycle. The number of discrete phase points contained in the wheel is determined by
the resolution of the PA (n-bits), which determines the tuning resolution of the DDFS.
For example, for an n = 28-bit phase accumulator, M will have a value of 0000...0001,
which would cause the phase accumulator to overflow after 228 reference-clock cycles
(increments). If the value of M is changed to 0111...1111, phase accumulator will
overflow after only 2 reference-clock cycles (the minimum required by Nyquist). This
relationship can be seen in the basic tuning equation for DDFS architecture:
n
clkout
fMf2*
= (10)
where:
outf = output frequency of the DDFS,
M = frequency control word,
clkf = internal reference clock frequency (system clock),
n = length of the phase accumulator, in bits.
Any change to the value of M results in immediate and phase-continuous changes
in the output frequency. In a DDFS, no loop settling time is incurred as in the case of a
PLL. As the output frequency is increased, the number of samples per cycle decreases.
Since, sampling theory, dictates that at least two samples per cycle are required to
reconstruct the output waveform, the maximum fundamental output frequency of a DDFS
is2clkf . However, for practical applications, the output frequency is limited to somewhat
less than that, improving the quality of the reconstructed waveform and permitting
filtering on the output. When generating a constant frequency, the output of the PA
increases linearly, so the analog waveform, it generates is inherently a ramp.
11
Texas Tech University, Shashikant Shrimali, May 2007
2.3.2 Phase-to-Amplitude Converter (ROM/ LUT)
In this thesis, the DDFS’s ROM is a sine Look up Table; it converts digital phase
input from the accumulator to output amplitude. The accumulator output represents the
phase of the wave as well as an address to a word, which is the corresponding amplitude
of the phase in the LUT. This phase amplitude from the ROM LUT drives the DAC to
provide an analog output. It is also called a digital Phase-to-Amplitude Converter (PAC),
or polar-to-rectangular transformation (projection of the real or imaginary component in
time), or (sine) waveform mapping device - a Memory. All techniques of calculating data
as simple lookup table operation are modeled here. The lookup memory contains one
cycle of the waveform to be generated. The size of the LUT is 2n words. LUT translates
truncated phase information, being in digital form, into quantized numerical waveform
samples.
Some DDFS systems can be implemented with ROM or without ROM. Using, a
ROM LUT as a phase to amplitude converter has some advantages over other ROMless
architectures. The simplicity of the ROM circuit makes the ROM LUT easy to
implement. The advantages of ROMless architectures can be seen when higher bit
accuracy is desired. For higher bit accuracy, the ROM becomes very large, consumes
more power and becomes slow as compared to ROMless architecture. The ROM LUT
stores the values of phase amplitudes while ROMless architectures compute phase
amplitudes. Inherently, a ROM LUT provides better SFDR than any ROMless
architecture for same bit width [2]. In an ideal case with no phase and amplitude
quantization, the output sequence of the look up table is given by,
( )niP 2)(.2sin π (11)
where, is the (n-bit) phase register value (at the ith clock period). The numerical
period, of the PA output sequence (in clock cycles) is
)(iP
Pe
)2,(
2n
n
pGCDPe
Δ= (12)
where, represents the greatest common divisor of )2,( npGCD Δ pΔ and 2n
12
Texas Tech University, Shashikant Shrimali, May 2007
2.3.3 Digital-to-Analog Converter and Filter
The phase accumulator computes a phase (angle) address for the look-up table,
which outputs the digital value of amplitude—corresponding to the sine of that phase
angle—to the DAC. The DAC, in turn, converts that number to a corresponding value of
analog voltage or current. The DAC and rest of the system run at the same reference
clock for synchronization. The DAC adds quantization error at the output to the sine
wave. Ideally, xxSIN )( is used to filter the output of the DAC [2]. It removes the extra
frequency components added to the sine wave and hence produces a smooth sine wave.
13
Texas Tech University, Shashikant Shrimali, May 2007
CHAPTER III
OUTPUT SPECTRUM OF DDFS
3.1 Sampled Output of DDFS [4]
An understanding of sampling theory is necessary when analyzing the sampled
output of a DDFS based signal synthesis solution. The spectrum of a sampled output is
illustrated in Figure 3.1. In this example, the sampling clock is 300 MHz and the
fundamental output frequency is 80 MHz [4].
clkf
outf
Figure 3.1: Spectral analysis of sampled output [4]
The Nyquist Theorem dictates that there is a minimum of two samples per cycle
required to reconstruct the desired output waveform. Images are created in the sampled
output spectrum at . The 1st image response occurs in this example at outclk ff ± outclk ff −
or 220 MHz. The 3rd, 4th, and 5th images appear at 380 MHz, 520 MHz, 680 MHz, and
820 MHz (respectively). Figure 3.1 shows that nulls appear at multiples of the sampling
frequency. In the case of the frequency exceeding the frequency, the 1st image
response will appear within the Nyquist bandwidth
outf clkf
clkfDC 21− as an aliased image.
14
Texas Tech University, Shashikant Shrimali, May 2007
In typical DDFS applications, a lowpass filter is utilized to suppress the effects of
the image responses in the output spectrum. In order to keep the cutoff requirements on
the lowpass filter reasonable, it is an accepted rule to limit the bandwidth to
approximately 40% of the frequency [4]. This facilitates using an economical lowpass
filter implementation on the output.
outf
clkf
Figure 3.1 indicates that the amplitude of and the image response follows a outf
xx)sin( roll off response. This is due to the quantized nature of the sampled output. The
amplitude of the fundamental and any given image response can be calculated using the
xx)sin( formula. Per the rolloff response function, the amplitude of the fundamental
output will decrease inversely to increases in its tuned frequency. The amplitude rolloff
due to xx)sin( in a DDFS system is –3.92 dB over its DC to Nyquist bandwidth. A
DDFS architecture can include inverse SINC filtering. This can pre-compensate for the
xx)sin( rolloff and maintain flat output amplitude (± .1 dB) from the D/A converter over
a bandwidth of up to 45% of the clock rate or 80% of Nyquist rate. It is important to
generate a frequency plan in DDFS applications and analyze the spectral considerations
of the image response and the xx)sin( amplitude response at the desired and
frequencies.
outf clkf
The other anomalies in the output spectrum, such as integral and differential
linearity errors of the D/A converter, glitch energy associated with the D/A converter,
and clock feed-through noise, do not follow the xx)sin( roll-off response. These
anomalies appear as harmonics and spurious energy in the output spectrum and generally
are much lower in amplitude than the image responses. The general noise floor of a
DDFS is determined by the cumulative combination of substrate noise, thermal noise
effects, ground coupling, and a variety of other sources of low-level signal corruption.
The noise floor, spur performance, and jitter performance of DDFS is greatly influenced
15
Texas Tech University, Shashikant Shrimali, May 2007
by circuit board layout, the quality of the power supply, and the quality of the input
reference clock.
3.2 Spectral Purity Considerations
The fidelity of a signal formed by recalling samples of a sinusoid from a LUT is
affected by both the phase and amplitude quantization of the process. The length and
width of the look-up table affect the signal's phase angle resolution and the signal's
amplitude resolution respectively. These resolution limits are equivalent to time base
jitter and to amplitude quantization of the signal and add spectral modulation lines and a
white broadband noise floor to the signal's spectrum. In conjunction with the system
clock frequency, PA width determines the frequency resolution of the DDFS. The PA
must have a sufficient field width to span the desired frequency resolution. For most
practical applications, a large number of bits are allocated to the phase accumulator in
order to satisfy the system frequency resolution requirements.
3.3 Spurious Free Dynamic Range
In many DDFS applications, the spectral purity of the DAC output is of primary
concern. Unfortunately, a number of interacting factors complicate the measurement,
prediction, and analysis of this performance. Even an ideal N-bit DAC produces
harmonics in a DDFS system. The amplitude of these harmonics is highly dependent
upon the ratio of the output frequency to the clock frequency. The assumption that the
quantization noise appears as white noise and is spread uniformly over the Nyquist
bandwidth is simply not true in a DDFS system. For instance, if the DAC output
frequency is set to an exact sub multiple of the clock frequency, then the quantization
noise is concentrated at multiples of the output frequency, i.e., it is highly signal
dependent. If the output frequency is slightly offset, however, the quantization noise
becomes more random, thereby giving an improvement in the effective SFDR. To obtain
best SFDR clock and output frequencies must be carefully chosen.
16
Texas Tech University, Shashikant Shrimali, May 2007
In ADC-based systems, adding a small amount of random noise to the input tends
to randomize the quantization errors and reduce this effect. The same thing can be done
in a DDFS system. A pseudo-random digital noise generator output can be added to the
DDFS sine amplitude word before being loaded into the DAC. The amplitude of the
digital noise is set to about LSB21 . This accomplishes the randomization process at the
expense of a slight increase in the overall output noise floor. In most DDFS applications,
however, there is enough flexibility in selecting the various frequency ratios so that
dithering is not required [5].
3.4 DDFS with Phase Truncation and Spurious Performance
Phase truncation is an important aspect of DDS architectures. Consider a DDS
with a 12-bit phase accumulator. To directly convert 32 bits of phase to corresponding
amplitude would require 232 entries in a lookup table. If each entry were stored with 8-bit
accuracy, then 4-gigabytes of lookup table memory would be required. Clearly, it would
be impractical to implement such a design. The solution is to use a fraction of the most
significant bits of the accumulator output to provide phase information. For example, in a
32-bit DDS design, only the upper most 12 bits might be used for phase information. The
lower 20 bits would be ignored (truncated) in this case. To understand the implications of
truncating the phase accumulator output, consider the concept of “digital phase wheel”.
Consider a simple DDS architecture that uses an 8-bit accumulator of which only the
upper 5 bits are used for resolving phase. The phase wheel for this example is shown
Figure 3.2 below [6],
17
Texas Tech University, Shashikant Shrimali, May 2007
Figure 3.2: Phase truncation error and the phase wheel [6]
With an 8-bit accumulator, the phase resolution associated with the accumulator is
1/256th of a full circle, or 1.41° (360/28). In Figure 3.2, the accumulator phase resolution
is identified by the outer circle of tic marks. If only the most significant 5 bits of the
accumulator are used to convey phase information, then the resolution becomes 1/32nd of
a full circle, or 11.25° (360/25).
These are identified by the inner circle of tic marks. If a tuning word, value of 6 is
used, then the accumulator counts by increments of 6. The first four phase angles
corresponding to 6-count steps of the accumulator are depicted in Figure 3.2. The first
phase step (6 counts on the outer circle) falls short of the first inner tic mark. Thus, a
discrepancy arises between the phase of the accumulator (the outer circle) and the phase
as determined by 5-bit resolution (the inner circle). This discrepancy results in a phase
error of 8.46° (6 x 1.41°), as depicted by arc E1 in the Figure 3.2. On the second phase
step of the accumulator (6 more counts on the outer circle) the phase of the accumulator
resides between the 1st and 2nd tic marks on the inner circle. Again, there is a
discrepancy between the phase of the accumulator and the phase as determined by 5 bits
of resolution. The result is an error of 5.64° (4 x 1.41°) as depicted by arc E2 in the
Figure 3.2.
18
Texas Tech University, Shashikant Shrimali, May 2007
Similarly, at the 3rd phase steps of the accumulator an error of 2.82° (2 x 1.41°)
results. On the 4th phase step, however, the accumulator phase and the 5-bit resolution
phase coincide resulting in no phase error. This pattern continues as the accumulator
increments by 6 counts on the outer circle each time.
The phase error introduced by truncating the accumulator will result in errors in
amplitude during the phase-to-amplitude conversion process inherent in the DDS. It turns
out that these errors are periodic. They are periodic because, regardless of the tuning
word chosen, after a sufficient number of revolutions of the phase wheel, the accumulator
phase and truncated phase will coincide. Since these amplitude errors are periodic in the
time domain, they appear as spurs in the frequency domain and are known as phase
truncation spurs.
It turns out that the magnitude and distribution of phase truncation spurs is
dependent on three factors [6]:
1. Phase Accumulator size
2. Phase word size; i.e., the number of bits of phase after truncation
3. Frequency control word
3.5 DDFS with Dither and its effect on SFDR
In the phase dithering model, the phase values generated by the PA contain a
certain amount of noise. This is accomplished by adding a small random number to each
phase value generated at the output of the PA. A dither generator is used to produce a
random number with each update of the accumulator. Typically, the MSB of the random
number is positioned one bit less than the LSB of the word that is fed to the LUT. In the
phase truncation DDFS, the PT (Phase Truncation) introduces a phase error in the phase
slope by discarding the least significant part. The phase error due to the discarded
fractional part of the address count is periodic which results in undesired spurs. These
spurs associated with this correlated error sequence are impressed on the final output
waveform and results in spur in the synthesizer output spectrum.
19
Texas Tech University, Shashikant Shrimali, May 2007
These spurs can be suppressed by breaking up the regularity of the address error
with an additive randomizing signal. This random sequence, called dither, a noise
sequence, with variance approximately equal to the least significant integer bit of the PA.
The dithered DDFS supplies, a higher spurious free dynamic range (SFDR) in
comparison to a phase truncation design. The additional logic resources required to
implement the dither sequence generator are not significant. Typically, a 3 or 4-bit
random number is sufficient.
20
Texas Tech University, Shashikant Shrimali, May 2007
CHAPTER IV
SIMULATION AND RESULTS OF DDFS
This chapter discusses the implementation of different models of the DDFS and
their simulation results. High-level modeling and simulation of a DDFS was carried out
using MATLAB-Simulink to understand the overall functionality and the flow from input
to output. The DDS has also been modeled using Verilog and simulated with the
ModelSim simulator. This Verilog code is synthesizable on an FPGA. A DDFS has also
been designed using VerilogA and simulated with Cadence. In the Cadence simulation,
the reference clock used is a jittered clock, to understand the effect of clock jitter on the
output spectrum. Simulations help to increase the understanding of design under non-
ideal operations. Simulations serve as a prototype for a design before it is actually built in
hardware. These simulations are used to understand the effect of the non-ideal
characteristics of the building blocks of a DDFS on its output spectrum. Operation of the
DDFS has been explained in previous chapters.
4.1 High-level simulation of DDFS with MATLAB-Simulink
Matlab-Simulink implementation of a direct digital synthesizer consists of Phase
Register (PR), Phase Accumulator (PA) and Look up Table (LUT). This model has been
developed to study the functionality of the DDS. Figure 4.1 shows the system-level
model of the DDFS. The PR contains the frequency tuning word. The unit delay block in
the Figure 4.1 along with an adder and feedback loop represents the PA. These unit delay
blocks act as register. The LUT is implemented using inbuilt sine LUT block of
MATLAB-Simulink. The input of the LUT is scaled using the gain block.
21
Texas Tech University, Shashikant Shrimali, May 2007
Figure 4.1: MATLAB-Simulink model of DDS
With every clock pulse the contents of the PR is added to that of PA. The PA
generates the phase values of the output sine wave. The output of the PA serves as the
address of the LUT. Each time the PA overflows, the LUT outputs sampled values of the
sine wave. This output of the LUT represents one cycle of the sine waveform, since the
LUT contains sampled values of one cycle of the sine wave. The overflow rate of the PA
depends on the bit-size of the PA (number of bits) and the frequency tuning word. Larger
the size of the frequency tuning word faster the PA overflows. The output frequency of
the DDFS is directly proportional to the frequency tuning word. Therefore, larger the
frequency tuning word, higher is the output frequency and faster the PA overflows. This
is shown with the help of simulations. These simulations help to understand the signal
flow through the DDFS and the overflow of the PA and the relation of output frequency
with the PA. The frequency of the output wave depends on the overflow rate of the PA
and the frequency tuning word. This overflow rate depends on the frequency tuning word
stored in the phase register.
To generate an output frequency of 10 MHz with a reference clock frequency of
50M Hz, a frequency tuning word ( M ) of 51 is stored in the Phase Register. The value
of the frequency tuning word ( M ) is calculated using the frequency tuning equation. The
Phase Accumulator is 8-bits wide. This control word M is added to the previous value of
PA with each clock pulse.
22
Texas Tech University, Shashikant Shrimali, May 2007
In the Figure 4.2, it can be seen that for a frequency tuning word of 51, for the
first 3000 clocks, the PA overflows slightly more than two times. Therefore, the sine
wave of lower frequency is produced, which is shown in Figure 4.3.
Figure 4.2: PA output for M = 51
Figure 4.3: LUT output for = 10 MHz outf
If the output frequency to be generated is increased to 20 MHz with the same
reference clock frequency of 50 MHz, a new value of frequency tuning word ( M ) 102 is
stored in the Phase Register. Phase Accumulator is 8-bit wide. This control word M is
added to the previous value of PA with each clock.
Figure 4.4: PA output for M = 102
23
Texas Tech University, Shashikant Shrimali, May 2007
Figure 4.5: LUT output for = 20 MHz outf
In the above Figure 4.4, it can be seen that for a frequency tuning word of 102, for
the first 3000 clocks, the PA overflows slightly more than four times. Therefore, sinne
wave of higher frequency is produced, which is shown in Figure 4.5.
For smaller values of M , the PA overflows slower than with larger values as can
be seen in Figure 4.2 and Figure 4.3 respectively. Hence, the output wave in Figure 4.4
has a frequency lower then the output wave in Figure 4.5.
4.2 RTL level simulation of DDFS using ModelSim
The DDFS has been modeled on a behavioral level with the help of Verilog. First,
a simple DDS is simulated and then later more modules are added to look for their effects
on the output spectrum. These models help in understanding the effect of each block on
output.
4.2.1 Simple DDFS Simulation using ModelSim
The Figure 4.6 shows the RTL level schematic of the DDFS. This model has been
implemented using Verilog and simulated in ModelSim. In this model, different modules
in Verilog are created and then put together; therefore, it can be easily modified
according one’s design requirements. This design consist of four basic modules, they are
Top level, Register, Adder and the Look up Table. These modules are discussed in detail
in this chapter.
24
Texas Tech University, Shashikant Shrimali, May 2007
Figure 4.6: RTL schematic of DDFS
This model of the DDFS uses an 8-bit PA. The PA consists of a register and adder
with a feedback network. The Adder module is an 8-bit adder that has two 8-bit inputs
and one 9-bit output.
Figure 4.7 Verilog code for 8-bit Register
The Figure 4.7 shows the Verilog code for module Register. This module takes
clock, reset and 8-bit data “d” as input and an 8-bit output “q”. At each clock, this module
checks if the reset is high then the output q is zero else, it is equal to input d. This code is
synthesizable.
25
Texas Tech University, Shashikant Shrimali, May 2007
Figure 4.8 Simulation result of Register
The PA is modeled by interconnecting Register and Adder modules, as shown in
RTL schematic of DDFS, in Figure 4.6. It has been mentioned earlier that with each
clock pulse the contents of the phase register are added to that of the PA. This shown in
Figure 4.9. Here the content of the phase register is 40 i.e. “d1”. When the reset is high,
the output of phase register is 0 and when the reset goes low the output “q1” is 40.
Initially sum1 is equal to 0. After reset goes low with each clock pulse the content of the
PA are incremented by a value of 40. This goes on until the accumulator overflows and
then the cycle starts again.
Figure 4.9 Simulation result of the PA
Next module used in this model is the Look up Table. This module takes clock,
reset and an 8-bit address as input and has one 8-bit output. The 8-bit address input is the
output of the PA. The LUT contains the sampled values of the amplitude of the one cycle
of the sine wave. The LUT has 28 entries i.e. 256 values and each entry is 8-bit in length.
Therefore, the size of the LUT is 256*8 i.e. 65536.
26
Texas Tech University, Shashikant Shrimali, May 2007
The contents of the LUT i.e. sampled values of the amplitude of the sine wave are
generated using MATLAB.
Figure 4.10 Verilog code for the LUT
Figure 4.10 shows the Verilog code of the LUT. This code is synthesizable. When
the reset is high, the LUT is loaded with the sampled values of the amplitude of the sine
wave and the output is 0. When the reset goes low, then based on the address generated
by the PA and received by the LUT the, corresponding values are present at the output of
the LUT, this is shown in Figure 4.11 below.
Figure 4.11 Simulation result of the LUT
27
Texas Tech University, Shashikant Shrimali, May 2007
Figure 4.12 Top-level with the interconnection of modules
This simulation, performed with a reference clock of 50 MHz because the
XILINX SPARTAN FPGA can run at maximum clock frequency of 50 MHz. Figure 4.6
shows the RTL schematic of the DDFS. In this simulation a output frequency of 3.1 MHz
is generated with an 8-bit PA and reference clock of 50M Hz, the frequency tuning word
is calculated to be 15.87 (binary- 00001111). When this tuning word is programmed into
the Phase Register, the fractional part is lost. This results in the loss of phase and
introduces some error in the phase calculation and the output frequency generated is a
little less than the desired frequency. Figure 4.14 shows the PSD plot for this simulation.
The output frequency generated is 2.93 MHz and the SFDR is calculated to be 61.35 dB.
28
Texas Tech University, Shashikant Shrimali, May 2007
Figure 4.13: ModelSim simulation = 3.1 MHz outf
Figure 4.14: PSD plot = 3.1 MHz outf
If the same simulation is performed to generate an output frequency of 5 MHz
keeping the reference clock and PA the same, then the tuning word is calculated to be
25.6 (binary- 00011001).
29
Texas Tech University, Shashikant Shrimali, May 2007
Figure 4.15: ModelSim simulation = 5 MHz outf
Figure 4.16: PSD plot = 5 MHz outf
The output frequency obtained is 4.887 MHz, which is a little less than the desired
frequency due to loss of the fractional part of frequency tuning word. SFDR of 55.04 dB
is found from the PSD plot in Figure 4.16.
30
Texas Tech University, Shashikant Shrimali, May 2007
4.2.2 DDFS with Phase Truncation and Dither
In the previous simulation, the PA was 8-bit in length. Therefore, the LUT had 28
entries i.e. 256 values and each entry was 8-bit in length. Therefore, the size of the LUT
was 256*8 i.e. 2048. If the size of the PA is 12-bit then the LUT will have 212 entries. If
each entry is 8-bit long then the size of the LUT will be 212 * 8 (i.e. 32768). This shows
that, as the length of the PA increases, the size of the LUT increases exponentially. This
reduces the speed of the DDFS but more the number of bits in the PA, higher is the
frequency resolution. The frequency resolution is directly proportional to the length of
the PA. Therefore, there exists a trade off between size, frequency resolution and the
speed. The solution to this problem is phase truncation (PT). With PT, the length of the
PA can be increased without increasing the size of the LUT but at the expense of the
spectral purity of the output spectrum. In phase truncation, the most significant bits
(MSB) of the phase are used to address the LUT. Figure 4.17 shows the RLT schematic
of a DDFS with a Phase Truncation block.
Figure 4.17: DDFS with phase truncation
The phase truncation and its effect on output spectrum of DDFS have been
examined through this simulation. A 12-bit PA will generate a 12-bit phase value and a
LUT with entries 212 i.e. 4096 samples will be required. In this simulation, an additional
module for PT has been implemented. This module takes the 12-bit phase value
generated by the PA and outputs the 8 MSB’s leaving 4 LSB’s. These 8 most significant
31
Texas Tech University, Shashikant Shrimali, May 2007
bits of the phase are used to address the LUT. This can be seen in Figure 4.19, “q2”
represents the output of the PA and “pt” is the truncated phase value.
Figure 4.18: Verilog Code for Phase Truncation Module
Figure 4.19: Simulation result of the PT
To generate an output frequency of 3.1 MHz with an 12-bit PA and a reference
clock of 50 MHz, the frequency tuning word is calculated to be 253.95 (binary-
000011111101).
32
Texas Tech University, Shashikant Shrimali, May 2007
Figure 4.20: ModelSim simulation with PT
107
-90
-80
-70
-60
-50
-40
-30
-20
-10 X: 3.128e+006Y: -5.982
Hz
dB
The PSD of the output
Figure 4.21: PSD plot with PT
The SFDR from the PSD plot in Figure 4.21 is found to be 19.42 dB.
As indicated in previous chapters, phase truncation introduces amplitude errors,
and these errors are reflected as spurs in the output spectrum. To suppress these spurs a
random bit sequence, called dither, is added to the PA output before truncation. This
gives a better SFDR than phase truncation alone.
33
Texas Tech University, Shashikant Shrimali, May 2007
Figure 4.22: DDFS with Phase Truncation and Dither Generator
The dither generator implemented in this simulation generates a 3-bit random
number. This random number is added to the output of PA. The fractional part of the
tuning word is discarded, the phase generated by the PA contains error and this error is
periodic. Addition of a random number to this phase value breaks the periodicity.
Figure 4.23: Verilog code for the random number generator
34
Texas Tech University, Shashikant Shrimali, May 2007
Figure 4.24: Simulation result of the PA, PT and Dither
The figure 4.24 above shows the simulation result of the PA, PT and the Dither
modules connected together. The output of the PA is shown by signal “q2”, phase
truncated phase is shown by “pt” and the dither by “x”.
107
-90
-80
-70
-60
-50
-40
-30
-20
-10 X: 3.107e+006Y: -5.538
Hz
dB
The PSD of the output
Figure 4.25: PSD Plot with PT and Dither
Looking at the PSD plot in Figure 4.25, SFDR is found to be 29.10 dB. As earlier
mentioned that PT introduces spurs in the output spectrum and addition of dither helps in
suppressing these spurs unto some extent.
These Verilog models are very user friendly and synthesizable on FPGA. One can
easily make changes to these models according to their application requirement and look
into different aspects of the DDFS.
35
Texas Tech University, Shashikant Shrimali, May 2007
4.3 Cadence simulation of DDFS with Clock Jitter
In this model, the effect of clock jitter on the output spectrum is analyzed. In the
real world, inputs are not ideal and are always associated with some kind of noise. In the
time domain, noise on clocks or input data is known as jitter. The DDFS has been
modeled in VerilogA the same way as it was modeled in the section 4.2. The VerilogA
code can be seen the Appendix section. This section demonstrates the use of behavioral
modeling to generate these non-ideal signals.
Jitter can be defined as the deviation of the significant instances of a signal from
their ideal location in time. To put it more simply, jitter is how early or late a signal
transition is with reference to when it should transition. In a digital signal, the significant
instances are the transition (crossover) points.
Figure 4.26: Random number and Clock Jitter Generator
The DDFS requires a reference clock input. In order to perform system-level
verification or simulation a certain amount of jitter is introduced. This jitter typically has
a Gaussian distribution. In this design, jitter is produced by generating a random number
using random distribution function of VerilogA, then integrating this random number
with the ideal clock. This causes the clock to make transitions more randomly. The
VerilogA function used for this purpose is
$rdist_normal (seed, mean, standard_deviation);
This function uses three parameters: a seed value, mean and standard deviation. The
mean parameter is an integer input, which causes the average value returned by the
function to approach the value specified. The standard deviation parameter used with the
36
Texas Tech University, Shashikant Shrimali, May 2007
$dist_normal function is an integer input which helps determine the shape of the density
function. Larger numbers for standard deviation will spread the returned values over a
wider range. With a mean of 0 and standard deviation of 1, $dist_normal generates
Gaussian distribution.
Figure 4.27: VerilogA modules of Random Number and Clock Generator
Figure 4.28 shows the random number generated using random function of VerilogA.
Figure 4.28: Output of Random number generator
37
Texas Tech University, Shashikant Shrimali, May 2007
This random number is used to vary the delay time of the clock. When this
random number is integrated with the ideal clock, it randomly changes the delay time of
the clock pulse, which introduces jitter in the clock. The Figure 4.29 shows ideal clock.
Jitter induced clock can be seen in Figure 4.30 below. As shown, the delay time of the
jittered clock, represented by green, is different from the ideal clock, represented in
black.
Figure 4.29: Ideal Clock =100 MHz clkf
38
Texas Tech University, Shashikant Shrimali, May 2007
Figure 4.30: Ideal Clock, Jittered Clock and Random Number
Figure 4.31: Jitter
The amount of jitter induced in the ideal clock is shown in Figure 4.31. The
amount of jitter can be varied by making very small changes in the clock module.
39
Texas Tech University, Shashikant Shrimali, May 2007
Figure 4.32: DDFS with ideal clock
Figure 4.33: DDFS with jittered clock
Figure 4.32 and Figure 4.33 shows DDFS with ideal clock and non-ideal clock
respectively. Figure 4.34 shows the output of the PA when clocked with the ideal clock.
Figure 4.35 shows the output of the PA when clocked with jittered clock.
Figure 4.34: Phase Accumulator with Ideal Clock
40
Texas Tech University, Shashikant Shrimali, May 2007
Figure 4.35: Phase Accumulator with jittered Clock
As shown in Figure 4.34 and Figure 4.35, the PA overflows uniformly with the
ideal clock but non-uniformly with the jittered clock. This non-uniform overflow of PA
in Figure 4.35 is due to the effect of jitter present in the clock. For the clock with jitter,
the PA goes through non-uniform iterations. When this PA output is fed to the input of
the LUT, the LUT output’s samples at different times and this gives rise to spurs in the
output spectrum of the DDFS.
If an output frequency of 7.5 MHz is generated with a reference clock frequency
of 100 MHz, a frequency tuning word ( M ) of 19 is stored in the Phase Register. In this
design, the Phase Accumulator is 8-bit wide. This control word M is added to the
previous value of PA with each clock update.
Figure 4.36: Output spectrum of DDFS with ideal clock = 7.5 MHz outf
41
Texas Tech University, Shashikant Shrimali, May 2007