Chap 6

6-1

Chapter 6. Quantization To convert an analog signal to a digital signal, the following three procedures are required. First, the signal is passed through a lowpass filter to prevent aliasing. Second, the signal is sampled by a sample-and-hold circuit. Finally, the samples are quantized by an analog to digital converter (ADC) in order to be represented in digital form as shown in Figure 6.1. x(t) Anti-aliasing x(n) $( )x n Filter Sample-and-hold A/D Converter (Analog Signal) (LPF) Circuit (Discrete-time (Digital Signal) Signal) Figure 6.1. Typical analog to digital conversion process There are many different kinds of quantization techniques available. Quantization methods such as the linear quantization, the nonlinear quantization, the delta modulation, and the sigma-delta modulation are described in this chapter. Also efficient quantization methods such as the adaptive quantization and the differential quantization are explained. The Adaptive Differential Pulse Code Modulation (ADPCM) is also described. 6.1 Linear Quantization In linear or uniform quantization, the quantization step size is fixed. The constant quantization step size is used no matter what the instantaneous signal amplitude is. Linear quantization with 2m quantization levels is shown in Figure 6.2 where ∆ is the quantization step size and m is the number of bits in a quantization word. Sample value in volts[V] (2m−1)∆/2 (Positive Peak Value = 2m∆/2: loudest) (2m−3)∆/2 3∆/2 ∆/2 (softest) −∆/2 (softest) −3∆/2 −(2m−3)∆/2 −(2m−1)∆/2 (Negative Peak Value = −2m∆/2: loudest) Figure 6.2. Constant quantization step size ∆ is used for linear quantization. (m is the number of bits used for quantization)

6-2

For example, if the peak to peak value of a signal is 4 [V] and m = 3, then the signal may be quantized according to the rule shown in Figure 6.3. Digital value after quantization (1.75) 111 (1.25) 110 (0.75) 101 (0.25) 100 Sample value before −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 quantization [V] 011 (−0.25) 010 (−0.75) 001 (−1.25) 000 (−1.75) Figure 6.3. Three-bit linear quantization example. (sign-magnitude binary representation is used)

In this case ∆ = 0.5 [V]. A value between 0 and 0.5 is approximated (or quantized) by the quantization level ∆/2 (0.25) and encoded by 100. A value between 0.5 and 1.0 is quantized by the quantization level 3∆/2 (0.75) and encoded by 101. Likewise, a value between −2 and −1.5 is quantized by the level −7∆/2 (−1.75) and encoded by 000 and so on. The process of sampling, quantization and encoding is referred to as the pulse code modulation (PCM). Keeping a step size fixed is especially essential for high-fidelity digital audio. One notable example is the compact disc (CD) format. The CD format uses 16 bit linear quantization so that the range between the negative peak and the positive peak is divided by 216 (65,536) uniform quantization levels. 6.2 Quantization Noise and SQNR Suppose that the signal to be quantized has a peak-to-peak value of 2V [V] and that the number of bits in a quantization word is m. If there are 2m quantization levels, then the quantization step size is given by

6-3

∆ = 22m

V . (6.1)

As shown in Figure 1, the quantizer input is x(n) and the output is ˆ( )x n . The output can be expressed by the following equation: ˆ( )x n = x(n) + e(n) (6.2) where e(n) is termed the quantization noise or quantization error. The noise sequence, e(n), is uncorrelated with the sequence x(n). After quantization, actual information about the noise is forgotten. However, the statistics of the noise is known. The noise is between −∆/2 and ∆/2 and is uniformly distributed. The probability density function of the noise or error, e(n), is given by Fig. 6.4. pe(e) e −∆/2 ∆/2 Figure 6.4 Probability density function of the quantization noise e(n)

The mean of the noise is

<e(n)> = me = ( )eep e de∞

−∞∫ = / 2

/ 2

1e de∆

−∆ ∆∫ = 0. (6.3)

The noise power (which is the same as the variance because the mean is zero in this case) is given by

<e2(n)> = σe2 = e p e dee

2 ( )−∞

∞z = e de2

2

2 1∆∆

∆

−z /

/ = ∆

2

12 = ( )2

12 2

2

2

Vm⋅

. (6.4)

The signal to quantization noise ratio (SQNR) in dB is given by

SQNR = 10log σσ

x

e

2

2

FHGIKJ = 10log 2

2 12

2 2

2

mx

V⋅F

HGIKJ

σ( ) /

(6.5)

= 10log(22m) + 10log σ x

V

2

2 3/FHGIKJ

= 6m + 10log σ x

V

2

2 3/FHGIKJ [dB]

6-4

where σx2 is the signal power (or variance assuming that the mean is zero). The SQNR is

linearly proportional to the number of bits m in the ADC. For each extra bit of resolution in the ADC, there is improvement of 6 dB in the SQNR. Let us assume that the signal is uniformly distributed between −V and V. The signal power is given by

σx2 =

212

2Vb g = V2

3. (6.6)

Thus, the SQNR of the signal that is uniformly distributed between the negative and positive peak values becomes SQNRuniform = 6m [dB]. (6.7) The dynamic range of the ADC is a measure of the range of input amplitudes for which the ADC produces a positive SQNR. In the case when the signal is uniformly distributed between the negative and the positive peak values, the dynamic range is defined as the ratio of the loudest amplitude to the softest one. The dynamic range in decibel is given by

DRuniform = 20 2 22

20 210 10log //

logm

m∆∆FHG

IKJ = = 6m [dB]. (6.8)

Note that the SQNR and the dynamic range are identical. By adding one extra bit, dynamic range is increased by 6 dB. In the case of the CD format, the dynamic range is about 16(6) = 96 dB. In comparison, the dynamic range of a typical cassette player is about 60 dB and the dynamic range of the audio part of a typical hi-fi VHS VCR is about 80 dB. For sinusoidal inputs, the dynamic range of the ADC is defined as the ratio of the signal power of a full scale sinusoid to the signal power of a small sinusoidal input that results in a SQNR of 0 dB. The signal power of a full-scale sinusoid is V2/2. To have an SQNR of 0 dB, the smallest sinusoidal signal power is ∆2/12. Thus, the dynamic range for sinusoidal input is the same as the SQNR when the signal is a sinusoid with the amplitude V.

DRsinusoid = SQNRfull scale sinusoid = 6m + 10log VV

2

2

23

//

FHGIKJ = 6m + 1.76 [dB]. (6.9)

In other words, the dynamic range is the peak SQNR of the ADC for a sinusoidal input. 6.3 Nonlinear Quantization When people talk over the telephone, they seldom yell on the mouthpiece all the time. Thus, speech sample values are mostly concentrated in the soft or medium range of

6-5

amplitude. For efficient quantization of a speech signal, nonuniform quantization step size is often used. A smaller step size is used for a softer sound and a bigger step size is used for a louder sound. A small difference may be noticeable when the sound is soft, but the same difference may not be noticeable if the sound is loud. Two methods, the µ-law and the A-law, are widely used for nonlinear quantization in telephone system. The µ-law is used in North America and the A-law is used in Europe. In both cases, the sample is quantized linearly and then compressed according to the nonlinear compression rule. The µ-law compresses a 13 to 14-bit linearly quantized speech sample to an 8-bit word. On the other hand, the A-law compresses a 12-bit linearly quantized speech sample to an 8-bit word. At the receiver the compressed words are expanded for reconstruction of original speech. A special chip which performs both the compression and the expansion is called the compander. Figure 6.5 shows the 8-bit representation of the A-law nonlinear quantization. The first bit is used for the sign bit, the next three bits are used for the segment identifier which can vary between 0 and 7, and the last four bits are used to represent numbers between 0 and 15. Sign bit Segment Identifier A B C D (s) Figure 6.5 8-bit word for A-law nonlinear quantization

TABLE 6.1 summarizes the compression rule and the expansion rule for A-law quantization. Segment 12-bit original 8-bit compressed 12-bit expanded 0 s000 0000 ABCD s000 ABCD s000 0000 ABCD 1 s000 0001 ABCD s001 ABCD s000 0001 ABCD 2 s000 001A BCDx s010 ABCD s000 001A BCD1 3 s000 01AB CDxx s011 ABCD s000 01AB CD10 4 s000 1ABC Dxxx s100 ABCD s000 1ABC D100 5 s001 ABCD xxxx s101 ABCD s001 ABCD 1000 6 s01A BCDx xxxx s110 ABCD s01A BCD1 0000 7 s1AB CDxx xxxx s111 ABCD s1AB CD10 0000 TABLE 6.1. A-law compression and expansion rule (x denotes the don’t care term which can be either 0 or 1) As the segment identifier increases, the signal sample amplitude increases or gets louder. For the louder signal, effective quantization step size increases. For example, the original sample with segment identifier 2 has twice bigger the quantization step size than that of the sample with the segment identifier of 1 or 0. The µ-law is very similar to the A-law but is more complicated. The dynamic range of the A-law PCM is 72 dB even though it uses eight bits per sample. Eight bits in linear quantization would give a dynamic range of only 48 dB.

6-6

6.4 Delta Modulation (DM) Delta modulation is obtained from a staircase approximation xq(t) of the continuous-time signal x(t) as shown in Figure 6.6. The ∆ is termed the step size of the staircase approximation and the T is the sampling interval. staircase approximation: xq(t) 3∆ continuous-time ∆ signal: x(t) t -∆ T 2T 3T 4T 5T 15T -2∆ Figure 6.6 Staircase approximation of an analog signal for ∆ modulation. As in Figure 6.6, the initial value of xq(t) is zero at t = 0. We keep this value for T seconds and compare this to the actual signal value at t = T. Because x(T) is greater than xq(0), xq(t) is increased by ∆ at t = T to follow x(t) as closely as possible. The same thing happens at t = 2T and t = 3T. At t = 4T, xq(3T) is compared to x(4T). Because x(4T) is smaller than xq(3T), xq(t) is decreased by ∆ at t = 4T to follow closely the original signal. This process continues. To follow closely the original signal with the staircase approximation, the sampling interval T needs to be kept small. A sampling rate may be much higher than twice the highest frequency of x(t). This is called the oversampling. We do not send or store the staircase approximation. Instead, the information about the increment or decrement at each sampling instant is transmitted or stored. Only one bit is required to quantize this information as in Figure 6.7. Analog signal DM signal out Σ One-bit Quantizer x(t) + x(t) − xq(t−T) (binary signal) − Accumulator Decoder xq(t−T) ±∆ (a) Modulator

6-7

Decoder Accumulator Lowpass Reconstructed analog signal DM signal ±∆ xq(t) Filter $( )x t (b) Demodulator Figure 6.7 Delta Modulation and Demodulation If the increment is denoted by 1 and the decrement is denoted by 0, then the example shown in Figure 6.6 will produce one-bit quantizer output (or DM output) as shown in TABLE 6.2. At the receiver, the staircase approximation is reconstructed from the binary signal. The staircase signal in turn is smoothed after the lowpass filter. Samp. Inst. T 2T 3T 4T 5T 6T 7T 8T 9T 10T 11T 12T 13T 14T Change +∆ +∆ +∆ -∆ -∆ -∆ -∆ -∆ +∆ +∆ +∆ +∆ -∆ -∆ DM output 1 1 1 0 0 0 0 0 1 1 1 1 0 0 TABLE 6.2 Output binary sequence of the delta modulator 6.5 Oversampling There are mainly two reasons why oversampling is used in A/D conversion. First, the analog anti-aliasing filter does not need a sharp cutoff characteristic with oversampling. A simple passive RC filter instead of an OP-amp based active filter can be used for anti-aliasing. Secondly, the quantization noise is reduced as it spreads over the wider band. Fig. 6.8 shows the simplified Fourier spectrum of the sampled signal whose sampling frequency is slightly greater than the highest frequency of the analog signal. It also shows the required lowpass filter response for anti-aliasing. Required analog lowpass filter response f [Hz] -5F1 -4F1 -3F1 -2F1 -F1 F1 2F1 3F1 4F1 5F1 Figure 6.8 Fourier spectrum of the sampled signal (Sampling frequency: F1)

6-8

Now suppose that the same analog signal is sampled at the rate of F2 which is four times F1. The Fourier spectrum of the sampled signal is shown in Fig. 6.9. The required lowpass filter response for anti-aliasing has much wider transition band. A simple passive lowpass filter may be good enough for anti-aliasing. Required analog lowpass filter response f [Hz] -4F1 (-F2) -3F1 -2F1 -F1 F1 2F1 3F1 4F1 (F2) Figure 6.9 Fourier spectrum of the sampled signal (Sampling frequency: F2 = 4F1) After sampling, each sample is quantized by an ADC. The oversampled digital signal must be downsampled later for obvious reason (we do not want to have too many samples!). Before downsampling the oversampled signal must be passed through a digital lowpass filter for anti-aliasing. The anti-aliasing digital filter is usually an FIR filter to ensure linear phase. The collective operation of lowpass filtering and downsampling is known as the decimation. When each sample is quantized by an ADC, quantization noise is resulted. Let the quantization noise power is σe

2. The noise power is the same no matter what the sampling rate is as long as the ADC resolution is fixed. However, its frequency distribution is different because of the different sampling rates. Because the quantization is assumed to be white, the noise power is uniformly distributed between −Fs/2 and Fs/2 where Fs is the sampling frequency. Thus, the power spectral density is given by Pe(f) = σe

2/Fs [W/Hz] (6.10) Fig. 6.10 shows the power spectral density (PSD) of the quantization noise for two different sampling frequencies. Pe(f) : PSD for sampling frequency F1 σe

2/F1 : PSD for sampling frequency F2 σe

2/F2 f [Hz] -F2 (-4F1 ) -3F1 -F2/2 -F1 -F1/2 F1/2 F1 F2/2 3F1 F2 (4F1 ) Figure 6.10 Quantization power spectral density for two PCM conversions

(Area of each rectangle is σe2)

6-9

If the digital lowpass filter used for decimation has the cutoff frequency F1/2, then the decimated signal will have one fourth the quantization noise power of the original oversampled signal because the noise power outside of the passband is eliminated. In general the ratio of the in-band noise power (after lowpass filtering) to the original noise power is given by

2 21

2

12e er

FF

σ = σ (6.11)

where 2r = F2/F1 is termed the oversampling ratio. Thus, the effective SQNR is

SQNR = 10log σσ

x

eF F

2

1 22

FHG

IKJ = 10log σ

σx

e

2

2

FHGIKJ + 3r [dB] (6.12)

For every doubling of the oversampling ratio, i.e., for every increment in r, the SQNR improves by 3 dB, or the resolution improves by a half bit. 6.6 Sigma-Delta Modulation (One-bit Quantization) Delta modulator described in the previous section is reproduced below. Note that the signals are in the discrete-time domain for convenience. Analog signal DM signal out Σ One-bit ADC x(n) + x(n) − xq(n-1) (binary signal) − Accumulator DAC xq(n−1) ±∆ (a) Modulator DAC Accumulator Lowpass Reconstructed signal DM signal ±∆ xq(n) Filter $( )x n (b) Demodulator Figure 6.11 Delta Modulator and Demodulator in the discrete-time domain

6-10

Because the accumulator in the demodulator is a linear system, it can be moved and placed in front of the modulator without changing the overall performance. Analog signal One-bit resolution Accumulator Σ One-bit ADC binary signal x(n) + − Accumulator DAC (a) Modulator DAC Lowpass Reconstructed signal Binary signal Filter $( )x n (b) Demodulator Figure 6.12 Modified delta modulator and demodulator in the discrete-time domain. Because two accumulators in Fig. 6.12 are still linear, they can be combined in one accumulator and placed before the one-bit ADC as shown in Fig. 6.13. The resulting modulator is referred to as the sigma-delta (Σ-∆) or delta-sigma modulator. Analog signal One-bit resolution Σ Accumulator One-bit ADC binary signal x(n) + y(n) − DAC y(n−1) Fig. 6.13 Sigma-delta modulator A/D system

Let xi(n) be an input and xo(n) be an output of the accumulator. The input-output relation of the accumulator in the time domain is given by xo(n) = xo(n−1) + xi(n). (6.13)

6-11

In the z-transform domain, the relation becomes Xo(z) = z−1Xo(z) + Xi(z). (6.14) The transfer function of the accumulator is

H(z) = 1

( ) 1( ) 1

o

i

X zX z z−=

−. (6.15)

An equivalent block diagram of the Σ-∆ modulator is obtained by replacing the accumulator with its transfer function and the ADC with the additive quantization noise model as shown in Fig. 6.15. e(n) +

x(n) + Σ 1

11

−−

z + Σ y(n)

− z−1 y(n−1) Fig. 6.15 Equivalent sigma-delta modulator A/D system

One can show that

Y(z) = 11 1

1

−−−

−

zX z z Y z( ) ( ) + E(z) (6.16)

where X(z), Y(z), and E(z) are z-transforms of x(n), y(n), and e(n), respectively. By simplifying the above equation, the following is obtained. Y(z) = X(z) + (1−z−1)E(z). (6.17) Note that the resulting noise is given by

N(z) = (1−z−1)E(z). (6.18) Now the noise transfer function, Hn(z), is given by

Hn(z) = 1 − z−1.

6-12

The frequency response is given by

Hn(θ) = 1 − e−jθ = 2 2

222

j jje ej e

j

θ θ− θ

−−

= 22sin2

jj e

θ−θ⎛ ⎞

⎜ ⎟⎝ ⎠

The magnitude response is

( )( ) 2 sin 2nH θθ = . (6.19)

Because the sampling frequency was F2, the magnitude response can be given in terms of the analog frequency, f, as (remember that θ = ω/F2 = 2πf/F2)

2

( ) 2 sinnfH f

F⎛ ⎞π

= ⎜ ⎟⎝ ⎠

. (6.20)

Note:

Let H(f) be the transfer function of a linear system and Pi(f) be the power spectral density of the input of the system. The power spectral density of the output of the system is given by

2( ) ( ) ( )o iP f H f P f= Thus, the power spectral density of the noise is given by

2 2

2

2 2 2

2( ) ( ) 2 1 cose en n

fP f H fF F F

⎛ ⎞σ σπ= = − ⎜ ⎟

⎝ ⎠ (6.21)

The power spectral density of the noise is shown in Fig. 6.16. Note that the gain at DC is zero and the large attenuation is achieved at low frequencies. There is amplification in higher frequencies but high frequencies are going to be removed by a digital lowpass filter. The in-band noise power is obtained by calculating the shaded area and is given by

1

1

0.52

0.5( )

F

n nFP f df

−σ = ∫ . (6.22)

6-13

Shaded area is the in-band noise power

-0.5F2 -F1 -0.5F1 0 0.5F1 F1 0.5F2 (f) 0

σe2/F2

2σe2/F2

3σe2/F2

4σe2/F2

Figure 6.16 Power spectral density of the noise (Shaded area is the in-band noise power). The in-band noise power at the output of a Σ-∆ modulator is approximately given by

2nσ =

3

2

12

2

3 ⎟⎟⎠

⎞⎜⎜⎝

⎛πσ

FF

e for 1 2F F<< . (6.23)

The SQNR in dB is

SQNR = 10log σσ

x

e

2

2

FHGIKJ − 10log π2

3FHGIKJ + 30log F

F2

1

FHGIKJ (6.24)

= 10log σσ

x

e

2

2


3FHGIKJ + 9r [dB].

For every doubling of the oversampling ratio, i.e., for every increment in r, the SQNR improves by 9 dB, or equivalently, the resolution improves by 1.5 bits. To increase the resolution further, the second-order sigma-delta modulator as shown in Fig. 6.17, for example, is considered. e(n) x(n) Σ Σ Σ Σ z−1 Σ y(n) − − z−1 z−1

DAC Figure 6.17 Second-order sigma-delta modulator

In this case, the SQNR becomes

6-14

SQNR = 10log σσ

x

e

2

2


5FHGIKJ + 15r [dB]. (6.25)

For every doubling of the oversampling ratio, the SQNR is improved by 15 dB. To achieve the SQNR of 96 dB as in the case of 16 bit linear quantization, oversampling ratio can be chosen as 64 (= 26). At this rate one bit quantization is as good as 16-bit linear quantization. This technique can be applied to the CD player. A sequence stored in a CD can be upsampled by 64 and interpolated. The interpolated version of the sequence will be the input to the sigma-delta modulator such as one shown in Fig. 6.17. Output y(n) will be converted to analog signal. This is called the one-bit DAC. Recently Super Audio Compact Disc (SACD) format uses the 1-bit Direct Stream Digital technique that is similar to this. 6.7 Adaptive Quantization Speech signals are nonstationary. The standard deviation of a typical speech signal varies with time. The quantization step size can be adapted to the signal’s dynamics. The adaptation can be done at every sample or every few samples. Or it can be done in longer intervals, e.g. 10-20 ms. There are mainly two kinds of adaptation methods available: feedforward adaptation and feedback adaptation. In the feedforward adaptation, the adaptation is computed from the incoming signal into the encoder as shown in Figure 6.18. Transmission channel x(n) Quantizer Decoder $x n( ) c(n) c(n) Adaptation ∆(n) ∆(n) Figure 6.18 Feedforward Adaptive Quantizer In this case, x(n) is the signal into the encoder, $( )x n is the reconstructed signal at the decoder, c(n) is the coded binary sequence. The step size at time n is given by ∆(n) = Kσ(n) (6.26) where σ(n) is the estimation of the standard deviation of the signal and K is the arbitrarily chosen constant. The variance of the signal can be computed from the segment of signal to be quantized:

6-15

σ2 2

0

11( ) ( )nM

x n mm

M

= +=

−

∑ . (6.27)

Note that equation (6.27) uses M future samples. This means that there will be a delay of M and the variance is updated every M-th sample. To update the variance for every sample and to use past samples instead of future samples, an alternative method can be used:

$ ( ) ( )σ α2 2

0

1

n x n mm

m

M

= −=

−

∑ (6.28)

where 0 < α < 1. Equation (6.28) can be computed recursively. By taking the z-transform of equation (6.28), one will have

Z{ $ ( )σ2 n } = ( )αz m

m

M−

=

−

∑ 1

0

1

Z{x2(n)} = 11

1

1

−−

−

−

( )ααzz

M

Z{x2(n)} (6.29)

where Z{⋅} is the z-transform operation. By approximating the numerator to one (for large enough M ), Equation (6.29) becomes (1 − αz-1)Z{ $ ( )σ2 n } = Z{x2(n)} (6.30) The inverse z-transform of equation (6.30) is given by $σ2(n) = α $σ2(n-1) + x2(n). (6.31) The smaller the α, the faster the quantizer can track change in the signal. A typical value of α is 0.9. ∆(n) is usually restricted to a range ∆min < ∆(n) < ∆max. The ratio ∆max/∆min is usually given by 100. One major drawback is that it is necessary to transmit information about ∆(n) as well as c(n). This transmission increases the bit rate. On the other hand, in the feedback adaptation, the adaptation is computed from the outgoing signal as shown in Fig. 6.19. Transmission channel x(n) Quantizer Decoder $x n( ) c(n) c(n) ∆(n) ∆(n) Adaptation Adaptation Figure 6.19. Feedback adaptive quantizer.

6-16

Feedback stepsize adaptation has the advantage that no additional information needs to be transmitted besides the quantized signal. Typically the stepsize is adapted according to the rule ∆(n) = P×∆(n−1). (6.32) The value of the multiplier P depends only on the value of |c(n−1)| which is the magnitude of the codeword in the previous time instant. As an example, when the quantizer is a 3-bit (or 8-level) quantizer as in section 6.1, the P can follow the TABLE 6.3 shown below.

|c(n−1)| 00 01 10 11 P 0.85 1 1 1.5

TABLE 6.3 Multiplier P of the 3-Bit Quantization Stepsize The rationale behind this multiplication is that, for small |c(n−1)|, the signal is soft and we use P<1 to diminish the step size and achieve a finer quantization. On the other hand, for large |c(n−1)|, P>1 because the signal is already loud and needs bigger quantization step size. 6.8 Differential Quantization In the speech signal, especially in the voiced or vowel sound, there is a relatively smooth change from one speech sample to the next. In other words, there is considerable correlation between adjacent samples. As a result, it is expected that the difference of adjacent samples will have a smaller variance and dynamic range than the speech samples themselves. This motivates the quantization of the difference d(n) = x(n)-~x (n) instead of the speech sample x(n), where ~x (n) is the estimation or prediction of x(n) as in Figure 6.20. x(n) + d(n) Quantizer $d n( ) Encoder c(n) c(n) Decoder $d n( ) + $x n( ) channel − + ~x n( ) ~x n( ) + $x n( ) Predictor Predictor + Figure 6.20 Differential Quantization (Transmitter and Receiver)

6-17

A typical prediction rule is given by

~( ) $( )x n x n mmm

P

= −=∑α

1

. (6.33)

Prediction of incoming sample is made based on the P past decoded samples. The prediction coefficients αm are chosen to minimize the average squared prediction error. By writing the equations around the adders of Figure 6.20, it can be shown that the quantization error x(n)− $x (n) of the speech signal is equal to the quantization error d(n)− $d (n) of the difference signal. The difference signal has a smaller variance, and so does the corresponding quantization error. With this approach we decrease the quantization error and increase the SNR. 6.9 Adaptive Differential Quantization So far, the differential quantization has used a fixed predictor and a fixed quantizer. However, speech signal’s characteristics change with time. The adaptation can be performed on both the quantizer and the predictor. This results in adaptive differential quantization. The system incorporates adaptive differential quantization is called the Adaptive Differential Pulse Code Modulation (ADPCM). The CCITT G.721 ADPCM standard employs feedback adaptation of step size and prediction. Bit rate in this case is 32 kbps which is half that of the CCITT A-law or µ-law PCM standard. A detailed description of the ADPCM standard can be found in CCITT Recommendation G.721 “32 kbps Adaptive Differential Pulse Code Modulation,” October 1985. 6.10 Data compression Lossless data compression algorithms preserve all the information in the data so that it can be reconstructed without error. In lossless data compression, compression rate is only a modest 2:1 to 8:1, depending upon the redundancy of the information source and compression techniques' capabilities. Lossless algorithms are mandatory for transmitting or storing such data as computer programs, documents, medical image and numerical information, where a single bad bit could lead to disaster. Lossy compression techniques do not offer perfect reproduction, but can compress data into as little as 1 percent of its uncoded length. The information recovered only approximates the source material, but that is enough in many applications – for images and sounds destined for human eyes and ears, for example. Run-length encoding is effective whenever a particular character is repeated many times in succession. Instead of repeating the character, run-length encoding uses an escape sequence to specify it and how many times to repeat it. The repeated character is replaced by an escape character followed by 2 bytes: the byte for the character to be

6-18

duplicated, and a byte specifying how many times to repeat it. Using run-length encoding, the 35-byte sequence abcde000000000000000000000000000000 reduces to abcde<Esc> 0 30 which is only 8 bytes long. 6.11 Shanon’s information theory How much compression can we reasonably expect to get? In the late 1940s Claude E. Shannon discovered that the extent to which a message can be compressed and then accurately restored is limited by its entropy. Entropy is a measure of the message’s information content or the average information of the message. The information is expressed in bits as the base 2 logarithm of the inverse of message’s probability. For example, suppose any given time one out of four possible letters is transmitted: A, B, C, and D. The probability of transmitting A is 1/2 (or P(A) = 0.5) and the probabilities of transmitting B, C, and D are 1/4, 1/8, and 1/8, respectively. In this case, the information of A is

I (A) = log2(1/P(A)) = log2(2) = 1 bit Other informations are given as

I (B) = log2(1/P(B)) = log2(4) = 2 bits I (C) = log2(1/P(C)) = log2(8) = 3 bits = I(D).

Note that the more probable the letter, the lower its information. Now the entropy (average information) is computed as

Entropy = P(A)I(A) + P(B)I(B) + P(C)I(C) + P(D)I(D) = 0.5 + 0.5 + 0.375 + 0.375 = 1.75 bits.

In this case, one easiest way to encode four letters will be as follows.

A – 00; B – 01; C – 10; D – 11. Note that two bits are required to encode the message. Another way to encode the message is as follows:

A – 0; B – 10; C – 110; D – 111. With this kind of encoding, any binary sequence can be uniquely decoded. For example, a binary sequence

00110101001111010111 can be parsed as

6-19

0/0/110/10/0/0/111/10/10/111.

That will be decoded as

AACBAADBBD. It looks like we need more bits to encode the message. However, in this particular example, the average bits required to encode the message is the same as the entropy. Hence, this kind of encoding is called the entropy coding. One systematic way to encode a message so that the average bits will approach the entropy is called the Huffman coding that was developed by David Huffman as part of a class assignment at MIT in 1950. In general average bit rate of Huffman coding is larger than the enropy. 6.12 Huffman Coding Suppose symbols and their corresponding probabilities are given below. A - 0.4 B - 0.2 C - 0.15 D - 0.1 E - 0.1 F - 0.05 Huffman code is derived from a binary tree that is built corresponding to the symbol probabilities. 0 0 (0.35) (1.0) 0 1 (0.6) 1 0 (0.25) 1 0

(0.15) 1 1 The resulting codes are as follows: A: 0 B: 100 C: 101 D: 110 E: 1110 F: 1111

A (0.4)

B (0.2)

D (0.1)

F (0.05)

C (0.15)

E (0.1)

6-20

Example

(a) Find the average length of the codeword. Average code length = (.4)(1) + (.2)(3) + (.15)(3) + (.1)(3) + (.1)(4) + (.05)(4) = 2.35

(b) Find the entropy. Entropy = (.4)(−log20.4) + (.2)(−log20.2) + (.15)(−log20.15) + (.1)(−log20.1) + + (.1)(−log20.1) + (.05)(−log20.05)

= (.4)(1.322) + (.2)(2.322) + (.15)(2.737) + (.1)(3.322) + (.1)(3.322) + (.05)(4.322) = 2.2843

Note The WAVE file is the Windows standard file for recording and playing a quantized signal using Sound Blaster compatible cards. The header, which is 44 bytes long, contains the information about stereo or mono recording, 16-bit or 8-bit quantization, A-law or µ-law compression, sampling rate, and so on. The following program converts an 8-bit binary WAVE file to a decimal data file. #include <stdio.h> void main() { int n;

double speech; FILE *in, *out; in = fopen(“input.wav”, “rb”); out = fopen(“output.dat”, “w”); for (n=0; n<44; n++) speech = getc(in); /* To get rid of 44-byte long header */ while(!feof(in)) { speech = getc(in) - 128.; /* To make mean value of speech zero */ fprintf (out, “%f\n”, speech); } fclose(in); fclose(out); }

Computer Assignment 4 Compute and plot a magnitude spectrum of the first 1024 points of the wave file, C:\WINDOWS\ringin.wav, or any waveform you recorded. Use FFT function.

6-21

C program to compute the magnitude spectrum of speech signals The following C program is to compute the magnitude spectrum of a speech signal. #include <stdio.h> #include <math.h> #include "fft.cpp" #define N 1024 void fft(double xr[], double xi[], int npt, int inv); void main() { int n; double mag, xr[N], xi[N]; FILE *in, *out; in = fopen("ah.wav", "rb"); out = fopen("magnit.dat", "w"); for (n=0; n<44; n++) mag = getc(in); // Get rid of the first 44-byte long header for (n=0; n<N; n++) { xr[n] = getc(in) - 127.5; // To make the mean value of speech zero xi[n] = 0; // To make the imaginary part zero } fft (xr, xi, N, 0); for (n=0; n<N; n++) { mag = xr[n]*xr[n] + xi[n]*xi[n]; mag = 10.*log10(mag); fprintf (out, "%f\n", mag); } fclose(in); fclose(out); }

0 200 400 600 800 1000 12000

20

40

60

80

100

(a) Spectrum of “ah” sound

0 200 400 600 800 1000 12000

20

40

60

80

100

(b) Spectrum of “oh” sound

6-22

PROBLEMS 6.1 The probability density function of the noise, e(n), is given below. pe(e) e a b

(a) Find the mean of the noise. (b) Find the variance of the noise.

6.2 The signal to be quantized has uniform distribution and is sampled at 8 kHz. 8-bit

linear quantization is used for each sample.

(a) Find the signal to quantization noise ratio. (b) Find the bit rate of the digital signal.

6.3 An analog signal is to be quantized and transmitted over a digital system with a

dynamic range of at least 58 dB. The analog signal has an absolute bandwidth of 1,000Hz and an amplitude range of −5 to 5 V.

(a) Determine the minimum sampling rate needed. (b) Determine the number of bits needed for quantization. (c) Determine the quantization step size. (d) Determine the minimum bit rate required in the digital system to transmit the

signal. 6.4 Assume that the input to a delta modulation is 0.4t3 − t [V]. The step size of the

DM is 0.1 [V] and the sampler operates at 10 samples/sec. Over a time interval of 0 to 2 sec, sketch the input waveform and the staircase approximation. Find also the delta modulator output.

6.5 A binary sequence is 1110101000100111. Sketch the resulting analog waveform

that appears at the delta demodulator output. 6.6 Consider the first order Σ-∆ modulator shown in Fig. 6.15.

(a) Show that the in-band quantization noise power is given by

6-23

21 1

2 2

2 sine F FF F

⎛ ⎞σ π π−⎜ ⎟π ⎝ ⎠

(b) Using the Taylor series expansion of the sine function and assuming that F1 <<

F2, show that the in-band noise power is simplified to

σπ

eFF

22

1

2

3

3FHGIKJ

Taylor series: sin x = x − 3

3!x +

5

5!x −

7

7!x + …

6.7 Suppose symbols and their corresponding probabilities are given below. A − 0.4 B − 0.4 C – 0.2

(a) Find the entropy of the symbols.

(b) Find the average length of the codeword. 6.8 Find the Huffman code for the following symbols. A - 0.4 B - 0.2 C - 0.2 D - 0.1 E - 0.1

Chap 6

Documents

adaptive quantization

quantization word

differential quantization

quantization error

nonlinear quantization

quantization v figure

quantization noise en

linear quantization