Top Banner
1 ELE4607 Advanced Digital Communications Module 4: Differential Codin g
55
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: DPCM

1

ELE4607 Advanced Digital Communications

Module 4: DifferentialCoding

Page 2: DPCM

Differential Coding

2

• In encoding an analog signal, need to quantize it (A/D or Analog-to-Digital conversion).

• Quantization itself is complex – covered in a separate module.• Output of quantizer is a series of numbers every T seconds.• If those numbers have a smaller dynamic range, it follows that fewer

bits are required to represent the source, hence a bit rate saving.

Page 3: DPCM

3

Differential CodingHence the basic idea of differential coding:

• Encoder generates a prediction of the next sample• Decoder generates the same prediction• The error, defined as (actual - predicted), is transmitted.• If the prediction is good, fewer bits are required to transmit the

error signal than the raw samples themselves.

Page 4: DPCM

4

Differential Coding• Direct quantization is sometimes called PCM, for Pulse Code Modula-

tion.

• Differential quantization is thus called Differential PCM or DPCM.• Used directly in many systems or in conjunction with other algorithms,

for example transform image coding (separate module)

Page 5: DPCM

Differential Coding TheoryA model for a signal is

s(n) = sˆ(n) + e(n)

(1)

5

s(n) is the signal sample at instant nsˆ(n) is an estimation (or approximation) of the signale(n) is an error term

• sˆ(n) is the “deterministic component”.• e(n) is the “stochastic” or random (non-predictable) component, de-

scribed in statistical terms

Page 6: DPCM

Differential Coding TheorySignal model:

s(n) = sˆ(n) + e(n)

(2)

6

or equivalently,

where

e(n) = s(n) − sˆ(n) (3)

s(n) is the signal sample at instant nsˆ(n) is an estimation (or approximation) of the signale(n) is an error term

Page 7: DPCM

Differential Coding TheoryPrediction formed by a weighted linear sum:

sˆ(n) = a1s(n − 1) + a2s(n − 2) + · · · + aps(n − P )P

where

= X

ak s(n − k) (4)k=1

ak is the kth prediction coefficientP is the predictor order

Page 8: DPCM

• Both encoder and decoder run this prediction synchronously.

• Note that it is effectively a form of discrete-time filter.• In speech coding, encoder is said to be the “analysis” filter, and

decoder is the “synthesis” filter.

Problem: error e(n) is quantized (not known exactly at the decoder), so s(n) is not known precisely. This complicates the prediction process (see following equations)

Page 9: DPCM

9

Differential PredictionCombining the above equations gives:

sˆ(n)z

P }| {

e(n) = s(n) − X

ak s(n − k) (5)k=1

s˜(n) is introduced in both the encoder and decoder, and prediction based on that (see diagrams)

s˜(n)z

P }| {

e(n) = s(n) − X

ak sˆ(n − k) (6)k=1

Page 10: DPCM

10

+

Encoder & DecoderEncoder:

s(n) + P

e(n)

s˜(n)

QuantizerQ(·)

Predictor1 − A(z)

sˆ(n) P

+

eˆ(n)

s˜(n)

Decoder:eˆ(n) + P sˆ(n)

+

s˜(n)Predictor1 − A(z)

Page 11: DPCM

Filter InterpretationTaking previous prediction equations and converting to the Z -domain:

P

A(z) = 1 − X

ak z−k (7)

k=1

= 1 − Ps(z) (8)

where A(z) is normally referred to as the “analysis filter” – it creates the error (or prediction residual) from the a speech signal:

E(z)A(z) =

S(z)(9)

Page 12: DPCM

The short-term predictor Ps(z) forms the linear prediction of future samples based on past samples. The subscript ‘s’ is used to distinguish the short-term predictor from the long-term predictor (see later module on speech coding). Ps(z) is defined as

Ps(z) =PX

ak z−k = 1 − A(z) (10)

k=1

Note that some texts denote the prediction filter by A(z) rather than1 − A(z)

Page 13: DPCM

13

Simple Predictors• A simple predictor which gives good prediction for images and reason-

ably good prediction for speech is a first-order predictor with P = 1 and a unity prediction coefficient a1 = 1.

• Optimal prediction requires minimization of a “cost function”. The re- sult will be shown in the next section to depend on the autocorrelation of the signal.

• Another way (not shown) is in terms of vectors (signal vector, coefficient vector) & uses the “principle of orthogonality” (ends up deriving the same result)

Page 14: DPCM

First-Order Optimal Predictor

14

Note that the speech coding literature usually uses a for the predictor coef- ficients, but the adaptive filtering literature usually uses h. It doesn’t matter– they are just coefficients.Consider a simple first-order predictor:

xˆ(n) = h1x(n − 1) (11)

The prediction error is

e(n) = x(n) − xˆ(n) (12)= x(n) − h1x(n − 1) (13)

Page 15: DPCM

1

2

1

The instantaneous square error is

e2(n) = (x(n) − h x(n − 1))2

(14)

Taken over a sufficiently large number of samples, the average square error is

e2 = X

n

= X

1e2(n)

N

1(x(n) − h x(n − 1))

N 1

n

= X

(x2(n) − 2x(n)h x(n − 1) + h2x2(n − 1))N

1 1

n

Page 16: DPCM

1− −

To minimize the average square error with respect to the predictor parameterh1, take derivatives:

Setting

d e2

d h1=

1 X

Nn

(0 − 2x(n)x(n − 1) + 2h1x2(n − 1))

d e2

gives the equation

= 0d h1

1 X x(n)x(n 1) = h∗

Nn

1 X x2(n 1)

Nn

Page 17: DPCM

1

P2

Hence the optimal predictor h∗ is

1 P

h∗ N n x(n)x(n − 1)1 = 1

N n x (n − 1) (15)

Where the summation has to be taken over a “sufficiently large” number of samples to form the prediction.However, real-world signals are “quasi-stationary” so we cannot have too large a block, otherwise the signal changes too quickly.

Page 18: DPCM

X

Taking the summation over a large number of samples, we may use autocor- relations defined as

R (0) ≈ 1x2

Nn

1 X

(n − 1) (16)

Hence

R (1) ≈N

x(n)x(n − 1) (17)n

R (1)h∗

1 = (18)R (0)

Page 19: DPCM

Second-Order Optimal Predictor

xˆ(n)

= h1x(n − 1) + h2x(n − 2)∴ e(n) = x(n) − xˆ(n)

= x(n) − (h1x(n − 1) + h2x(n − 2)) 2

For a second-order predictor,

(19)

∴ e2(n) = [x(n) − (h1x(n − 1) + h2x(n − 2))]

Over many samples, the average square error is

e2 =1

N

1

X e2(n)

nX

2=N

[x(n) − (h1x(n − 1) + h2x(n − 2))]n

Page 20: DPCM

To minimize the average square error with respect to the predictor parame- ters h1 and h2, take derivatives:

and set

∂ e2

∂ h1=

1 X

Nn

2 [x(n) − (h1x(n − 1) + h2x(n − 2))]

× [−x(n − 1)]

(20)

∂ e2

= 0 (21)∂ h1

Page 21: DPCM

Second-Order Optimal Predictor→ optimal predictor h∗,1

1

2

−N

1 X 2 [x(n) − (h∗x(n − 1) + h∗x(n − 2))]

N 1 2

n

× [−x(n − 1)]

= 0

1 X x(n)x(n 1) = h∗

Nn

1 X x(n − 1)x(n − 1)

n

+ h∗ 1 X

N x(n − 1)x(n − 2)

n

Page 22: DPCM

Second-Order Optimal Predictor

2

Using autocorrelation as before,

R (1) = h∗R (0) + h∗R (1) (22)1 2

Similarly, optimizing wrt h∗ yields

R (2) = h∗R (1) + h∗R (0) (23)1 2

Page 23: DPCM

Second-Order Optimal Predictor

h∗2

In matrix form these are more compactly expressed as

R (1)

R (0) R (1)

h∗

or,

R (2)=

R (1) R (0) 1 (24)

r = R h∗ (25)

Page 24: DPCM

Prediction FiltersTaking z transforms of the predictor equation,

E(z) = X (z) − Xb (z) (26)

= X (z) − h1X (z)z−1 + h2X (z)z−2

= X (z)

1 − h1z

−1 + h2z−2

Page 25: DPCM

Prediction FiltersThus

X (z)E(z)

1=

1 − (h1z−1 + h2z−2)

(27)

• The analysis filter is FIR (all-zero).

• The prediction filter is all-pole.• Care must be taken to ensure the synthesis filter at the receiver does not

become unstable.

• Factorize and check roots are inside the unit circle.

Page 26: DPCM

MATLAB Example

actual predicted

Sa

mpl

e V

alue

Second−order Linear Prediction15

10

5

0

−5

−10

−15

−200 50 100 150 200 250 300 350 400

Sample Number

Page 27: DPCM

The actual predictor of

hact =

1.7119

−0.8100 (28)

was used in lpeg.m, with an input of white Gaussian noise of variance unity. Over a block of 2000 samples, the normal equation method as above gave

hnormal =

1.7188

−0.8238 (29)

Page 28: DPCM

Coding the Predictor Parameters• Optimal predictor size will depend on the application.

• Predictor parameters will change over time, as the source is being en- coded (for example, words being spoken, syllables being pronounced, areas of a video screen).

Page 29: DPCM

Coding the Predictor Parameters• Parameters may be estimated from past blocks (backwards estimation)

– does not require transmission of the parameters, but the optimal pre-dictors are slightly out-of-date.

• Parameters may be calculated by the transmitter and explicitly sent to the receiver. This requires extra bits (a “side channel”).

• Speech encoders typically sample at fs = 8kHz, calculate over a block of 2-20ms and use 10th order prediction.

• Image encoders generally use a smaller order predictor, or a two- dimensional predictor.

Page 30: DPCM

30

Coding the Parameters• Instead of block-by-block parameter updates and transmission, it is

pos- sible to update the predictor on a sample-by-sample basis.

• Generally this is termed “adaptive linear prediction”.• Algorithm for block estimation is the Normal Equation method or

Yule- Walker method.

• For sample-by-sample estimation it is called the LMS or Least MeanSquare algorithm.

Page 31: DPCM

Adaptive Prediction

31

The predictor is defined by

e(n) = x(n) − xˆ(n) (30)P

= x(n) − X

hk x(n − k) (31)k=1

• For a first-order predictor, the error squared e2(n) gives rise to a quadratic-shape in two dimensions (e2 vs h1).

• For a second-order, gives rise to a “bowl-shaped” surface (e2 vs h1, h2).

Page 32: DPCM

Adaptive Prediction

32

In matrix form this is

e(n) = x(n) − hT x(n − 1) (32)

Strictly, h is now a function of time index n, ie h(n)

Page 33: DPCM

h1

h2h = . .hP

(33)

x(n − 1) =

x(n − 1)

x(n − 2) .

(34) x(n − P )

Page 34: DPCM

The estimate of the gradient in the h1 direction is

ˆ 2∂ T 2∇h1 e (n) =

∂h1

x(n) − h x(n − 1)

(35)

2=∂h1

(x(n) − (h1x(n − 1) + h2x(n − 2) + · · ·) )

= 2 (x(n) − (h1x(n − 1) + h2x(n − 2) + · · ·) )∂

×∂h1

(x(n) − (h1x(n − 1) + h2x(n − 2) + · · ·) )

= 2 e(n) ( − x(n − 1))= −2 e(n) x(n − 1) (36)

Page 35: DPCM

Adaptive PredictionSimilarly, the estimate of the gradient in the h2 direction is

ˆ 2∇h2 e (n) = −2 e(n) x(n − 2) (37)

Page 36: DPCM

At each new sample, update the predictor h by a quantity proportional to thenegative gradient of e2(n) (because we want to seek the minimum error). So,

h(n + 1) = h(n) − µ ˆ e2(n) (38)

where µ is the adaptation rate parameter. Using the partial derivatives just found,

h(n + 1) = h(n) + 2µ e(n) x(n − 1)) (39)

Page 37: DPCM

= . . .

Expanded this is,

h1(n + 1)

h2(n + 1)

hP (n + 1)

h1(n)

h2(n)

hP (n)

+ 2 µ e(n)

x(n − 1)

x(n − 2)

x(n − P )

(40)

This equation is evaluated at each sample to update the predictor parameters, which in turn are used to predict the next sample.

Page 38: DPCM

38

h

Adaptive Prediction Steppinge2

b

h1(n)b

h1 (n + 1)b

b b

b b

b

∗1

h1

µ is the adaption rate parameter (set empirically, larger gives faster adapta- tion but possibility of instability). Typ. µ = 0.001.

Page 39: DPCM

MATLAB Example

act

39

See MATLAB script alpeg.m

with µ = 0.001, the actual predictor of

1.7119h = −0.8100 (41)

was used with an input of white Gaussian noise of variance unity.Over a block of 2000 samples, the normal equation method as above gave

hLM S =

1.6247

−0.7968 (42)

Page 40: DPCM

h co

effic

ient

val

ue

Adaptive Predictor − h Coefficients with µ=0.0012

1.5

1

0.5

0

−0.5

−10 200 400 600 800 1000 1200 1400 1600 1800 2000

Sample Number

Page 41: DPCM

41

Module Summary – Important Points1. Explain role of prediction in coding

2. Mathematically derive and implement block-based predictor

3. Mathematically derive and implement adaptive predictor