Transform Coding

Transform Coding

Heejune AHNEmbedded Communications Laboratory

Seoul National Univ. of TechnologyFall 2013

Last updated 2013. 9. 30

Heejune AHN: Image and Video Compression p. 2

Agenda

Transform Coding Concept Transform Theory Review DCT (Discrete Cosine Transform) DCT in Video coding DCT Implementation & Fast Algorithms Appendix: KL Transform


1. Transform Coding

X1= lum(2n), X2= lum(2n+1), neighbor pixels X1 ~ U(0, 255), X2~ U(0,255)

Quantization of X1 and X2 => same data Cross-Correlation of X1 and X2

Y1, Y2

45 degree rotation Y1 = (X1 + X2) /2

• Average or DC value Y2 = (X2 – X1) /2

• Difference or AC value Y1 ~ F(0, 255), Y2~ F(-255,255)

2X

1X

p

1y

2x

2y

2Y

1Y

1x

255 -255 2550 0


Which ones are easier to encode (quantize)?

2550

255 -255 2550 0

2550

f(X1) f(X2)

f(Y1) f(Y2)


Origins of Transform Coding Benefits Signal Theory

• Make the representation easier to manipulate

• energy concentration

Image and HVS Properties• HVS is more sensitive to Low frequency

• More dense quantizer to Low frequency

21

,

2,

222,22, ),/(log

2

1 N

lklklklk N

Bb

Vilfredo ParetoEconomist1848-1923


2. Transform Theory Review

Definition of Transform N to M mapping, [Y1, Y2, . . ., YN] = F [X1,X2, . . ., XM]

Linear Transform (cf. Non-Linear Transform) if [Y11, Y12] = F [X11,X12] and [Y21, Y22] = F [X21,X22]

[Y11 + Y21, Y12 +Y22] = F [X11+X21, X21+X22]

Matrix representation of Linear Transform Forward

Inverse

N transform coefficients,arranged as a vector

Transform matrixof size NxN

Input signal block of sizeN, arranged as a vector

y = T x

x = T-1 y


Basis Vectors

Orthogonal Vl * Vm = 0 for basis Vector V1, V2, . . ., VN

Each vectors are disjointed, separated. Orthonormal

|| Vl || = 1 for basis Vector V1, V2, . . ., VN

Parseval’s Theorem • Signal Power/Energy conserves between Transform Domain

v1v2v3

vN

x = T-1 y = TT yT-1 =TT =>

||y||2 = yTy = xTTT Tx = ||x||2


Example of Orthonormal transform

11

11

2

1

45cos45sin

45sin45cos )rotation45(

oo

oooT


2D Transform

Data 2D pixel value matrix, 2D transform coefs matrix 2D matrix => 1D vector

Forward Transform

Inverse transform

NxN transform coefficients,arranged as a vector Transform matrix

of size N2xN2 Input signal block of sizeNxN, arranged as a vector

y = T x

x = T-1 y

),(),( ),(1

0

1

0

mntmnflkFN

n

N

m


3. Transforms

Various transforms in image compression DFT (Discrete Fourier Transform) DCT (Discrete cosine Transform) DST (Discrete sine Transform) Hadamard Transfrom Discrete Wavelet Transform and more (HAAR etc )


Hadamard transform

Core Matrix 1 차원

N 차원

2 차원

Transform

11

11

2

11

H

11

1111

2

1

nn

nnnn HH

HHHHH

product Knonecker :

,2 ,1 ,2 ,log where 2

nNNn n

1* HHHt

HXHHXHY NNt

NNNN

2for

1 111

111 1

11 11

1 1 1 1

4

1

2

1

11

1111

n

HH

HHHHH

nn

nnnn


DCT Transform

1D Forward DCT (pixel domain to frequency domain)

1D Inverse DCT (frequency domain to pixel domain)Nk

N

NkN

k nfkCkF

N

n

2)0(

1)0(

,1 , ,1 ,0 ,2

1)(2ncos )( )()(

1

0

10 ,2

1)(2ncos )()()(

1

0

NnN

kkFkCnf

N

k


2D DCT

2D DCT basis Functions Coef. Distribution

DC ~ Uniform dist., AC ~ Laplacian dist.


Properties Orthonormal transform Separable transform Real valued coefficients

DCT performance very resembles KLT for image input

• Image input model (1 order Markov chain)

• xn+1 = rho * xn+1 + e(n)

DCT complexity 2D DCT = 1D DCT for vertical * 1D DCT for horizontal Not for 3D (for delay and memory size) DCT size (4x4, 8x8, 16x16, 32x32 …)

• Larger: better performance, but blocking artifact (?) and HW complexity


Coding Performance of DCT

Karhunen Loève transform [1948/1960]Haar transform [1910]

Walsh-Hadamard transform [1923]Slant transform [Enomoto, Shibata, 1971]

Discrete CosineTransform (DCT) [Ahmet, Natarajan, Rao, 1974]

Comparison of 1-dbasis functions forblock size N=8


Energy concentration Performance measured for typical natural images, block size 1x32 KLT is optimum DCT performs only slightly worse than KLT


Complexity Performance of DCT

Separation of 2D DCT Cascading 1-D DCT Reduction of the complexity (multiplication) from O(N4) to O(N3) 8x8 DCT

• For 64 each Coefs, 64 multiplications

• 2 times 64 Coefs x 8 Can you derive this ?

column-wise N-transform

row-wiseN-transform

N

Nx Ax AxAT

NxN blockof pixels

NxN block of transformcoefficientsAxAT


4. Transform in Image Coding

Transform coding Procedure Transform T(x) usually invertible Quantization not invertible, introduces distortion Combination of encoder and decoder lossless

transform

Ty x quantizer

Qq y encoder

Cc q

samples yimage x indices q

1

inversetransform

ˆ ˆT x y 1

dequantizer

ˆ Qy q 1

decoder

C q c

indices qˆsamples yreconstructed

ˆimage x

bit-stream c


185 3 1 1 -3 2 -1 0

1 1 -1 0 -1 0 0 1

0 0 1 0 -1 0 0 0

1 1 0 -1 0 0 0 -1

0 0 1 0 0 0 -1 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

DCT in Image Coding

DCT

Original 8x8 block

Q

1480 26.0 9.5 8.9 -26.4 15.1 -8.1 0.3

11.0 8.3 -8.2 3.8 -8.4 -6.0 -2.8 10.6

-5.5 4.5 9.0 5.3 -8.0 4.0 -5.1 4.9

10.7 9.8 4.9 -8.3 -2.1 -1.9 2.8 -8.1

1.6 1.4 8.2 4.3 3.4 4.1 -7.9 1.0

-4.5 -5.0 -6.4 4.1 -4.4 1.8 -3.2 2.1

5.9 5.8 2.4 2.8 -2.0 5.9 3.2 1.1

-3.0 2.5 -1.0 0.7 4.1 -6.1 6.0 5.7

198 202 194 179 180 184 196 168

187 196 192 181 182 185 189 174

188 185 193 179 188 188 187 170

184 188 182 187 183 186 195 174

194 193 189 187 180 183 181 185

193 195 193 192 170 189 187 181

181 185 183 180 175 184 185 176

195 185 177 178 170 179 195 175

192 201 195 184 177 184 193 174

189 191 195 182 182 187 190 171

188 185 190 181 185 187 189 171

189 188 185 183 183 182 190 175

191 192 186 189 179 182 188 178

190 191 189 190 177 186 184 179

189 188 185 184 175 186 187 179

189 188 178 176 173 183 193 180

Scaling and inverse DCT

Reconstructed 8x8 block

Inverse zig-zag scan

Mean of Block: 185

(0,3) (0,1) (1,1) (0,1) (0,1) (0,1) (0,-1) (1,1)

(1,1) (0,1) (1,-3) (0,2) (0,-1) (6,1) (0,-1) (0,-1)

(1,-1) (14,1) (9,-1) (0,-1) EOB

Run-level coding

Zig-zag scan

Transmission

Transformed 8x8 block

185 3 1 1 -3 2 -1 0

1 1 -1 0 -1 0 0 1

0 0 1 0 -1 0 0 0

1 1 0 -1 0 0 0 -1

0 0 1 0 0 0 -1 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

Mean of Block: 185

(0,3) (0,1) (1,1) (0,1) (0,1) (0,1) (0,-1) (1,1)

(1,1) (0,1) (1,-3) (0,2) (0,-1) (6,1) (0,-1) (0,-1)

(1,-1) (14,1) (9,-1) (0,-1) EOB


DCT in Image Coding

Uniform deadzone quantizer transform coefficients that fall

below a threshold are discarded.

Entrphy coding Positions of non-zero transform

coefficients are transmitted in addition to their amplitude values.

Efficient encoding of the position of non-zero transform coefficients: zig-zag-scan + run-level-coding

Quantizer input

Quantizer output


DCT Examples

Note that only a few coefficients has sizable value.

image blockDCT coefficients

of block

quantized DCT coefficients

of block

block reconstructed

from quantized coefficients

0

2

4

6

0

2

4

6

- 30

- 20

- 10

0

10

20

30

0

2

4

6

0

2

4

6

- 30

- 20

- 10

0

10

20

30

0

2

4

6

0

2

4

6

- 30

- 20

- 10

0

10

20

30

0

2

4

6

0

2

4

6

- 30

- 20

- 10

0

10

20

30

0

2

4

6

0

2

4

6

- 30

- 20

- 10

0

10

20

30

0

2

4

6

0

2

4

6

- 30

- 20

- 10

0

10

20

30


DCT coding with increasingly coarse quantization, block size 8x8

quantizer stepsize for AC coefficients: 25




4. Implementation

Implementation issue HW or SW Computational Cost, Speed, Implementation Size Performance Cost Implementation complexity

SW Implementation decision factors Computational cost of multiplication Whether Fixed or Float point operation (esp. multiplication) Special Coprocessor and Instruction set (e.g. MMX)


Fast DCT Algorithm

Original DCT/IDCT Computation load

• 64 Add + 64 Mult.

• 8 (7) Addition + 8 multiplication / one coeff. (from eqn.) Scaling

• input range [0, 255] => output range [-2024, 2024]

Fast DCT Similar to Fast DFT Share same computation between nodes. O(NxN) => O (N log2N)

• N : Width (num of coeff.)

• log2N : Steps of algorithm

Several version : Chen, Lee, Arai etc


Chen’s FDCT

See Code at http://www.cmlab.csie.ntu.edu.tw/~chenhsiu/tech/fastdct.cpp


How the fast algorithm works? Exploiting the symmetry of cosine function.

STEP 1

STEP 2

8283

283

281 cos cos )6(2,cos cos )2(2 DDFDDF

865218

33740

83

652183740

815

7813

6811

589

4

87

385

283

18021

cos)( cos)( )6(2

cos)( cos)( )2(2

coscoscoscos

coscoscoscos)2(

ffffffffF

ffffffffF

ffff

ffffF

)( 2),( 1 65213740 ffffDffffD


HW Implementation

2D DCT using 1D DCT Function Block

1-D DCT

8x8 RAM

Input sample

MUX Output coef

Row order input Column order output


Distributed Arithmetic DCT Multiplier-less architecture Lookup, Shift, accumulators only

Shift(2-1)

accumulator

4 bits from u input

Output coef Fx

Add or subtract

LUT (ROM)


IDCT Mismatch

DCT x IDCT = I ? DCT is defined: in “floating point” and “direct form.” Integer Implementation induces ‘error’ after Inverse DCT. different FDCT has different ‘error’s.

DCT mismatch in MC-DCT different reference image at encoder and decoder very small error but it accumulates.

DCT Q

IDCTDIQIDCTE

recE

IQ

recDShould Equal but Mismatch !

orgE VLC VLD


IDCT Mismatch control Minimum accuracy of DCT algorithm is defined in SPEC. H.261/3,MPEG-1/2 Restrict the sum of coefficients values

• Oddification rule of sum of all DCT coefficients,

• Make LSB of F[63], the last Coef.

• Decoder check and correct the values H.264

• (modified) Integer DCT is used

adding random error cancelation

Appendix

KL Transform, The Optimal Transform


Optimal Transform

Optimality (No) Redundancy in input signal => (No) Redundant Quantization

Result No cross-correlation between different components (coefs)

K-L (Karhunen-Loeve) transform Assumption

• Input Covariance is given Problem Definition

• find a transform (Y=T X) such that RY,Y = T RX,X TT meets diagonal matrix (i.e., completely uncorrelated Y)

] [ *,

t

XXER XX

}{ diag

T )(

1

1

0

*,

**,

k

xxyy

N

ttt

RTXTXTEYYER


Optimal Transform

Solution

• Build T with eigenvectors of RX,X as basis vector

• Then, by the definition of Eigen-vectors & values (of RX,X)

– –

• So.

Issue in KLT

• RX,X is varying for image to image: Need to calculate new T, transmit it to decoder

• Not Separable (vertical, horizontal)• But, good for benchmarking performance of other transform.

, 1 , ,1 ,0 , NkRkkkx

1

0

10 101010,

N

N NNNxxR

I

RR

N

Nt

N

Nxxt

Nyy

1

0

10*

10

10,*

10,

tNoptimalT *

10

Transform Coding

Documents