Fast Algorithm for DCT(2)

Advanced Digital Signal Processing

Term paper ─ Tutorial

Fast Algorithms for Discrete Cosine

and Sine Transforms

學號：R99943006

姓名：劉姿伶老師：丁建均教授

1

Contents

I. Abstract 3

II. Introduction of DCT/DST 3

III. Fast Algorithm for 1-D DCT/DST1. The explicit forms of orthonormal DCT/DST 5

2. 1-D DCT-I/DST-I 8

A. The split-radix DCT-I algorithm 8

B. The split-radix DST-I algorithm 11

3. 1-D DCT-II/DST-II

A. The split-radix DCT-II algorithm 13

B. DCT-II computation via Walsh Transform 15

IV. Fast Algorithm for 2-D DCT1. Introduction of 2-D DCT 17

2. Fast Algorithm for 2-D DCT-II 18

V. Conclusion 19

VI. Reference 19

2

I. Abstract

Discrete Cosine Transform (DCT) and Discrete Sine Transform (DST) are

widely utilized in different area. Since the DCT became the standard of image

and video compression and processing, the fast algorithm for DCT/DST has been

more and more essential nowadays. In this tutorial, we first analyze the properties

of DCT and DST. After that, we not only introduce split-radix fast algorithm for

1-D DCT-I, DST-I, DCT-II/DST-II, but also show another fast transform for DCT-

II via Walsh transform. In addition to the 1-D algorithm, we also extend the

concept of 1-D fast algorithm to 2-D algorithm.

II. Introduction of DCT and DST

A DCT/DST uses sum of cosine/sine functions oscillating at different

frequencies to express a sequence of data. DCTs are important in many areas like

lossy compressions of audio and images and spectral methods for the numerical

solution of partial differential equations. Besides, when input is even or function,

either DCT-I or DST-I is capable of replacing Discrete Fourier Transform (DFT)

in spectral analysis to reduce computation complexity [2]. There are eight types

of DCTs and DSTs respectively, and four types of DCT/DST are more common.

Four common types of DCT are shown in matrix form and denoted by , ,

, . And they are defined as

3

Their corresponding four types of DST in matrix form denoted by , , ,

, are respectively defined as

There are some important properties of the four-type DCTs and four-type

DSTs and are shown as follows:

They are real-valued and orthonormal.

They are intrinsically related to generalized DFT. [3]

The inverse of DCT/DST matrix is simply the transpose matrix of

original matrix.

Since the range is large if we illustrate all of DCT and DST mentioned

above, we only focus on type-I and type-II DCT/DST which have been discussed

in our class. The application of type-I DCT/DST is the replacement of DFT to

reduce computation complexity. And the type-II of DCT is mainly used in image

compression because of its excellent capability in energy compaction.

Moreover, there are relations between DCT and corresponding DST, which

can be used to reduce the implementation complexity. Specifically, the matrix

4

is related to matrix by [4, 5]

where is the cross-identity (reflection) matrix and

. Therefore, the efficient implementation of DST-II can be

obtained from the one of DCT-II by proper order reversal and sign changes.

Consequently, in the following chapter, only DCT-I, DST-I and DCT-II will be

discussed.

III.Fast Algorithm for 1-D DCT/DST

1、 The explicit forms of orthonormal DCT/DST

Before we introduce the fast algorithm for DCT/DST, we first derive the

explicit forms of DCT/DST matrices for N=2, 4 and 8. By deriving the explicit

form, the properties of symmetry and anti-symmetry in DCT/DST are observed,

which can be used in the fast algorithm. Moreover, the higher-order matrices (ie.

the matrices with N>8) can be generated from lower-order ones (ie. the matrices

with N = 2, 4 and 8) by recursive sparse matrix factorization.

For N = 2, 4 and 8, the explicit forms of DCT-I matrix defined in

Chapter II are as follow:

5

The elements of the DST-I matrix are defined in Chapter II. For N = 2,

the matrix , and for N = 4 and 8, the explicit form are as following:

Finally, For N = 2, 4 and 8, the explicit forms of DCT-II matrix defined

in Chapter II are as follow:

6

By investigating the rows of the matrices , and , the properties

of symmetry can be found. We observe that the location of the symmetry center is

determined by the number of the row (or the number of column because the two

are the same). Here, we call the number of the row of a matrix as the order of a

matrix. When the order of the matrix is even, the symmetry center is in the

midpoint between two center elements of the row. And when the order of the

matrix is odd, the symmetry center is in the midpoint of the row. We denote the

element of the row in the matrix by . Then, the row is symmetric if

, and anti-symmetric if We

can see that even-indexed rows are symmetric and odd-indexed rows are anti-

symmetric, which can be expressed as follow:

By Ref. [6][7], after proper bit-reversal, matrices with even order and

symmetric property may be factorized into sparse matrices as two following

forms:

Where and are matrices of order with symmetry and anti-symmetry.

And and are respectively the results of the multiplication of and by

reflection matrix

In addition to even order matrices, it is possible for odd order matrices to use

sparse matrix factorization as well, except for the center rows and columns. The

symmetry and anti-symmetry in the rows of the DCT and DST matrices are

significant characteristics in factorizing the matrices.

2、 1-D DCT-I/DST-I

As we have mentioned before, DCTs/DSTs are intrinsically related to

generalized DFT. The fast algorithm for DFT, usually called fast Fourier

7

transform (FFT) algorithm, has been investigated and studied for decades. Hence,

the concepts of FFT can also be exploited in Fast algorithm of DCT/DST. In the

following sections, one of the FFT called split-radix FFT is going to be extended

to DCT-I, DST-I and DCT-II. Therefore, we introduce its concepts first.

Spit-radix FFT algorithm is a variant of the Cooley-Tukey FFT algorithm.[8]

Instead of one radix, a blend of radices 2 and 4 is used in spit-radix FFT. It

recursively decomposes a N-point DFT into one N/2-point DFT and two N/4-

point DFTs. With the use of different radix in the decomposition, split-radix FFT

achieves lower computation than radix-2m FFT, for m is positive integer.

A. The split-radix DCT-I algorithm The idea of split-radix FFT algorithm can be extended to the DCT-I [9]. The

formulae of the split-radix fast DCT-I algorithm without normalized factors can

be derived as follows:

Therefore, the first stage decomposes the (N+1)-point DCT-I into one

(N/2+1)-point DCT-I, one (N/4+1)-point DCT-I and (N/4-1)-point DST-I. The

decomposition works recursively and the final output sequence is in bit-reversed

order. The generalized signal flow graph with regular structure is shown in

Fig.3.1 for N = 2, 4 and 8 with .

Based on the split-radix algorithm, the matrix can be decomposed

recursively and is shown as follows:

8

is a permutation matrix for reordering from bit-reversal to natural order.

Fig. 3.1.

And and are un-normalized DST-I and DCT-I matrices respectively

shown as

9

where and are bit-reversal permutation matrices. And is a rotation

matrix as follow:

B. The split-radix DST-I algorithm The idea of split-radix FFT algorithm used in last Section can be extended to

the DST-I as well. And the formulae of split-radix fast DST-I algorithm without

normalization factors can be expressed as follows:

So, we can see that the first stage decomposes the (N+1)-point DST-I into

one (N/2+1)-point DST-I, one (N/4+1)-point DCT-I and (N/4-1)-point DST-I. The

decomposition works recursively and the final output sequence is in bit-reversed

order. Since the DST-I matrix for N = 2 is trivial, the generalized signal flow

graph with regular structure is only shown for N = 4 and 8 with in Fig.

3.2.

10

Fig. 3.2

Based on the split-radix algorithm, the matrix can be decomposed

recursively and is shown as follows:

is a permutation matrix for reordering from bit-reversal to natural order.

And and are un-normalized DST-I and DCT-I matrices respectively shown

as

11

where and are bit-reversal permutation matrices. And is a rotation

matrix as follow:

3、 1-D DCT-II/DST-II

A. The split-radix DCT-II algorithm In addition to 1-D DCT-I and 1-D DST-I, the idea of split-radix FFT

algorithm used can be extended to the DST-II as well. And the formulae of split-

radix fast DCT-II algorithm without normalization factors can be expressed as

follows:

12

Fig. 3.3

Hence, the first stage decomposes the (N+1)-point DCT-II into one N/2-

point DCT-II, one N/4-point DCT-II and N/4-point DST-II. The decomposition

works recursively and the final output sequence is in bit-reversed order. The

generalized signal flow graph with regular structure is shown in Fig. 3.3 for N =

2, 4 and 8. Moreover, the matrix can be decomposed recursively and is

shown as follows:

where is bit-reversal matrices reordering from bit-reversal to natural order . And

matrix is given as follow:

where is a rotation matrix as follow:

13

B. DCT-II computation via Walsh Transform Besides split-radix algorithm, Walsh transform (WT), another kind of

generalized DFT, can also be exploited to fast algorithm of DCT. The concept of

fast algorithm of DCT-II via WT is that the DCT-II matrix with symmetry/anti-

symmetry can be realized via other simpler matrix with symmetry/anti-symmetry.

Since the properties of symmetry/anti-symmetry in WT matrix are the same with

the DCT-II matrix and the element in WT matrix consists only , WT matrix is

an ideal matrix to replace the DCT-II matrix.

Denote the DCT-II and WT respectively in matrix-vector notation without

the normalized factors as

If we rearrange the rows of and , the formulae can be expressed as

Since is an orthonormal matrix, we can get the following results by substitute

the relation into the last equation:

where is the conversion matrix transferring Walsh domain vector to the

DCT-II domain. Therefore, there are two steps in this fast algorithm: first,

compute the WT of the input vector; second, convert from Walsh domain to the

DCT domain via . The signal flow graph for the DCT-II computation via WT

for N = 8 is shown in Fig.3.4(a) and (b).

14

Fig.3.4(a). Step1: Compute the WT of the input vector.

Fig.3.4(b). Step2: Convert from Walsh domain to the DCT domain via .

IV. Fast Algorithm for 2-D DCT

1、 Introduction of 2-D DCT

The most common and significant use of 2-D DCT algorithm is the

application of DCT-II (typically ) in compression of digital

image and video. Therefore, among eight types of DCT and their corresponding

DST, 2-D DCT-II is the only one that research efforts have been concentrated on.

And there is no exception for this tutorial that we only focus on the topic of 2-D

DCT-II fast algorithm.

The 2-D DCT-II for an input data matrix , ,

is defined as follow

and the inverse 2-D DCT-II as follow

15

2、 Fast algorithm for 2-D DCT-II

Generally, there are two kinds of approach of existing fast 2-D DCT-II

algorithm: indirect and direct. In the indirect approach, the 2-D DCT is computed

via other 2-D discrete orthogonal transforms like 2-D DFT of the same size. In

the direct approach, there are two methods based on the direct 2-D DCT

computation. The first one is called row-column method. It utilizes the

separability property of 2-D DCT kernel by applying any fast 1-D DCT algorithm

to the rows and columns of the input matrix sequentially. Hence, for an

input data matrix, where , the row-column method requires totally

multiplication. The second method is a 2-D vector method [10]

utilizing 2-D decomposition process. Although its computation efficiency

outperforms the row-column method, in this tutorial, we only introduce the row-

column one because it combines the 1-D DCT fast algorithm we have already

learned in previous sections and will be easier to understand.

In the row-column method [11], the transform kernel of 2-D DCT-II is

derived in the form as

.

Thus, the multiplications required are reduced and only the ones for the

computation of 1-D DCT-II remain. Therefore, for DCT-II, data

reordering and mapping before 1-D DCT-II is needed. Moreover, a

regular butterfly structure for additions after 1-D DCT-II is also included. The

corresponding signal flow graph for DCT-II computation is shown in

Fig.4.1.

16

Fig. 4.1. Signal flow graph for DCT-II computation.

Any 1-D DCT-II fast algorithm can be used.

V. Conclusion

In this tutorial, we first analyze the symmetric and anti-symmetric properties

of DCT and DST. After that, split-radix fast algorithm for 1-D DCT-I, DST-I,

DCT-II/DST-II is introduced. Moreover, another fast transform for DCT-II via

Walsh transform is also shown. In addition to the 1-D algorithm, we also extend

the concept of 1-D fast algorithm to 2-D algorithm utilizing the separablility

property.

VI. References

[1] V. Britanak, P. C. Yip, K. R. Rao, “Discrete Cosine and Sine Transforms:

General Properties, Fast Algorithms and Integer Approximations, ” Academic

Press, 2007

[2] 丁建均教授, “高等數位訊號處理” 課程講義[3] S. A. Martucci, “Symmetric convolution and the discrete sine and cosine

transforms,” IEEE Transactions on Signal Processing, Vol. 42, No. 5, May

1994, pp. 1038–1051.

[4] Z. Wang, “A fast algorithm for the discrete sine transform implemented by the

fast cosine transform,” IEEE Transactions on Acoustics, Speech, and Signal

Processing, Vol. ASSP-30, October 1982, pp. 814–815.

[5] Z.Wang, “Fast algorithms for the discreteWtransform and for the discrete

Fourier transform,” IEEE Transactions on Acoustics, Speech, and Signal

Processing, Vol. ASSP-32, August 1984, pp. 803–816.

17

[6] S. Venkataraman, V. R. Kanchan, K. R. Rao and M. Mohanty,

“Discrete transforms via the Walsh–Hadamard transform”,

Signal Processing, Vol. 14, No. 4, June 1988, pp. 371–382.

[7] G. Plonka and M. Tasche, “Fast and numerically stable

algorithms for discrete cosine transforms”, Linear Algebra and

its Applications, Vol. 394, No. 1, January 2005, pp. 309–345.

[8] Split-radix FFT algorithm , Wikipedia, http://en.wikipedia.org/wiki/Split-

radix_FFT_algorithm

[9] C. W. Sun and P. Yip, “Split-radix algorithms for DCT and DST”, Proceedings

of the Asilomar Conference on Signals, Systems and Computers, Pacific Grove,

CA, November 1989, pp. 508–512.

[10] S. C. Chan and K. L. Ho, “A new two-dimensional fast cosine transform

algorithm,” IEEE Transactions on Signal Processing, Vol. 39, No. 2, February

1991, pp. 481–485.

[11] N. I. Cho and S. U. Lee, “Fast algorithm and implementation of 2-D discrete

cosine transform,” IEEE Transactions on Circuits and Systems, Vol. 38, No. 3,

March 1991, pp. 297–305.

18

Fast Algorithm for DCT(2)

Documents

Fast Algorithm for DCT(2)