Advanced Digital Signal Processing
Term paper ─ Tutorial
Fast Algorithms for Discrete Cosine
and Sine Transforms
學號:R99943006
姓名:劉姿伶老師:丁建均 教授
1
Contents
I. Abstract 3
II. Introduction of DCT/DST 3
III. Fast Algorithm for 1-D DCT/DST1. The explicit forms of orthonormal DCT/DST 5
2. 1-D DCT-I/DST-I 8
A. The split-radix DCT-I algorithm 8
B. The split-radix DST-I algorithm 11
3. 1-D DCT-II/DST-II
A. The split-radix DCT-II algorithm 13
B. DCT-II computation via Walsh Transform 15
IV. Fast Algorithm for 2-D DCT1. Introduction of 2-D DCT 17
2. Fast Algorithm for 2-D DCT-II 18
V. Conclusion 19
VI. Reference 19
2
I. Abstract
Discrete Cosine Transform (DCT) and Discrete Sine Transform (DST) are
widely utilized in different area. Since the DCT became the standard of image
and video compression and processing, the fast algorithm for DCT/DST has been
more and more essential nowadays. In this tutorial, we first analyze the properties
of DCT and DST. After that, we not only introduce split-radix fast algorithm for
1-D DCT-I, DST-I, DCT-II/DST-II, but also show another fast transform for DCT-
II via Walsh transform. In addition to the 1-D algorithm, we also extend the
concept of 1-D fast algorithm to 2-D algorithm.
II. Introduction of DCT and DST
A DCT/DST uses sum of cosine/sine functions oscillating at different
frequencies to express a sequence of data. DCTs are important in many areas like
lossy compressions of audio and images and spectral methods for the numerical
solution of partial differential equations. Besides, when input is even or function,
either DCT-I or DST-I is capable of replacing Discrete Fourier Transform (DFT)
in spectral analysis to reduce computation complexity [2]. There are eight types
of DCTs and DSTs respectively, and four types of DCT/DST are more common.
Four common types of DCT are shown in matrix form and denoted by , ,
, . And they are defined as
3
Their corresponding four types of DST in matrix form denoted by , , ,
, are respectively defined as
There are some important properties of the four-type DCTs and four-type
DSTs and are shown as follows:
They are real-valued and orthonormal.
They are intrinsically related to generalized DFT. [3]
The inverse of DCT/DST matrix is simply the transpose matrix of
original matrix.
Since the range is large if we illustrate all of DCT and DST mentioned
above, we only focus on type-I and type-II DCT/DST which have been discussed
in our class. The application of type-I DCT/DST is the replacement of DFT to
reduce computation complexity. And the type-II of DCT is mainly used in image
compression because of its excellent capability in energy compaction.
Moreover, there are relations between DCT and corresponding DST, which
can be used to reduce the implementation complexity. Specifically, the matrix
4
is related to matrix by [4, 5]
where is the cross-identity (reflection) matrix and
. Therefore, the efficient implementation of DST-II can be
obtained from the one of DCT-II by proper order reversal and sign changes.
Consequently, in the following chapter, only DCT-I, DST-I and DCT-II will be
discussed.
III.Fast Algorithm for 1-D DCT/DST
1、 The explicit forms of orthonormal DCT/DST
Before we introduce the fast algorithm for DCT/DST, we first derive the
explicit forms of DCT/DST matrices for N=2, 4 and 8. By deriving the explicit
form, the properties of symmetry and anti-symmetry in DCT/DST are observed,
which can be used in the fast algorithm. Moreover, the higher-order matrices (ie.
the matrices with N>8) can be generated from lower-order ones (ie. the matrices
with N = 2, 4 and 8) by recursive sparse matrix factorization.
For N = 2, 4 and 8, the explicit forms of DCT-I matrix defined in
Chapter II are as follow:
5
The elements of the DST-I matrix are defined in Chapter II. For N = 2,
the matrix , and for N = 4 and 8, the explicit form are as following:
Finally, For N = 2, 4 and 8, the explicit forms of DCT-II matrix defined
in Chapter II are as follow:
6
By investigating the rows of the matrices , and , the properties
of symmetry can be found. We observe that the location of the symmetry center is
determined by the number of the row (or the number of column because the two
are the same). Here, we call the number of the row of a matrix as the order of a
matrix. When the order of the matrix is even, the symmetry center is in the
midpoint between two center elements of the row. And when the order of the
matrix is odd, the symmetry center is in the midpoint of the row. We denote the
element of the row in the matrix by . Then, the row is symmetric if
, and anti-symmetric if We
can see that even-indexed rows are symmetric and odd-indexed rows are anti-
symmetric, which can be expressed as follow:
By Ref. [6][7], after proper bit-reversal, matrices with even order and
symmetric property may be factorized into sparse matrices as two following
forms:
Where and are matrices of order with symmetry and anti-symmetry.
And and are respectively the results of the multiplication of and by
reflection matrix
In addition to even order matrices, it is possible for odd order matrices to use
sparse matrix factorization as well, except for the center rows and columns. The
symmetry and anti-symmetry in the rows of the DCT and DST matrices are
significant characteristics in factorizing the matrices.
2、 1-D DCT-I/DST-I
As we have mentioned before, DCTs/DSTs are intrinsically related to
generalized DFT. The fast algorithm for DFT, usually called fast Fourier
7
transform (FFT) algorithm, has been investigated and studied for decades. Hence,
the concepts of FFT can also be exploited in Fast algorithm of DCT/DST. In the
following sections, one of the FFT called split-radix FFT is going to be extended
to DCT-I, DST-I and DCT-II. Therefore, we introduce its concepts first.
Spit-radix FFT algorithm is a variant of the Cooley-Tukey FFT algorithm.[8]
Instead of one radix, a blend of radices 2 and 4 is used in spit-radix FFT. It
recursively decomposes a N-point DFT into one N/2-point DFT and two N/4-
point DFTs. With the use of different radix in the decomposition, split-radix FFT
achieves lower computation than radix-2m FFT, for m is positive integer.
A. The split-radix DCT-I algorithm The idea of split-radix FFT algorithm can be extended to the DCT-I [9]. The
formulae of the split-radix fast DCT-I algorithm without normalized factors can
be derived as follows:
Therefore, the first stage decomposes the (N+1)-point DCT-I into one
(N/2+1)-point DCT-I, one (N/4+1)-point DCT-I and (N/4-1)-point DST-I. The
decomposition works recursively and the final output sequence is in bit-reversed
order. The generalized signal flow graph with regular structure is shown in
Fig.3.1 for N = 2, 4 and 8 with .
Based on the split-radix algorithm, the matrix can be decomposed
recursively and is shown as follows:
8
is a permutation matrix for reordering from bit-reversal to natural order.
Fig. 3.1.
And and are un-normalized DST-I and DCT-I matrices respectively
shown as
9
where and are bit-reversal permutation matrices. And is a rotation
matrix as follow:
B. The split-radix DST-I algorithm The idea of split-radix FFT algorithm used in last Section can be extended to
the DST-I as well. And the formulae of split-radix fast DST-I algorithm without
normalization factors can be expressed as follows:
So, we can see that the first stage decomposes the (N+1)-point DST-I into
one (N/2+1)-point DST-I, one (N/4+1)-point DCT-I and (N/4-1)-point DST-I. The
decomposition works recursively and the final output sequence is in bit-reversed
order. Since the DST-I matrix for N = 2 is trivial, the generalized signal flow
graph with regular structure is only shown for N = 4 and 8 with in Fig.
3.2.
10
Fig. 3.2
Based on the split-radix algorithm, the matrix can be decomposed
recursively and is shown as follows:
is a permutation matrix for reordering from bit-reversal to natural order.
And and are un-normalized DST-I and DCT-I matrices respectively shown
as
11
where and are bit-reversal permutation matrices. And is a rotation
matrix as follow:
3、 1-D DCT-II/DST-II
A. The split-radix DCT-II algorithm In addition to 1-D DCT-I and 1-D DST-I, the idea of split-radix FFT
algorithm used can be extended to the DST-II as well. And the formulae of split-
radix fast DCT-II algorithm without normalization factors can be expressed as
follows:
12
Fig. 3.3
Hence, the first stage decomposes the (N+1)-point DCT-II into one N/2-
point DCT-II, one N/4-point DCT-II and N/4-point DST-II. The decomposition
works recursively and the final output sequence is in bit-reversed order. The
generalized signal flow graph with regular structure is shown in Fig. 3.3 for N =
2, 4 and 8. Moreover, the matrix can be decomposed recursively and is
shown as follows:
where is bit-reversal matrices reordering from bit-reversal to natural order . And
matrix is given as follow:
where is a rotation matrix as follow:
13
B. DCT-II computation via Walsh Transform Besides split-radix algorithm, Walsh transform (WT), another kind of
generalized DFT, can also be exploited to fast algorithm of DCT. The concept of
fast algorithm of DCT-II via WT is that the DCT-II matrix with symmetry/anti-
symmetry can be realized via other simpler matrix with symmetry/anti-symmetry.
Since the properties of symmetry/anti-symmetry in WT matrix are the same with
the DCT-II matrix and the element in WT matrix consists only , WT matrix is
an ideal matrix to replace the DCT-II matrix.
Denote the DCT-II and WT respectively in matrix-vector notation without
the normalized factors as
If we rearrange the rows of and , the formulae can be expressed as
Since is an orthonormal matrix, we can get the following results by substitute
the relation into the last equation:
where is the conversion matrix transferring Walsh domain vector to the
DCT-II domain. Therefore, there are two steps in this fast algorithm: first,
compute the WT of the input vector; second, convert from Walsh domain to the
DCT domain via . The signal flow graph for the DCT-II computation via WT
for N = 8 is shown in Fig.3.4(a) and (b).
14
Fig.3.4(a). Step1: Compute the WT of the input vector.
Fig.3.4(b). Step2: Convert from Walsh domain to the DCT domain via .
IV. Fast Algorithm for 2-D DCT
1、 Introduction of 2-D DCT
The most common and significant use of 2-D DCT algorithm is the
application of DCT-II (typically ) in compression of digital
image and video. Therefore, among eight types of DCT and their corresponding
DST, 2-D DCT-II is the only one that research efforts have been concentrated on.
And there is no exception for this tutorial that we only focus on the topic of 2-D
DCT-II fast algorithm.
The 2-D DCT-II for an input data matrix , ,
is defined as follow
and the inverse 2-D DCT-II as follow
15
2、 Fast algorithm for 2-D DCT-II
Generally, there are two kinds of approach of existing fast 2-D DCT-II
algorithm: indirect and direct. In the indirect approach, the 2-D DCT is computed
via other 2-D discrete orthogonal transforms like 2-D DFT of the same size. In
the direct approach, there are two methods based on the direct 2-D DCT
computation. The first one is called row-column method. It utilizes the
separability property of 2-D DCT kernel by applying any fast 1-D DCT algorithm
to the rows and columns of the input matrix sequentially. Hence, for an
input data matrix, where , the row-column method requires totally
multiplication. The second method is a 2-D vector method [10]
utilizing 2-D decomposition process. Although its computation efficiency
outperforms the row-column method, in this tutorial, we only introduce the row-
column one because it combines the 1-D DCT fast algorithm we have already
learned in previous sections and will be easier to understand.
In the row-column method [11], the transform kernel of 2-D DCT-II is
derived in the form as
.
Thus, the multiplications required are reduced and only the ones for the
computation of 1-D DCT-II remain. Therefore, for DCT-II, data
reordering and mapping before 1-D DCT-II is needed. Moreover, a
regular butterfly structure for additions after 1-D DCT-II is also included. The
corresponding signal flow graph for DCT-II computation is shown in
Fig.4.1.
16
Fig. 4.1. Signal flow graph for DCT-II computation.
Any 1-D DCT-II fast algorithm can be used.
V. Conclusion
In this tutorial, we first analyze the symmetric and anti-symmetric properties
of DCT and DST. After that, split-radix fast algorithm for 1-D DCT-I, DST-I,
DCT-II/DST-II is introduced. Moreover, another fast transform for DCT-II via
Walsh transform is also shown. In addition to the 1-D algorithm, we also extend
the concept of 1-D fast algorithm to 2-D algorithm utilizing the separablility
property.
VI. References
[1] V. Britanak, P. C. Yip, K. R. Rao, “Discrete Cosine and Sine Transforms:
General Properties, Fast Algorithms and Integer Approximations, ” Academic
Press, 2007
[2] 丁建均教授, “高等數位訊號處理” 課程講義[3] S. A. Martucci, “Symmetric convolution and the discrete sine and cosine
transforms,” IEEE Transactions on Signal Processing, Vol. 42, No. 5, May
1994, pp. 1038–1051.
[4] Z. Wang, “A fast algorithm for the discrete sine transform implemented by the
fast cosine transform,” IEEE Transactions on Acoustics, Speech, and Signal
Processing, Vol. ASSP-30, October 1982, pp. 814–815.
[5] Z.Wang, “Fast algorithms for the discreteWtransform and for the discrete
Fourier transform,” IEEE Transactions on Acoustics, Speech, and Signal
Processing, Vol. ASSP-32, August 1984, pp. 803–816.
17
[6] S. Venkataraman, V. R. Kanchan, K. R. Rao and M. Mohanty,
“Discrete transforms via the Walsh–Hadamard transform”,
Signal Processing, Vol. 14, No. 4, June 1988, pp. 371–382.
[7] G. Plonka and M. Tasche, “Fast and numerically stable
algorithms for discrete cosine transforms”, Linear Algebra and
its Applications, Vol. 394, No. 1, January 2005, pp. 309–345.
[8] Split-radix FFT algorithm , Wikipedia, http://en.wikipedia.org/wiki/Split-
radix_FFT_algorithm
[9] C. W. Sun and P. Yip, “Split-radix algorithms for DCT and DST”, Proceedings
of the Asilomar Conference on Signals, Systems and Computers, Pacific Grove,
CA, November 1989, pp. 508–512.
[10] S. C. Chan and K. L. Ho, “A new two-dimensional fast cosine transform
algorithm,” IEEE Transactions on Signal Processing, Vol. 39, No. 2, February
1991, pp. 481–485.
[11] N. I. Cho and S. U. Lee, “Fast algorithm and implementation of 2-D discrete
cosine transform,” IEEE Transactions on Circuits and Systems, Vol. 38, No. 3,
March 1991, pp. 297–305.
18