Top Banner
Research Article A Fast DCT Algorithm for Watermarking in Digital Signal Processor S. E. Tsai 1 and S. M. Yang 2 1 Department of Computer Science and Information Engineering, Chang Jung Christian University, Tainan City 701, Taiwan 2 Department of Aeronautics and Astronautics, National Cheng Kung University, Tainan City 701, Taiwan Correspondence should be addressed to S. M. Yang; [email protected] Received 3 November 2016; Accepted 23 January 2017; Published 9 February 2017 Academic Editor: Xinkai Chen Copyright © 2017 S. E. Tsai and S. M. Yang. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Discrete cosine transform (DCT) has been an international standard in Joint Photographic Experts Group (JPEG) format to reduce the blocking effect in digital image compression. is paper proposes a fast discrete cosine transform (FDCT) algorithm that utilizes the energy compactness and matrix sparseness properties in frequency domain to achieve higher computation performance. For a JPEG image of 8×8 block size in spatial domain, the algorithm decomposes the two-dimensional (2D) DCT into one pair of one-dimensional (1D) DCTs with transform computation in only 24 multiplications. e 2D spatial data is a linear combination of the base image obtained by the outer product of the column and row vectors of cosine functions so that inverse DCT is as efficient. Implementation of the FDCT algorithm shows that embedding a watermark image of 32 × 32 block pixel size in a 256 × 256 digital image can be completed in only 0.24 seconds and the extraction of watermark by inverse transform is within 0.21 seconds. e proposed FDCT algorithm is shown more efficient than many previous works in computation. 1. Introduction Discrete cosine transform (DCT) has been widely used to convert a dynamic signal into frequency components so as to reduce digital image storage size, expedite data transmission, and remove redundant information. DCT is closely related to discrete Fourier transform with the advantage of concen- trating the energy of transformed signal in low frequency range where human eyes are less sensitive in image processing [1]. e joint ISO committee therefore adopts DCT to Joint Photographic Experts Group (JPEG) international standard of 8 × 8 block size to reduce the blocking effect in image compression. A basic JPEG image encoding is composed of three procedures: image transform, quantization, and encoding. DCT can map an original data into frequency domain by cosine waveform, and conversely inverse discrete cosine transform (IDCT) transfers frequency domain data into spatial domain. Numerous coding methods based on DCT have been presented for digital image processing; however, the associated memory size, bandwidth, and safety issues are of significant concern to real-time applications. Sun and Yang [2] proposed an image compression method based on a Laplace transparent composite model to achieve high coding efficiency. Jridi et al. [3] presented image compression hardware to reduce computational complexity. Others have proposed to optimize image computation by digital signal processor (DSP). Kumbhare and Gokhale [4] developed a low complexity architecture for computing an algebraic integer based 8-point DCT in digital image processing. Jridi et al. [5] designed a low complexity DCT engine in digital video and image processing. Subband decomposition algorithms based on DCT have also been used in transmitting image data of low resolution to rebuilt image of better quality [6–8], but they required high complexity and thus time-consuming computation. Stassen’s matrix multiplica- tion algorithm was proposed to reduce complex matrix multiplication in DCT [9]. Khan et al. [10] increased the coordination between the pixel size and subword size to maximize resource utilization for multimedia application, but Hindawi Mathematical Problems in Engineering Volume 2017, Article ID 7401845, 7 pages https://doi.org/10.1155/2017/7401845
8

A Fast DCT Algorithm for Watermarking in Digital Signal Processordownloads.hindawi.com/journals/mpe/2017/7401845.pdf · 2019-07-30 · A Fast DCT Algorithm for Watermarking in Digital

Apr 01, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Fast DCT Algorithm for Watermarking in Digital Signal Processordownloads.hindawi.com/journals/mpe/2017/7401845.pdf · 2019-07-30 · A Fast DCT Algorithm for Watermarking in Digital

Research ArticleA Fast DCT Algorithm for Watermarking inDigital Signal Processor

S E Tsai1 and S M Yang2

1Department of Computer Science and Information Engineering Chang Jung Christian University Tainan City 701 Taiwan2Department of Aeronautics and Astronautics National Cheng Kung University Tainan City 701 Taiwan

Correspondence should be addressed to S M Yang smyangmailnckuedutw

Received 3 November 2016 Accepted 23 January 2017 Published 9 February 2017

Academic Editor Xinkai Chen

Copyright copy 2017 S E Tsai and S M Yang This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

Discrete cosine transform (DCT) has been an international standard in Joint Photographic Experts Group (JPEG) format to reducethe blocking effect in digital image compressionThis paper proposes a fast discrete cosine transform (FDCT) algorithm that utilizesthe energy compactness and matrix sparseness properties in frequency domain to achieve higher computation performance Fora JPEG image of 8 times 8 block size in spatial domain the algorithm decomposes the two-dimensional (2D) DCT into one pair ofone-dimensional (1D) DCTs with transform computation in only 24 multiplications The 2D spatial data is a linear combination ofthe base image obtained by the outer product of the column and row vectors of cosine functions so that inverse DCT is as efficientImplementation of the FDCT algorithm shows that embedding a watermark image of 32 times 32 block pixel size in a 256 times 256 digitalimage can be completed in only 024 seconds and the extraction of watermark by inverse transform is within 021 seconds Theproposed FDCT algorithm is shown more efficient than many previous works in computation

1 Introduction

Discrete cosine transform (DCT) has been widely used toconvert a dynamic signal into frequency components so as toreduce digital image storage size expedite data transmissionand remove redundant information DCT is closely relatedto discrete Fourier transform with the advantage of concen-trating the energy of transformed signal in low frequencyrangewhere human eyes are less sensitive in image processing[1] The joint ISO committee therefore adopts DCT to JointPhotographic Experts Group (JPEG) international standardof 8 times 8 block size to reduce the blocking effect in imagecompression A basic JPEG image encoding is composedof three procedures image transform quantization andencoding

DCT can map an original data into frequency domainby cosine waveform and conversely inverse discrete cosinetransform (IDCT) transfers frequency domain data intospatial domain Numerous coding methods based on DCThave been presented for digital image processing however

the associated memory size bandwidth and safety issuesare of significant concern to real-time applications Sun andYang [2] proposed an image compression method basedon a Laplace transparent composite model to achieve highcoding efficiency Jridi et al [3] presented image compressionhardware to reduce computational complexity Others haveproposed to optimize image computation by digital signalprocessor (DSP) Kumbhare and Gokhale [4] developeda low complexity architecture for computing an algebraicinteger based 8-point DCT in digital image processingJridi et al [5] designed a low complexity DCT engine indigital video and image processing Subband decompositionalgorithms based onDCThave also been used in transmittingimage data of low resolution to rebuilt image of betterquality [6ndash8] but they required high complexity and thustime-consuming computation Stassenrsquos matrix multiplica-tion algorithm was proposed to reduce complex matrixmultiplication in DCT [9] Khan et al [10] increased thecoordination between the pixel size and subword size tomaximize resource utilization formultimedia application but

HindawiMathematical Problems in EngineeringVolume 2017 Article ID 7401845 7 pageshttpsdoiorg10115520177401845

2 Mathematical Problems in Engineering

Image

R G B

R G B

Fast DCT Quantizer

Quantizationtable

Zigzag Bit-stream

DecodedimageIDCTDequantizer

Quantizationtable

De-zigzagBit-stream

8 times 8

8 times 8

Figure 1 The encoder and decoder model of JPEG compression standard by using discrete cosine transform (DCT) and inverse discretecosine transform (IDCT)

the work required heavy computation This paper proposesa fast DCT (FDCT) algorithm with significantly reducednumber of multiplications to achieve higher computationefficiency in digital image processing It is also shown suitablefor hardware implementation inDSPondigital watermarkingapplications

2 DCT in JPEG

Thebasic JPEG image encodingmethod is composed of threeprocedures image transform quantization and encodingFigure 1 shows the encoder and decoder model where anoriginal image is first divided into block pixel size 8 times 8 inRGB model with each block in 0 to 63 frequency coefficientsas shown in Figure 2 The low frequency coefficients arein the light color region During image processing DCTmaps the spatial domain data into frequency domain bycosine waveform and conversely in inverse discrete cosinetransform [11] The spatial domain indicates the ldquomagnituderdquoof a color image while the frequency domain shows themagnitude change from one pixel to the next In DCTthe original host signal is first divided into nonoverlap-ping 2D blocks of size 8 times 8 Each block is then pro-cessed independently and transformed into AC and DCcoefficients in frequency domain representing the averagecolor of the block and the color change across the blockrespectively

After DCT a quantizer with quantization table is used toprovide higher compassion ratio in transmission by approx-imating a continuous set of values in image data to a finite(preferably small) set of values It is done by dividing eachcomponent in frequency domain by a constant and thenrounding to the nearest integer The input to a quantizer isthe original data and the output is by a function of a set ofdiscrete finite output values A good quantizer is to representthe original signal with minimum loss or distortion A highlyuseful feature of JPEG process is that varying levels of image

compression and quality are obtained by the selection ofspecific quantization matrix similar to weighting function tomean psychological visual capability Quantization involvesdividing each coefficient by an integer value between 1 and255

After quantization the DC coefficient which containsa significant fraction of the total image energy becomes ameasure of the average value of the original 64 pixels and the63ACcomponents are treated in an entropy coding process inthe order of increasing frequency Because the 8times 8 blocks areusually with strong correlation the quantized DC coefficientis encoded as the difference from theDC term of the previousblock The higher frequency coefficients are more likely tobe 0 or negligible after quantization thereby improving thecompression of run-length encoding

3 Fast Discrete Cosine Transform (FDCT)

31 1D FDCT In proposed fast discrete cosine transform(FDCT) algorithm an original signal is first divided intononoverlapping 2D blocks of size 8 times 8 in three colorcomponents as shown in Figure 1 in JPEG format Eachblock is then processed independently by the transformConsider a spatial domain image data of 8-point s =[119904(0) 119904(1) sdot sdot sdot 119904(7)]119879 being transformed into frequencydomain f f = [119891(0) 119891(1) sdot sdot sdot 119891(7)]119879 where

119891 (119896) = 12119862 (119896)7sum119894=0

119904 (119894) cos((2119894 + 1) 11989612058716 ) (1)

with 119862(119896) = 1radic2 if 119896 = 0 and 119862(119896) = 1 for others119891(0) and 119891(4) can be evaluated without multiplication onlyby addition and subtraction There are only six elements leftin f to be evaluated Further minimization of the number of

Mathematical Problems in Engineering 3

0 1 5 6 14 15 27 28

2 4 7 13 16 26 29 42

3 8 12 17 25 30 41 43

9 11 18 24 31 40 44 53

10 19 23 32 39 45 52 54

20 22 33 38 46 51 55 60

21 34 37 47 50 56 59 61

35 36 48 49 57 58 62 63

Figure 2 The distribution of coefficients in DCT (a) low frequency position (heavy color) (b) low-middle-frequency position (light color)(c) high-middle frequency position (light color with dots) and (d) high frequency position (white)

multiplications can be achieved by regrouping the coefficientsby using the symmetric property to yield

[[[[[[[[[[

119891 (2)119891 (6)119891 (7)119891 (5)119891 (1)119891 (3)

]]]]]]]]]]

=[[[[[[[[[[

119887 119889 0 0 0 0119889 minus119887 0 0 0 00 0 119890 119891 minus119886 1198880 0 119891 119886 119888 1198900 0 minus119886 119888 minus119890 minus1198910 0 119888 119890 minus119891 minus119886

]]]]]]]]]]

[[[[[[[[[[

119904 (0) minus 119904 (3) minus 119904 (4) + 119904 (6)119904 (1) minus 119904 (2) minus 119904 (5) + 119904 (6)119904 (0) minus 119904 (7)119904 (6) minus 119904 (1)119904 (3) minus 119904 (4)119904 (2) minus 119904 (5)

]]]]]]]]]]

(2)

where 119886 = 0488 119887 = 0463 119888 = 0416 119889 = 0192 119890 = 0098and 119891 = 0278 The number of multiplications is now reducedto 20 To eliminate one more multiplication 119891(2) and 119891(6)can be evaluated by

[119891 (2)119891 (6)]

= A[[[[

(119887 minus 119889) (119904 (0) minus 119904 (3) minus 119904 (4) + 119904 (7))119889 (119904 (0) + 119904 (1) minus 119904 (2) minus 119904 (3) minus 119904 (4) minus 119904 (5) + 119904 (6) + 119904 (7))

minus (119887 minus 119889) (119904 (1) minus (2) minus 119904 (5) + 119904 (6))]]]](3)

where A = [ 1 1 00 1 1 ] so that 119891(1) 119891(3) 119891(5) and 119891(7) become

[[[[[[[[

119891 (7)119891 (5)minus119891 (1)119891 (3)

]]]]]]]]= Alowast

[[[[[[[[[[[[[[[[[[[[[[[[[

A[[[[

(119890 + 119886 minus 119891 + 119888) (119904 (0) minus 119904 (7))(119891 minus 119888) (119904 (0) minus 119904 (7) + 119904 (6) minus 119904 (1))

(119886 minus 119890 minus 119891 + 119888) (119904 (6) minus 119904 (1))]]]]

A[[[[

(minus119886 minus 119888) (119904 (0) minus 119904 (7) + 119904 (3) minus 119904 (4))119888 (119904 (0) minus 119904 (7) + 119904 (3) minus 119904 (4) + 119904 (6) minus 119904 (1) + 119904 (2) minus 119904 (5))

(119890 minus 119888) (119904 (6) minus 119904 (1) + 119904 (2) minus 119904 (5))]]]]

A[[[[

(119890 minus 119886 minus 119891 minus 119888) (119904 (3) minus 119904 (4))(119891 + 119888) (119904 (3) minus 119904 (4) + 119904 (2) minus 119904 (5))

(119886 + 119890 minus 119891 minus 119888) (119904 (2) minus 119904 (5))]]]]

]]]]]]]]]]]]]]]]]]]]]]]]]

(4)

4 Mathematical Problems in Engineering

whereAlowast = [ I I 00 I minusI ]with I being the 2 times 2 identity matrix By

nowDCTonly needs 12multiplications Similarly the inversetransform is

1199041015840 (119894) = 127sum119896=0

119862 (119896) 119891 (119896) cos((2119894 + 1) 11989612058716 ) (5)

where119862(119896) = 1radic2 if 119896 = 0 and119862(119896) = 1 for others Carefulobservation reveals that it is straightforward to derive inversetransform IDCT from DCT by

[[[[[[[[[[[[[[[[[[[[[

1199041015840 (0)1199041015840 (1)1199041015840 (2)1199041015840 (3)1199041015840 (4)1199041015840 (5)1199041015840 (6)1199041015840 (7)

]]]]]]]]]]]]]]]]]]]]]

=

[[[[[[[[[[[[[[[[[[

1 1 119887 119889 119890 119891 minus119886 1198881 minus1 119889 minus119887 minus119891 minus119886 minus119888 minus1198901 minus1 minus119889 119887 119888 119890 minus119891 minus1198861 1 minus119887 minus119889 minus119886 119888 minus119890 minus1198911 1 minus119887 minus119889 119886 minus119888 119890 1198911 minus1 minus119889 119887 minus119888 minus119890 119891 1198861 minus1 119889 minus119887 119891 119886 119888 1198901 1 119887 119889 minus119890 minus119891 119886 minus119888

]]]]]]]]]]]]]]]]]]

[[[[[[[[[[[[[[[[[[

119891 (0)119891 (4)119891 (2)119891 (6)119891 (7)119891 (5)minus119891 (1)119891 (3)

]]]]]]]]]]]]]]]]]]

(6)

By shifting and swapping the corresponding rows (6) can bedecomposed as

[[[[[[[[[[[[[[[[[[[[[

1199041015840 (0)1199041015840 (6)1199041015840 (3)1199041015840 (2)1199041015840 (7)1199041015840 (1)1199041015840 (4)1199041015840 (5)

]]]]]]]]]]]]]]]]]]]]]

= [[[[[[

[B + CB minus C

] +D

[B + CB minus C

] minusD

]]]]]] (7)

Thematrix is made up in three building blocks with symmet-ric coefficient matrices

B = [1 11 minus1] [

119891 (0)119891 (4)]

C = [119887 119889119889 minus119887][

119891 (2)119891 (6)]

D = [[[[[[

119890 119891 minus119886 119888119886 119888 119890

minus119890 minus119891symm minus119886

]]]]]]

[[[[[[

119891 (7)119891 (5)minus119891 (1)119891 (3)

]]]]]]

(8)

Both FDCT and fast IDCT have the same coefficients andmatrix blocks It is thus efficient in hardware implementation

32 2D FDCT Implementation of a 2DDCT is by separatinginto a pair 1D DCT as illustrated in Figure 3 Consider a 2Dspatial data sequence s(i j) 0 le 119894 119895 le 7 in matrix S of 8 times 8and the corresponding 2D DCT sequence 119891(119906 V) 0 le 119906 V le7 in frequency domain of matrix F is defined as

119891 (119906 V) = 14119862 (119906) 119862 (V)7sum119894=0

7sum119895=0

119904 (119894 119895) cos((2119894 + 1) 11990612058716 )

sdot cos((2119895 + 1) V12058716 ) (9)

The inverse transformation represented by S1015840 S1015840 = 1199041015840(119894 119895)1199041015840 (119894 119895) = 14

7sum119906=0

7sumV=0119862 (119906) 119862 (V) 119891 (119906 V)

sdot cos((2119894 + 1) 11990612058716 ) cos((2119895 + 1) V12058716 ) (10)

where 119862(119906) 119862(V) = 1radic2 if 119906 V = 0 and C(u) 119862(V) = 1for others By defining a matrixM = 119898(119906 V) where119898(119906 V)represents thematrix element in the uth row and vth column

119898(119906 V)

=

12radic2 119906 = 0 cup 0 le V le 712 cos (2V + 1) 11990612058716 1 le 119906 le 7 cup 0 le V le 7

(11)

A 2D DCT data matrix F and its inverse matrix S1015840 can bewritten as

F = MSM119879 (12a)

S1015840 = M119879FM (12b)

Because the base vectors of DCT are orthogonal the inversetransform IDCT can therefore be easily obtained as shown in

Mathematical Problems in Engineering 5

Columns

Rows

Row block of pixels

Result after DCT by rowsFinal result after DCT by columns

Frequency domain

DCT transform

DCT transform

A block 8 times 8 of picture in spatial domain

Figure 3 The FDCT algorithm of 2D DCT calculated by one pair of 1D DCTs

(12b) DCT transforms high correlated image into a few trans-form coefficients Conventional image coding techniques usethe quantization process to achieve higher compression ratioTherefore F includes only a few nonzero elements in the lowfrequency range which makes it possible to design efficientIDCT algorithm by fully utilizing the computation efficiencyin (3) and (4) and the energy compactness property of F

The 2D spatial datamatrix S1015840 in (12b) can be considered aslinear combination of the base images or the outer product ofthe column and row vectors inM This interpretation makesit easier for (12a) and (12b) tomanipulate the sparseness of the2D DCT matrix F and to calculate the spatial data matrix S1015840The proposed FDCT algorithm achieves high computationperformance over many other previous algorithms The row(or column) data can be processed by using 1D DCT (orIDCT) first with the results stored in transposition memoryBy exploiting the redundancy in the coefficients of DCT thealgorithm reduces the complexity of 2DDCTof an 8times 8 blockto only 24 multiplications

4 Performance Evaluation inDigital Image Watermarking

Discrete cosine transform (DCT) can map an originaldigital data into frequency domain by cosine waveformand conversely inverse discrete cosine transform (IDCT)transfers the frequency domain data into spatial domainThe associated memory size bandwidth and safety issuesin the transformation algorithms are of significant concernDirect computation of 2D DCT (N times N pixel size of ablock) requires 1198734 multiplications while direct realization(with the row-column separation) of 8 times 8 DCT 2 times 83 =1024 multiplications Although many algorithms have been

Table 1 The number of multiplications for 2D DCT by thealgorithms of previous works and by the proposed FDCT algorithm

AlgorithmNumber of multiplications

for2D DCT (8 times 8 block)

Direct computation 84 = 4096Direct computation(with line-column separation) 2 times 83 = 1024

Jridi et al [5] 256Ko et al [7] 31Sung et al [8] 25Manoria and Dixit [9] 28Khan et al [10] 40FDCT algorithm (this work) 24

proposed to optimize multiplications they either requiredhuge amount of computation [6ndash8 10] or suffered fromlow efficiency [4 5 9] Table 1 lists the number of mul-tiplications required in preview works compared with theproposed FDCT algorithm The latter is much more efficientin computation as described in (2) to (12a) and (12b) whereonly 12 times 2 = 24 multiplications are needed By comparisonJridi et al [5] needed 256 multiplications as they simplycalculated 8 times 8 standard DCT Similarly Ko et al [7]required 31 multiplications while Sung et al [8] based onsubband decomposition needed 25 multiplications Manoriaand Dixit [9] used Stassenrsquos matrix multiplication but their2D DCT (8 times 8) needed 28 multiplications Khan et al[10] took 40 multiplications for 2D DCT operation on animage of 8 times 8 block In summary both the FDCT algorithmand its inverse transform are more efficient than many

6 Mathematical Problems in Engineering

ExecutableCOFF file

TMS320C6701EPROMprogrammer

Hex conversionutility

Cross-referencelister

Debugging tools

Linker

Image files

Library

Archiver

Library ofobject files

Archiver

Run-time-supportlibrary

CC++Source files

Assemblersource

COFFObject files

CC++Compiler

Assembler

Linearassembly

Assembly-optimized file

Assemblyoptimizer

Library-buildutility

Figure 4 Software environment in implementing the FDCT algorithm on DSP (TMS320C6701)

previous works in reducing the complexity of computationImplementation in digital signal processor (DSP) for real-time digital watermarking becomes feasible

The process of embedding watermark in digital imagefor copyright protection and marketing applications havebeen proposed over the past decade Conventional techniqueis to embed a secret bit string in spatial frequency orwavelet domain into an image The FDCT algorithm isimplemented in a digital signal processor (TMS320C6701)to validate its efficiency in digital watermarking applicationsThe signal processor in both fixed-point and floating-pointis supported by a set of software development tools withCC++ compiler assembly optimizer a linker and assortedutilities as shown in Figure 4The proposed FDCT algorithmis written in C language and the CC++ compiler is ableto perform optimization (119899 = 0 1 2 3) in different levelof clock cycles and code size The lowest level (119899 = 0)optimization provides the operations of performing looprotation allocating variables to registers and simplifyingexpressions and statements the first level (119899 = 1) on constantpropagation and unused assignments the second level (119899 =2) on software pipelining unused global assignments loopunrolling and incremented pointer and the highest level(119899 = 3) on optimization by simplifying functions with return

Table 2 Processing time of the FDCT algorithm in DSP

Optimizationlevel (119899) Clocks

(cycle) Time (ms) Code size(KB)

None 22141816 133 210 15304558 92 191 11207925 67 222 5715198 35 363 5653914 34 35Matlab NA 610 NA

values never used removing all functions never called inlinecalls to small functions and reorder function declarations sothat the attributes of called functions are known when thecaller is optimized 119899 = 0 and 1 levels can efficiently reducethe code size while 119899 = 2 and 3 enhance the execution speedwith larger code size The computation time and code sizein different levels of optimization are listed in Table 2 Withincreasing code size the clock cycles and processing timedecrease so the best optimization is by 119899 = 3

For a digital watermark (32 times 32 block size) embeddedin an original image (256 times 256) calculation by the FDCT

Mathematical Problems in Engineering 7

algorithm in frequency domain and then inverse transformof the encrypted data back to spatial domain image uses5653914 clocks (119899 = 3) corresponding to 34ms in 35KBcode size Implementation shows that it takes only 024seconds to have the watermark embedded in the originalimage Extraction of watermark by inverse transform IDCT iswithin 021 seconds Real-time implementation of the FDCTalgorithm inDSP for image processing is shownvery efficient

5 Conclusions

(1) A fast discrete cosine transform (FDCT) algorithmthat utilizes the energy compactness and matrixsparseness properties in frequency domain for highercomputation performance is developed For a JPEGimage of 8 times 8 block size in spatial domain thealgorithm first decomposes the 2DDCT into one pairof 1D DCTs and the calculation can be completedin only 24 multiplications The 2D spatial data is alinear combination of base image obtained by theouter product of the column and row vectors of cosinefunctions such that the inverse DCT is as efficientThe algorithm is shown to achieve high performancecompared to many other previous works

(2) The algorithm optimizes a 2D DCT by exploitingthe redundancy of the frequency coefficients so as tofacilitate the implementation in digital signal proces-sor (DSP) For a spatial domain data matrix S the2D DCT data matrix F includes only a few nonzeroelements in the low frequency range which makesit possible to design efficient IDCT algorithm Theenergy compactness property of F and its inversematrix S1015840 can be written as linear combinations of thecosine functions such that both FDCT and its inversetransform are shown to have the same coefficients andmatrix blocks for efficient hardware implementation

(3) An example of digital image watermarking is appliedto demonstrate the efficiency of the FDCT algorithmHardware implementation of watermarking in DSPshows that it takes only 024 seconds to embed a 32times 32 block size digital watermark into a digital imageof block size 256 times 256 Implementation also showsthat extraction of watermark can be completed within021 seconds The FDCT algorithm in DSP is shownefficient and effective in real-time implementation ofdigital image watermarking

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] N Ahmed T Natarajan and K R Rao ldquoDiscrete cosine trans-formrdquo Institute of Electrical and Electronics Engineers Transac-tions on Computers vol 23 pp 90ndash93 1974

[2] C Sun and E-H Yang ldquoAn efficient DCT-based image com-pression system based on Laplacian transparent compositemodelrdquo IEEE Transactions on Image Processing vol 24 no 3pp 886ndash900 2015

[3] M Jridi A Alfalou and P K Meher ldquoOptimized architectureusing a novel subexpression elimination on Loeffler algorithmfor DCT-based image compressionrdquo VLSI Design vol 2012Article ID 209208 12 pages 2012

[4] P R Kumbhare and U M Gokhale ldquoDesign and implemen-tation of 2D-DCT by using Arai algorithm for image com-pressionrdquo Journal of The International Association of AdvancedTechnology and Science vol 16 no 5 article 5 2015

[5] M Jridi Y Ouerhani and A Alfalou ldquoLow complexity DCTengine for image and video compressionrdquo Real-Time Image andVideo Processing vol 8656 pp 1ndash9 2013

[6] M Marimuthu R Muthaiah and P Swaminathan ldquoSub-bandbased DCT for image compressionrdquo Research Journal of AppliedSciences Engineering and Technology vol 4 no 24 pp 5387ndash5390 2012

[7] L-T Ko J-E Chen H-C Hsin Y-S Shieh and T-Y SungldquoA unified algorithm for subband-based discrete cosine trans-formrdquoMathematical Problems in Engineering vol 2012 ArticleID 912194 31 pages 2012

[8] T-Y Sung Y-S Shieh andH-C Hsin ldquoAn efficient VLSI lineararray for DCTIDCT using subband decomposition algorithmrdquoMathematical Problems in Engineering vol 2010 Article ID185398 21 pages 2010

[9] M Manoria and P Dixit ldquoAn efficient DCT compressiontechnique using Strassenrsquos matrix multiplication algorithmrdquoInternational Journal of Computer Applications vol 60 no 9pp 45ndash50 2012

[10] S Khan E Casseau and D Menard ldquoHigh performance dis-crete cosine transform operator using multimedia orientedsubword parallelismrdquo Advances in Computer Engineering vol2015 Article ID 405856 10 pages 2015

[11] P Agrawal and S K Sharma ldquoReview paper on image com-pression using DCT KLT and DWTrdquo International Journal ofAdvanced Research in Computer Science and Software Engineer-ing vol 4 no 9 pp 928ndash931 2014

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: A Fast DCT Algorithm for Watermarking in Digital Signal Processordownloads.hindawi.com/journals/mpe/2017/7401845.pdf · 2019-07-30 · A Fast DCT Algorithm for Watermarking in Digital

2 Mathematical Problems in Engineering

Image

R G B

R G B

Fast DCT Quantizer

Quantizationtable

Zigzag Bit-stream

DecodedimageIDCTDequantizer

Quantizationtable

De-zigzagBit-stream

8 times 8

8 times 8

Figure 1 The encoder and decoder model of JPEG compression standard by using discrete cosine transform (DCT) and inverse discretecosine transform (IDCT)

the work required heavy computation This paper proposesa fast DCT (FDCT) algorithm with significantly reducednumber of multiplications to achieve higher computationefficiency in digital image processing It is also shown suitablefor hardware implementation inDSPondigital watermarkingapplications

2 DCT in JPEG

Thebasic JPEG image encodingmethod is composed of threeprocedures image transform quantization and encodingFigure 1 shows the encoder and decoder model where anoriginal image is first divided into block pixel size 8 times 8 inRGB model with each block in 0 to 63 frequency coefficientsas shown in Figure 2 The low frequency coefficients arein the light color region During image processing DCTmaps the spatial domain data into frequency domain bycosine waveform and conversely in inverse discrete cosinetransform [11] The spatial domain indicates the ldquomagnituderdquoof a color image while the frequency domain shows themagnitude change from one pixel to the next In DCTthe original host signal is first divided into nonoverlap-ping 2D blocks of size 8 times 8 Each block is then pro-cessed independently and transformed into AC and DCcoefficients in frequency domain representing the averagecolor of the block and the color change across the blockrespectively

After DCT a quantizer with quantization table is used toprovide higher compassion ratio in transmission by approx-imating a continuous set of values in image data to a finite(preferably small) set of values It is done by dividing eachcomponent in frequency domain by a constant and thenrounding to the nearest integer The input to a quantizer isthe original data and the output is by a function of a set ofdiscrete finite output values A good quantizer is to representthe original signal with minimum loss or distortion A highlyuseful feature of JPEG process is that varying levels of image

compression and quality are obtained by the selection ofspecific quantization matrix similar to weighting function tomean psychological visual capability Quantization involvesdividing each coefficient by an integer value between 1 and255

After quantization the DC coefficient which containsa significant fraction of the total image energy becomes ameasure of the average value of the original 64 pixels and the63ACcomponents are treated in an entropy coding process inthe order of increasing frequency Because the 8times 8 blocks areusually with strong correlation the quantized DC coefficientis encoded as the difference from theDC term of the previousblock The higher frequency coefficients are more likely tobe 0 or negligible after quantization thereby improving thecompression of run-length encoding

3 Fast Discrete Cosine Transform (FDCT)

31 1D FDCT In proposed fast discrete cosine transform(FDCT) algorithm an original signal is first divided intononoverlapping 2D blocks of size 8 times 8 in three colorcomponents as shown in Figure 1 in JPEG format Eachblock is then processed independently by the transformConsider a spatial domain image data of 8-point s =[119904(0) 119904(1) sdot sdot sdot 119904(7)]119879 being transformed into frequencydomain f f = [119891(0) 119891(1) sdot sdot sdot 119891(7)]119879 where

119891 (119896) = 12119862 (119896)7sum119894=0

119904 (119894) cos((2119894 + 1) 11989612058716 ) (1)

with 119862(119896) = 1radic2 if 119896 = 0 and 119862(119896) = 1 for others119891(0) and 119891(4) can be evaluated without multiplication onlyby addition and subtraction There are only six elements leftin f to be evaluated Further minimization of the number of

Mathematical Problems in Engineering 3

0 1 5 6 14 15 27 28

2 4 7 13 16 26 29 42

3 8 12 17 25 30 41 43

9 11 18 24 31 40 44 53

10 19 23 32 39 45 52 54

20 22 33 38 46 51 55 60

21 34 37 47 50 56 59 61

35 36 48 49 57 58 62 63

Figure 2 The distribution of coefficients in DCT (a) low frequency position (heavy color) (b) low-middle-frequency position (light color)(c) high-middle frequency position (light color with dots) and (d) high frequency position (white)

multiplications can be achieved by regrouping the coefficientsby using the symmetric property to yield

[[[[[[[[[[

119891 (2)119891 (6)119891 (7)119891 (5)119891 (1)119891 (3)

]]]]]]]]]]

=[[[[[[[[[[

119887 119889 0 0 0 0119889 minus119887 0 0 0 00 0 119890 119891 minus119886 1198880 0 119891 119886 119888 1198900 0 minus119886 119888 minus119890 minus1198910 0 119888 119890 minus119891 minus119886

]]]]]]]]]]

[[[[[[[[[[

119904 (0) minus 119904 (3) minus 119904 (4) + 119904 (6)119904 (1) minus 119904 (2) minus 119904 (5) + 119904 (6)119904 (0) minus 119904 (7)119904 (6) minus 119904 (1)119904 (3) minus 119904 (4)119904 (2) minus 119904 (5)

]]]]]]]]]]

(2)

where 119886 = 0488 119887 = 0463 119888 = 0416 119889 = 0192 119890 = 0098and 119891 = 0278 The number of multiplications is now reducedto 20 To eliminate one more multiplication 119891(2) and 119891(6)can be evaluated by

[119891 (2)119891 (6)]

= A[[[[

(119887 minus 119889) (119904 (0) minus 119904 (3) minus 119904 (4) + 119904 (7))119889 (119904 (0) + 119904 (1) minus 119904 (2) minus 119904 (3) minus 119904 (4) minus 119904 (5) + 119904 (6) + 119904 (7))

minus (119887 minus 119889) (119904 (1) minus (2) minus 119904 (5) + 119904 (6))]]]](3)

where A = [ 1 1 00 1 1 ] so that 119891(1) 119891(3) 119891(5) and 119891(7) become

[[[[[[[[

119891 (7)119891 (5)minus119891 (1)119891 (3)

]]]]]]]]= Alowast

[[[[[[[[[[[[[[[[[[[[[[[[[

A[[[[

(119890 + 119886 minus 119891 + 119888) (119904 (0) minus 119904 (7))(119891 minus 119888) (119904 (0) minus 119904 (7) + 119904 (6) minus 119904 (1))

(119886 minus 119890 minus 119891 + 119888) (119904 (6) minus 119904 (1))]]]]

A[[[[

(minus119886 minus 119888) (119904 (0) minus 119904 (7) + 119904 (3) minus 119904 (4))119888 (119904 (0) minus 119904 (7) + 119904 (3) minus 119904 (4) + 119904 (6) minus 119904 (1) + 119904 (2) minus 119904 (5))

(119890 minus 119888) (119904 (6) minus 119904 (1) + 119904 (2) minus 119904 (5))]]]]

A[[[[

(119890 minus 119886 minus 119891 minus 119888) (119904 (3) minus 119904 (4))(119891 + 119888) (119904 (3) minus 119904 (4) + 119904 (2) minus 119904 (5))

(119886 + 119890 minus 119891 minus 119888) (119904 (2) minus 119904 (5))]]]]

]]]]]]]]]]]]]]]]]]]]]]]]]

(4)

4 Mathematical Problems in Engineering

whereAlowast = [ I I 00 I minusI ]with I being the 2 times 2 identity matrix By

nowDCTonly needs 12multiplications Similarly the inversetransform is

1199041015840 (119894) = 127sum119896=0

119862 (119896) 119891 (119896) cos((2119894 + 1) 11989612058716 ) (5)

where119862(119896) = 1radic2 if 119896 = 0 and119862(119896) = 1 for others Carefulobservation reveals that it is straightforward to derive inversetransform IDCT from DCT by

[[[[[[[[[[[[[[[[[[[[[

1199041015840 (0)1199041015840 (1)1199041015840 (2)1199041015840 (3)1199041015840 (4)1199041015840 (5)1199041015840 (6)1199041015840 (7)

]]]]]]]]]]]]]]]]]]]]]

=

[[[[[[[[[[[[[[[[[[

1 1 119887 119889 119890 119891 minus119886 1198881 minus1 119889 minus119887 minus119891 minus119886 minus119888 minus1198901 minus1 minus119889 119887 119888 119890 minus119891 minus1198861 1 minus119887 minus119889 minus119886 119888 minus119890 minus1198911 1 minus119887 minus119889 119886 minus119888 119890 1198911 minus1 minus119889 119887 minus119888 minus119890 119891 1198861 minus1 119889 minus119887 119891 119886 119888 1198901 1 119887 119889 minus119890 minus119891 119886 minus119888

]]]]]]]]]]]]]]]]]]

[[[[[[[[[[[[[[[[[[

119891 (0)119891 (4)119891 (2)119891 (6)119891 (7)119891 (5)minus119891 (1)119891 (3)

]]]]]]]]]]]]]]]]]]

(6)

By shifting and swapping the corresponding rows (6) can bedecomposed as

[[[[[[[[[[[[[[[[[[[[[

1199041015840 (0)1199041015840 (6)1199041015840 (3)1199041015840 (2)1199041015840 (7)1199041015840 (1)1199041015840 (4)1199041015840 (5)

]]]]]]]]]]]]]]]]]]]]]

= [[[[[[

[B + CB minus C

] +D

[B + CB minus C

] minusD

]]]]]] (7)

Thematrix is made up in three building blocks with symmet-ric coefficient matrices

B = [1 11 minus1] [

119891 (0)119891 (4)]

C = [119887 119889119889 minus119887][

119891 (2)119891 (6)]

D = [[[[[[

119890 119891 minus119886 119888119886 119888 119890

minus119890 minus119891symm minus119886

]]]]]]

[[[[[[

119891 (7)119891 (5)minus119891 (1)119891 (3)

]]]]]]

(8)

Both FDCT and fast IDCT have the same coefficients andmatrix blocks It is thus efficient in hardware implementation

32 2D FDCT Implementation of a 2DDCT is by separatinginto a pair 1D DCT as illustrated in Figure 3 Consider a 2Dspatial data sequence s(i j) 0 le 119894 119895 le 7 in matrix S of 8 times 8and the corresponding 2D DCT sequence 119891(119906 V) 0 le 119906 V le7 in frequency domain of matrix F is defined as

119891 (119906 V) = 14119862 (119906) 119862 (V)7sum119894=0

7sum119895=0

119904 (119894 119895) cos((2119894 + 1) 11990612058716 )

sdot cos((2119895 + 1) V12058716 ) (9)

The inverse transformation represented by S1015840 S1015840 = 1199041015840(119894 119895)1199041015840 (119894 119895) = 14

7sum119906=0

7sumV=0119862 (119906) 119862 (V) 119891 (119906 V)

sdot cos((2119894 + 1) 11990612058716 ) cos((2119895 + 1) V12058716 ) (10)

where 119862(119906) 119862(V) = 1radic2 if 119906 V = 0 and C(u) 119862(V) = 1for others By defining a matrixM = 119898(119906 V) where119898(119906 V)represents thematrix element in the uth row and vth column

119898(119906 V)

=

12radic2 119906 = 0 cup 0 le V le 712 cos (2V + 1) 11990612058716 1 le 119906 le 7 cup 0 le V le 7

(11)

A 2D DCT data matrix F and its inverse matrix S1015840 can bewritten as

F = MSM119879 (12a)

S1015840 = M119879FM (12b)

Because the base vectors of DCT are orthogonal the inversetransform IDCT can therefore be easily obtained as shown in

Mathematical Problems in Engineering 5

Columns

Rows

Row block of pixels

Result after DCT by rowsFinal result after DCT by columns

Frequency domain

DCT transform

DCT transform

A block 8 times 8 of picture in spatial domain

Figure 3 The FDCT algorithm of 2D DCT calculated by one pair of 1D DCTs

(12b) DCT transforms high correlated image into a few trans-form coefficients Conventional image coding techniques usethe quantization process to achieve higher compression ratioTherefore F includes only a few nonzero elements in the lowfrequency range which makes it possible to design efficientIDCT algorithm by fully utilizing the computation efficiencyin (3) and (4) and the energy compactness property of F

The 2D spatial datamatrix S1015840 in (12b) can be considered aslinear combination of the base images or the outer product ofthe column and row vectors inM This interpretation makesit easier for (12a) and (12b) tomanipulate the sparseness of the2D DCT matrix F and to calculate the spatial data matrix S1015840The proposed FDCT algorithm achieves high computationperformance over many other previous algorithms The row(or column) data can be processed by using 1D DCT (orIDCT) first with the results stored in transposition memoryBy exploiting the redundancy in the coefficients of DCT thealgorithm reduces the complexity of 2DDCTof an 8times 8 blockto only 24 multiplications

4 Performance Evaluation inDigital Image Watermarking

Discrete cosine transform (DCT) can map an originaldigital data into frequency domain by cosine waveformand conversely inverse discrete cosine transform (IDCT)transfers the frequency domain data into spatial domainThe associated memory size bandwidth and safety issuesin the transformation algorithms are of significant concernDirect computation of 2D DCT (N times N pixel size of ablock) requires 1198734 multiplications while direct realization(with the row-column separation) of 8 times 8 DCT 2 times 83 =1024 multiplications Although many algorithms have been

Table 1 The number of multiplications for 2D DCT by thealgorithms of previous works and by the proposed FDCT algorithm

AlgorithmNumber of multiplications

for2D DCT (8 times 8 block)

Direct computation 84 = 4096Direct computation(with line-column separation) 2 times 83 = 1024

Jridi et al [5] 256Ko et al [7] 31Sung et al [8] 25Manoria and Dixit [9] 28Khan et al [10] 40FDCT algorithm (this work) 24

proposed to optimize multiplications they either requiredhuge amount of computation [6ndash8 10] or suffered fromlow efficiency [4 5 9] Table 1 lists the number of mul-tiplications required in preview works compared with theproposed FDCT algorithm The latter is much more efficientin computation as described in (2) to (12a) and (12b) whereonly 12 times 2 = 24 multiplications are needed By comparisonJridi et al [5] needed 256 multiplications as they simplycalculated 8 times 8 standard DCT Similarly Ko et al [7]required 31 multiplications while Sung et al [8] based onsubband decomposition needed 25 multiplications Manoriaand Dixit [9] used Stassenrsquos matrix multiplication but their2D DCT (8 times 8) needed 28 multiplications Khan et al[10] took 40 multiplications for 2D DCT operation on animage of 8 times 8 block In summary both the FDCT algorithmand its inverse transform are more efficient than many

6 Mathematical Problems in Engineering

ExecutableCOFF file

TMS320C6701EPROMprogrammer

Hex conversionutility

Cross-referencelister

Debugging tools

Linker

Image files

Library

Archiver

Library ofobject files

Archiver

Run-time-supportlibrary

CC++Source files

Assemblersource

COFFObject files

CC++Compiler

Assembler

Linearassembly

Assembly-optimized file

Assemblyoptimizer

Library-buildutility

Figure 4 Software environment in implementing the FDCT algorithm on DSP (TMS320C6701)

previous works in reducing the complexity of computationImplementation in digital signal processor (DSP) for real-time digital watermarking becomes feasible

The process of embedding watermark in digital imagefor copyright protection and marketing applications havebeen proposed over the past decade Conventional techniqueis to embed a secret bit string in spatial frequency orwavelet domain into an image The FDCT algorithm isimplemented in a digital signal processor (TMS320C6701)to validate its efficiency in digital watermarking applicationsThe signal processor in both fixed-point and floating-pointis supported by a set of software development tools withCC++ compiler assembly optimizer a linker and assortedutilities as shown in Figure 4The proposed FDCT algorithmis written in C language and the CC++ compiler is ableto perform optimization (119899 = 0 1 2 3) in different levelof clock cycles and code size The lowest level (119899 = 0)optimization provides the operations of performing looprotation allocating variables to registers and simplifyingexpressions and statements the first level (119899 = 1) on constantpropagation and unused assignments the second level (119899 =2) on software pipelining unused global assignments loopunrolling and incremented pointer and the highest level(119899 = 3) on optimization by simplifying functions with return

Table 2 Processing time of the FDCT algorithm in DSP

Optimizationlevel (119899) Clocks

(cycle) Time (ms) Code size(KB)

None 22141816 133 210 15304558 92 191 11207925 67 222 5715198 35 363 5653914 34 35Matlab NA 610 NA

values never used removing all functions never called inlinecalls to small functions and reorder function declarations sothat the attributes of called functions are known when thecaller is optimized 119899 = 0 and 1 levels can efficiently reducethe code size while 119899 = 2 and 3 enhance the execution speedwith larger code size The computation time and code sizein different levels of optimization are listed in Table 2 Withincreasing code size the clock cycles and processing timedecrease so the best optimization is by 119899 = 3

For a digital watermark (32 times 32 block size) embeddedin an original image (256 times 256) calculation by the FDCT

Mathematical Problems in Engineering 7

algorithm in frequency domain and then inverse transformof the encrypted data back to spatial domain image uses5653914 clocks (119899 = 3) corresponding to 34ms in 35KBcode size Implementation shows that it takes only 024seconds to have the watermark embedded in the originalimage Extraction of watermark by inverse transform IDCT iswithin 021 seconds Real-time implementation of the FDCTalgorithm inDSP for image processing is shownvery efficient

5 Conclusions

(1) A fast discrete cosine transform (FDCT) algorithmthat utilizes the energy compactness and matrixsparseness properties in frequency domain for highercomputation performance is developed For a JPEGimage of 8 times 8 block size in spatial domain thealgorithm first decomposes the 2DDCT into one pairof 1D DCTs and the calculation can be completedin only 24 multiplications The 2D spatial data is alinear combination of base image obtained by theouter product of the column and row vectors of cosinefunctions such that the inverse DCT is as efficientThe algorithm is shown to achieve high performancecompared to many other previous works

(2) The algorithm optimizes a 2D DCT by exploitingthe redundancy of the frequency coefficients so as tofacilitate the implementation in digital signal proces-sor (DSP) For a spatial domain data matrix S the2D DCT data matrix F includes only a few nonzeroelements in the low frequency range which makesit possible to design efficient IDCT algorithm Theenergy compactness property of F and its inversematrix S1015840 can be written as linear combinations of thecosine functions such that both FDCT and its inversetransform are shown to have the same coefficients andmatrix blocks for efficient hardware implementation

(3) An example of digital image watermarking is appliedto demonstrate the efficiency of the FDCT algorithmHardware implementation of watermarking in DSPshows that it takes only 024 seconds to embed a 32times 32 block size digital watermark into a digital imageof block size 256 times 256 Implementation also showsthat extraction of watermark can be completed within021 seconds The FDCT algorithm in DSP is shownefficient and effective in real-time implementation ofdigital image watermarking

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] N Ahmed T Natarajan and K R Rao ldquoDiscrete cosine trans-formrdquo Institute of Electrical and Electronics Engineers Transac-tions on Computers vol 23 pp 90ndash93 1974

[2] C Sun and E-H Yang ldquoAn efficient DCT-based image com-pression system based on Laplacian transparent compositemodelrdquo IEEE Transactions on Image Processing vol 24 no 3pp 886ndash900 2015

[3] M Jridi A Alfalou and P K Meher ldquoOptimized architectureusing a novel subexpression elimination on Loeffler algorithmfor DCT-based image compressionrdquo VLSI Design vol 2012Article ID 209208 12 pages 2012

[4] P R Kumbhare and U M Gokhale ldquoDesign and implemen-tation of 2D-DCT by using Arai algorithm for image com-pressionrdquo Journal of The International Association of AdvancedTechnology and Science vol 16 no 5 article 5 2015

[5] M Jridi Y Ouerhani and A Alfalou ldquoLow complexity DCTengine for image and video compressionrdquo Real-Time Image andVideo Processing vol 8656 pp 1ndash9 2013

[6] M Marimuthu R Muthaiah and P Swaminathan ldquoSub-bandbased DCT for image compressionrdquo Research Journal of AppliedSciences Engineering and Technology vol 4 no 24 pp 5387ndash5390 2012

[7] L-T Ko J-E Chen H-C Hsin Y-S Shieh and T-Y SungldquoA unified algorithm for subband-based discrete cosine trans-formrdquoMathematical Problems in Engineering vol 2012 ArticleID 912194 31 pages 2012

[8] T-Y Sung Y-S Shieh andH-C Hsin ldquoAn efficient VLSI lineararray for DCTIDCT using subband decomposition algorithmrdquoMathematical Problems in Engineering vol 2010 Article ID185398 21 pages 2010

[9] M Manoria and P Dixit ldquoAn efficient DCT compressiontechnique using Strassenrsquos matrix multiplication algorithmrdquoInternational Journal of Computer Applications vol 60 no 9pp 45ndash50 2012

[10] S Khan E Casseau and D Menard ldquoHigh performance dis-crete cosine transform operator using multimedia orientedsubword parallelismrdquo Advances in Computer Engineering vol2015 Article ID 405856 10 pages 2015

[11] P Agrawal and S K Sharma ldquoReview paper on image com-pression using DCT KLT and DWTrdquo International Journal ofAdvanced Research in Computer Science and Software Engineer-ing vol 4 no 9 pp 928ndash931 2014

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: A Fast DCT Algorithm for Watermarking in Digital Signal Processordownloads.hindawi.com/journals/mpe/2017/7401845.pdf · 2019-07-30 · A Fast DCT Algorithm for Watermarking in Digital

Mathematical Problems in Engineering 3

0 1 5 6 14 15 27 28

2 4 7 13 16 26 29 42

3 8 12 17 25 30 41 43

9 11 18 24 31 40 44 53

10 19 23 32 39 45 52 54

20 22 33 38 46 51 55 60

21 34 37 47 50 56 59 61

35 36 48 49 57 58 62 63

Figure 2 The distribution of coefficients in DCT (a) low frequency position (heavy color) (b) low-middle-frequency position (light color)(c) high-middle frequency position (light color with dots) and (d) high frequency position (white)

multiplications can be achieved by regrouping the coefficientsby using the symmetric property to yield

[[[[[[[[[[

119891 (2)119891 (6)119891 (7)119891 (5)119891 (1)119891 (3)

]]]]]]]]]]

=[[[[[[[[[[

119887 119889 0 0 0 0119889 minus119887 0 0 0 00 0 119890 119891 minus119886 1198880 0 119891 119886 119888 1198900 0 minus119886 119888 minus119890 minus1198910 0 119888 119890 minus119891 minus119886

]]]]]]]]]]

[[[[[[[[[[

119904 (0) minus 119904 (3) minus 119904 (4) + 119904 (6)119904 (1) minus 119904 (2) minus 119904 (5) + 119904 (6)119904 (0) minus 119904 (7)119904 (6) minus 119904 (1)119904 (3) minus 119904 (4)119904 (2) minus 119904 (5)

]]]]]]]]]]

(2)

where 119886 = 0488 119887 = 0463 119888 = 0416 119889 = 0192 119890 = 0098and 119891 = 0278 The number of multiplications is now reducedto 20 To eliminate one more multiplication 119891(2) and 119891(6)can be evaluated by

[119891 (2)119891 (6)]

= A[[[[

(119887 minus 119889) (119904 (0) minus 119904 (3) minus 119904 (4) + 119904 (7))119889 (119904 (0) + 119904 (1) minus 119904 (2) minus 119904 (3) minus 119904 (4) minus 119904 (5) + 119904 (6) + 119904 (7))

minus (119887 minus 119889) (119904 (1) minus (2) minus 119904 (5) + 119904 (6))]]]](3)

where A = [ 1 1 00 1 1 ] so that 119891(1) 119891(3) 119891(5) and 119891(7) become

[[[[[[[[

119891 (7)119891 (5)minus119891 (1)119891 (3)

]]]]]]]]= Alowast

[[[[[[[[[[[[[[[[[[[[[[[[[

A[[[[

(119890 + 119886 minus 119891 + 119888) (119904 (0) minus 119904 (7))(119891 minus 119888) (119904 (0) minus 119904 (7) + 119904 (6) minus 119904 (1))

(119886 minus 119890 minus 119891 + 119888) (119904 (6) minus 119904 (1))]]]]

A[[[[

(minus119886 minus 119888) (119904 (0) minus 119904 (7) + 119904 (3) minus 119904 (4))119888 (119904 (0) minus 119904 (7) + 119904 (3) minus 119904 (4) + 119904 (6) minus 119904 (1) + 119904 (2) minus 119904 (5))

(119890 minus 119888) (119904 (6) minus 119904 (1) + 119904 (2) minus 119904 (5))]]]]

A[[[[

(119890 minus 119886 minus 119891 minus 119888) (119904 (3) minus 119904 (4))(119891 + 119888) (119904 (3) minus 119904 (4) + 119904 (2) minus 119904 (5))

(119886 + 119890 minus 119891 minus 119888) (119904 (2) minus 119904 (5))]]]]

]]]]]]]]]]]]]]]]]]]]]]]]]

(4)

4 Mathematical Problems in Engineering

whereAlowast = [ I I 00 I minusI ]with I being the 2 times 2 identity matrix By

nowDCTonly needs 12multiplications Similarly the inversetransform is

1199041015840 (119894) = 127sum119896=0

119862 (119896) 119891 (119896) cos((2119894 + 1) 11989612058716 ) (5)

where119862(119896) = 1radic2 if 119896 = 0 and119862(119896) = 1 for others Carefulobservation reveals that it is straightforward to derive inversetransform IDCT from DCT by

[[[[[[[[[[[[[[[[[[[[[

1199041015840 (0)1199041015840 (1)1199041015840 (2)1199041015840 (3)1199041015840 (4)1199041015840 (5)1199041015840 (6)1199041015840 (7)

]]]]]]]]]]]]]]]]]]]]]

=

[[[[[[[[[[[[[[[[[[

1 1 119887 119889 119890 119891 minus119886 1198881 minus1 119889 minus119887 minus119891 minus119886 minus119888 minus1198901 minus1 minus119889 119887 119888 119890 minus119891 minus1198861 1 minus119887 minus119889 minus119886 119888 minus119890 minus1198911 1 minus119887 minus119889 119886 minus119888 119890 1198911 minus1 minus119889 119887 minus119888 minus119890 119891 1198861 minus1 119889 minus119887 119891 119886 119888 1198901 1 119887 119889 minus119890 minus119891 119886 minus119888

]]]]]]]]]]]]]]]]]]

[[[[[[[[[[[[[[[[[[

119891 (0)119891 (4)119891 (2)119891 (6)119891 (7)119891 (5)minus119891 (1)119891 (3)

]]]]]]]]]]]]]]]]]]

(6)

By shifting and swapping the corresponding rows (6) can bedecomposed as

[[[[[[[[[[[[[[[[[[[[[

1199041015840 (0)1199041015840 (6)1199041015840 (3)1199041015840 (2)1199041015840 (7)1199041015840 (1)1199041015840 (4)1199041015840 (5)

]]]]]]]]]]]]]]]]]]]]]

= [[[[[[

[B + CB minus C

] +D

[B + CB minus C

] minusD

]]]]]] (7)

Thematrix is made up in three building blocks with symmet-ric coefficient matrices

B = [1 11 minus1] [

119891 (0)119891 (4)]

C = [119887 119889119889 minus119887][

119891 (2)119891 (6)]

D = [[[[[[

119890 119891 minus119886 119888119886 119888 119890

minus119890 minus119891symm minus119886

]]]]]]

[[[[[[

119891 (7)119891 (5)minus119891 (1)119891 (3)

]]]]]]

(8)

Both FDCT and fast IDCT have the same coefficients andmatrix blocks It is thus efficient in hardware implementation

32 2D FDCT Implementation of a 2DDCT is by separatinginto a pair 1D DCT as illustrated in Figure 3 Consider a 2Dspatial data sequence s(i j) 0 le 119894 119895 le 7 in matrix S of 8 times 8and the corresponding 2D DCT sequence 119891(119906 V) 0 le 119906 V le7 in frequency domain of matrix F is defined as

119891 (119906 V) = 14119862 (119906) 119862 (V)7sum119894=0

7sum119895=0

119904 (119894 119895) cos((2119894 + 1) 11990612058716 )

sdot cos((2119895 + 1) V12058716 ) (9)

The inverse transformation represented by S1015840 S1015840 = 1199041015840(119894 119895)1199041015840 (119894 119895) = 14

7sum119906=0

7sumV=0119862 (119906) 119862 (V) 119891 (119906 V)

sdot cos((2119894 + 1) 11990612058716 ) cos((2119895 + 1) V12058716 ) (10)

where 119862(119906) 119862(V) = 1radic2 if 119906 V = 0 and C(u) 119862(V) = 1for others By defining a matrixM = 119898(119906 V) where119898(119906 V)represents thematrix element in the uth row and vth column

119898(119906 V)

=

12radic2 119906 = 0 cup 0 le V le 712 cos (2V + 1) 11990612058716 1 le 119906 le 7 cup 0 le V le 7

(11)

A 2D DCT data matrix F and its inverse matrix S1015840 can bewritten as

F = MSM119879 (12a)

S1015840 = M119879FM (12b)

Because the base vectors of DCT are orthogonal the inversetransform IDCT can therefore be easily obtained as shown in

Mathematical Problems in Engineering 5

Columns

Rows

Row block of pixels

Result after DCT by rowsFinal result after DCT by columns

Frequency domain

DCT transform

DCT transform

A block 8 times 8 of picture in spatial domain

Figure 3 The FDCT algorithm of 2D DCT calculated by one pair of 1D DCTs

(12b) DCT transforms high correlated image into a few trans-form coefficients Conventional image coding techniques usethe quantization process to achieve higher compression ratioTherefore F includes only a few nonzero elements in the lowfrequency range which makes it possible to design efficientIDCT algorithm by fully utilizing the computation efficiencyin (3) and (4) and the energy compactness property of F

The 2D spatial datamatrix S1015840 in (12b) can be considered aslinear combination of the base images or the outer product ofthe column and row vectors inM This interpretation makesit easier for (12a) and (12b) tomanipulate the sparseness of the2D DCT matrix F and to calculate the spatial data matrix S1015840The proposed FDCT algorithm achieves high computationperformance over many other previous algorithms The row(or column) data can be processed by using 1D DCT (orIDCT) first with the results stored in transposition memoryBy exploiting the redundancy in the coefficients of DCT thealgorithm reduces the complexity of 2DDCTof an 8times 8 blockto only 24 multiplications

4 Performance Evaluation inDigital Image Watermarking

Discrete cosine transform (DCT) can map an originaldigital data into frequency domain by cosine waveformand conversely inverse discrete cosine transform (IDCT)transfers the frequency domain data into spatial domainThe associated memory size bandwidth and safety issuesin the transformation algorithms are of significant concernDirect computation of 2D DCT (N times N pixel size of ablock) requires 1198734 multiplications while direct realization(with the row-column separation) of 8 times 8 DCT 2 times 83 =1024 multiplications Although many algorithms have been

Table 1 The number of multiplications for 2D DCT by thealgorithms of previous works and by the proposed FDCT algorithm

AlgorithmNumber of multiplications

for2D DCT (8 times 8 block)

Direct computation 84 = 4096Direct computation(with line-column separation) 2 times 83 = 1024

Jridi et al [5] 256Ko et al [7] 31Sung et al [8] 25Manoria and Dixit [9] 28Khan et al [10] 40FDCT algorithm (this work) 24

proposed to optimize multiplications they either requiredhuge amount of computation [6ndash8 10] or suffered fromlow efficiency [4 5 9] Table 1 lists the number of mul-tiplications required in preview works compared with theproposed FDCT algorithm The latter is much more efficientin computation as described in (2) to (12a) and (12b) whereonly 12 times 2 = 24 multiplications are needed By comparisonJridi et al [5] needed 256 multiplications as they simplycalculated 8 times 8 standard DCT Similarly Ko et al [7]required 31 multiplications while Sung et al [8] based onsubband decomposition needed 25 multiplications Manoriaand Dixit [9] used Stassenrsquos matrix multiplication but their2D DCT (8 times 8) needed 28 multiplications Khan et al[10] took 40 multiplications for 2D DCT operation on animage of 8 times 8 block In summary both the FDCT algorithmand its inverse transform are more efficient than many

6 Mathematical Problems in Engineering

ExecutableCOFF file

TMS320C6701EPROMprogrammer

Hex conversionutility

Cross-referencelister

Debugging tools

Linker

Image files

Library

Archiver

Library ofobject files

Archiver

Run-time-supportlibrary

CC++Source files

Assemblersource

COFFObject files

CC++Compiler

Assembler

Linearassembly

Assembly-optimized file

Assemblyoptimizer

Library-buildutility

Figure 4 Software environment in implementing the FDCT algorithm on DSP (TMS320C6701)

previous works in reducing the complexity of computationImplementation in digital signal processor (DSP) for real-time digital watermarking becomes feasible

The process of embedding watermark in digital imagefor copyright protection and marketing applications havebeen proposed over the past decade Conventional techniqueis to embed a secret bit string in spatial frequency orwavelet domain into an image The FDCT algorithm isimplemented in a digital signal processor (TMS320C6701)to validate its efficiency in digital watermarking applicationsThe signal processor in both fixed-point and floating-pointis supported by a set of software development tools withCC++ compiler assembly optimizer a linker and assortedutilities as shown in Figure 4The proposed FDCT algorithmis written in C language and the CC++ compiler is ableto perform optimization (119899 = 0 1 2 3) in different levelof clock cycles and code size The lowest level (119899 = 0)optimization provides the operations of performing looprotation allocating variables to registers and simplifyingexpressions and statements the first level (119899 = 1) on constantpropagation and unused assignments the second level (119899 =2) on software pipelining unused global assignments loopunrolling and incremented pointer and the highest level(119899 = 3) on optimization by simplifying functions with return

Table 2 Processing time of the FDCT algorithm in DSP

Optimizationlevel (119899) Clocks

(cycle) Time (ms) Code size(KB)

None 22141816 133 210 15304558 92 191 11207925 67 222 5715198 35 363 5653914 34 35Matlab NA 610 NA

values never used removing all functions never called inlinecalls to small functions and reorder function declarations sothat the attributes of called functions are known when thecaller is optimized 119899 = 0 and 1 levels can efficiently reducethe code size while 119899 = 2 and 3 enhance the execution speedwith larger code size The computation time and code sizein different levels of optimization are listed in Table 2 Withincreasing code size the clock cycles and processing timedecrease so the best optimization is by 119899 = 3

For a digital watermark (32 times 32 block size) embeddedin an original image (256 times 256) calculation by the FDCT

Mathematical Problems in Engineering 7

algorithm in frequency domain and then inverse transformof the encrypted data back to spatial domain image uses5653914 clocks (119899 = 3) corresponding to 34ms in 35KBcode size Implementation shows that it takes only 024seconds to have the watermark embedded in the originalimage Extraction of watermark by inverse transform IDCT iswithin 021 seconds Real-time implementation of the FDCTalgorithm inDSP for image processing is shownvery efficient

5 Conclusions

(1) A fast discrete cosine transform (FDCT) algorithmthat utilizes the energy compactness and matrixsparseness properties in frequency domain for highercomputation performance is developed For a JPEGimage of 8 times 8 block size in spatial domain thealgorithm first decomposes the 2DDCT into one pairof 1D DCTs and the calculation can be completedin only 24 multiplications The 2D spatial data is alinear combination of base image obtained by theouter product of the column and row vectors of cosinefunctions such that the inverse DCT is as efficientThe algorithm is shown to achieve high performancecompared to many other previous works

(2) The algorithm optimizes a 2D DCT by exploitingthe redundancy of the frequency coefficients so as tofacilitate the implementation in digital signal proces-sor (DSP) For a spatial domain data matrix S the2D DCT data matrix F includes only a few nonzeroelements in the low frequency range which makesit possible to design efficient IDCT algorithm Theenergy compactness property of F and its inversematrix S1015840 can be written as linear combinations of thecosine functions such that both FDCT and its inversetransform are shown to have the same coefficients andmatrix blocks for efficient hardware implementation

(3) An example of digital image watermarking is appliedto demonstrate the efficiency of the FDCT algorithmHardware implementation of watermarking in DSPshows that it takes only 024 seconds to embed a 32times 32 block size digital watermark into a digital imageof block size 256 times 256 Implementation also showsthat extraction of watermark can be completed within021 seconds The FDCT algorithm in DSP is shownefficient and effective in real-time implementation ofdigital image watermarking

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] N Ahmed T Natarajan and K R Rao ldquoDiscrete cosine trans-formrdquo Institute of Electrical and Electronics Engineers Transac-tions on Computers vol 23 pp 90ndash93 1974

[2] C Sun and E-H Yang ldquoAn efficient DCT-based image com-pression system based on Laplacian transparent compositemodelrdquo IEEE Transactions on Image Processing vol 24 no 3pp 886ndash900 2015

[3] M Jridi A Alfalou and P K Meher ldquoOptimized architectureusing a novel subexpression elimination on Loeffler algorithmfor DCT-based image compressionrdquo VLSI Design vol 2012Article ID 209208 12 pages 2012

[4] P R Kumbhare and U M Gokhale ldquoDesign and implemen-tation of 2D-DCT by using Arai algorithm for image com-pressionrdquo Journal of The International Association of AdvancedTechnology and Science vol 16 no 5 article 5 2015

[5] M Jridi Y Ouerhani and A Alfalou ldquoLow complexity DCTengine for image and video compressionrdquo Real-Time Image andVideo Processing vol 8656 pp 1ndash9 2013

[6] M Marimuthu R Muthaiah and P Swaminathan ldquoSub-bandbased DCT for image compressionrdquo Research Journal of AppliedSciences Engineering and Technology vol 4 no 24 pp 5387ndash5390 2012

[7] L-T Ko J-E Chen H-C Hsin Y-S Shieh and T-Y SungldquoA unified algorithm for subband-based discrete cosine trans-formrdquoMathematical Problems in Engineering vol 2012 ArticleID 912194 31 pages 2012

[8] T-Y Sung Y-S Shieh andH-C Hsin ldquoAn efficient VLSI lineararray for DCTIDCT using subband decomposition algorithmrdquoMathematical Problems in Engineering vol 2010 Article ID185398 21 pages 2010

[9] M Manoria and P Dixit ldquoAn efficient DCT compressiontechnique using Strassenrsquos matrix multiplication algorithmrdquoInternational Journal of Computer Applications vol 60 no 9pp 45ndash50 2012

[10] S Khan E Casseau and D Menard ldquoHigh performance dis-crete cosine transform operator using multimedia orientedsubword parallelismrdquo Advances in Computer Engineering vol2015 Article ID 405856 10 pages 2015

[11] P Agrawal and S K Sharma ldquoReview paper on image com-pression using DCT KLT and DWTrdquo International Journal ofAdvanced Research in Computer Science and Software Engineer-ing vol 4 no 9 pp 928ndash931 2014

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: A Fast DCT Algorithm for Watermarking in Digital Signal Processordownloads.hindawi.com/journals/mpe/2017/7401845.pdf · 2019-07-30 · A Fast DCT Algorithm for Watermarking in Digital

4 Mathematical Problems in Engineering

whereAlowast = [ I I 00 I minusI ]with I being the 2 times 2 identity matrix By

nowDCTonly needs 12multiplications Similarly the inversetransform is

1199041015840 (119894) = 127sum119896=0

119862 (119896) 119891 (119896) cos((2119894 + 1) 11989612058716 ) (5)

where119862(119896) = 1radic2 if 119896 = 0 and119862(119896) = 1 for others Carefulobservation reveals that it is straightforward to derive inversetransform IDCT from DCT by

[[[[[[[[[[[[[[[[[[[[[

1199041015840 (0)1199041015840 (1)1199041015840 (2)1199041015840 (3)1199041015840 (4)1199041015840 (5)1199041015840 (6)1199041015840 (7)

]]]]]]]]]]]]]]]]]]]]]

=

[[[[[[[[[[[[[[[[[[

1 1 119887 119889 119890 119891 minus119886 1198881 minus1 119889 minus119887 minus119891 minus119886 minus119888 minus1198901 minus1 minus119889 119887 119888 119890 minus119891 minus1198861 1 minus119887 minus119889 minus119886 119888 minus119890 minus1198911 1 minus119887 minus119889 119886 minus119888 119890 1198911 minus1 minus119889 119887 minus119888 minus119890 119891 1198861 minus1 119889 minus119887 119891 119886 119888 1198901 1 119887 119889 minus119890 minus119891 119886 minus119888

]]]]]]]]]]]]]]]]]]

[[[[[[[[[[[[[[[[[[

119891 (0)119891 (4)119891 (2)119891 (6)119891 (7)119891 (5)minus119891 (1)119891 (3)

]]]]]]]]]]]]]]]]]]

(6)

By shifting and swapping the corresponding rows (6) can bedecomposed as

[[[[[[[[[[[[[[[[[[[[[

1199041015840 (0)1199041015840 (6)1199041015840 (3)1199041015840 (2)1199041015840 (7)1199041015840 (1)1199041015840 (4)1199041015840 (5)

]]]]]]]]]]]]]]]]]]]]]

= [[[[[[

[B + CB minus C

] +D

[B + CB minus C

] minusD

]]]]]] (7)

Thematrix is made up in three building blocks with symmet-ric coefficient matrices

B = [1 11 minus1] [

119891 (0)119891 (4)]

C = [119887 119889119889 minus119887][

119891 (2)119891 (6)]

D = [[[[[[

119890 119891 minus119886 119888119886 119888 119890

minus119890 minus119891symm minus119886

]]]]]]

[[[[[[

119891 (7)119891 (5)minus119891 (1)119891 (3)

]]]]]]

(8)

Both FDCT and fast IDCT have the same coefficients andmatrix blocks It is thus efficient in hardware implementation

32 2D FDCT Implementation of a 2DDCT is by separatinginto a pair 1D DCT as illustrated in Figure 3 Consider a 2Dspatial data sequence s(i j) 0 le 119894 119895 le 7 in matrix S of 8 times 8and the corresponding 2D DCT sequence 119891(119906 V) 0 le 119906 V le7 in frequency domain of matrix F is defined as

119891 (119906 V) = 14119862 (119906) 119862 (V)7sum119894=0

7sum119895=0

119904 (119894 119895) cos((2119894 + 1) 11990612058716 )

sdot cos((2119895 + 1) V12058716 ) (9)

The inverse transformation represented by S1015840 S1015840 = 1199041015840(119894 119895)1199041015840 (119894 119895) = 14

7sum119906=0

7sumV=0119862 (119906) 119862 (V) 119891 (119906 V)

sdot cos((2119894 + 1) 11990612058716 ) cos((2119895 + 1) V12058716 ) (10)

where 119862(119906) 119862(V) = 1radic2 if 119906 V = 0 and C(u) 119862(V) = 1for others By defining a matrixM = 119898(119906 V) where119898(119906 V)represents thematrix element in the uth row and vth column

119898(119906 V)

=

12radic2 119906 = 0 cup 0 le V le 712 cos (2V + 1) 11990612058716 1 le 119906 le 7 cup 0 le V le 7

(11)

A 2D DCT data matrix F and its inverse matrix S1015840 can bewritten as

F = MSM119879 (12a)

S1015840 = M119879FM (12b)

Because the base vectors of DCT are orthogonal the inversetransform IDCT can therefore be easily obtained as shown in

Mathematical Problems in Engineering 5

Columns

Rows

Row block of pixels

Result after DCT by rowsFinal result after DCT by columns

Frequency domain

DCT transform

DCT transform

A block 8 times 8 of picture in spatial domain

Figure 3 The FDCT algorithm of 2D DCT calculated by one pair of 1D DCTs

(12b) DCT transforms high correlated image into a few trans-form coefficients Conventional image coding techniques usethe quantization process to achieve higher compression ratioTherefore F includes only a few nonzero elements in the lowfrequency range which makes it possible to design efficientIDCT algorithm by fully utilizing the computation efficiencyin (3) and (4) and the energy compactness property of F

The 2D spatial datamatrix S1015840 in (12b) can be considered aslinear combination of the base images or the outer product ofthe column and row vectors inM This interpretation makesit easier for (12a) and (12b) tomanipulate the sparseness of the2D DCT matrix F and to calculate the spatial data matrix S1015840The proposed FDCT algorithm achieves high computationperformance over many other previous algorithms The row(or column) data can be processed by using 1D DCT (orIDCT) first with the results stored in transposition memoryBy exploiting the redundancy in the coefficients of DCT thealgorithm reduces the complexity of 2DDCTof an 8times 8 blockto only 24 multiplications

4 Performance Evaluation inDigital Image Watermarking

Discrete cosine transform (DCT) can map an originaldigital data into frequency domain by cosine waveformand conversely inverse discrete cosine transform (IDCT)transfers the frequency domain data into spatial domainThe associated memory size bandwidth and safety issuesin the transformation algorithms are of significant concernDirect computation of 2D DCT (N times N pixel size of ablock) requires 1198734 multiplications while direct realization(with the row-column separation) of 8 times 8 DCT 2 times 83 =1024 multiplications Although many algorithms have been

Table 1 The number of multiplications for 2D DCT by thealgorithms of previous works and by the proposed FDCT algorithm

AlgorithmNumber of multiplications

for2D DCT (8 times 8 block)

Direct computation 84 = 4096Direct computation(with line-column separation) 2 times 83 = 1024

Jridi et al [5] 256Ko et al [7] 31Sung et al [8] 25Manoria and Dixit [9] 28Khan et al [10] 40FDCT algorithm (this work) 24

proposed to optimize multiplications they either requiredhuge amount of computation [6ndash8 10] or suffered fromlow efficiency [4 5 9] Table 1 lists the number of mul-tiplications required in preview works compared with theproposed FDCT algorithm The latter is much more efficientin computation as described in (2) to (12a) and (12b) whereonly 12 times 2 = 24 multiplications are needed By comparisonJridi et al [5] needed 256 multiplications as they simplycalculated 8 times 8 standard DCT Similarly Ko et al [7]required 31 multiplications while Sung et al [8] based onsubband decomposition needed 25 multiplications Manoriaand Dixit [9] used Stassenrsquos matrix multiplication but their2D DCT (8 times 8) needed 28 multiplications Khan et al[10] took 40 multiplications for 2D DCT operation on animage of 8 times 8 block In summary both the FDCT algorithmand its inverse transform are more efficient than many

6 Mathematical Problems in Engineering

ExecutableCOFF file

TMS320C6701EPROMprogrammer

Hex conversionutility

Cross-referencelister

Debugging tools

Linker

Image files

Library

Archiver

Library ofobject files

Archiver

Run-time-supportlibrary

CC++Source files

Assemblersource

COFFObject files

CC++Compiler

Assembler

Linearassembly

Assembly-optimized file

Assemblyoptimizer

Library-buildutility

Figure 4 Software environment in implementing the FDCT algorithm on DSP (TMS320C6701)

previous works in reducing the complexity of computationImplementation in digital signal processor (DSP) for real-time digital watermarking becomes feasible

The process of embedding watermark in digital imagefor copyright protection and marketing applications havebeen proposed over the past decade Conventional techniqueis to embed a secret bit string in spatial frequency orwavelet domain into an image The FDCT algorithm isimplemented in a digital signal processor (TMS320C6701)to validate its efficiency in digital watermarking applicationsThe signal processor in both fixed-point and floating-pointis supported by a set of software development tools withCC++ compiler assembly optimizer a linker and assortedutilities as shown in Figure 4The proposed FDCT algorithmis written in C language and the CC++ compiler is ableto perform optimization (119899 = 0 1 2 3) in different levelof clock cycles and code size The lowest level (119899 = 0)optimization provides the operations of performing looprotation allocating variables to registers and simplifyingexpressions and statements the first level (119899 = 1) on constantpropagation and unused assignments the second level (119899 =2) on software pipelining unused global assignments loopunrolling and incremented pointer and the highest level(119899 = 3) on optimization by simplifying functions with return

Table 2 Processing time of the FDCT algorithm in DSP

Optimizationlevel (119899) Clocks

(cycle) Time (ms) Code size(KB)

None 22141816 133 210 15304558 92 191 11207925 67 222 5715198 35 363 5653914 34 35Matlab NA 610 NA

values never used removing all functions never called inlinecalls to small functions and reorder function declarations sothat the attributes of called functions are known when thecaller is optimized 119899 = 0 and 1 levels can efficiently reducethe code size while 119899 = 2 and 3 enhance the execution speedwith larger code size The computation time and code sizein different levels of optimization are listed in Table 2 Withincreasing code size the clock cycles and processing timedecrease so the best optimization is by 119899 = 3

For a digital watermark (32 times 32 block size) embeddedin an original image (256 times 256) calculation by the FDCT

Mathematical Problems in Engineering 7

algorithm in frequency domain and then inverse transformof the encrypted data back to spatial domain image uses5653914 clocks (119899 = 3) corresponding to 34ms in 35KBcode size Implementation shows that it takes only 024seconds to have the watermark embedded in the originalimage Extraction of watermark by inverse transform IDCT iswithin 021 seconds Real-time implementation of the FDCTalgorithm inDSP for image processing is shownvery efficient

5 Conclusions

(1) A fast discrete cosine transform (FDCT) algorithmthat utilizes the energy compactness and matrixsparseness properties in frequency domain for highercomputation performance is developed For a JPEGimage of 8 times 8 block size in spatial domain thealgorithm first decomposes the 2DDCT into one pairof 1D DCTs and the calculation can be completedin only 24 multiplications The 2D spatial data is alinear combination of base image obtained by theouter product of the column and row vectors of cosinefunctions such that the inverse DCT is as efficientThe algorithm is shown to achieve high performancecompared to many other previous works

(2) The algorithm optimizes a 2D DCT by exploitingthe redundancy of the frequency coefficients so as tofacilitate the implementation in digital signal proces-sor (DSP) For a spatial domain data matrix S the2D DCT data matrix F includes only a few nonzeroelements in the low frequency range which makesit possible to design efficient IDCT algorithm Theenergy compactness property of F and its inversematrix S1015840 can be written as linear combinations of thecosine functions such that both FDCT and its inversetransform are shown to have the same coefficients andmatrix blocks for efficient hardware implementation

(3) An example of digital image watermarking is appliedto demonstrate the efficiency of the FDCT algorithmHardware implementation of watermarking in DSPshows that it takes only 024 seconds to embed a 32times 32 block size digital watermark into a digital imageof block size 256 times 256 Implementation also showsthat extraction of watermark can be completed within021 seconds The FDCT algorithm in DSP is shownefficient and effective in real-time implementation ofdigital image watermarking

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] N Ahmed T Natarajan and K R Rao ldquoDiscrete cosine trans-formrdquo Institute of Electrical and Electronics Engineers Transac-tions on Computers vol 23 pp 90ndash93 1974

[2] C Sun and E-H Yang ldquoAn efficient DCT-based image com-pression system based on Laplacian transparent compositemodelrdquo IEEE Transactions on Image Processing vol 24 no 3pp 886ndash900 2015

[3] M Jridi A Alfalou and P K Meher ldquoOptimized architectureusing a novel subexpression elimination on Loeffler algorithmfor DCT-based image compressionrdquo VLSI Design vol 2012Article ID 209208 12 pages 2012

[4] P R Kumbhare and U M Gokhale ldquoDesign and implemen-tation of 2D-DCT by using Arai algorithm for image com-pressionrdquo Journal of The International Association of AdvancedTechnology and Science vol 16 no 5 article 5 2015

[5] M Jridi Y Ouerhani and A Alfalou ldquoLow complexity DCTengine for image and video compressionrdquo Real-Time Image andVideo Processing vol 8656 pp 1ndash9 2013

[6] M Marimuthu R Muthaiah and P Swaminathan ldquoSub-bandbased DCT for image compressionrdquo Research Journal of AppliedSciences Engineering and Technology vol 4 no 24 pp 5387ndash5390 2012

[7] L-T Ko J-E Chen H-C Hsin Y-S Shieh and T-Y SungldquoA unified algorithm for subband-based discrete cosine trans-formrdquoMathematical Problems in Engineering vol 2012 ArticleID 912194 31 pages 2012

[8] T-Y Sung Y-S Shieh andH-C Hsin ldquoAn efficient VLSI lineararray for DCTIDCT using subband decomposition algorithmrdquoMathematical Problems in Engineering vol 2010 Article ID185398 21 pages 2010

[9] M Manoria and P Dixit ldquoAn efficient DCT compressiontechnique using Strassenrsquos matrix multiplication algorithmrdquoInternational Journal of Computer Applications vol 60 no 9pp 45ndash50 2012

[10] S Khan E Casseau and D Menard ldquoHigh performance dis-crete cosine transform operator using multimedia orientedsubword parallelismrdquo Advances in Computer Engineering vol2015 Article ID 405856 10 pages 2015

[11] P Agrawal and S K Sharma ldquoReview paper on image com-pression using DCT KLT and DWTrdquo International Journal ofAdvanced Research in Computer Science and Software Engineer-ing vol 4 no 9 pp 928ndash931 2014

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: A Fast DCT Algorithm for Watermarking in Digital Signal Processordownloads.hindawi.com/journals/mpe/2017/7401845.pdf · 2019-07-30 · A Fast DCT Algorithm for Watermarking in Digital

Mathematical Problems in Engineering 5

Columns

Rows

Row block of pixels

Result after DCT by rowsFinal result after DCT by columns

Frequency domain

DCT transform

DCT transform

A block 8 times 8 of picture in spatial domain

Figure 3 The FDCT algorithm of 2D DCT calculated by one pair of 1D DCTs

(12b) DCT transforms high correlated image into a few trans-form coefficients Conventional image coding techniques usethe quantization process to achieve higher compression ratioTherefore F includes only a few nonzero elements in the lowfrequency range which makes it possible to design efficientIDCT algorithm by fully utilizing the computation efficiencyin (3) and (4) and the energy compactness property of F

The 2D spatial datamatrix S1015840 in (12b) can be considered aslinear combination of the base images or the outer product ofthe column and row vectors inM This interpretation makesit easier for (12a) and (12b) tomanipulate the sparseness of the2D DCT matrix F and to calculate the spatial data matrix S1015840The proposed FDCT algorithm achieves high computationperformance over many other previous algorithms The row(or column) data can be processed by using 1D DCT (orIDCT) first with the results stored in transposition memoryBy exploiting the redundancy in the coefficients of DCT thealgorithm reduces the complexity of 2DDCTof an 8times 8 blockto only 24 multiplications

4 Performance Evaluation inDigital Image Watermarking

Discrete cosine transform (DCT) can map an originaldigital data into frequency domain by cosine waveformand conversely inverse discrete cosine transform (IDCT)transfers the frequency domain data into spatial domainThe associated memory size bandwidth and safety issuesin the transformation algorithms are of significant concernDirect computation of 2D DCT (N times N pixel size of ablock) requires 1198734 multiplications while direct realization(with the row-column separation) of 8 times 8 DCT 2 times 83 =1024 multiplications Although many algorithms have been

Table 1 The number of multiplications for 2D DCT by thealgorithms of previous works and by the proposed FDCT algorithm

AlgorithmNumber of multiplications

for2D DCT (8 times 8 block)

Direct computation 84 = 4096Direct computation(with line-column separation) 2 times 83 = 1024

Jridi et al [5] 256Ko et al [7] 31Sung et al [8] 25Manoria and Dixit [9] 28Khan et al [10] 40FDCT algorithm (this work) 24

proposed to optimize multiplications they either requiredhuge amount of computation [6ndash8 10] or suffered fromlow efficiency [4 5 9] Table 1 lists the number of mul-tiplications required in preview works compared with theproposed FDCT algorithm The latter is much more efficientin computation as described in (2) to (12a) and (12b) whereonly 12 times 2 = 24 multiplications are needed By comparisonJridi et al [5] needed 256 multiplications as they simplycalculated 8 times 8 standard DCT Similarly Ko et al [7]required 31 multiplications while Sung et al [8] based onsubband decomposition needed 25 multiplications Manoriaand Dixit [9] used Stassenrsquos matrix multiplication but their2D DCT (8 times 8) needed 28 multiplications Khan et al[10] took 40 multiplications for 2D DCT operation on animage of 8 times 8 block In summary both the FDCT algorithmand its inverse transform are more efficient than many

6 Mathematical Problems in Engineering

ExecutableCOFF file

TMS320C6701EPROMprogrammer

Hex conversionutility

Cross-referencelister

Debugging tools

Linker

Image files

Library

Archiver

Library ofobject files

Archiver

Run-time-supportlibrary

CC++Source files

Assemblersource

COFFObject files

CC++Compiler

Assembler

Linearassembly

Assembly-optimized file

Assemblyoptimizer

Library-buildutility

Figure 4 Software environment in implementing the FDCT algorithm on DSP (TMS320C6701)

previous works in reducing the complexity of computationImplementation in digital signal processor (DSP) for real-time digital watermarking becomes feasible

The process of embedding watermark in digital imagefor copyright protection and marketing applications havebeen proposed over the past decade Conventional techniqueis to embed a secret bit string in spatial frequency orwavelet domain into an image The FDCT algorithm isimplemented in a digital signal processor (TMS320C6701)to validate its efficiency in digital watermarking applicationsThe signal processor in both fixed-point and floating-pointis supported by a set of software development tools withCC++ compiler assembly optimizer a linker and assortedutilities as shown in Figure 4The proposed FDCT algorithmis written in C language and the CC++ compiler is ableto perform optimization (119899 = 0 1 2 3) in different levelof clock cycles and code size The lowest level (119899 = 0)optimization provides the operations of performing looprotation allocating variables to registers and simplifyingexpressions and statements the first level (119899 = 1) on constantpropagation and unused assignments the second level (119899 =2) on software pipelining unused global assignments loopunrolling and incremented pointer and the highest level(119899 = 3) on optimization by simplifying functions with return

Table 2 Processing time of the FDCT algorithm in DSP

Optimizationlevel (119899) Clocks

(cycle) Time (ms) Code size(KB)

None 22141816 133 210 15304558 92 191 11207925 67 222 5715198 35 363 5653914 34 35Matlab NA 610 NA

values never used removing all functions never called inlinecalls to small functions and reorder function declarations sothat the attributes of called functions are known when thecaller is optimized 119899 = 0 and 1 levels can efficiently reducethe code size while 119899 = 2 and 3 enhance the execution speedwith larger code size The computation time and code sizein different levels of optimization are listed in Table 2 Withincreasing code size the clock cycles and processing timedecrease so the best optimization is by 119899 = 3

For a digital watermark (32 times 32 block size) embeddedin an original image (256 times 256) calculation by the FDCT

Mathematical Problems in Engineering 7

algorithm in frequency domain and then inverse transformof the encrypted data back to spatial domain image uses5653914 clocks (119899 = 3) corresponding to 34ms in 35KBcode size Implementation shows that it takes only 024seconds to have the watermark embedded in the originalimage Extraction of watermark by inverse transform IDCT iswithin 021 seconds Real-time implementation of the FDCTalgorithm inDSP for image processing is shownvery efficient

5 Conclusions

(1) A fast discrete cosine transform (FDCT) algorithmthat utilizes the energy compactness and matrixsparseness properties in frequency domain for highercomputation performance is developed For a JPEGimage of 8 times 8 block size in spatial domain thealgorithm first decomposes the 2DDCT into one pairof 1D DCTs and the calculation can be completedin only 24 multiplications The 2D spatial data is alinear combination of base image obtained by theouter product of the column and row vectors of cosinefunctions such that the inverse DCT is as efficientThe algorithm is shown to achieve high performancecompared to many other previous works

(2) The algorithm optimizes a 2D DCT by exploitingthe redundancy of the frequency coefficients so as tofacilitate the implementation in digital signal proces-sor (DSP) For a spatial domain data matrix S the2D DCT data matrix F includes only a few nonzeroelements in the low frequency range which makesit possible to design efficient IDCT algorithm Theenergy compactness property of F and its inversematrix S1015840 can be written as linear combinations of thecosine functions such that both FDCT and its inversetransform are shown to have the same coefficients andmatrix blocks for efficient hardware implementation

(3) An example of digital image watermarking is appliedto demonstrate the efficiency of the FDCT algorithmHardware implementation of watermarking in DSPshows that it takes only 024 seconds to embed a 32times 32 block size digital watermark into a digital imageof block size 256 times 256 Implementation also showsthat extraction of watermark can be completed within021 seconds The FDCT algorithm in DSP is shownefficient and effective in real-time implementation ofdigital image watermarking

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] N Ahmed T Natarajan and K R Rao ldquoDiscrete cosine trans-formrdquo Institute of Electrical and Electronics Engineers Transac-tions on Computers vol 23 pp 90ndash93 1974

[2] C Sun and E-H Yang ldquoAn efficient DCT-based image com-pression system based on Laplacian transparent compositemodelrdquo IEEE Transactions on Image Processing vol 24 no 3pp 886ndash900 2015

[3] M Jridi A Alfalou and P K Meher ldquoOptimized architectureusing a novel subexpression elimination on Loeffler algorithmfor DCT-based image compressionrdquo VLSI Design vol 2012Article ID 209208 12 pages 2012

[4] P R Kumbhare and U M Gokhale ldquoDesign and implemen-tation of 2D-DCT by using Arai algorithm for image com-pressionrdquo Journal of The International Association of AdvancedTechnology and Science vol 16 no 5 article 5 2015

[5] M Jridi Y Ouerhani and A Alfalou ldquoLow complexity DCTengine for image and video compressionrdquo Real-Time Image andVideo Processing vol 8656 pp 1ndash9 2013

[6] M Marimuthu R Muthaiah and P Swaminathan ldquoSub-bandbased DCT for image compressionrdquo Research Journal of AppliedSciences Engineering and Technology vol 4 no 24 pp 5387ndash5390 2012

[7] L-T Ko J-E Chen H-C Hsin Y-S Shieh and T-Y SungldquoA unified algorithm for subband-based discrete cosine trans-formrdquoMathematical Problems in Engineering vol 2012 ArticleID 912194 31 pages 2012

[8] T-Y Sung Y-S Shieh andH-C Hsin ldquoAn efficient VLSI lineararray for DCTIDCT using subband decomposition algorithmrdquoMathematical Problems in Engineering vol 2010 Article ID185398 21 pages 2010

[9] M Manoria and P Dixit ldquoAn efficient DCT compressiontechnique using Strassenrsquos matrix multiplication algorithmrdquoInternational Journal of Computer Applications vol 60 no 9pp 45ndash50 2012

[10] S Khan E Casseau and D Menard ldquoHigh performance dis-crete cosine transform operator using multimedia orientedsubword parallelismrdquo Advances in Computer Engineering vol2015 Article ID 405856 10 pages 2015

[11] P Agrawal and S K Sharma ldquoReview paper on image com-pression using DCT KLT and DWTrdquo International Journal ofAdvanced Research in Computer Science and Software Engineer-ing vol 4 no 9 pp 928ndash931 2014

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: A Fast DCT Algorithm for Watermarking in Digital Signal Processordownloads.hindawi.com/journals/mpe/2017/7401845.pdf · 2019-07-30 · A Fast DCT Algorithm for Watermarking in Digital

6 Mathematical Problems in Engineering

ExecutableCOFF file

TMS320C6701EPROMprogrammer

Hex conversionutility

Cross-referencelister

Debugging tools

Linker

Image files

Library

Archiver

Library ofobject files

Archiver

Run-time-supportlibrary

CC++Source files

Assemblersource

COFFObject files

CC++Compiler

Assembler

Linearassembly

Assembly-optimized file

Assemblyoptimizer

Library-buildutility

Figure 4 Software environment in implementing the FDCT algorithm on DSP (TMS320C6701)

previous works in reducing the complexity of computationImplementation in digital signal processor (DSP) for real-time digital watermarking becomes feasible

The process of embedding watermark in digital imagefor copyright protection and marketing applications havebeen proposed over the past decade Conventional techniqueis to embed a secret bit string in spatial frequency orwavelet domain into an image The FDCT algorithm isimplemented in a digital signal processor (TMS320C6701)to validate its efficiency in digital watermarking applicationsThe signal processor in both fixed-point and floating-pointis supported by a set of software development tools withCC++ compiler assembly optimizer a linker and assortedutilities as shown in Figure 4The proposed FDCT algorithmis written in C language and the CC++ compiler is ableto perform optimization (119899 = 0 1 2 3) in different levelof clock cycles and code size The lowest level (119899 = 0)optimization provides the operations of performing looprotation allocating variables to registers and simplifyingexpressions and statements the first level (119899 = 1) on constantpropagation and unused assignments the second level (119899 =2) on software pipelining unused global assignments loopunrolling and incremented pointer and the highest level(119899 = 3) on optimization by simplifying functions with return

Table 2 Processing time of the FDCT algorithm in DSP

Optimizationlevel (119899) Clocks

(cycle) Time (ms) Code size(KB)

None 22141816 133 210 15304558 92 191 11207925 67 222 5715198 35 363 5653914 34 35Matlab NA 610 NA

values never used removing all functions never called inlinecalls to small functions and reorder function declarations sothat the attributes of called functions are known when thecaller is optimized 119899 = 0 and 1 levels can efficiently reducethe code size while 119899 = 2 and 3 enhance the execution speedwith larger code size The computation time and code sizein different levels of optimization are listed in Table 2 Withincreasing code size the clock cycles and processing timedecrease so the best optimization is by 119899 = 3

For a digital watermark (32 times 32 block size) embeddedin an original image (256 times 256) calculation by the FDCT

Mathematical Problems in Engineering 7

algorithm in frequency domain and then inverse transformof the encrypted data back to spatial domain image uses5653914 clocks (119899 = 3) corresponding to 34ms in 35KBcode size Implementation shows that it takes only 024seconds to have the watermark embedded in the originalimage Extraction of watermark by inverse transform IDCT iswithin 021 seconds Real-time implementation of the FDCTalgorithm inDSP for image processing is shownvery efficient

5 Conclusions

(1) A fast discrete cosine transform (FDCT) algorithmthat utilizes the energy compactness and matrixsparseness properties in frequency domain for highercomputation performance is developed For a JPEGimage of 8 times 8 block size in spatial domain thealgorithm first decomposes the 2DDCT into one pairof 1D DCTs and the calculation can be completedin only 24 multiplications The 2D spatial data is alinear combination of base image obtained by theouter product of the column and row vectors of cosinefunctions such that the inverse DCT is as efficientThe algorithm is shown to achieve high performancecompared to many other previous works

(2) The algorithm optimizes a 2D DCT by exploitingthe redundancy of the frequency coefficients so as tofacilitate the implementation in digital signal proces-sor (DSP) For a spatial domain data matrix S the2D DCT data matrix F includes only a few nonzeroelements in the low frequency range which makesit possible to design efficient IDCT algorithm Theenergy compactness property of F and its inversematrix S1015840 can be written as linear combinations of thecosine functions such that both FDCT and its inversetransform are shown to have the same coefficients andmatrix blocks for efficient hardware implementation

(3) An example of digital image watermarking is appliedto demonstrate the efficiency of the FDCT algorithmHardware implementation of watermarking in DSPshows that it takes only 024 seconds to embed a 32times 32 block size digital watermark into a digital imageof block size 256 times 256 Implementation also showsthat extraction of watermark can be completed within021 seconds The FDCT algorithm in DSP is shownefficient and effective in real-time implementation ofdigital image watermarking

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] N Ahmed T Natarajan and K R Rao ldquoDiscrete cosine trans-formrdquo Institute of Electrical and Electronics Engineers Transac-tions on Computers vol 23 pp 90ndash93 1974

[2] C Sun and E-H Yang ldquoAn efficient DCT-based image com-pression system based on Laplacian transparent compositemodelrdquo IEEE Transactions on Image Processing vol 24 no 3pp 886ndash900 2015

[3] M Jridi A Alfalou and P K Meher ldquoOptimized architectureusing a novel subexpression elimination on Loeffler algorithmfor DCT-based image compressionrdquo VLSI Design vol 2012Article ID 209208 12 pages 2012

[4] P R Kumbhare and U M Gokhale ldquoDesign and implemen-tation of 2D-DCT by using Arai algorithm for image com-pressionrdquo Journal of The International Association of AdvancedTechnology and Science vol 16 no 5 article 5 2015

[5] M Jridi Y Ouerhani and A Alfalou ldquoLow complexity DCTengine for image and video compressionrdquo Real-Time Image andVideo Processing vol 8656 pp 1ndash9 2013

[6] M Marimuthu R Muthaiah and P Swaminathan ldquoSub-bandbased DCT for image compressionrdquo Research Journal of AppliedSciences Engineering and Technology vol 4 no 24 pp 5387ndash5390 2012

[7] L-T Ko J-E Chen H-C Hsin Y-S Shieh and T-Y SungldquoA unified algorithm for subband-based discrete cosine trans-formrdquoMathematical Problems in Engineering vol 2012 ArticleID 912194 31 pages 2012

[8] T-Y Sung Y-S Shieh andH-C Hsin ldquoAn efficient VLSI lineararray for DCTIDCT using subband decomposition algorithmrdquoMathematical Problems in Engineering vol 2010 Article ID185398 21 pages 2010

[9] M Manoria and P Dixit ldquoAn efficient DCT compressiontechnique using Strassenrsquos matrix multiplication algorithmrdquoInternational Journal of Computer Applications vol 60 no 9pp 45ndash50 2012

[10] S Khan E Casseau and D Menard ldquoHigh performance dis-crete cosine transform operator using multimedia orientedsubword parallelismrdquo Advances in Computer Engineering vol2015 Article ID 405856 10 pages 2015

[11] P Agrawal and S K Sharma ldquoReview paper on image com-pression using DCT KLT and DWTrdquo International Journal ofAdvanced Research in Computer Science and Software Engineer-ing vol 4 no 9 pp 928ndash931 2014

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: A Fast DCT Algorithm for Watermarking in Digital Signal Processordownloads.hindawi.com/journals/mpe/2017/7401845.pdf · 2019-07-30 · A Fast DCT Algorithm for Watermarking in Digital

Mathematical Problems in Engineering 7

algorithm in frequency domain and then inverse transformof the encrypted data back to spatial domain image uses5653914 clocks (119899 = 3) corresponding to 34ms in 35KBcode size Implementation shows that it takes only 024seconds to have the watermark embedded in the originalimage Extraction of watermark by inverse transform IDCT iswithin 021 seconds Real-time implementation of the FDCTalgorithm inDSP for image processing is shownvery efficient

5 Conclusions

(1) A fast discrete cosine transform (FDCT) algorithmthat utilizes the energy compactness and matrixsparseness properties in frequency domain for highercomputation performance is developed For a JPEGimage of 8 times 8 block size in spatial domain thealgorithm first decomposes the 2DDCT into one pairof 1D DCTs and the calculation can be completedin only 24 multiplications The 2D spatial data is alinear combination of base image obtained by theouter product of the column and row vectors of cosinefunctions such that the inverse DCT is as efficientThe algorithm is shown to achieve high performancecompared to many other previous works

(2) The algorithm optimizes a 2D DCT by exploitingthe redundancy of the frequency coefficients so as tofacilitate the implementation in digital signal proces-sor (DSP) For a spatial domain data matrix S the2D DCT data matrix F includes only a few nonzeroelements in the low frequency range which makesit possible to design efficient IDCT algorithm Theenergy compactness property of F and its inversematrix S1015840 can be written as linear combinations of thecosine functions such that both FDCT and its inversetransform are shown to have the same coefficients andmatrix blocks for efficient hardware implementation

(3) An example of digital image watermarking is appliedto demonstrate the efficiency of the FDCT algorithmHardware implementation of watermarking in DSPshows that it takes only 024 seconds to embed a 32times 32 block size digital watermark into a digital imageof block size 256 times 256 Implementation also showsthat extraction of watermark can be completed within021 seconds The FDCT algorithm in DSP is shownefficient and effective in real-time implementation ofdigital image watermarking

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] N Ahmed T Natarajan and K R Rao ldquoDiscrete cosine trans-formrdquo Institute of Electrical and Electronics Engineers Transac-tions on Computers vol 23 pp 90ndash93 1974

[2] C Sun and E-H Yang ldquoAn efficient DCT-based image com-pression system based on Laplacian transparent compositemodelrdquo IEEE Transactions on Image Processing vol 24 no 3pp 886ndash900 2015

[3] M Jridi A Alfalou and P K Meher ldquoOptimized architectureusing a novel subexpression elimination on Loeffler algorithmfor DCT-based image compressionrdquo VLSI Design vol 2012Article ID 209208 12 pages 2012

[4] P R Kumbhare and U M Gokhale ldquoDesign and implemen-tation of 2D-DCT by using Arai algorithm for image com-pressionrdquo Journal of The International Association of AdvancedTechnology and Science vol 16 no 5 article 5 2015

[5] M Jridi Y Ouerhani and A Alfalou ldquoLow complexity DCTengine for image and video compressionrdquo Real-Time Image andVideo Processing vol 8656 pp 1ndash9 2013

[6] M Marimuthu R Muthaiah and P Swaminathan ldquoSub-bandbased DCT for image compressionrdquo Research Journal of AppliedSciences Engineering and Technology vol 4 no 24 pp 5387ndash5390 2012

[7] L-T Ko J-E Chen H-C Hsin Y-S Shieh and T-Y SungldquoA unified algorithm for subband-based discrete cosine trans-formrdquoMathematical Problems in Engineering vol 2012 ArticleID 912194 31 pages 2012

[8] T-Y Sung Y-S Shieh andH-C Hsin ldquoAn efficient VLSI lineararray for DCTIDCT using subband decomposition algorithmrdquoMathematical Problems in Engineering vol 2010 Article ID185398 21 pages 2010

[9] M Manoria and P Dixit ldquoAn efficient DCT compressiontechnique using Strassenrsquos matrix multiplication algorithmrdquoInternational Journal of Computer Applications vol 60 no 9pp 45ndash50 2012

[10] S Khan E Casseau and D Menard ldquoHigh performance dis-crete cosine transform operator using multimedia orientedsubword parallelismrdquo Advances in Computer Engineering vol2015 Article ID 405856 10 pages 2015

[11] P Agrawal and S K Sharma ldquoReview paper on image com-pression using DCT KLT and DWTrdquo International Journal ofAdvanced Research in Computer Science and Software Engineer-ing vol 4 no 9 pp 928ndash931 2014

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: A Fast DCT Algorithm for Watermarking in Digital Signal Processordownloads.hindawi.com/journals/mpe/2017/7401845.pdf · 2019-07-30 · A Fast DCT Algorithm for Watermarking in Digital

Submit your manuscripts athttpswwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of