Top Banner
Histogram estimation-scheme-based steganalysis defeating the steganography using pixel-value differencing and modulus function Jeong-Chun Joo Kyung-Su Kim Heung-Kyu Lee Korea Advanced Institute of Science and Technology Department of Computer Science Guseong-dong, Yuseong-gu Daejeon, Korea 305-701 E-mail: [email protected] Hae-Yeoun Lee Kumoh National Institute of Technology School of Computer and Software Engineering Yangho-dong, Gumi Gyeongbuk, Korea 730-701 Abstract. We propose a steganalysis defeating the steganographic method using pixel-value differencing and modulus function, which is a recent method with high security and capacity for secret communication. The presented steganalysis is designed to reveal the existence of the message and uses three steganalytic measures that remarkably in- crease their values in the stego images. Hence, the stego images are statistically separated with the cover images. Detection of the hidden message is possible by modeling the changes generated by the embed- ding process and comparing the values of the steganalytic measures. To increase the performance of the steganalytic measures, a novel histo- gram estimation scheme is used to estimate the histogram value of the cover image and the embedding ratio. A support vector machine classi- fier is adopted to discriminate between cover and stego images. The experimental results verify that the proposed steganalysis can detect the stego images with 97.1% accuracy, even though the embedding ratio is just 10% of the maximum hiding capacity. Also, the length of the hidden message can be successfully estimated without the cover image. © 2010 Society of Photo-Optical Instrumentation Engineers. DOI: 10.1117/1.3463021 Subject terms: steganalysis; steganography; pixel-value differencing; modulus function. Paper 100025R received Jan. 11, 2010; revised manuscript received Apr. 26, 2010; accepted for publication May 20, 2010; published online Jul. 23, 2010. 1 Introduction For secure communication, many steganographic methods have been researched to embed secret data and hide the existence of the messages into various media formats, such as image, sudoku, short message service, and jigsaw puzzle. 14 On the other hand, many steganalysis algorithms have been studied to detect the existence of the hidden message in the stego images. 511 Once the novel stegano- graphic methods are proposed, they can be used illegally to expose the confidential information. As a result, counter- steganalysis methods are necessary to ensure continuing se- curity of confidential information and its importance is more emphasized. In steganography, invisibility for the security and capac- ity for the communication are important factors. Pixel- value differencing PVD steganography was proposed by Wu and Tsai to avoid the well-known steganalysis methods, such as the RS analysis and adjust the human visual system. 12 Because human eyes are more sensitive to the changes in the flat area of the image, more message bits were embedded into the pixel pairs having the large differ- ence values such as edge area. Although Wu’s and Tsai’s method was secure under RS analysis, it made the step effects of the PVD histogram and was defeated. 13 The modified PVD steganography was proposed by Zhang and Wang to overcome the step effects. 13 It randomly changed the lower and upper bound of the subrange. However, it made the PVD histogram too smooth and was defeated through the one-more-time embedding and the neural network. 14 To increase the capacity, Wu et al. 15 presented the hybrid method, where least significant bit LSB re- placement and PVD embedding were implemented in the flat and edge areas, respectively. Sabeti et al. 16 defeated this method by using 2 test on the PVD histogram. Many re- searchers have studied the embedding algorithms based on PVD in order to increase the hiding capacity or improve the image quality. 17,18 Therefore, the demand for accurate ste- ganalysis for PVD steganographic methods is increasing. To provide both the high embedding capacity and the good image quality in comparison to the other PVD stega- nographic methods, a PVD steganographic method using modulus function was presented by Wang et al. 19 It ad- justed the remainder of two consecutive pixels to match the message value. We refer to Wang et al.’s method as the modulus PVD steganography. Because there are no step effects, Zhang and Wang’s steganalysis 13 is not effective. Also, the Sabeti et al.’s attack, 14 which is designed for Zhang and Wang’s steganography, has low detection accu- racy and cannot estimate the length of the hidden message. Steganalysis schemes can be classified into two catego- ries: specific and universal. Specific steganalysis is not ap- plicable for the other embedding algorithm or the new ste- ganographic method. Most steganalysis schemes fall into 0091-3286/2010/$25.00 © 2010 SPIE Optical Engineering 497, 077001 July 2010 Optical Engineering July 2010/Vol. 497 077001-1 Downloaded from SPIE Digital Library on 28 Oct 2010 to 143.248.135.186. Terms of Use: http://spiedl.org/terms
11

Histogram estimation-scheme-based steganalysis defeating ...hklee.kaist.ac.kr/publications/OE(with Joo in 2010).pdf · the specific category.5,6,14,16 Because specific steganalysis

Aug 04, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Histogram estimation-scheme-based steganalysis defeating ...hklee.kaist.ac.kr/publications/OE(with Joo in 2010).pdf · the specific category.5,6,14,16 Because specific steganalysis

Hspf

JKHK

DGDE

HKSYG

1

Fheaphmgescm

ivWsscwemem

0

Optical Engineering 49�7�, 077001 �July 2010�

O

istogram estimation-scheme-basedteganalysis defeating the steganography usingixel-value differencing and modulus

unction

eong-Chun Jooyung-Su Kimeung-Kyu Leeorea Advanced Institute of Science and

Technologyepartment of Computer Scienceuseong-dong, Yuseong-guaejeon, Korea 305-701-mail: [email protected]

ae-Yeoun Leeumoh National Institute of Technologychool of Computer and Software Engineeringangho-dong, Gumiyeongbuk, Korea 730-701

Abstract. We propose a steganalysis defeating the steganographicmethod using pixel-value differencing and modulus function, which is arecent method with high security and capacity for secret communication.The presented steganalysis is designed to reveal the existence of themessage and uses three steganalytic measures that remarkably in-crease their values in the stego images. Hence, the stego images arestatistically separated with the cover images. Detection of the hiddenmessage is possible by modeling the changes generated by the embed-ding process and comparing the values of the steganalytic measures. Toincrease the performance of the steganalytic measures, a novel histo-gram estimation scheme is used to estimate the histogram value of thecover image and the embedding ratio. A support vector machine classi-fier is adopted to discriminate between cover and stego images. Theexperimental results verify that the proposed steganalysis can detect thestego images with 97.1% accuracy, even though the embedding ratio isjust 10% of the maximum hiding capacity. Also, the length of the hiddenmessage can be successfully estimated without the cover image. © 2010Society of Photo-Optical Instrumentation Engineers. �DOI: 10.1117/1.3463021�

Subject terms: steganalysis; steganography; pixel-value differencing; modulusfunction.

Paper 100025R received Jan. 11, 2010; revised manuscript received Apr. 26,2010; accepted for publication May 20, 2010; published online Jul. 23, 2010.

Introduction

or secure communication, many steganographic methodsave been researched to embed secret data and hide thexistence of the messages into various media formats, suchs image, sudoku, short message service, and jigsawuzzle.1–4 On the other hand, many steganalysis algorithmsave been studied to detect the existence of the hiddenessage in the stego images.5–11 Once the novel stegano-

raphic methods are proposed, they can be used illegally toxpose the confidential information. As a result, counter-teganalysis methods are necessary to ensure continuing se-urity of confidential information and its importance isore emphasized.In steganography, invisibility for the security and capac-

ty for the communication are important factors. Pixel-alue differencing �PVD� steganography was proposed byu and Tsai to avoid the well-known steganalysis methods,

uch as the RS analysis and adjust the human visualystem.12 Because human eyes are more sensitive to thehanges in the flat area of the image, more message bitsere embedded into the pixel pairs having the large differ-

nce values such as edge area. Although Wu’s and Tsai’sethod was secure under RS analysis, it made the step

ffects of the PVD histogram and was defeated.13 Theodified PVD steganography was proposed by Zhang and

091-3286/2010/$25.00 © 2010 SPIE

ptical Engineering 077001-

Downloaded from SPIE Digital Library on 28 Oct 2010 to 14

Wang to overcome the step effects.13 It randomly changedthe lower and upper bound of the subrange. However, itmade the PVD histogram too smooth and was defeatedthrough the one-more-time embedding and the neuralnetwork.14 To increase the capacity, Wu et al. 15 presentedthe hybrid method, where least significant bit �LSB� re-placement and PVD embedding were implemented in theflat and edge areas, respectively. Sabeti et al.16 defeated thismethod by using �2 test on the PVD histogram. Many re-searchers have studied the embedding algorithms based onPVD in order to increase the hiding capacity or improve theimage quality.17,18 Therefore, the demand for accurate ste-ganalysis for PVD steganographic methods is increasing.

To provide both the high embedding capacity and thegood image quality in comparison to the other PVD stega-nographic methods, a PVD steganographic method usingmodulus function was presented by Wang et al.19 It ad-justed the remainder of two consecutive pixels to match themessage value. We refer to Wang et al.’s method as themodulus PVD steganography. Because there are no stepeffects, Zhang and Wang’s steganalysis13 is not effective.Also, the Sabeti et al.’s attack,14 which is designed forZhang and Wang’s steganography, has low detection accu-racy and cannot estimate the length of the hidden message.

Steganalysis schemes can be classified into two catego-ries: specific and universal. Specific steganalysis is not ap-plicable for the other embedding algorithm or the new ste-ganographic method. Most steganalysis schemes fall into

July 2010/Vol. 49�7�1

3.248.135.186. Terms of Use: http://spiedl.org/terms

Page 2: Histogram estimation-scheme-based steganalysis defeating ...hklee.kaist.ac.kr/publications/OE(with Joo in 2010).pdf · the specific category.5,6,14,16 Because specific steganalysis

tahitglnafdtf

tdubPtfimhtaviaoa

mtEc

2Ticsottases

o

Joo et al.: Histogram estimation-scheme-based steganalysis defeating the steganography…

O

he specific category.5,6,14,16 Because specific steganalysisttacks the targeted embedding algorithm with the veryigh detection accuracy, it is still challengeable and hencenteresting topic. Universal steganalysis detects the exis-ence of hidden messages independently of the stegano-raphic methods.7–9 In general, classifiers such as the fisherinear discriminant, support vector machine �SVM�, andeural network are used to distinguish between the covernd stego images. The classifiers are trained using variouseatures changed in the embedding process. Because theetection accuracy of the steganalysis is related to therained characteristic database, the selection of statisticaleatures is more important.10,14

In this paper, we present a specific steganalysis to defeathe modulus PVD steganography. By analyzing the embed-ing process, we provide three blind SMs �SMs�, which areseful for the steganalysis: �1� the fluctuations around theorder of the subrange, �2� the asymmetry of the stegoVD histogram, and �3� the abnormal increase of the his-

ogram value. The SVM classifier is applied for the classi-cation of the cover and stego images. To accurately esti-ate the original histogram values, the two novel

istogram estimation schemes �HES�, which work withouthe cover image, are explained: a curve-fitting method andhistogram reverse-tracing method. Because the histogramalues of the cover image are obtained from the suspiciousmage, the existence of the hidden message can be detectednd the embedding ratio can be estimated. The performancef the presented steganalysis is verified through detailednd exhaustive experimentation.

This paper is organized as follows. Section 2 reviews theodulus PVD steganography. The steganalysis that defeats

he modulus PVD steganography is presented in Section 3.xperimental results are shown in Section 4 and Section 5oncludes.

Review of Modulus PVD Steganographyo explain the modulus PVD steganography, let the cover

mage be F, which is partitioned into sub-block Fi. Fi isomposed of two consecutive pixels, P�i,x� and P�i,y�. Mes-age embedding is performed by modifying the remainderf the two consecutive pixels instead of the difference be-ween two pixels. A range table R is composed of n con-iguous subranges Rk�0�k�n� and provides informationbout the hiding capacity of each Fi. The width, wk, of theubrange is equal to the power of 2. Table 1 shows thexample of the range table R, which is composed of sixubranges.

The embedding algorithm of the modulus PVD stegan-graphy is as follows:

Table 1 Example of the range table R. lk andrespectively.

Index �k� 0 1

�lk uk� �0 7� �8 15� �

Hiding bits �ti� 3 3

ptical Engineering 077001-

Downloaded from SPIE Digital Library on 28 Oct 2010 to 14

Step 1. Obtain five parameters by using the Wu andTsai’s scheme12 �1� the difference value di of two continu-ous pixels �di= P�i,y�− P�i,x��, �2� the subrange Rk such thatRk� �lk ,uk� and lk� �di��uk, �3� the width wk of Rk �wk

=uk− lk+1�, �4� the hiding bits ti, and �5� the decimal valueti� of ti message bits for each sub-block Fi.

Step 2. Compute the remainder values Frem�i� by divid-ing �P�i,x�+ P�i,y�� by 2ti.

Step 3. Embed ti bits of secret data into Fi by alteringP�i,x� and P�i,y� such that Frem�i�= ti� through the optimal ap-proach to achieve the minimal distortion. After, if P�i,x�� or

P�i,y�� under-/overflow the boundary value �0 or 255�, thenexecute step 4. Otherwise, concealing of the secret data iscompleted after the replacement of �P�i,x� , P�i,y�� by�P�i,x�� , P�i,y�� � in the cover image.

Step 4. Revise �P�i,x�� and P�i,y�� � by adding �2ti /2�, sub-tracting �2ti /2� or shifting the overflowing values. Afterstep 4, �P�i,x�� , P�i,y�� � can be corrected and �P�i,x�� , P�i,y�� � is not

below 0 or over 255. Finally, put �P�i,x�� , P�i,y�� � in place of�P�i,x� , P�i,y�� in the cover image. The embedding algorithmis now finished.

The extraction algorithm is very simple. The receiverfinds the difference of two continuous pixels �di�= P�i,y��

− P�i,x�� �, and derives the width wk� of the subrange Rk using

�wk�=uk�− lk�+1�, and the length ti of the message bits. Aftercomputing the remainder of the sub-block Fi� by �Frem�i��

= �P�i,x�� + P�i,y�� �mod 2ti�, the extraction algorithm is com-

pleted by transforming the remainder Frem�i�� into a binarystring with a length ti.

The example for hiding the secret data is shown in Table2. Because it uses the remainder for data embedding andchanges the pixel values in turn, it can modify the pixelvalues minimally and achieve good image quality.

3 Steganalysis Defeating Modulus PVDSteganography

To defeat the modulus PVD steganography, its weaknessesare analyzed in Section 3.1. In order to estimate the originalhistogram values, Section 3.2 explains two histogram esti-mation schemes without cover images. Three blind SMsrevealing the existence of the hidden message are presentedin Section 3.3. Section 3.4 shows how the embedding ratiois estimated.

Figure 1 shows the overall processes of our steganalysisalgorithm. The proposed method extracts three feature val-ues to detect the existence of the hidden message embedded

the lower and upper bound of the subrange,

3 4 5

�32 63� �64 127� �128 255�

5 6 7

uk are

2

16 31�

4

July 2010/Vol. 49�7�2

3.248.135.186. Terms of Use: http://spiedl.org/terms

Page 3: Histogram estimation-scheme-based steganalysis defeating ...hklee.kaist.ac.kr/publications/OE(with Joo in 2010).pdf · the specific category.5,6,14,16 Because specific steganalysis

btmvacShr

PtEderg

3Eha

h

w�oitm

Joo et al.: Histogram estimation-scheme-based steganalysis defeating the steganography…

O

y the modulus PVD steganography.19 For our steganalysisechnique, the histogram value of the cover image is esti-

ated from the suspicious image. The estimated histogramalue he�0� is used for the detection of the hidden messagend the estimation of the embedding ratio. Before classifi-ation, the SVM as the classifier is trained through the threeMs for the massive tests. Finally, whether there is theidden message or not is determined and the embeddingatio is estimated.

For the modeling and explanation, let h and h� be theVD histogram of the cover and the stego images, respec-

ively. In addition, the estimated histogram is denoted as he.mbedding ratio � means the proportion of the real embed-ed message bits to the maximum embeddable bits. Whenmbedding the message into all pixel pairs, the embeddingatio � is equal to 1 �100%�. �a ,b� means that a is a histo-ram bin and b is its occurrences.

.1 Weaknesses of Modulus PVD Steganographyven though the modulus PVD steganography provides aigh embedding capacity, good image quality, and securitygainst RS analysis, the weaknesses revealing the existence

Table 2 Changes of the pixel values forsteganography.

p�i,x� ,p�i,y�

Sec

0 1 2

�8,7� �8,8� �9,8� �9,9� �

�8,8� �8,8� �8,9� �9,9� �9

�8,9� �8,8� �8,9� �9,9� �9

�8,10� �7,9� �8,9� �8,10� �9

�8,11� �7,9� �7,10� �8,10� �8

ptical Engineering 077001-

Downloaded from SPIE Digital Library on 28 Oct 2010 to 14

of the hidden message still exist. Figure 2 depicts the PVDhistograms that have extreme artifacts that were generatedby modulus PVD steganography. Because the fluctuationsappear around the border of the subrange, the width of therange table can be estimated.

Modulus PVD steganography changes two pixel valuesin turn to embed the secret data, as shown in Table 2.Therefore, the difference of two stego pixels will be equalor changed at most by one after the embedding process isfinished. For example, if cover pixel pair is �8,7� and itsdifference is −1, the difference of the stego pixel pair is −1or 0. When the cover pixel pair is �8,8�, the difference ischanged to 0 or 1. In general, the absolute value of twopixels’ difference is unchanged or decreased by 1 after theembedding process. However, because there are no abso-lute values smaller than 0, the change of h�0� in the PVDhistogram of the stego image violates the general modifica-tion rule. The abnormal change of h�0� also affects themodification of h�1�. Consequently, the PVD histogram h�of the stego image embedded by the modulus PVD stega-nography can be modeled as follows:

ding the secret data by the modulus PVD

a �decimal value�

4 5 6 7

�6,6� �7,6� �7,7� �8,7�

�10,10� �6,7� �7,7� �7,8�

�10,10� �10,11� �7,7� �7,8�

�9,11� �10,11� �10,12� �7,8�

�9,11� �9,12� �10,12� �10,13�

��d� =�1 −

2 · h�d� +

2· h�d − 1� , if d � 0;

2· h�d − 1� + 1 −

2 · h�d� +

2· h�d + 1� , if d = 0 or 1;

1 −�

2 · h�d� +

2· h�d + 1� , if d � 1.

� �1�

here � is the embedding ratio and d�−255�d�255,dN� is the difference of two pixels. h��0� and h��1� are

btained from three histogram bins of the cover, as shownn Eq. �1�, while other histogram values are obtained fromwo histogram bins. It increased the histogram value abnor-

ally and caused the weaknesses of the modulus PVD ste-

ganography, such as the asymmetry of PVD histogram andabnormal increase of the histogram value.

The other weakness of the modulus PVD steganographyare caused by the exceptional cases that occur around theborder of the subrange in the PVD histogram. In the em-bedding process of the modulus PVD steganography, the

embed

ret dat

3

6,5�

,10�

,10�

,10�

,11�

July 2010/Vol. 49�7�3

3.248.135.186. Terms of Use: http://spiedl.org/terms

Page 4: Histogram estimation-scheme-based steganalysis defeating ...hklee.kaist.ac.kr/publications/OE(with Joo in 2010).pdf · the specific category.5,6,14,16 Because specific steganalysis

dmsdphshtmtcac

Joo et al.: Histogram estimation-scheme-based steganalysis defeating the steganography…

O

ifference of two pixels determines the length of the hiddenessage bits. Therefore, keeping the difference in the same

ubrange is crucial for the correct communication. If theifference falls into another subrange after embedding, theixel values must be adjusted. h�lk� is changed to h��lk� or��lk+1� instead of h��lk−1� to keep the difference in theame subrange. If h��uk−1� is in a different subrange with�lk�, then it could not be obtained from h�lk�. This meanshat h��uk−1� decreases and h��lk+1� increases after theessage is embedded. h� is abnormally changed to make

he new difference belong to the same subrange as theover. To analyze the abnormal changes statistically, h�round the lower and upper bound of the subrange can bealculated by

Fig. 1 Block diagram of the proposed steganalysis.

Fig. 2 PVD histogram of “Baboon” and “Boat”nesses that were generated by the embedding

ptical Engineering 077001-

Downloaded from SPIE Digital Library on 28 Oct 2010 to 14

�h��uk� = 1 −�

2 · h�uk�;

h��lk + 1� =�

2· h�lk� + 1 −

2 · h�lk + 1� +

2· h�lk + 2� .��2�

where lk and uk are the lower and upper bound of the sub-range Rk. For natural images, h�uk−1� is greater thanh�lk+1� because its PVD histogram follows the Gaussiandistribution.20 However, as a result of Eq. �2�, h��uk−1� isless than h��lk+1�. Hence, this is how the embedding pro-cess makes fluctuations around the border of the subrange.

3.2 HESAs shown in Fig. 1, he�0� must be estimated before extract-ing the steganalytic measure �SM3� and estimating the em-bedding ratio. There are two methods for accurately esti-mating he�0�. One is the curve-fitting method, which usesthe well-known distribution models for normal PVD histo-gram. The other is the histogram reverse-tracing method,which uses the changes of the PVD histogram.

3.2.1 Curve-fitting methodThe first curve-fitting method is related to the PVD histo-gram curve around 0. In general, the PVD histogram lookslike it follows the Gaussian distribution;20 however, acloser look reveals that its curve can be fitted to one ofthree well-known distribution models: Gaussian, Laplacian,or linear models. he�0� is estimated from the fitted distribu-tion model in Eq. �3�. Because most pixel differences areconcentrated around 0, the shape of four bins adjacent to 0is effective to estimate h�0� from the suspicious imagewithout the cover. Therefore, the curve-fitting estimation

after 100% embedding. They show the weak-s.

imagesproces

July 2010/Vol. 49�7�4

3.248.135.186. Terms of Use: http://spiedl.org/terms

Page 5: Histogram estimation-scheme-based steganalysis defeating ...hklee.kaist.ac.kr/publications/OE(with Joo in 2010).pdf · the specific category.5,6,14,16 Because specific steganalysis

susphnlfc

h

3

TtaovhHtcccahhbo

h

s�c

Joo et al.: Histogram estimation-scheme-based steganalysis defeating the steganography…

O

cheme finds out the most suitable model by using the val-es of the four bins instead of the whole bins. After mes-age embedding, the shape of the PVD histogram in theositive area �+1�+4� is severely changed because the��1� value increases. However, the PVD histogram in theegative area �−4�−1� relatively keeps the shape in simi-ar to that of the cover. As a result, the histogram valuesrom −4 to −1 are considered to find the most suitableurve model as shown in Section 4.1.

1. Gaussian Model Estimation. If the shape of four his-togram values is convex, he�0� is estimated by theGaussian model. The variables a1 and �1 in theGaussian distribution form are calculated by usingtwo bins. One is the histogram value at −1�−1,h��−1��, and the other is the average of the his-

e 1 e

ptical Engineering 077001-

Downloaded from SPIE Digital Library on 28 Oct 2010 to 14

togram at the center of mass in the range from −7 to−1 ��int��� i=−7

−1 i�h�i��� / � i=−7−1 �h�i���� ,

��1 /7� i=−7−1 �h�i����.

2. Laplacian Model Estimation: If the shape of four his-togram values is concave, he�0� is estimated by theLaplacian model. The variables a2 and �2 in the La-placian distribution form are calculated by using twobins ��−1,h��−1�� and �−2,h��−2���.

3. Linear Model Estimation. If the differences betweenthe histogram values are almost same �h��d�−h��d−1��h��d−1�−h��d−2��, the histogram values in-crease by the same step. In this case, he�0� is esti-mated by the linear equation.

The models above can be summarized as follows:

e�0� = �a1 ·1

�2��1

· exp−02

2�12 , if h� � Gaussian model;

a2 · exp�− �2 · �0�� , if h� � Laplacian model;

h��− 1� + �h��− 1� − h��− 2�� , if h� � linear model.� �3�

.2.2 Histogram reverse-tracing method

he histogram reverse-tracing method uses the changes ofhe PVD histogram. The histogram values of the cover im-ge can be estimated through the embedding ratio � or onef the original histogram values. Because most histogramalues are changed after message embedding, the originalistogram values cannot be estimated without the cover.owever, through extensive experiments, we have found

hat h�−7� is relatively unchanged compared to others. Be-ause h�−7� is between h�−6� and h�−8�, which are de-reased slightly after embedding, he�−7� can be roughlyalculated by Eq. �4�. The estimated results were accept-ble, and hence, the histogram reverse-tracing method usede�−7�. We verified that the estimated histogram value

e�−7� was very close to the original histogram value h�−7�y the massive experiments, as shown in Fig. 5. he�−7� isbtained by

e�− 7� =h��− 6� + h��− 8�

2. �4�

Using the estimated he�−7�, the initial embedding ratio

1 is calculated as follows:

1 = 2�1 −h��− 7�he�− 7� � . �5�

The histogram values from −6 to −1 are estimated byolving Eq. �1�. For example, he�−6� is estimated byh��−6�− ��1 /2�he�−7�� / �1− ��1 /2��, and then he�−5� isomputed from h �−6� and � . In this way, h �−1� is calcu-

lated at last. Because h�1� is similar to h�−1� and h�−1� issimilar to h��−1�, the variables ��, h�1�, h�−1�, and h�0�� ofEq. �1� are replaced by �1, h��−1�, h��−1�, and he�0�, re-spectively. Finally, he�0� is computed by

he�0� =h��0� − �1 · h��− 1�

1 − ��1/2�. �6�

The estimated he�0� is then used to extract SM3 andestimate the embedding ratio. The closer he�0� is to theoriginal histogram value, the more accurate the classifica-tion and estimation results become.

3.3 SMsThree blind SMs are presented to defeat the modulus PVDsteganography: �1� SM1: the fluctuations around the borderof the subrange, �2� SM2: the asymmetry of the stego PVDhistogram, and �3� SM3: the abnormal increase of the his-togram value.

To detect the modulus PVD steganography, the first SM�SM1� checks the fluctuations by comparing the histogramvalues around the border. SM1 is designed using

SM1 =h��lk + 1� − h��uk−1�

h��lk + 1�� 100. �7�

Because the width of each subrange is taken to be a powerof 2, |8| is the common border in the various range tables.Moreover, because the occurrences of the pixel differenceh�d� decrease by increasing �d� in a macroscopicallysmooth fashion,13 the two bins ��−7,h��−7�� and�−9,h��−9��� are used to insure the maximum difference

July 2010/Vol. 49�7�5

3.248.135.186. Terms of Use: http://spiedl.org/terms

Page 6: Histogram estimation-scheme-based steganalysis defeating ...hklee.kaist.ac.kr/publications/OE(with Joo in 2010).pdf · the specific category.5,6,14,16 Because specific steganalysis

pedifS

i

S

w�g�sdBst

mf

S

Talh3ii

3Ecdhfh�i

htmc

4TaeNg

Joo et al.: Histogram estimation-scheme-based steganalysis defeating the steganography…

O

ossible. To normalize the values of SM1 and maximize theffects of the message embedding, h��−7� is used as theivisor. Although h�−7� is greater than h�−9� in the covermage, h��−7� is less than h��−9� in the stego image. There-ore, the existence of the hidden message is revealed whenM1 of the suspicious image is a positive value.

The second SM �SM2� checks the asymmetry of h� ands designed as follows:

M2 =h���d�� − h��− �d��

h��− �d��� 100, �8�

here d is the difference of the pixel pair. As shown in Eq.1�, h��1� increases when embedding the messages. h��1� isreater than h�1� while h��−1� is similar to h�−1� �h��1�h�1� and h��−1��h�−1��. In other words, h��1� is not

imilar to h��−1� while h�1� is similar to h�−1�. Therefore,of SM2 is set to 1 to get the maximum asymmetric value.ecause of the exceptional changes of h��1�, SM2 of the

tego image is �0, while SM2 of the cover image is closeo 0.

Our third SM �SM3� is designed to examine the abnor-al increase of the histogram value. SM3 is obtained as

ollows:

M3 =h��d� − he�d�

he�d�� 100. �9�

o detect the abnormal increase of the PVD histogram, anccurate estimation of the original histogram value is abso-utely crucial. Because he�0� closely estimates the originalistogram value of the cover image, as described in Section.2, d is set to zero. As a result, the SM3 value of the stegomage is much higher than 0, while that of the cover images close to 0.

.4 Estimation of the Embedding Ratioquation �1� shows how the PVD histogram values arehanged after embedding and helps to estimate the embed-ing ratio. h�1� and h�2� of the cover image are similar to�−1� and h�−2�, respectively, because the PVD histogramollows the Gaussian distribution.20 Therefore, h�−1� and�−2� are replaced by h�1� and h�2�, respectively. Throughh��1�−h��−1��, the embedding ratio � is estimated by us-ng

=2�h��1� − h��− 1��

h�0�. �10�

��1� and h��−1� are obtained from the PVD histogram ofhe suspicious image. Then, h�0� is replaced by he�0� esti-

ated successfully in Section 3.2. In experiments, the ac-uracy of this estimation will be verified.

Experimental Resultso evaluate the performance of our SM, 12 gray-scale im-ges of 512�512 were tested, as depicted in Fig. 3. For thextensive experiments, 2048 color images from the USDARCS Photo Gallery21 and KODAK22 were converted toray-scale images in the BMP image file format. After the

ptical Engineering 077001-

Downloaded from SPIE Digital Library on 28 Oct 2010 to 14

modulus PVD steganography was applied to the images,the proposed steganalysis was evaluated to see whether itcould defeat the modulus PVD steganography.

4.1 Accuracy of HESAs explained in Section 3.2, he�0� is estimated by thecurve-fitting method and the histogram reverse-tracingmethod. First, the curve-fitting method estimates he�0�through the fitted distribution models: Gaussian, Laplacian,or linear models. For example, because the PVD histogramof the “Peppers” image is convex, as shown in Figs. 4�a�and 4�b�, the Gaussian model was adopted to estimate h�0�.To get the a1 and �1 of Eq. �3�, two bins ��−1,8408� and�−3,5079�� were obtained. The PVD histogram of the“Woman” image is concave, as shown in Figs. 4�c� and4�d�; thus the Laplacian model was adopted. To get the a2and �2 of Eq. �3�, two bins ��−1,14375� and �−2,9544��were used. The PVD histogram of the “Lena” image fol-lowed the Linear model, as shown in Figs. 4�e� and 4�f�.

Second, the histogram reverse-tracing method reflectsthe histogram values of the cover. As mentioned before,he�−7� has to be estimated accurately. Figure 5 shows thedifference ratio between h�−7� and he�−7�. Themeanstandard deviations of the error rates are 4.92.1�100%� and 1.02.0 �50%�. The estimation method in Eq.�4� can successfully estimate he�−7� within a 10% differ-ence when compared to the cover.

Some natural images include the large flat areas, such asthe sky or ocean. In these cases, the PVD histogram of thecover image had abnormally high extreme at h�0�. Figure6�a� shows the test image had a black background wherethe pixel values were almost the same. The wide flat areaindicates that h�0� is much higher than the adjacent histo-gram values, as shown in Fig. 6�b�. Because the adjacenthistogram values from h�−4� to h�−1� are much smallerthan h�0�, the curve-fitting method fails to estimate theoriginal h�0�. However, the histogram reverse-tracingmethod can successfully estimate the h�0� because it isbased on the changes of the histogram values.

Fig. 3 Twelve test images in the BMP format to verify the perfor-mance of the proposed SMs.

July 2010/Vol. 49�7�6

3.248.135.186. Terms of Use: http://spiedl.org/terms

Page 7: Histogram estimation-scheme-based steganalysis defeating ...hklee.kaist.ac.kr/publications/OE(with Joo in 2010).pdf · the specific category.5,6,14,16 Because specific steganalysis

isoattHftc

4SmgFfvtS

Joo et al.: Histogram estimation-scheme-based steganalysis defeating the steganography…

O

For 2048 images, the estimation error of h�0� is depictedn Fig. 7. Although the first curve-fitting method producedome noticeable errors in exceptional cases, the error ratef the second histogram reverse-tracing method was notice-bly more accurate. Although the meanstandard devia-ions of the curve-fitting method are 11.323.4, those ofhe histogram reverse-tracing method are −5.44.0.ence, we adopted the histogram reverse-tracing method

or the estimation of the original value in order to reducehe error rate for all images, including the exceptionalases.

.2 Efficiency of SMsMs are designed to find the artifacts made by a hiddenessage. The embedding process changes the PVD histo-

ram and increases the values of the proposed measures.igure 8 depicts three SMs of the cover and stego imagesor 12 test images in Fig. 3. As expected in Section 3.3, thealues of the cover images were under or around 0, buthose of the stego images were much higher than 0. TheM1 values of the stego images were above 0 due to the

Fig. 4 Estimation of distribution models by theand �e, f� Linear.

ptical Engineering 077001-

Downloaded from SPIE Digital Library on 28 Oct 2010 to 14

fluctuations of the PVD histogram. Because there weresmall fluctuations in the PVD histogram of the cover im-ages, this measure did not show good separation results in alow embedding ratio of �25%. However, when the embed-ding ratio was �50%, the SM1 values of the cover andstego images were clearly separated. SM2 values of thestego images were remarkably higher than 0, while those ofthe cover images were very close to 0. For SM3, he�0� wasestimated through the histogram reverse-tracing method toconsider the exceptional cases. Because he�0� was similarto h�0� of the cover image, the SM3 values of the coverimages were nearly 0. However, the embedding process ofthe modulus PVD steganography increased h��0� and madeSM3 values much greater than 10%. Consequently, threeSMs were increased by the embedding process and veryeffective for detection of the hidden message.

4.3 Classification Performance of the ProposedSteganalysis

To determine whether a suspicious image contains a hiddenmessage or not, the classifier is employed. The two-class

tting method: �a, b� Gaussian, �c, d� Laplacian,

curve-fi

July 2010/Vol. 49�7�7

3.248.135.186. Terms of Use: http://spiedl.org/terms

Page 8: Histogram estimation-scheme-based steganalysis defeating ...hklee.kaist.ac.kr/publications/OE(with Joo in 2010).pdf · the specific category.5,6,14,16 Because specific steganalysis

Spimirc

itarst

Joo et al.: Histogram estimation-scheme-based steganalysis defeating the steganography…

O

VM codes of Matlab from LIBSVM23 can evaluate theresented SMs: SM1, SM2, and SM3. For our tests, allmages were embedded at 10, 25, 50, 75, and 100% of the

aximum embedding capacity. To test the various naturalmages, he�0� was estimated by using the histogrameverse-tracing method, which was more accurate and suc-essfully minimized the errors.

All the images were divided into nonoverlapping train-ng and testing subsets. The SVM classifier was trained byhe values of the three steganalytic measures of 1000 covernd 5000 stego images �1000 stego images per embeddingatio�. The testing subset consisted of 1048 cover and 5240tego images. In Table 3, the detection accuracy values ofhe 100 experiments are summarized. As shown in Table 3,

Fig. 5 Difference ratio between he�−7� and h�−cover image can be estimated within a 10% dimage.

Fig. 6 Exceptional case of the PVD histogram hdistribution model can fit it.

ptical Engineering 077001-

Downloaded from SPIE Digital Library on 28 Oct 2010 to 14

the proposed steganalysis can successfully detect the exis-tence of the hidden message even though the embeddingratio is very low �10%�.

4.4 Accuracy of Estimating the Embedding RatioFigure 9 depicts the estimation results of each embeddingratio. The embedding ratio is estimated in a way describedin Section 2.4. We have generated four stego images foreach image by embedding 25, 50, 75, and 100% of themaximum hiding capacity. The proposed method estimatesthe embedding ratio successfully and can separate the coverimages from the stego images, embedding just 25% of themaximum hiding capacity. Moreover, because the PVD-based method determines the hiding capacity through the

7�−he�−7�� / �min�h�−7� ,he�−7����. h�−7� of thece between the original image and the stego

he wide flat area, such as dark background. No

7� ��h�−ifferen

aving t

July 2010/Vol. 49�7�8

3.248.135.186. Terms of Use: http://spiedl.org/terms

Page 9: Histogram estimation-scheme-based steganalysis defeating ...hklee.kaist.ac.kr/publications/OE(with Joo in 2010).pdf · the specific category.5,6,14,16 Because specific steganalysis

Joo et al.: Histogram estimation-scheme-based steganalysis defeating the steganography…

O

Table 3 Classification performance on the 1048 cover and 5240 stego-images �100 trial�. The pro-posed steganalysis can detect the existence of the hidden message with high accuracy.

Cover Stego 100% Stego 75% Stego 50% Stego 25% Stego 10%

Error �EA� 32/1048 0/1048 0/1048 0/1048 0/1048 31/1048

Correct �%� 97.0 100.0 100.0 100.0 100.0 97.1

Fig. 7 Comparison of the error rate for the estimated he�0� after using the curve-fitting and histogramreverse-tracing method. The error rate was computed by �h�0�−he�0�� / �min�h�0� ,he�0���.

Fig. 8 Values of the steganalytic measures for 12 test images. Because the values of the coverimages are separated from those of the stego images, the proposed measures are effective for de-tecting the existence of the hidden message.

ptical Engineering July 2010/Vol. 49�7�077001-9

Downloaded from SPIE Digital Library on 28 Oct 2010 to 143.248.135.186. Terms of Use: http://spiedl.org/terms

Page 10: Histogram estimation-scheme-based steganalysis defeating ...hklee.kaist.ac.kr/publications/OE(with Joo in 2010).pdf · the specific category.5,6,14,16 Because specific steganalysis

rtm

5TmtFancdStvcda

fcTohPato

bgrsmg

Joo et al.: Histogram estimation-scheme-based steganalysis defeating the steganography…

O

ange table and the proposed method can estimate the rangeable using the fluctuation periods, the length of the hiddenessage bits can be computed by the embedding ratio.

Conclusionhis paper proposed the specific steganalysis to defeatodulus PVD steganography. The changes of the PVD his-

ogram were modeled by analyzing the embedding process.luctuations around the border of the subrange were cre-ted to keep the subrange index to ensure correct commu-ication. Asymmetric property and abnormal increase wereaused when the general modification rule for the minimumistortion was broken. Utilizing these weaknesses, the newMs were designed to successfully detect the existence of

he hidden message. Moreover, because some histogramalues of the stego image were similar to the cover andomputed by the histogram modification model, the embed-ing ratio of the suspicious image was estimated accuratelynd reliably.

The histogram values of the cover image were estimatedrom the suspicious image through the novel HESs: theurve-fitting method and histogram reverse-tracing method.he first curve-fitting method fitted the PVD histogram asne of three well-known distribution models. The secondistogram reverse-tracing method used the changes of theVD histogram. Because the proposed estimation schemesre performed without the cover, it is very useful to applyhe other various fields and has possibilities to be devel-ped.

The proposed specific steganalysis can be used for theasis of the universal steganalysis for the PVD stegano-raphic methods and the advanced steganalysis, such as theestoration of the original messages. In future works, theteganalytic features and the blind histogram estimationethod for the cover image will be researched to detect the

eneral steganographic method.

Fig. 9 Estimated embedding ratio of th

ptical Engineering 077001-1

Downloaded from SPIE Digital Library on 28 Oct 2010 to 14

AcknowledgmentsThis work was partially supported by Defense AcquisitionProgram Administration and Agency for Defense Develop-ment under Contract No. UD060048AD.

References

1. Z. Chen and W. Liu, “Improved LSB matching steganography withhistogram characters reserved,” Proc. SPIE 6837, 68370X �2007�.

2. W. Hong, T.-S. Chen, and C.-W. Shiu, “Steganography using sudokurevisited,” in IITA ’08: Proc. of 2008 2nd Int. Symp. on IntelligentInformation Technology Application, IEEE Computer Society, Wash-ington, DC pp. 935–939 �2008�.

3. M. Shirali-Shahreza and M. Hassan Shirali-Shahreza, “Text stegan-ography in sms,” in ICCIT 07: Proc. of 2007 Int. Conf. on Conver-gence Information Technology, IEEE Computer Society, Washington,DC, pp. 2260–2265 �2007�.

4. E.-J. Farn and C.-C. Chen, “Jigsaw puzzle images for steganogra-phy,” Opt. Eng. 48, 077006 �2009�.

5. J. Fridrich, M. Goljan, and R. Du, “Detecting lsb steganography incolor and gray-scale images,” IEEE Multimedia 8, 22–28 �2001�.

6. A. D. Ker, “Steganalysis of lsb matching in grayscale images,” SignalProcess. Lett. 12, 441–444 �June 2005�.

7. J. J. Harmsen and W. A. Pearlman, “Steganalysis of additive-noisemodelable information hiding,” Proc. SPIE 5020, 131–142 �2003�.

8. I. Avcibas, M. Kharrazi, N. Memon, and B. Sankur, “Image stega-nalysis with binary similarity measures,” EURASIP J. Appl. SignalProcess. 2005, 2749–2757 �2005�.

9. D.-C. Lou, C.-L. Lin, and C.-L. Liu, “Universal steganalysis schemeusing support vector machines,” Opt. Eng. 46, 117002 �2007�.

10. Q. Liu, A. H. Sung, Z. Chen, and J. Xu, “Feature mining and patternclassification for steganalysis of lsb matching steganography in gray-scale images,” Pattern Recogn. 41, 56–66 �2008�.

11. Q. Liu, A. H. Sung, B. Ribeiro, M. Wei, Z. Chen, and J. Xu, “Imagecomplexity and feature mining for steganalysis of least significant bitmatching steganography,” Inf. Sci. 178, 21–36 �2008�.

12. D.-C. Wu and W.-H. Tsai, “A steganographic method for images bypixel-value differencing,” Pattern Recogn. Lett. 24, 1613–1626�2003�.

13. X. Zhang and S. Wang, “Vulnerability of pixel-value differencingsteganography to histogram analysis and modification for enhancedsecurity,” Pattern Recogn. Lett. 25, 331–339 �2004�.

14. V. Sabeti, S. Samavi, M. Mahadavi, and S. Shirani, “Steganalysis andpayload estimation of embedding in pixel differences using neuralnetworks,” Pattern Recogn. 43, 405–415 �2010�.

15. H.-C. Wu, N.-I. Wu, C.-S. Tsai, and M.-S. Hwang, “Image stegano-graphic scheme based on pixel-value differencing and lsb replace-

imal message length for 2048 images.

e max

July 2010/Vol. 49�7�0

3.248.135.186. Terms of Use: http://spiedl.org/terms

Page 11: Histogram estimation-scheme-based steganalysis defeating ...hklee.kaist.ac.kr/publications/OE(with Joo in 2010).pdf · the specific category.5,6,14,16 Because specific steganalysis

1

1

1

1

2

2

22

Joo et al.: Histogram estimation-scheme-based steganalysis defeating the steganography…

O

ment methods,” in IEE Proc. on Vision, Image and Signal Process-ing, Vol. 152, 611–615 �Oct. 2005�.

6. V. Sabeti, S. Samavi, M. Mahdavi, and S. Shirani, “Steganalysis ofpixel-value differencing steganographic method,” in IEEE PacificRim Conf. on Communications, Computers and Signal Processing,PacRim 2007, pp. 292–295 �2007�.

7. J.-C. Liu and M.-H. Shih, “Generalizations of pixel-value differenc-ing steganography for data hiding in images,” Fund. Inform. 83, 319–335 �2008�.

8. K.-C. Chang, C.-P. Chang, P. S. Huang, and T.-M. Tu, “A novelimage steganographic method using tri-way pixel-value differenc-ing,” J. Multimedia 3, 37–44 �2008�.

9. C.-M. Wang, N.-I. Wu, C.-S. Tsai, and M.-S. Hwang, “A high qualitysteganographic method with pixel-value differencing and modulusfunction,” J. Syst. Softw. 81, 150–158 �2008�.

0. T. Zhang and X. Ping, “A new approach to reliable detection of lsbsteganography in natural images,” Signal Process. 83, 2085–2093�2003�.

1. NRCS Photo Gallery Home, �http://www.nrcs.usda.gov� �May 15,2008�.

2. KODAK Image Set, �http://r0k.us/graphics/kodak� �May 15, 2008�.3. C.-C. Chang and C.-J. Lin, LIBSVM: a library for support vector

machines �http://www.csie.ntu.edu.tw/cjlin/libsvm� �November 3,2009�.

Jeong-Chun Joo received his BS in com-puter science from Korea Military Academy,Republic of Korea, in 1996, and MS in com-puter science from Korea National DefenseUniversity, Republic of Korea, in 2002. He iscurrently working toward his PhD at theMultimedia Computing Laboratory, Depart-ment of Computer Science, Korea Ad-vanced Institute of Science and Technology�KAIST�, Daejeon, Republic of Korea. Hismajor interests are steganography, stega-nalysis, and information security.

Heung-Kyu Lee received his BS in elec-tronic engineering from Seoul National Uni-versity, Seoul, Republic of Korea, in 1978,and MS and PhD in computer science fromKAIST, Daejeon, Republic of Korea, in 1981and 1984, respectively. Since 1986, he hasbeen a professor in the Department of Com-puter Science, KAIST. His major interestsare digital watermarking, digital fingerprint-ing, and digital rights management.

ptical Engineering 077001-1

Downloaded from SPIE Digital Library on 28 Oct 2010 to 14

Kyung-Su Kim received his BS in com-puter engineering from Inha University, In-cheon, Republic of Korea, in 2005, and hisMS and PhD, both in computer sciencefrom KAIST, Daejeon, Republic of Korea, in2007 and 2010, respectively. He is now withthe Network Security Research Team, KTNetwork R&D Lab., Daejeon, Republic ofKorea. His research interests includeimage/video watermarking and fingerprint-ing, error concealment methods, informa-

tion security, multimedia signal processing, multimedia communica-tions, and network security.

Hae-Yeoun Lee received his MS and PhDin computer science from KAIST, Daejeon,Republic of Korea, in 1997 and 2006, re-spectively. From 2006 to 2007, he was apostdoctoral researcher at Weill MedicalCollege, Cornell University, Ithaca, NewYork. He is now with Kumoh National Insti-tute of Technology, Republic of Korea. Hismajor interests are digital watermarking, im-age processing, remote sensing, and digitalrights management.

July 2010/Vol. 49�7�1

3.248.135.186. Terms of Use: http://spiedl.org/terms