A Markov Process Based Approach to Effective Attacking JPEG Steganography Yun Q. Shi, Chunhua Chen, Wen Chen New Jersey Institute of Technology Newark, NJ USA 07102 {shi,cc86}@njit.edu Abstract. In this paper, a new steganalysis scheme is presented to effectively detect the advanced JPEG steganography. For this purpose, we first choose to work on JPEG 2-D arrays formed from the magnitudes of JPEG quantized block DCT coefficients. Difference JPEG 2-D arrays along horizontal, vertical and diagonal directions are then used to enhance changes caused by JPEG steg- anography. Markov process is applied to modeling these difference JPEG 2-D arrays so as to utilize the second order statistics for steganalysis. In addition to the utilization of difference JPEG 2-D arrays, a thresholding technique is de- veloped to greatly reduce the dimensionality of transition probability matrices, i.e., the dimensionality of feature vectors, thus making the computational com- plexity of the proposed scheme manageable. The experimental works are pre- sented to demonstrate that the proposed scheme has outperformed the existing steganalyzers in attacking OutGuess, F5, and MB1. 1 Introduction Internet has become an important communication channel since the 90’s of the last century, through which emails, speeches, images and videos are easily transmitted and shared. With image steganography, covert communication through the Internet can al- so be conducted. Steganography is the art and science of “invisible” communication, which is to conceal the very existence of hidden messages. Images have many attributes, which make it suitable for steganography. Images can convey a large size of message. For in- stance, some steganographic method can accomplish a steganographic proportion that exceeds 13% of the image file size [1]. Because the non-stationarity of images, the image steganography is hard to attack. Especially, as the interchange of digital images is frequently used nowadays, image steganography becomes promising. Recently, research in the field of JPEG (Joint Photographic Experts Group) steg- anography has become active as JPEG images are used popularly. Many steg- anographic techniques operating on JPEG images have been published and become publicly available. Most of the techniques in this category modify the LSB (least sig- nificant bit) of the block discrete cosine transform (BDCT) coefficients, which are the
16
Embed
A Markov Process Based Approach to Effective Attacking ...shi/PaperDownload/steganalysis/IHW06.pdf · A Markov Process Based Approach to Effective Attacking JPEG Steganography ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
A Markov Process Based Approach
to Effective Attacking JPEG Steganography
Yun Q. Shi, Chunhua Chen, Wen Chen
New Jersey Institute of Technology
Newark, NJ USA 07102 {shi,cc86}@njit.edu
Abstract. In this paper, a new steganalysis scheme is presented to effectively
detect the advanced JPEG steganography. For this purpose, we first choose to
work on JPEG 2-D arrays formed from the magnitudes of JPEG quantized
block DCT coefficients. Difference JPEG 2-D arrays along horizontal, vertical
and diagonal directions are then used to enhance changes caused by JPEG steg-
anography. Markov process is applied to modeling these difference JPEG 2-D
arrays so as to utilize the second order statistics for steganalysis. In addition to
the utilization of difference JPEG 2-D arrays, a thresholding technique is de-
veloped to greatly reduce the dimensionality of transition probability matrices,
i.e., the dimensionality of feature vectors, thus making the computational com-
plexity of the proposed scheme manageable. The experimental works are pre-
sented to demonstrate that the proposed scheme has outperformed the existing
steganalyzers in attacking OutGuess, F5, and MB1.
1 Introduction
Internet has become an important communication channel since the 90’s of the last
century, through which emails, speeches, images and videos are easily transmitted and
shared. With image steganography, covert communication through the Internet can al-
so be conducted.
Steganography is the art and science of “invisible” communication, which is to
conceal the very existence of hidden messages. Images have many attributes, which
make it suitable for steganography. Images can convey a large size of message. For in-
stance, some steganographic method can accomplish a steganographic proportion that
exceeds 13% of the image file size [1]. Because the non-stationarity of images, the
image steganography is hard to attack. Especially, as the interchange of digital images
is frequently used nowadays, image steganography becomes promising.
Recently, research in the field of JPEG (Joint Photographic Experts Group) steg-
anography has become active as JPEG images are used popularly. Many steg-
anographic techniques operating on JPEG images have been published and become
publicly available. Most of the techniques in this category modify the LSB (least sig-
nificant bit) of the block discrete cosine transform (BDCT) coefficients, which are the
outcomes of block-wise two-dimensional (2-D) DCT followed by quantization using
JPEG quantization table.
In this paper we look at three recent published and most advanced steganographic
methods, i.e., Outguess [2], F5 [1], and the model-based steganography (MB) [3].
OutGuess constructs a universal steganographic framework, which embeds hidden
data using the redundancy of a cover image. For JPEG images, OutGuess preserves
statistics of the BDCT coefficient histogram. Two measures are taken to reduce the
change on the cover image introduced by data embedding. Before embedding, Out-
Guess identifies the redundant BDCT coefficients which have least effect on the cover
image and will be modified if necessary during the data embedding. It also adjusts the
untouched coefficients during the embedding procedure to preserve the original histo-
gram of the BDCT coefficients after embedding.
F5 was developed from Jsteg, F3, and F4. JPEG is the only image format that F5
works with. F5 takes two main actions to increase the security against steganalysis at-
tacks: straddling and matrix coding. Straddling scatters the message as uniformly as
possible over the cover image to equalize the change density. With matrix embedding,
F5 improves the embedding efficiency that is defined as the number of bits embedded
per change of BDCT coefficient. Generally speaking, the smaller the embedding mes-
sage size is, the larger the embedding efficiency of F5 is.
In general, the hidden data may be uncorrelated to the cover image, which is util-
ized by many steganalysis algorithms to attack the data hiding algorithms. MB em-
bedding tries to make the embedded data correlated to the cover image. This is real-
ized by splitting the cover image into two parts, modeling the parameter of the
distribution of the second part given the first part, encoding the second part using the
model and to-be-embedded message, and then combining the two parts to form the
stego image. In embedding method MB1 ([3]), which operates on JPEG images, a
Cauchy distribution is used to model the JPEG BDCT mode histogram. The embed-
ding procedure keeps the lower precision version of the BDCT mode histogram un-
changed.
To attack steganography, some steganalysis schemes have been proposed. There
are two categories, i.e., specific steganalysis and universal steganalysis [4]. Specific
steganalysis concentrates on detecting some particular steganographic tool and has
good performance on this steganographic tool if well designed. Universal steganalysis
yet tries to steganalyze any steganographic tool, known or unknown in advance.
Farid proposed a universal steganalyzer based on image’s high order statistics [5].
Quadrature mirror filters are used to decompose the image into wavelet subbands and
then the high order statistics are calculated for each high frequency subband. The sec-
ond set of statistics is calculated for the errors in an optimal linear predictor of the co-
efficient magnitude. Both sets of statistical moments are used as features for stegana-
lysis. It can achieve generally better detection rate than random guess for universal
steganographic methods.
In [6], Shi et al presented a universal steganalysis system. The statistical moments
of characteristic functions of the image, its prediction-error image, and their discrete
wavelet transform (DWT) subbands are selected as features. All of the low-low wave-
let subbands are also used in their system. This steganalyzer can provide a better per-
formance than [5] in general.
In [7], Fridrich has proposed a set of distinguishing features from the BDCT do-
main and spatial domain aiming at detecting information embedded in JPEG images.
The statistics of the original image are estimated by decompressing the JPEG image
followed by cropping the four rows and four columns on the boundary, and then re-
compressing the cropped image to JPEG format using the original quantization table.
The author claimed that the obtained image has statistical properties very much simi-
lar to that of the cover image. Features for steganalysis are generated from the statis-
tics of the JPEG image and its estimated version. Designed specifically for detecting
JPEG steganography, this scheme performs better than [5, 6] in attacking JPEG steg-
anography [1, 2, 3].
Recently, a specific steganalysis scheme detecting data hidden with spread spec-
trum method is proposed, in which the inter-pixel dependencies are used and a Mark-
ov chain model is adopted [8]. The empirical transition matrix of a given test image is
formed. This matrix has a dimensionality of 256×256 for a grayscale image with a bit
depth of 8. That is, this matrix has 65,536 elements. Obviously, these elements cannot
be straightforwardly used as features. The authors select several largest probabilities
along the main diagonal together with their neighbors, and some randomly selected
probabilities along the main diagonal as features. As a result, some information loss is
inevitable due to the random fashion of feature selection. Furthermore, this method
uses Markov chain only along horizontal direction, which cannot reflect the 2-D na-
ture of digital image.
In this paper, a new steganalysis scheme is presented to effectively detect the ad-
vanced JPEG steganography. First, we choose to work on JPEG 2-D arrays to formu-
late features for steganalysis. Difference JPEG 2-D arrays along horizontal, vertical
and diagonal directions are then used to generally enhance changes caused by JPEG
steganography. Markov process is applied to modeling these difference JPEG 2-D ar-
rays so as to utilize the second order statistics for steganalysis. In addition to the utili-
zation of difference JPEG 2-D arrays, a thresholding technique is developed to greatly
reduce the dimensionality of transition probability matrixes, i.e., the dimensionality of
feature vectors, thus making the computational complexity of the proposed scheme
manageable. The experimental works are presented to demonstrate that the proposed
scheme has outperformed the state-of-the-arts in attacking OutGuess, F5, and MB1.
The rest of this paper is organized as follows. The feature construction procedure is
described in Section 2. In Section 3, support vector machine, the classifier used in our
investigation, is introduced. Experimental results are given in Section 4. Next, some
discussion is made in Section 5. Finally, conclusion is drawn in Section 6.
2 Feature Construction
In this paper, steganalysis is considered as a task of two-class pattern recognition.
That is, a given test image needs to be classified as either a stego image (with hidden
data) or a non-stego image (without hidden data). Therefore, feature construction is a
key step in the steganalysis.
As mentioned in Section 1, modern steganorgraphic methods such as OutGuess and
MB have made great efforts to keep the changes of BDCT coefficients caused by data
hiding as less as possible. In particular, they attempt to keep the changes on the histo-
gram of JPEG coefficients as less as possible. Under these circumstances, we propose
to use the second order statistics as features for steganalysis to detect these JPEG
steganographic methods.
In this section, we first define the JPEG 2-D array, followed by introducing the dif-
ference JPEG 2-D array along different directions. We then propose to model the dif-
ference JPEG 2-D array using Markov random process. According to the theory of
random process, the transition probability matrix can be used to characterize the
Markov process. Our proposed features are derived from the transition probability ma-
trix. In order to achieve an appropriate balance between steganalysis capability and
computational complexity, we use the so-called one-step transition probability matrix
in this work. In order to further reduce computational cost by reducing the dimension-
ality of feature vectors, we resort to a thresholding technique.
2.1 JPEG 2-D Array
Generating features from the exact 8×8 block discrete cosine transform (BDCT) do-
main to attack the steganographic algorithms operating on JPEG images is natural and
reasonable. For this purpose, it is necessary to first study the property of JPEG BDCT
coefficients. Su
Sv
Fig. 1. A sketch of JPEG coefficient 2-D array
For a given image, consider a 2-D array consisting of all of the 8×8 block DCT co-
efficients which have been quantizated with a JPEG quantization table and have not
been zig-zag scanned, run-length coded and Huffman coded. That is, this 2-D array
has the same size as the given image with each 8×8 block filled up with the corre-
sponding JPEG quantized 8×8 block DCT coefficients. Furthermore, we take absolute
value for each DCT coefficient, resulting in a 2-D array as shown in Figure 1. We call
this resultant 2-D array as JPEG 2-D array in this paper. The features proposed in this
scheme are formed from the JPEG 2-D array.
The reason for taking absolute values is discussed below. Note that these JPEG
BDCT quantized coefficients can be either positive, or negative, or zero. It is known
that the BDCT coefficients have been decorrelated effectively. Since the BDCT coef-
ficients in general do not obey Gaussian distribution, however, these coefficients are
not statistically independent of each other. It is also well-known that the power of an
8×8 block of DCT coefficients is highly concentrated in the DC (direct current) and
low-frequency AC (alternative current) coefficients. The JPEG quantization, after
which the majority of high-frequency BDCT AC coefficients may become zero, fur-
ther enhances this disparity in power distribution among quantized BDCT coefficients.
The general trend in power distribution of the BDCT coefficients in each block is non-
increasing along the zig-zag scan order of all of the DCT coefficients in the block if
we ignore some up-and-down of small magnitudes. This is consistent with the fact that
the zig-zag scanning makes the use of run-length coding efficient [9]. Combining the-
se observations, we can state that the magnitude of the non-zero BDCT coefficients is
somehow correlated each other along the zig-zag scan order. Hence, there exists the
correlation among the absolute values of the BDCT coefficients along horizontal, ver-
tical and diagonal directions. This observation can be further justified by observing
Figure 3 shown below. That is, the difference of the absolute values of two immedi-
ately (horizontally in Figure 3) neighboring BDCT coefficients are highly concen-
trated around 0, having a Laplacian-like distribution. The same is true along the verti-
cal and diagonal directions.
In addition, the steganographic methods operating on the JPEG images do not
touch the DCT DC coefficients nor change the sign of the DCT AC coefficients dur-
ing data embedding [2, 3] (note that a DCT coefficient with a non-zero magnitude
changing to zero is not a sign change). Further discussion in this regard is made in
Section 5.1, which shows that taking absolute value results in higher detection rates in
general and lower computational complexity.
2.2 Difference JPEG 2-D Array
According to [6], the disturbance caused by the data embedding manifests itself more
obviously in the prediction-error image than in the original test image. Hence, it is ex-
pected that the disturbance caused by the steganographic methods in JPEG images can
be enlarged by observing the difference between an element and one of its neighbors
in the JPEG 2-D array. For this purpose, we consider the following four difference
JPEG 2-D arrays.
Denote the JPEG 2-D array generated from a given test image by ( , )F u v
( [1, ], [1, ])u vu S v S∈ ∈ , where uS is the size of the JPEG 2-D array in horizontal direc-
tion and vS in vertical direction. Then as shown in Figure 2, the difference arrays are
generated by the following formulae:
( , ) ( , ) ( 1, )hF u v F u v F u v= − + , (1)
( , ) ( , ) ( , 1)vF u v F u v F u v= − + , (2)
( , ) ( , ) ( 1, 1)dF u v F u v F u v= − + + , (3)
( , ) ( 1, ) ( , 1)mdF u v F u v F u v= + − + , (4)
where [1, 1], [1, 1]u vu S v S∈ − ∈ − and ( , ), ( , ), ( , ), ( , )h v d mdF u v F u v F u v F u v denote the
difference arrays in the horizontal, vertical, main diagonal, and minor diagonal direc-
tions, respectively.
_=
2JPEG D Array−
Horizontal
Difference Array (a)
2JPEG D Array− Vertical
Difference Array
_=
(b)
_=
2JPEG D Array−
Main Diagonal
Difference Array
(c)
_=
2JPEG D Array−
Minor Diagonal
Difference Array
(d)
Fig. 2. The generation of four difference JPEG 2-D arrays. Parts (a), (b), (c), and (d) corre-
spond to horizontal, vertical, main diagonal, and minor diagonal difference JPEG 2-D arrays,
respectively
It is observed that the distribution of the elements of the above-described difference
arrays is Laplacian-like. Most of the difference values are close to zero. In our ex-
perimental works reported in this paper, an image set consisting of 7560 JPEG images
with quality factors ranging from 70 to 90 is used. The arithmetic average of the his-
tograms of the horizontal difference JPEG 2-D arrays generated from this JPEG image
set and the histogram of the horizontal difference JPEG 2-D array generated from a
randomly selected image from the set are shown in Figure 3 (a) and (b), respectively.
It is observed that most elements in the horizontal difference JPEG 2-D arrays fall into
the interval [-T, T] as long as T is large enough. The values of mean and standard de-
viation of percentage number of elements of horizontal difference JPEG 2-D arrays
for the image set falling into [-T, T] when T = {1, 2, 3, 4, 5, 6, 7} are shown in Table
1. Both Figure 3 and Table 1 support the claim of Laplacian-like distribution of the
elements of the horizontal difference JPEG 2-D arrays. The same is true for the differ-
ence JPEG 2-D array along other three directions.
Table 1. Mean and standard deviation of percentage numbers of elements of horizontal
difference JPEG 2-D arrays falling within [-T, T] for T = 1, 2, 3, 4, 5, 6, 7