A Project Report On “Review of tools & techniques for steganalysis” SUBMITTED TO ASIAN SCHOOL OF CYBER LAW, PUNE, In partial fulfillment for the award of the degree of CYBER FORENSICS ANALYST Submitted by SACHIN LAWANDE Under the guidance of Prof. ROHAS NAGPAL ASIAN SCHOOL OF CYBER LAW Department of Cyber Forensics 2015-2016 ASCL
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
A Project Report
On
“Review of tools & techniques for steganalysis”SUBMITTED TO ASIAN SCHOOL OF CYBER LAW, PUNE,
In partial fulfillment for the award of the degree of
CYBER FORENSICS ANALYST
Submitted by
SACHIN LAWANDE
Under the guidance of
Prof. ROHAS NAGPAL
ASIAN SCHOOL OF CYBER LAW
Department of Cyber Forensics2015-2016
ASCL
Acknowledgement
Every ASCL student looks toward the CFA project as an opportunity by which he can implement the skill that he has eventually nurtured in the year by hard work dedication the milestone of completing the project would have been intractable without the help of few people who need to be acknowledge.
We owe this moment of satisfactions with a dear sense gratitude to our internal guide Prof. Rohas Nagpal who guided us at every stage. Whose technical support and helpful attitude give us high moral support.
We also take this opportunity to thank all our colleagues who baked our interest by giving useful suggestions and also possible help. At last but not least we are thankful to our friend colleagues and all the people directly or indirectly concerned with this project.
Sachin LawandeCFA (ASCL)
Pune
I
ABSTRACT
Steganography deals with confidentiality and convert communication and today the
techniques for countering this in the context of computer forensics has somewhat fallen
behind. This Report will discuss on how steganography works in data hiding and different
methods and techniques to investigation that data. While this paper is about recovering
encoded data, tools that are used for both steganography and steganalysis, Techniques that
shows hidden data form. These methods will help forensics analyst to get hidden data. We
need to keep stay one step ahead of cyber criminals.
Rapidly growing computer and networking technology coupled with an affected expansion in
communications and information exchange capability within government, public and private
corporations and our own homes has made our world smaller. As a society, we are substantially
more participated in information technologies than ever before. Use of the Internet and
multimedia for communication have become mutual place and have become an integral part of
both business activity and social activity. This has changed how citizens across the world
operate.
The rapid evolution of the Internet and technology has also been somewhat of a “double-edged
sword.” Not only has it delivered a medium for exchanging vast amounts of information and
knowledge for the benefit of men it has also provided a new medium for conducting activities
harmful to mankind. No longer restricted to the bounds of physical space, criminals, terrorists,
have discovered a digital world where they can take benefit of the vast expanse of cyber space to
conceal their activities from the prying eyes of law enforcement and the intelligence. In the pre-
Internet era, criminals often operated under the clock of darkness. Now they operate all time
under the cloak of cyber space” with little concern for being detected, arrested, prosecuted and
convicted because by and large much criminal action goes un-reported. Even when it is reported,
law enforcement is already so overcome with CP investigations they don’t have the time or
assets to investigate other cyber-crimes. This fact is not lost on those people who would use the
Internet for criminal activity or otherwise evil purposes.
ASCL 2015 [CFA] Page 4
Review of tools & techniques for steganalysis
To make matters poorer, criminals are adapting to evolving law enforcement technologies in the
field of cyber forensics by finding new ways to hide their criminal and illegal activities. Law
enforcement forensic experts are start to discover data hiding applications on detained media that
have been used to avoid detection by popular computer cyber forensic tools by hiding a any
digital file inside of another digital multimedia file. This method is called digital steganography.
The rising opportunities of modern communications need the special means of security
specially on computer network. The network data security is becoming more important as the
number of data being swapped on the internet increases. Therefore, the confidentiality and data
integrity are needs to protect against unlawful access and use. This has caused in an explosive
growth of the field of information hiding.
In watermarking applications, the message contains information or data such as owner id and a
digital time stamp, which usually applied for copyright protection.
Fingerprint, the owner of the data set implants a serial number that uniquely identifies the user
of the data set. This adds to copyright information to makes it thinkable to trace any
unauthorized use of the data set back to the user.
1.2 Basic Concepts related to Project
1.2.1 What is Steganography?
Steganography is the art of hiding private or sensitive information within something that
appears to be nothing out to the usual. Steganography is often confused with cryptography
because the two are similar in the way that they both methods are used to guard important
information. The difference between two is that steganography includes hiding information so it
appears that no information is hidden at all. If a person or persons views the object that the
information is hidden inside the digital media of he or she will have no idea that there is any
hidden information, therefore the person will not attempt to decode the infoemation.
What steganography basically does is exploit human perception, human senses are not trained
to look for files that have data inside of them, although this software is available that can do what
ASCL 2015 [CFA] Page 5
Review of tools & techniques for steganalysis
is called Steganography. The most common use of steganography is to hide a file inside another
digital media file.
1.2.2 What is Steganalysis?
Steganalysis is the art of detecting messages hidden using steganography method; this is
equivalent to cryptanalysis applied to cryptography.
The goal of steganalysis is to identify supposed packages, determine whether or not they have a data encrypted into them, and, if possible, recover that data.
Unlike cryptanalysis, where it is obvious that captured data contains a message (though that message is encrypted), steganalysis generally starts with a pile of suspect data files, but little information about which of the files, if any, contain a payload. The steganalyst is usually something of a forensic statistician, and must start by reducing this set of data files (which is often quite large; in many cases, it may be the entire set of files on a computer) to the subset most likely to have been altered.
Detecting Steganography:
The art of detecting Steganography is referred to as Steganalysis.
To put is simply Steganalysis involves identifying the use of Steganography inside of a
file. Steganalysis does not deal with trying to decode the hidden data inside of a file, just
discovering it.
There are many methods and techniques that can be used to detect Steganography such as:
Viewing the file and comparing it to another replica of the file found on the Internet
(Picture file). There are usually multiple replica of images on the internet, so you may want to
look for several of them and try and match the victim file to them. For example if you download
a JPEG and your suspect file is also a JPEG and the two files look almost same apart from the
ASCL 2015 [CFA] Page 6
Review of tools & techniques for steganalysis
fact that one is larger than the other, it is most probable you suspect file has hidden data inside
of it.
1.2.3 Short-time Fourier transform (STFT)? Short-term Fourier transform, is a Fourier-related transform used to determine the sinusoidal frequency and phase content of local sections of a signal as it changes over time. In practice, the procedure for computing STFTs is to divide a longer time signal into shorter segments of equal length and then compute the Fourier transform separately on each shorter segment. This reveals the Fourier spectrum on each shorter segment. One then usually plots the changing spectra as a function of time.
Fig 1.1 An STFT being used to analyze an audio signal across time
Application-
fundamental frequency estimation from spectral peaks cross-synthesis spectral envelope extraction by cepstral smoothing spectral envelope extraction by linear prediction sinusoidal modeling of audio signals sines+noise modeling sines+noise+transients modeling chirplet modeling time-scale modification frequency scaling FFT filter banks
1.2.4 Linear Regression? Linear regression is the most basic and commonly used predictive analysis. Regression estimates are used to describe data and to explain the relationship between one dependent variable and one or more independent variables.
At the center of the regression analysis is the task of fitting a single line through a scatter plot. The simplest form with one dependent and one independent variable is defined by the formula y = c + b*x, where y = estimated dependent, c = constant, b = regression coefficients, and x = independent variable.
Fig 1.2.4 simple linear regression, which has one independent variable
Application – (1) Causal analysis(2) Forecasting an effect(3) Trend forecasting
Other than correlation analysis, which focuses on the strength of the relationship between two or more variables, regression analysis assumes a dependence or causal relationship between one or more independent and one dependent variable
A Support Vector Machine (SVM) is a discriminative classifier formally defined by a separating hyperplane. In other words, given labeled training data (supervised learning), the algorithm outputs an optimal hyperplane which categorizes new examples.
Application-
SVMs can be used to solve various real world problems:
SVMs are helpful in text and hypertext categorization as their application can significantly reduce the need for labeled training instances in both the standard inductive and transductive settings.
Classification of images can also be performed using SVMs. Experimental results show that SVMs achieve significantly higher search accuracy than traditional query refinement schemes after just three to four rounds of relevance feedback
SVMs are also useful in medical science to classify proteins with up to 90% of the compounds classified correctly.
Hand-written characters can be recognized using SVM.
ASCL 2015 [CFA] Page 9
Review of tools & techniques for steganalysis
Chapter 2
REPORT PERCEPTION AND STUDY
2.1 Problem of Definition
Steganography, “covered writing,” is a means of hidden communication that covers a multiplicity of
techniques used to embed data within a cover middle in such a manner that the very existence of the
encoded information is unseen.
Thousands of steganography applications are readily available on the Internet, and most of those are
available as free or shareware, for use by hackers and criminals. Computer security, law enforcement, and
intelligence experts need the ability to both spot the use of digital steganography applications to secure
information and then extract the hidden information. Accordingly, there is much current attention in
steganalysis, or the discovery and extraction of information hidden with digital steganography
applications.
2.2 Objectives This report will emphasis on steganography of graphical or image files. It will describe some
technical features of steganography used by various tools that are specific to certain types of
image files. Following that, steganalysis techniques used to detect the existence of hidden data
from the forensic analyst's point of view will be discussed. Finally, the limitations in steganalysis
will be presented along with the evaluation of some steganalysis tools.
ASCL 2015 [CFA] Page 10
Review of tools & techniques for steganalysis
Chapter 3
TECHNIQUES FOR STEGANALYSIS
3.1 Signature Steganalysis
Steganography methods hide secret data and manipulate the images and other digital image in
ways as to remain invisible to human eye. But hiding data within any digital media using
steganography requires modifications of the media properties that may cause some form of
degradation or unusual features and patterns. These patterns and features may act as signatures
that broadcast the presence of encoded message In signature based attacks are accepted to detect
the existence of hidden messages. It is reported that Jpeg, a data insert steganography stool,
inserts the secret data at the end of JPEG files marker and adds a secure signature of the
program before the secret data. The signature is the following hex code: 5B 3B 31 53 00. The
existence of this signature automatically implies that the image contains a secret data embedded
essentially using Jpeg.
Example
Fig.3.1.1 Image with Hidden Data
ASCL 2015 [CFA] Page 11
Review of tools & techniques for steganalysis
Img.3.1.2 Hex editor view of Image Signature denoted by the red box in the diagram below
These particular steganography applications also encode additional data used for decoding the
secreted information. This data excludes signature bytes that the steganography application uses
to find that the hidden information was encoded by itself (indicated by the red box in the
diagram below), and a hash value representation of the user’s define password (indicated by the
green box).
ASCL 2015 [CFA] Page 12
Review of tools & techniques for steganalysis
3.2 BMP – The Least Significant Bit Technique
A commonly used steganography technique that can be realistic to BMP graphic files is the Least
Significant Bit (LSB) method. As its name shows, the LSB method changes the least significant
bit in the data bytes of the image to encode the unseen data. These bit changes do not cause main
quality lack in the image, mainly for 24-bit BMP files. Sometimes, a steganography can use the
least two significant bits in the bytes to encode the hidden data.
Detection
To illustrate the LSB steganographic method, consider the given BMP image: house.bmpThe LSB steganography method encrypts messages in the LSB of every byte in an image. By doing so, the value of each pixel is altered slightly, but not enough to make major visual changes to the image, even when compared to the original. Same the original carrier file with the same file that has been used by the LSB method in a hex editor shows a modification in some byte values. Notice in the figure 1.4 and 1.5 below that the emphasized byte values differ in value by one.
Example
Img.3.2.1 Data.bmp
ASCL 2015 [CFA] Page 13
Review of tools & techniques for steganalysis
Hex editor comparison
Img.3.2.2 Data.jpg (without steganography)
Img.3.2.3 Data.jpg (with steganography)
Mining hidden information that has been embedded using the LSB method involves defining
the number of bits used for encoding. After mining the encoding bits, they must be reassembled
to create the hidden information. Some steganography applications work various randomization
techniques for reassembling the encoded bits. For honest embedding, simply reconstruct eight
bits into each byte of the hidden data.
ASCL 2015 [CFA] Page 14
Review of tools & techniques for steganalysis
3.3 Specific Statistical steganalysis
Steganography embeds top-secret messages in images; this causes modifications in the statistics
of an image. Statistical steganalysis, as the name implies, examines this underlying statistics of
an image to detect the secret embedded data. Statistical steganalysis is measured powerful than
signature steganalysis because mathematical methods are more sensitive than visual awareness.
Specific statistical steganalysis can be categorized based on data hiding techniques i.e in spatial
domain and transform domain.
i) Spatial domain steganalysis a) Chisquare Attack
The first ever statistical steganalysis was projected by Westfeld and Pfi. This method is specific
to LSB embedding and is based on influential first order statistical analysis rather than visual
review. The technique identifies Pairs of Values (POVs) which involve of pixel values,
quantized DCT coefficients or palette catalogs that get mapped to one another on LSB flipping.
After message embedding, the total number of incident of two members of certain POV remains
same. This concept of pair wise needs is exploited to design a statistical Chi-square test to detect
the hidden messages. The reported results show that this method constantly detects sequentially
embedded messages. Later, the method was general to detect randomly scattered messages.
b) RQP (Raw Quick Pair)
Another specific steganalysis technique for detecting LSB embedding in 24-bit color images—
the Raw Quick Pair (RQP) method is offered by Fridrich. The method is based on analyzing
close pairs of colors formed by LSB embedding. It has been shown that the ratio of close color
to the total number of unique color rises significantly when a message of a selected length is
embedded in a cover image relatively than in a stego image. It is this difference that enables to
discriminate between cover images and stego images for the case of LSB steganography. The
method works constantly well as long as the number of unique color in the cover image is less
ASCL 2015 [CFA] Page 15
Review of tools & techniques for steganalysis
than 30% of the number of pixels. As reported the method has higher detection rate than the
technique given by Westfeld and Pfitzmann but cannot be useful to grayscale images.
c) RS Steganalysis
A more sophisticated method RS steganalysis (Regular & Singular group) is offered by Fridrich
et al for recognition of LSB embedding in color and grayscale images. This method utilizes
sensitive dual statistics resultant from spatial correlations in images. The image is separated into
disjoint groups of fixed shape. Within each group noise is calculated by the mean absolute value
of the alterations between adjacent pixels. Each group is classified as ―regular or ―singular
reliant on whether the pixel noise within the group is improved or after flipping the LSBs of a
fixed set of pixels within each group using a ―mask‖. The ordering is repeated for a dual type
of flipping. Theoretical analysis and experimentation show that the amount of regular and
singular group’s forms curves quadratic in the amount of information embedded by the LSB
method. RS steganalysis is more consistent than Chi-square method .
d) B-Spline fitting
Shunquan Tan and Bin Li claims that there is no targeted steganalysis contrary to EALSBMR.
They proposed that B-Spline can be used to fit the histogram to eliminate the pulse distortion
caused by settling phase of EALSBMR. This method can correctly estimate the threshold used
in the secret data embedding procedure and divide the stego images with unit block size from
those with block sizes greater than 1.
ii) Transform domain steganalysis
a) Chi-square statistics
Zhang and Ping have suggested an attack on sequential JSteg and random JSteg for JPEG
images. The method is based on the statistical model of DCT coefficients. It is experiential
that the quantized DCT coefficients of JPEG images allocate symmetrically around zero in
ASCL 2015 [CFA] Page 16
Review of tools & techniques for steganalysis
clean images. These deliverys are changed owing to the message embedding; sequential or
random. Chi-square statistics of stego image are considered and an inequality equation is
used to judge the occurrence of hidden message. The embedding ratio is also deliberate.
The technique is simple and very operative.
b) Histogram Analysis Attack
Histograms analysis attack works on JPEG successive and pseudo-random embedding type
stegosystems, such as JSteg and Outguess 0.1. It can effectively approximation the length
of the message embedded and it is based on the harm of histogram symmetry after
embedding. X. Yu et al proposed a commanding steganalysis method specific for JSteg
steganography in JPEG file format. In this technique the cover image histogram of DCT
constants is assessed from the stego image histogram. This estimation is more correct than
Fridrichs cropping method.
c) Calibration Technique
Fridrich offers a feature-based steganalytic method which is joint with the concept of
calibration for JPEG images. First and second order features are analyzed both in DCT and
spatial domain like comprehensive DCT coefficient histogram, dual histograms, blockiness,
co-occurrence matrix. In order to evaluation the cover image we take into account how
JPEG works. Based on the fact that JPEG images have a block structure of 8x8 blocks and
are formed by quantized DCT constants, which tend to be robust to small distortions such
as density and embedding, we can estimate the cover image. Thus, by decompressing and
recompressing an image with changed block structure we can estimate the cover image.
This is done by using the following calibration methods on the stego image.
Decompress the stego image using its quantization table.
Crop the decompressed stego image by 4 pixels, either column-wise or row-wise or at the
edges.
Compress the cropped image using the same quantization table.
ASCL 2015 [CFA] Page 17
Review of tools & techniques for steganalysis
3.4 Text Based steganalysis
The usage of text media, as a protection channel for secret communication, has drawn more
attention. This attention in turn creates growing concerns on text steganalysis. At present, it is
harder to find secret messages in texts associated with other types of multimedia files, such as
image, video and audio. In general, text steganalysis feats the fact that embedding information
usually changes some statistical possessions of stego texts; therefore it is vital to perceive the
modifications of stego texts. Previous work on text steganalysis could be unevenly classified into
three classes: format- based , invisible character-based and linguistics, separately. Different from
the former two categories, linguistic steganalysis attempts to detect secret messages in natural
language texts. In the case of linguistic steganography, lexical, syntactic, or semantic things of
texts are operated to conceal information while their meanings are conserved as much as
possible. Due to the diversity of syntax and the polysemia of semantics in natural language, it is
hard to observe the changes in stego texts. So far, many linguistic steganalysis methods have
been proposed. In these methods, special structures are designed to extend semantic or
syntactical changes of stego texts. For example, Z.L. Chen et al. designed the N-window mutual
information matrix as the recognition feature to detect semantic steganography algorithms.
Furthermore, they used the word entropy and the change of the word location as the semantic
features, which enhanced the detection rates of their methods. Similarly, C.M. Taskiran et al
used the probabilistic context-free grammar to design the unique structures in order to attack on
syntax steganography algorithms. In the work declared above, designed features powerfully
affect the final presentations and they can merely disclose local properties of texts. Accordingly,
when the size of a text is large enough, alterations between Natural texts (NTs) and Stego texts
(STs) are evident, thus the detection acts of the mentioned methods are acceptable. Whereas,
when the sizes of texts become small, the detection rates decrease vividly and can not be
satisfied for applications. In addition, some steganographic tools have been improved in the
features of semantic and syntax for better camouflage. Therefore, linguistic steganalysis still
needs further research to determination these problems. Some more work on Text Steganalysis
has been deliberated below.
ASCL 2015 [CFA] Page 18
Review of tools & techniques for steganalysis
A. Linguistic Steganalysis Based on Meta Features and Immune Mechanism
Linguistic steganalysis depends on effective discovery structures due to the diversity of syntax
and the polysemia of semantics in ordinary language processing. This paper presents a novel
linguistics steganalysis way based on meta types and immune clone mechanism. Firstly, meta
functions are used to signify texts. Then resistant clone tool is exploited to select appropriate
features so as to found effective detectors. Our approach employed meta forms as detection
features, which is an reverse view from the past literatures. Moreover, the immune training
process covers of two phases which can identify individually two kinds of stego texts. The
constituted detectors have the talented of blind steganalysis to a certain extent. Experiments
show that the proposed approach gets improved performance than typical existing methods,
especially in detecting short texts. When sizes of texts are kept to 3kB, detection accuracies have
exceeded 95.
B. Research on Steganalysis for Text Steganography Based on Font Format
In the study area of text steganography, algorithms based on font format have benefits of great
capacity, good imperceptibility and wide use range. However, little work on steganalysis for
such algorithms has been stated in the literature. Based on the fact that the statistic functions of
font format will be changed after using font-format-based steganographic algorithms, we extant
a novel Support Vector Machine-based steganalysis algorithm to detect whether hidden
information exists or not. This algorithm can not only efficiently detect the existence of hidden
information, but also guess the hidden information length according to differences of font
attribute value. As shown by experimental results, the finding accuracy of our algorithm reaches
as high as 99.3 percent when the hidden data length is at least 16 bits.
ASCL 2015 [CFA] Page 19
Review of tools & techniques for steganalysis
3.5 Audio steganalysis Algorithms
Audio steganalysis is very problematic due to the existence of advanced audio steganography
schemes and the very environment of audio signals to be high-capacity data streams necessitates
the need for scientifically stimulating statistical analysis.
A. Phase and Echo Steganalysis
Zeng et. al proposed steganalysis algorithms to detect phase coding steganography based on the
analysis of phase discontinuities and to detect echo steganography based on the statistical
moments of peak frequency. The phase steganalysis algorithm explores the fact that phase
coding corrupts the extrinsic continuities of unwrapped phase in each audio segment, causing
changes in the phase difference. A statistical analysis of the phase difference in each audio
segment can be used to monitor the change and train the classifiers to differentiate an embedded
audio signal from a clean audio signal. The echo steganalysis algorithm statistically analyzes the
peak frequency using short window extracting and then calculates the eighth high order center
moments of peak frequency as feature vectors that are fed to a support vector machine, which is
used as a classifier to distinguish between audio signals with and without data.
B. Universal Steganalysis based on Recorded Speech
Johnson et. al projected a generic universal steganalysis algorithm that bases it study on the
statistical regularities of recorded speech. Their statistical model decomposes an audio signal
(i.e., recorded speech) using basis functions localized in both time and frequency areas in the
form of Short Time Fourier Transform (STFT). The spectrograms composed from this
decomposition are examined using non-linear support vector machines to differentiate between
cover and stego audio signals. This approach is probable to work only for high-bit rate audio
steganography and will not be operative for detecting low bit-rate embedding’s.
ASCL 2015 [CFA] Page 20
Review of tools & techniques for steganalysis
C. Use of Statistical Distance Measures for Audio Steganalysis
H. Ozer et. al measured the distribution of various statistical distance measures on cover audio
signals and stego audio signals vis--vis their types without noise and observed them to be
statistically dissimilar. The authors employed audio excellence metrics to capture the anomalies
in the signal introduced by the embedded data. They designed an audio steganalysis that relied
on the choice of audio excellence measures, which were tested reliant on their perceptual or non-
perceptual nature. The selection of the proper features and quality measures was shown using
the
(i) ANOVA test to determine whether there are any statistically significant alterations between
available conditions and the
(ii) SFS (Sequential Floating Search) algorithm that reflects the inter-correlation between the test
features in ensemble.
Subsequently, two classifiers, one based on linear regression and other based on support vector
machines were used and also concurrently evaluated for their capability to detect stego messages
embedded in the audio signals. The features selected using the SFS test and estimated using the
support vector machines produced the best outcome. The perceptual- domain measures measured
StegSpy is a program always in growth. The latest version includes allows documentation of a
“steganized” file. StegSpy will notice steganography and the program used to hide the message.
The latest version also identifies the position of the hidden content as well. StegSpy currently
identifies the following programs
Hiderman
Masker
Invisible Secrets
JPEGx
StegSpy is a software tool designed to detect the incidence of data that has been hidden using steganography. Steganography is a Method used to embed hidden data within another file. The file containing the data, or carrier file, serves as an safe medium used to covertly transport the underlying data, or payload. When join, these two form the staged file. The process of detecting steganography is called steganalysis. StegSpy conducts steganalysis by locating specific hexadecimal byte patterns within the raw data of supposed staged files to determine if those files contain hidden.Example StegSpy’s main border consists of an Information window and a Run button, as
depicted in
ASCL 2015 [CFA] Page 29
Review of tools & techniques for steganalysis
Table.4.3.2: User Interface
Table.4.3.3: Steganalysis Result
ASCL 2015 [CFA] Page 30
Review of tools & techniques for steganalysis
after StegSpy inspects a file, the results of the analysis will appear in the Information window.
If StegSpy identifies the inspected file as a stegoed file, the program used to embed the hidden
data will be known in the Information window along with the offset location of the detected
signature within the stegoed file. If StegSpy recognizes the examined file as a clean file, the
path of the file will appear in the Information window along with the message “Sorry, no Steg
found.” Figures 4.3.2 and 4.3.3 depict the results of a positive analysis and a negative analysis
individually.
ASCL 2015 [CFA] Page 31
Review of tools & techniques for steganalysis
4.4 VSL application
A lot of applications dedicated to steganography are simply command line tools, which limits
their usage. Also, most of them tool only one technique - commonly some LSB tool variation.
Similar situation goes for steganalysis applications.
On the other hand, VSL provide easy to use, yet power full framework to use many methods at
the same time. Since VSL is a graphical block diagramming tool, it allows compound processing
that can be performed in both batch and parallel form (see screenshots). Also, it can be operated
even by moderately inexperienced users as it provides legible graphical user interface
(conforming with drag-and-drop technology).
Besides its GUI, application delivers several ready-to-use steganographic and steganalysis
techniques. Data can be unseen with basic Least Significant Bit (LSB) method, with more
advanced Karhunen-Loeve Transform (KLT) metod or by F5 algorithm, which uses DCT
transformation in JPEG files. For steganalysis two advanced tool can be used. First, RS-
Analysis: efficient steganalysis for LSB methods - and the second one - Binary Similarity
Measures (BSM) method with Support Vector Machines (SVMs) classifier: blind steganalysis
(universal) technique, which can be used to find any kind of steganography.
VSL covers also many other modules - several distortion techniques, which can be used to test
conflict of steganographic technique. Program has built-in modules, which helps with research,
reports, file handling, image analysis etc.
Free and open source software
Application is licensed under GNU GPLv3 license, which is involved within distribution
package. Anyone is free to use, accept and share this software free of charge, as long as the
license is not violated.
ASCL 2015 [CFA] Page 32
Review of tools & techniques for steganalysis
Platform independent
Virtual Steganographic Laboratory is coded in Java, so it is cross-platform software and it can be
performed on any operating system, which has Java (1.5 or later version is required).
Example
Fig 4.4.1 VSL Tool Snapshot
ASCL 2015 [CFA] Page 33
Review of tools & techniques for steganalysis
Fig 4.4.2 VSL Tool Snapshot
Fig 4.4.3 VSL Tool Snapshot
ASCL 2015 [CFA] Page 34
Review of tools & techniques for steganalysis
4.5 Ben-4D Steganalysis
Quick and correct identification of stego-carrier files from a crew of files. A generalisation of
the basic principles of Benford’s Law distribution is applied on the doubtful file in order to
Figure 4.5 Hit Rates Comparison among ‘Ben-4D’ and other tools.
Ben-4D is a tool to be used when images are saved through carving and where the metadata of the pictures in question may be missing or unreliable for whatever reason, e.g. examining partially recovered images for steganography.
ASCL 2015 [CFA] Page 35
Review of tools & techniques for steganalysis
Figure 4.5.2 Ben-4D’s hit rates and false positives (hidden data: 1Kb)
Before testing ‘Ben-4D’, stego-carrier files were separated into three different groups of five hundred original files each. For each group JPHSWin, Camouflage and Invisible Secrets were used to embed the minimum in size, file possible which was an ASCII txt file (1Kb).
ASCL 2015 [CFA] Page 36
Review of tools & techniques for steganalysis
Illustration
Example 1-
James was arrested for violating child pornography laws of the USA. He is using steganography technique to hide kiddie porn in greeting cards. He is sending though images using yahoo mail to his clients.
Image-
Encrypted Image with child pornography contents
ASCL 2015 [CFA] Page 37
Review of tools & techniques for steganalysis
How to use Steganalysis to decode data?
Solution-
Step 1 - Use Stegdetect tool to analyze Data is hidden inside image or not.
- Open tool- Open Christmas.png- Analyze
Figure 5.1 Stegdetect analyze of hidden data
The Figure 5.1 showing Hidden stego-image. That Means data is hidden inside image. Now we can move to our next step.
ASCL 2015 [CFA] Page 38
Review of tools & techniques for steganalysis
Step 2 - Use Digital Invisible Ink Toolkit (DIIT) to get encoded data.
- Run Digital invisible ink toolkit
- Select Decode tab
- Open Get Image and select encoded image Hidden.png
Figure 5.2- DIIT Open Encoded Image
ASCL 2015 [CFA] Page 39
Review of tools & techniques for steganalysis
Step 3 – Try Different steganography algorithms to decode data.
- Try all algorithms one by one- Click on ok
Figure 5.3 Select and try steganography algorithm one by one
ASCL 2015 [CFA] Page 40
Review of tools & techniques for steganalysis
Step 4 – Final step Save Hidden Data
- Select save location in your computer- Put file name and save
Figure 5.4 Save Hidden data file
Decoded Child pornographic content-
ASCL 2015 [CFA] Page 41
Review of tools & techniques for steganalysis
Example 2 -
The al Qaeda terrorists used the internet in public place and send messages via public e-mail. The secret communication about their activity often discusses using steganography. They are hiding their messages inside normal images.
Figure 5.5 Image with hidden communication data
Solution-
Step 1- Analyze the image using DIIT tool for stegnolysis.
- Open DIIT- Select Analysis tab- Click on stegnolysis- Go
ASCL 2015 [CFA] Page 42
Review of tools & techniques for steganalysis
Figure 5.6 Analysis of Image data is hidden or not
Step 2 – Star Decode Image
Figure 5.7 Select algorithm for Decode
ASCL 2015 [CFA] Page 43
Review of tools & techniques for steganalysis
Figure 5.8 Save Hidden File
Figure 5.8 Decoded Massage file
ASCL 2015 [CFA] Page 44
Review of tools & techniques for steganalysis
CONCLUSION
From the information that has been presented in this report, it would be hard to come to a firm
conclusion concerning the state of steganalysis tools. Since it is not a widespread research with
large amounts of data sets, it would be debatable if such a conclusion is made. However, it can
be said that steganalysis is not as straight forward or suitable as steganography. This translates to
a great deal of benefit for those who hide secrets using steganography. And a huge difficulty for
the forensic analysts, who has the challenge of detecting and recovering the hidden messages
without destroying it. Furthermore, it is also specious that steganalysis fails when such tools are
applied to detect steganographic techniques it wasn't intended to detect. It has also been observed
that, false positives are also likely when generic techniques are used to detect factors such as
casualness of LSB. Perhaps with more data and research, these tools can be enhanced to be more
real and accurate. As steganography techniques are easily available in different varieties for
anyone who propose to keep or communicate secrets, and with the emerging signs of its use in
various arenas, forensic analysts face new tests in their investigations. Criminals would indeed
exploit every chance available to ensure the success of their plans. This could involve mass
circulation of terror plans over the Internet or even more covert means of transmitting and
storing banned content on portable storage devices.
ASCL 2015 [CFA] Page 45
Review of tools & techniques for steganalysis
REFERENCES
[1] N.F.Johnson, S.Jajodia, Traveling steganography: seeing the unseen, IEEE Computers,
Feb 1998, Page(s):26–34.
[2] Steganography software tools,
http://members.tripod.com/steganography/stego/software.html[Accessed on 12
Jan2013].
[3] A.Westfeld, F5-A steganographic algorithm: high capacity despite better steganalysis,
Proceedings of Fourth International Workshop on Information Hiding, April 2001, Page(s):
289–302.
[4] W.-N. Lie and L.-C. Chang, Data hiding in images with adaptive numbers of least
significant bits based on human system, in Proc., IEEE Int. Conf. Image Processing, 1999,
Page(s): 286–290.
[5] Y. K. Lee and L. H. Chen, High capacity image steganographic model, Proc. Inst. Elect.
[8] F. A. P. Petitcolas, R. J, and M. G. Kuhn, ―Information hiding—A survey, Proc. IEEE, vol. ‖87, no. 7, Jul.1999, Page(s) 1062–1078.
[9] N.F. Johnson, S. Jajodia, Steganalysis of images created using current steganography
software, in: Lecture Notes in
[10] R.Chandramouli, Li Grace, Nasir Memon, Adaptive steganography, in: Proc. SPIE,
Security of Multimedia Contents IV, San Jose, CA, vol. 4675, 2002, pp. 69–78.
ASCL 2015 [CFA] Page 46
Review of tools & techniques for steganalysis
I. GLOSSARY
Term Definition
least significant bit(LSB)
In computing, the least significant bit (LSB) is the bit position in a binary integer giving the units value, that is, determining whether the number is even or odd..
Most significant bit(MSB)
In computing, the most significant bit (MSB, also called the high-orderbit) is the bit position in a binary number having the greatest value. TheMSB is sometimes referred to as the left-most bit due to the convention in positional notation of writing more significant digits further to the left.
Semantic Business
Vocabulary and Rules
The SBVR defines the vocabulary and rules for documenting the semantics of business vocabularies, business facts, and business rules; as well as an XMI schema for the interchange of business vocabularies and business rules among organizations and between software tools
Class Model
A class diagram is a type of static structural diagram that describes the structure of a system by showing the real time entities in business, their attributes, operations (or methods), and the relationships among the classes.
Use case Model
Use case model or diagram also static structural diagram that represent the interaction between end user (Actor) to the system under consideration
Software Requirement Specification
A software requirements specification (SRS) is a complete description of the behavior of a system to be developed which includes all the necessary requirement for system development
XML Metadata
Interchange (XMI)
The XML Metadata Interchange (XMI) is standard for exchanging metadata information via Extensible Markup Language (XML). The most common use of XMI is as an interchange format for UML models, although it can also be used for serialization of models of other languages (Meta models).
MD5 The MD5 message-digest algorithm is a widely used cryptographic hash function producing a 128-bit (16-byte) hash value, typically expressed in text format as a 32 digit hexadecimal number. MD5 has been utilized
ASCL 2015 [CFA] Page 47
Review of tools & techniques for steganalysis
in a wide variety of cryptographic applications, and is also commonly used to verify data integrity.
II. ABREVATIONS
Acronym Definition
LSB least significant bit
MSB Most significant bit
RQP Raw Quick Pair
SBVR Semantic Business Vocabulary and Rules
SRS Software Requirement Specification
TC Test Cases
UML Unified Modeling Language
XMI XML Metadata Interchange
MD5 Message-Digest algorithm 5
ASCL 2015 [CFA] Page 48
Review of tools & techniques for steganalysis
III. LIST OF FIGURES
FIGURE NO. DESCRPTION THE FIGURE PAGE NO
1.1 Software Analysis Process 02
3.1 Image with Hidden Data 06
3.2Hex Editor view
07
3.3 Data.bmp 08
3.4 Data.bmp without steganography 09
3.5 Data.bmp with steganography 10
4.1 Stego Image Snapshot 20
4.2 DIIT Snapshot 21
4.3 Importing the Text File as Input to System 22
4.4 VSL Tool Snapshot 1 23
4.5 VSL Tool Snapshot 2 24
ASCL 2015 [CFA] Page 49
Review of tools & techniques for steganalysis
IV. LIST OF TABLES
FIGURE NO DESCRPTION OF TABLE PAGE NO.
4.1 MD5 HASH TABLE 21
4.2 Keywords and Phrases for Logical Formulations 24