AN OPTIMAL DATA HIDING SCHEME WITH TREE- BASED PARITY
CHECKABSTRACT Steganography is defined as the science of hiding or
embedding data in a transmission medium. The word Steganography is
originally made up of two Greek words which mean Covered Writing.
Steganalysis is the science of attacking Steganography. Stenography
studies the scheme to hide secrets into the communication between
the sender and the receiver such that no other people can detect
the existence of the secrets. A steganographic method consists of
an embedding algorithm and an extraction algorithm. The embedding
algorithm describes how to hide a message into the cover object and
the extraction algorithm illustrates how to extract the message
from the stego object. A commonly used strategy for steganography
is to embed the message by lightly distorting the cover object into
the target stego object. If the distortion is sufficiently small,
the stego object will be indistinguishable from the noisy cover
object. Therefore, reducing distortion is a crucial issue for
steganographic methods. we propose an efficient embedding scheme
that uses the least number of changes over the tree-based parity
check model, by using (Majority parity check) MPC is the algorithm
instead of TBPC method we are going to get less distortion in stego
image. Keeping height of the tree constant we can embed more data
when compare to TBPC method. By introducing the majority vote
strategy, we effectively construct the stego object with least
distortion under the tree structure model We also show that our
method yields a binary linear stego-code. In comparison with the
TBPC method, our method significantly reduces the number of
modifications on average.TABLE OF CONTENTS
CHAPTER
NO. TITLE PAGE
NO.
ABSTRACT
LIST OF FIGURES
LIST OF SYMBOLS
LIST OF ABBREVIATIONS
LIST OF TABLES
1.CHAPTER 1 : INTRODUCTION
1.1 GENERAL
1.1.1THE IMAGE PROCESSING SYSTEM
1.1.2 IMAGE PROCESSING FUNDAMENTAL
1.2 OBJECTIVE
1.3 EXISTING SYSTEM
1.3.1EXISTINGSYSTEMDISADVANTAGES
1.3.2 LITERATURE SURVEY
1.4 PROPOSED SYSTEM
1.4.1 PROPOSED SYSTEM ADVANTAGES
2. CHAPTER 2 : AN OPTIMAL DATA HIDING SCHEME WITH
TREE- BASED PARITY CHECK
2.1 GENERAL
2.1.1 STEGANOGRAPHY
2.1.CLASSIFICATION OF STEGANOGRAPHY TECHNIQUES2.1.3
WATERMARKING
2.2 PRINCIPLE OF DIGITAL WATERMARKS2.3 STRUCTURE OF A DIGITAL
WATERMARK
2.4 THE IMPORTANCE OF DIGITAL WATERMARKS
2.5 THE PURPOSES OF DIGITAL WATERMARKS 2.6 DIGITAL WATERMARK
TYPES AND TERMS
2.7 EFFECTIVE DIGITAL WATERMARKS
2.8 PROBLEM DEFINATION2.9 METHODOLOGIES
2.9.1 MODULES NAME
2.3.2 MODULES DESCRIPTION2.3.3 GIVEN INPUTAND EXPECTED
OUTPUT2.10 TECHNIQUE OR ALGORITHM
3.CHAPTER 3 : REQUIREMENTS
3.1 GENERAL
3.2 HARDWARE REQUIREMENTS
3.3 SOFTWARE REQUIREMENTS
4.CHAPTER4 :SOFTWARE SPECIFICATION 5.1 general
5.2 features of matlab
5.2.1 INTERFACING WITH OTHER LANGUAGES
5.2.2 ANALYZING AND ACCESSING DATA
5.2.3 PERFORMING NUMERIC COMPUTATION
5.CHAPTER 5 : IMPLEMENTATION
6.1 GENERAL
6.2 IMPLEMENTATION CODING
6.CHAPTER 6 : SNAPSHOTS
7.1 SNAPSHOTS
7.CHAPTER 7:APPLICATION AND FUTURE ENCHANCEMENT
9.1 GENERAL
9.2 APPLICATIONS
9.3 FUTURE ENHANCEMENTS
8.CHAPTER 8 :
10.1CONCLUSION
10.2 REFERENCES
LIST OF FIGURES
FIGURE NONAME OF THE FIGUREPAGE NO
1.1 A BLOCK DAIGRAM FOR IMAGE PROCESSING SYSTEM
1.2BLOCK DIAGRAM OF FUNDAMENTAL SEQUENCE INVOLVED IN AN IMAGE
PROCESSING SYSTEM
1.3 IMAGE PROCESSING TECHNIQUES
1.4GENERAL METHOD FOR FEATURE BASED WATERMARKING.
2.1BASIC BLOCK DIAGRAM OF STEGANOGRAPHY.
2.2CLASSIFICATION OF STEGANOGRAPHY TECHNIQUES.
2.9TREE FORMATION
2.9.1MASTER TREE FORMATION
2.9.2OPTIMIZED TREE
LIST OF ABBREVIATIONS
NMR - NUCLEAR MAGNETIC RESONANCE.
TBPC - TREE BASED PARITY CHECKING.BER - BIT ERROR RATE.
D/A - DIGITAL-TO-ANALOG
A/D - ANALOG- TO-DIGITAL
PSNR - PEAK SIGNAL NOISE RATIO.
MPC - MAJORITY PARITY CHEC KING. CHAPTER 1 INTRODUCTION
1.1 GENERAL The term digital image refers to processing of a two
dimensional picture by a digital computer. In a broader context, it
implies digital processing of any two dimensional data. A digital
image is an array of real or complex numbers represented by a
finite number of bits. An image given in the form of a
transparency, slide, photograph or an X-ray is first digitized and
stored as a matrix of binary digits in computer memory. This
digitized image can then be processed and/or displayed on a
high-resolution television monitor. For display, the image is
stored in a rapid-access buffer memory, which refreshes the monitor
at a rate of 25 frames per second to produce a visually continuous
display.
1.1.1 THE IMAGE PROCESSING SYSTEM
DIGITIZER A digitizer converts an image into a numerical
representation suitable for input into a digital computer. Some
common digitizers are
Microdensitometer
Flying spot scanner
Image dissector
Videocon camera
Photosensitive solid- state arrays. IMAGE PROCESSOR An image
processor does the functions of image acquisition, storage,
preprocessing, segmentation, representation, recognition and
interpretation and finally displays or records the resulting image.
The following block diagram gives the fundamental sequence involved
in an image processing system.
As detailed in the diagram, the first step in the process is
image acquisition by an imaging sensor in conjunction with a
digitizer to digitize the image. The next step is the preprocessing
step where the image is improved being fed as an input to the other
processes. Preprocessing typically deals with enhancing, removing
noise, isolating regions, etc. Segmentation partitions an image
into its constituent parts or objects. The output of segmentation
is usually raw pixel data, which consists of either the boundary of
the region or the pixels in the region themselves. Representation
is the process of transforming the raw pixel data into a form
useful for subsequent processing by the computer. Description deals
with extracting features that are basic in differentiating one
class of objects from another. Recognition assigns a label to an
object based on the information provided by its descriptors.
Interpretation involves assigning meaning to an ensemble of
recognized objects. The knowledge about a problem domain is
incorporated into the knowledge base. The knowledge base guides the
operation of each processing module and also controls the
interaction between the modules. Not all modules need be
necessarily present for a specific function. The composition of the
image processing system depends on its application. The frame rate
of the image processor is normally around 25 frames per second.
DIGITAL COMPUTER Mathematical processing of the digitized image
such as convolution, averaging, addition, subtraction, etc. are
done by the computer. MASS STORAGE The secondary storage devices
normally used are floppy disks, CD ROMs etc. HARD COPY DEVICE The
hard copy device is used to produce a permanent copy of the image
and for the storage of the software involved. OPERATOR CONSOLE The
operator console consists of equipment and arrangements for
verification of intermediate results and for alterations in the
software as and when require. The operator is also capable of
checking for any resulting errors and for the entry of requisite
data.1.1.2 IMAGE PROCESSING FUNDAMENTAL
Digital image processing refers processing of the image in
digital form. Modern cameras may directly take the image in digital
form but generally images are originated in optical form. They are
captured by video cameras and digitalized. The digitalization
process includes sampling, quantization. Then these images are
processed by the five fundamental processes, at least any one of
them, not necessarily all of them. IMAGE PROCESSING TECHNIQUES
This section gives various image processing techniques.
FIG1.3: IMAGE PROCESSING TECHNIQUES
IMAGE ENHANCEMENT
Image enhancement operations improve the qualities of an image
like improving the images contrast and brightness characteristics,
reducing its noise content, or sharpen the details. This just
enhances the image and reveals the same information in more
understandable image. It does not add any information to it. IMAGE
RESTORATION
Image restoration like enhancement improves the qualities of
image but all the operations are mainly based on known, measured,
or degradations of the original image. Image restorations are used
to restore images with problems such as geometric distortion,
improper focus, repetitive noise, and camera motion. It is used to
correct images for known degradations. IMAGE ANALYSIS
Image analysis operations produce numerical or graphical
information based on characteristics of the original image. They
break into objects and then classify them. They depend on the image
statistics. Common operations are extraction and description of
scene and image features, automated measurements, and object
classification. Image analyze are mainly used in machine vision
applications.
IMAGE COMPRESSION
Image compression and decompression reduce the data content
necessary to describe the image. Most of the images contain lot of
redundant information, compression removes all the redundancies.
Because of the compression the size is reduced, so efficiently
stored or transported. The compressed image is decompressed when
displayed. Lossless compression preserves the exact data in the
original image, but Lossy compression does not represent the
original image but provide excellent compression. IMAGE
SYNTHESIS
Image synthesis operations create images from other images or
non-image data. Image synthesis operations generally create images
that are either physically impossible or impractical to acquire.
APPLICATIONS OF DIGITAL IMAGE PROCESSING
Digital image processing has a broad spectrum of applications,
such as remote sensing via satellites and other spacecrafts, image
transmission and storage for business applications, medical
processing, radar, sonar and acoustic image processing, robotics
and automated inspection of industrial parts.
MEDICAL APPLICATIONS
In medical applications, one is concerned with processing of
chest X-rays, cineangiograms, projection images of transaxial
tomography and other medical images that occur in radiology,
nuclear magnetic resonance (NMR) and ultrasonic scanning. These
images may be used for patient screening and monitoring or for
detection of tumors or other disease in patients. SATELLITE
IMAGING
Images acquired by satellites are useful in tracking of earth
resources; geographical mapping; prediction of agricultural crops,
urban growth and weather; flood and fire control; and many other
environmental applications. Space image applications include
recognition and analysis of objects contained in image obtained
from deep space-probe missions. COMMUNICATION
Image transmission and storage applications occur in broadcast
television, teleconferencing, and transmission of facsimile images
for office automation, communication of computer networks,
closed-circuit television based security monitoring systems and in
military communications. RADAR IMAGING SYSTEMS
Radar and sonar images are used for detection and recognition of
various types of targets or in guidance and maneuvering of aircraft
or missile systems. DOCUMENT PROCESSING It is used in scanning, and
transmission for converting paper documents to a digital image
form, compressing the image, and storing it on magnetic tape. It is
also used in document reading for automatically detecting and
recognizing printed characteristics.
DEFENSE/INTELLIGENCE It is used in reconnaissance
photo-interpretation for automatic interpretation of earth
satellite imagery to look for sensitive targets or military threats
and target acquisition and guidance for recognizing and tracking
targets in real-time smart-bomb and missile-guidance systems.
1.2 OBJECTIVE
The main goal of our project is distortion between the cover
object and the stego object is an important issue for
steganography. The tree-based parity check method is very efficient
for hiding a message on image data due to its simplicity. more
distortion indicates quality of the image reduces that is PSNR
value of image reduces and the hacker can easily identify some
message or secrete message in the image.
Based on this approach, we propose a majority vote strategy that
results in least distortion for finding a stego object. The lower
embedding efficiency of our method is better than that of previous
works when the hidden message length is relatively large.1.3
EXISTING SYSTEM Matrix embedding uses linear codes, which is also
called syndrome coding or coset encoding. It embeds and extracts a
message by using the parity check matrix of a linear code. TBPC
(Tree Based Parity Checking),is the existing method .Achieving good
PSNR value of stego image is not good in this method. 1.3.1
DISADVANTAGES OF EXISTING SYSTEM For matrix embedding, finding the
stego object with least distortion is difficult in general. In this
method the efficiency is low.
Embedding efficiency is less.
Time complicity is more.
LITERATURE SURVEY:1. J. Fridrich, Asymptotic behavior of the ZZW
embedding construction, IEEE Trans. Inf. Forensics Security, vol.
4, no. 1, pp. 151154, Mar. 2009. We analyze asymptotic behavior of
the embedding construction for steganography proposed by Zhang,
Zhang, and Wang (ZZW) at 10th Information Hiding by deriving a
closed form expression for the limit between embedding efficiency
of the ZZW construction and the theoretical upper bound as a
function of relative payload. This result confirms the experimental
observation made in the original publication.2. R. Y. M. Li, O. C.
Au, K. K. Lai, C. K. Yuk, and S.-Y. Lam, Data hiding with tree
based parity check, in Proc. IEEE Int. Conf. Multimedia and Expo
(ICME 07), 2007, pp. 635638. In this paper, we propose a novel
algorithm namely tree based parity check (TBPC) that can be applied
to most of the existing data hiding algorithms to achieve
improvement in visual quality. In data hiding process, distortion
is created when the original image is modified. Most existing data
hiding algorithms try to minimize the visual artifacts introduced
by the modifications. The proposed algorithm tries to reduce the
probability of modifying the original host image. Theoretical
analysis and experimental results are given in this paper. Both
measures suggest that an improvement in visual quality is achieved
in the watermarked image.
3. W. Zhang and S. Li, A coding problem in steganography,
Designs, Codes Cryptogr., vol. 46, no. 1, pp. 6881, 2008. To study
how to design a steganographic algorithm more efficiently, a new
coding problem--steganographic codes (abbreviated stego-codes)--is
presented in this paper. The stego-codes are defined over the field
with q(q 2) elements. A method of constructing linear stego-codes
is proposed by using the direct sum of vector subspaces. And the
problem of linear stego-codes is converted to an algebraic problem
by introducing the concept of the tth dimension of a vector space.
Some bounds on the length of stego-codes are obtained, from which
the maximum length embeddable (MLE) code arises. It is shown that
there is a corresponding relation between MLE codes and perfect
error-correcting codes. Furthermore the classification of all MLE
codes and a lower bound on the number of binary MLE codes are
obtained based on the corresponding results on perfect codes.
Finally hiding redundancy is defined to value the performance of
stego-codes.
4. W. Zhang, X. Zhang, and S. Wang, Maximizing steganographic
embedding efficiency by combining hamming codes and wet paper
codes, in Proc. Int. Workshop Inf. Hiding (IH 08), 2008, vol. LNCS
5284, pp. For good security and large payload in steganography, it
is desired to embed as many messages as possible per change of the
cover-object, i.e., to have high embedding efficiency.
Steganographic codes derived from covering codes can improve
embedding efficiency. In this paper, we propose a new method to
construct stego-codes, showing that not just one but a family of
stego-codes can be generated from one covering code by combining
Hamming codes and wet paper codes. This method can enormously
expand the set of embedding schemes as applied in steganography.
Performances of stego-code families of structured codes and random
codes are analyzed. By using the stego-code families of LDGM codes,
we obtain a family of near optimal embedding schemes for binary
steganography and 1 steganography, respectively, which can approach
the upper bound of embedding efficiency for various chosen
embedding.5. M. Khatirinejad and P. Lisonek, Linear codes for high
payload steganography, Discrete Applied Math., vol. 157, no. 5, pp.
971981, 2009. Steganography is concerned with communicating hidden
messages in such a way that no one apart from the sender and the
intended recipient can detect the very existence of the message. We
study the syndrome coding method (sometimes also called matrix
embedding method), which uses a linear code as an ingredient. Among
all codes of a fixed block length and fixed dimension (and thus of
a fixed in- formation rate), an optimal code is one that makes it
most difficult for an eavesdropper to detect the presence of the
hidden message. We show that the average distance to code is the
appropriate concept that replaces the covering radius for this
particular application. We completely classify the optimal codes in
the cases when the linear code used in the syndrome coding method
is a 1- or 2-dimensional code over GF(2). In the steganography
application this translates to cases when the code carries a high
payload (has a high information rate).
1.4 PROPOSED METHOD
We propose the toggle criteria of a node in the TBPC method can
be relaxed by the strategy of majority vote. Our strategy inherits
the efficiency of the TBPC method and produces a stego object with
least distortion under the tree based parity check model.
1.4.1 ADVANTAGES OF PROPOSED SYSTEM In this method we
effectively construct the stego object with least distortion under
the tree structure model.
This method significantly reduces the number of modifications on
average. CHAPTER 2
PROJECT DESCRIPTION
2.1 GENERAL
In this project TBPC method can be formulated as a matrix
embedding method, but is more efficient than those based on linear
codes. Due to its simplicity, the TBPC method provides very
efficient embedding and extraction algorithms. A systematic method
to generate codes with an arbitrary small relative payload from any
code with a large relative payload. Since our method works
naturally with large relative payloads. implies that our method
applies to small relative payloads as well.2.1.1 STEGANOGRAPHY
Steganography is the art and science of hiding messages.
Steganography and cryptology are similar in the way that they both
are used to protect important information. The difference between
the two is that Steganography involves hiding information so it
appears that no information is hidden at all. If a person views the
digital object that the information is hidden inside, he or she
will have no idea that there is any hidden information, therefore
the person will not attempt to decrypt the information, this is the
main objective behind steganography. Steganography comes from the
Greek words Steganos (Covered) and Graptos (Writing), these days
the sense of the word steganography usually refers to information
or a file that has been concealed inside a digital Picture, Video
or Audio file. What Steganography technically does is to make use
of human awareness; human senses are not trained to look for files
that have information hidden inside of them, although there are
programs available that can do what is called Steganalysis
(Detecting use of Steganography.) The most common use of
Steganography is to hide a file inside another file. When
information or a file is hidden inside a carrier file, the data is
usually encrypted with a password.
The basic model of steganography consists of Carrier, Message
and Password. Carrier is also known as cover-object, which the
message is embedded and serves to hide the presence of the message.
Message is the data that the sender wishes to remain it
confidential. It can be plain text, cipher text, other image, or
anything that can be embedded in a bit stream such as a copyright
mark, a covert communication, or a serial number. Password is known
as stego-key, which ensures that only recipient who know the
corresponding decoding key will be able to extract the message from
a cover-object. The cover-object with the secretly embedded message
is then called the stego-object. Recovering message from a
stego-object requires the cover-object itself and a corresponding
decoding key if a stego-key was used during the encoding process.
The original image may or may not be required in most applications
to extract the message.
There are several suitable carriers below to be the
cover-object:
Network Protocols such as TCP, IP and UDP
Audio that using digital audio formats such as wav, midi, avi,
mpeg, mpi and voc .
File and Disk that can hides and append files by using the slack
space
Text such as null characters, just alike Morse code including
html and java
Images file such as bmp, gif and jpg, where they can be both
color and gray-scale. In general, the information hiding process
extracts redundant bits from cover-object. The process consists of
two steps.
Identification of redundant bits in a cover-object. Redundant
bits are those bits that can be modified without corrupting the
quality or destroying the integrity of the cover-object.
The embedding process then selects the subset of the redundant
bits to be replaced with data from a secret message. The
stego-object is created by replacing the selected redundant bits
with message bits.
Data-hiding techniques should be capable of embedding data in a
host signal with the following restrictions and features:
1. The host signal should be non objectionably degraded and the
embedded data should be minimally perceptible. (The goal is for the
data to remain hidden. As any magician will tell you, it is
possible for something to be hidden while it remains in plain
sight; you merely keep the person from looking at it. We will use
the words hidden, inaudible, imperceivable, and invisible to mean
that an observer does not notice the presence of the data, even if
they are perceptible.)
2. The embedded data should be directly encoded into the media,
rather than into a header or wrapper, so that the data remain
intact across varying data file formats.
3. The embedded data should be immune to modifications ranging
from intentional and intelligent attempts at removal to anticipated
manipulations, e.g., channel noise, filtering, resampling,
cropping, encoding, lossy compressing, printing and scanning,
digital-to-analog (D/A) conversion, and analog- to-digital (A/D)
conversion, etc.
4. Asymmetrical coding of the embedded data is desirable, since
the purpose of data hiding is to keep the data in the host signal,
but not necessarily to make the data difficult to access.
5. Error correction coding1 should be used to ensure data
integrity. It is inevitable that there will be some degradation to
the embedded data when the host signal is modified.
6. The embedded data should be self-clocking or arbitrarily
re-entrant. This ensures that the embedded data can be recovered
when only fragments of the host signal are available, e.g., if a
sound bite is extracted from an interview, data embedded in the
audio segment can be recovered. This feature also facilitates
automatic decoding of the hidden data, since there is no need to
refer to the original host signal.
stego key TRANSMISSION STEGO KEY
cover STEGO STEGO EMBEDDED
signal SIGNAL SIGNAL DATA
EMBEDDED data
FIG 2.1: BASIC BLOCK DIAGRAM OF STEGANOGRAPHY. 2.1.2
CLASSIFICATION OF STEGANOGRAPHY TECHNIQUES
Over the past few years, numerous steganography techniques that
embed hidden messages in multimedia objects have been proposed.
There have been many techniques for hiding information or messages
in images in such a manner that the alterations made to the image
are perceptually indiscernible. Common approaches are
including:
Least significant bit insertion (LSB).
Masking and filtering.
Transform techniques Least significant bits (LSB) insertion is a
simple approach to embedding information in image file. The
simplest steganographic techniques embed the bits of the message
directly into least significant bit plane of the cover-image in a
deterministic sequence. Modulating the least significant bit does
not result in human-perceptible difference because the amplitude of
the change is small.
Masking and filtering techniques, usually restricted to 24 bits
and gray scale images, hide information by marking an image, in a
manner similar to paper watermarks. The techniques performs
analysis of the image, thus embed the information in significant
areas so that the hidden message is more integral to the cover
image than just hiding it in the noise level.
Transform techniques embed the message by modulating
coefficients in a transform domain, such as the Discrete Cosine
Transform (DCT) used in JPEG compression, Discrete Fourier
Transform, or Wavelet Transform. These methods hide messages in
significant areas of the cover-image, which make them more robust
to attack. Transformations can be applied over the entire image, to
block throughout the image, or other variants.
There are several approaches in classifying Steganographic
systems. One could categorize them according to the type of covers
used for secret communication or according to the cover
modifications applied in the embedding process. The second approach
will be followed in this section, and the Steganographic methods
are grouped in six categories, although in some cases an exact
classification is not possible. Figure 1 presents the steganography
classification.
FIG 2.2: CLASSIFICATION OF STEGANOGRAPHY TECHNIQUES.
The main goal of steganography is to communicate securely in a
completely undetectable manner and to avoid drawing suspicion to
the transmission of a hidden data. It is not to keep others from
knowing the hidden information, but it is to keep others from
thinking that the information even exists. If a steganography
method causes someone to suspect the carrier medium, then the
method has failed.
Until recently, information hiding techniques received very much
less attention from the research community and from industry than
cryptography. This situation is, however, changing rapidly and the
first academic conference on this topic was organized in 1996.
There has been a rapid growth of interest in steganography for two
main reasons:
The publishing and broadcasting industries have become
interested in techniques for hiding encrypted copyright marks and
serial numbers in digital films, audio recordings, books and
multimedia products.
Moves by various governments to restrict the availability of
encryption services have motivated people to study methods by which
private messages can be embedded in seemingly innocuous cover
messages.
STEGANOGRAPHY VS CRYPTOGRAPHY
Basically, the purpose of cryptography and steganography is to
provide secret communication. However, steganography is not the
same as cryptography. Cryptography hides the contents of a secret
message from a malicious people, whereas steganography even
conceals the existence of the message. Steganography must not be
confused with cryptography, where we transform the message so as to
make it meaning obscure to a malicious people who intercept it.
Therefore, the definition of breaking the system is different. In
cryptography, the system is broken when the attacker can read the
secret message. Breaking a steganographic system need the attacker
to detect that steganography has been used and he is able to read
the embedded message.
In cryptography, the structure of a message is scrambled to make
it meaningless and unintelligible unless the decryption key is
available. It makes no attempt to disguise or hide the encoded
message. Basically, cryptography offers the ability of transmitting
information between persons in a way that prevents a third party
from reading it. Cryptography can also provide authentication for
verifying the identity of someone or something
In contrast, steganography does not alter the structure of the
secret message, but hides it inside a cover-image so it cannot be
seen. A message in cipher text, for instance, might arouse
suspicion on the part of the recipient while an invisible message
created with steganographic methods will not. In other word,
steganography prevents an unintended recipient from suspecting that
the data exists. In addition, the security of classical
steganography system relies on secrecy of the data encoding system.
Once the encoding system is known, the steganography system is
defeated.
It is possible to combine the techniques by encrypting message
using cryptography and then hiding the encrypted message using
steganography. The resulting stego-image can be transmitted without
revealing that secret information is being exchanged. Furthermore,
even if an attacker were to defeat the steganographic technique and
detect the message from the stego-object, he would still require
the cryptographic decoding key to decipher the encrypted message.
Table 1 shows that both technologies have counter advantages and
disadvantages.
STEGANOGRAPHY
CRYPTOGRAPHY
Unknown message passing.
Little known technology.
Technology still being developed for certain formats.
Once detected message is known Many Carrier formats.
Known message passing.
Common technology.
Most algorithms known to government departments Strong algorithm
are currently resistant to brute force attack.
Large expensive computing power required for cracking
Technology increase reduces strength.
STEGANOGRAPHY APPLICATIONS
There are many applications for digital steganography of image,
including copyright protection, feature tagging, and secret
communication. Copyright notice or watermark can embedded inside an
image to identify it as intellectual property. If someone attempts
to use this image without permission, we can prove by extracting
the watermark
In feature tagging, captions, annotations, time stamps, and
other descriptive elements can be embedded inside an image. Copying
the stegoimage also copies of the embedded features and only
parties who possess the decoding stego-key will be able to extract
and view the features. On the other hand, secret communication does
not advertise a covert communication by using steganography.
Therefore, it can avoid scrutiny of the sender, message and
recipient. This is effective only if the hidden communication is
not detected by the others people.
WATERMARKING
Digital watermarking is an extension of steganography, is a
promising solution for content copyright protection in the global
network. It imposes extra robustness on embedded information.
Digital watermarking is the science of embedding copyright
information in the original files. The information embedded is
called watermarks. Digital watermarking does not leave a noticeable
mark on the content and dont affect its appreciation. These are
imperceptible and detected only by proper authorities. Digital
watermarks are difficult to remove without noticeable degrading the
content and are covert means in situations where cryptography fails
to provide robustness. The content is watermarked by converting
copyright information into random digital noise using special
algorithm that is perceptible only to the creator. Watermarks are
resistant to filtering and stay with the content as long as the
original has not been purposely damaged.
HISTORY ABOUT WATERMARKING
The distribution of works of art, including pictures, music,
video and textual documents, has become easier. With the widespread
and increasing use of the Internet, digital forms of these media
(still images, audio, video, text) are easily accessible. This is
clearly advantageous, in that it is easier to market and sell one's
works of art. However, this same property threatens copyright
protection. Digital documents are easy to copy and distribute,
allowing for pirating. There are a number of methods for protecting
ownership. One of these is known as digital watermarking. Digital
watermarking is the process of inserting a digital signal or
pattern (indicative of the owner of the content) into digital
content. The signal, known as a watermark, can be used later to
identify the owner of the work, to authenticate the content, and to
trace illegal copies of the work. Watermarks of varying degrees of
obtrusiveness are added to presentation media as a guarantee of
authenticity, quality, ownership, and source. To be effective in
its purpose, a watermark should adhere to a few requirements. In
particular, it should be robust, and transparent. Robustness
requires that it be able to survive any alterations or distortions
that the watermarked content may undergo, including intentional
attacks to remove the watermark, and common signal processing
alterations used to make the data more efficient to store and
transmit. This is so that afterwards, the owner can still be
identified. Transparency requires a watermark to be imperceptible
so that it does not affect the quality of the content, and makes
detection, and therefore removal, by pirates less possible. The
media of focus in this paper is the still image. There are a
variety of image watermarking techniques, falling into 2 main
categories, depending on in which domain the watermark is
constructed: the spatial domain (producing spatial watermarks) and
the frequency domain (producing spectral watermarks). The
effectiveness of a watermark is improved when the technique
exploits known properties of the human visual system. These are
known as perceptually based watermarking techniques. Within this
category, the class of image-adaptive watermarks proves most
effective. In conclusion, image watermarking techniques that take
advantage of properties of the human visual system, and the
characteristics of the image create the most robust and transparent
watermarks. Digital watermarking is a technology for embedding
various types of information in digital content. In general,
information for protecting copyrights and proving the validity of
data is embedded as a watermark. A digital watermark is a digital
signal or pattern inserted into digital content. The digital
content could be a still image, an audio clip, a video clip, a text
document, or some form of digital data that the creator or owner
would like to protect. The main purpose of the watermark is to
identify who the owner of the digital data is, but it can also
identify the intended recipient. Why do we need to embed such
information in digital content using digital watermark technology?
The Internet boom is one of the reasons. It has become easy to
connect to the Internet from home computers and obtain or provide
various information using the World Wide Web. All the information
handled on the Internet is provided as digital content. Such
digital content can be easily copied in a way that makes the new
file indistinguishable from the original. Then the content can be
reproduced in large quantities.
For example, if paper bank notes or stock certificates could be
easily copied and used, trust in their authenticity would greatly
be reduced, resulting in a big loss. To prevent this, currencies
and stock certificates contain watermarks. These watermarks are one
of the methods for preventing counterfeit and illegal use.
Digital watermarks apply a similar method to digital content.
Watermarked content can prove its origin, thereby protecting
copyright. A watermark also discourages piracy by silently and
psychologically deterring criminals from making illegal copies.2.2
PRINCIPLE OF DIGITAL WATERMARKS
A watermark on a bank note has a different transparency than the
rest of the note when a light is shined on it. However, this method
is useless in the digital world. Currently there are various
techniques for embedding digital watermarks. Basically, they all
digitally write desired information directly onto images or audio
data in such a manner that the images or audio data are not
damaged. Embedding a watermark should not result in a significant
increase or reduction in the original data. Digital watermarks are
added to images or audio data in such a way that they are invisible
or inaudible and unidentifiable by human eye or ear. Furthermore,
they can be embedded in content with a variety of file formats.
Digital watermarking is the content protection method for the
multimedia era. Materials suitable for watermarking. Digital
watermarking is applicable to any type of digital content,
including still images, animation, and audio data. It is easy to
embed watermarks in material that has a comparatively high
redundancy level ("wasted"), such as color still images, animation,
and audio data; however, it is difficult to embed watermarks in
material with a low redundancy level, such as black-and-white still
images.To solve this problem, we developed a technique for
embedding digital watermarks in black-and-white still images and a
software application that can effectively embed and detect digital
watermarks.
2.3 STRUCTURE OF A DIGITAL WATERMARK The material that contains
a digital watermark is called a carrier. A digital watermark is not
provided as a separate file or a link. It is information that is
directly embedded in the carrier file. Therefore, simply viewing
the carrier image containing it cannot identify the digital
watermark. Special software is needed to embed and detect such
digital watermarks. Kowas SteganoSign is one of these software
packages. Both images and audio data can carry watermarks. A
digital watermark can be detected as shown in the following
illustration.
2.4 THE IMPORTANCE OF DIGITAL WATERMARKS The Internet has
provided worldwide publishing opportunities to creators of various
works, including writers, photographers, musicians and artists.
However, these same opportunities provide ease of access to these
works, which has resulted in pirating. It is easy to duplicate
audio and visual files, and is therefore probable that duplication
on the Internet occurs without the rightful owners permission. An
example of an area where copyright protection needs to be enforced
is in the on-line music industry.
Digital watermarking is being recognized as a way for improving
this situation. RIAA reports that "record labels see watermarking
as a crucial piece of the copy protection system, whether their
music is released over the Internet or on DVD-Audio". They are of
the opinion that any encryption system can be broken, sooner or
later, and that digital watermarking is needed to indicate who the
culprit is. Another scenario in which the enforcement of copyright
is needed is in newsgathering. When digital cameras are used to
snapshoot an event, the images must be watermarked as they are
captured. This is so that later, image's origin and content can be
verified. This suggests that there are many applications that could
require image watermarking, including Internet imaging, digital
libraries, digital cameras, medical imaging, image and video
databases, surveillance imaging, video-on-demand systems, and
satellite-delivered video.
2.5 THE PURPOSES OF DIGITAL WATERMARKS
Watermarks are a way of dealing with the problems mentioned
above by providing a number of services:
1. They aim to mark digital data permanently and unalterably, so
that the source as well as the intended recipient of the digital
work is known. Copyright owners can incorporate identifying
information into their work. That is, watermarks are used in the
protection of ownership. The presence of a watermark in a work
suspected of having been copied can prove that it has been
copied.
2. By indicating the owner of the work, they demonstrate the
quality and assure the authenticity of the work. 3. With a tracking
service, owners are able to find illegal copies of their work on
the Internet. In addition, because each purchaser of the data has a
unique watermark embedded in his/her copy, any unauthorized copies
that s/he has distributed can be traced back to him/her.
4. Watermarks can be used to identify any changes that have been
made to the watermarked data.
5. Some more recent techniques are able to correct the
alteration as well.
2.6 DIGITAL WATERMARK TYPES AND TERMSWatermarks can be visible
or invisible:
a. Visible watermarks are designed to be easily perceived by a
viewer (or listener). They clearly identify the owner of the
digital data, but should not detract from the content of the
data.
b. Invisible watermarks are designed to be imperceptible under
normal viewing (or listening) conditions; more of the current
research focuses on this type of watermark than the visible type.
Both of these types of watermarks are useful in deterring theft,
but they achieve this in different ways. Visible watermarks give an
immediate indication of who the owner of the digital work is, and
data watermarked with visible watermarks are not of as much
usefulness to a potential pirate (because the watermark is
visible). Invisible watermarks, on the other hand, increase the
likelihood of prosecution after the theft has occurred. These
watermarks should therefore not be detectable to thieves, otherwise
they would try to remove it; however, they should be easily
detectable by the owners.
A further classification of watermarks is into fragile,
semi-fragile or robust:
a. A fragile watermark is embedded in digital data to for the
purpose of detecting any changes that have been made to the content
of the data. They achieve this because they are distorted, or
"broken", easily. Fragile watermarks are applicable in image
authentication systems.
b. Semi-fragile watermarks detect any changes above a
user-specified threshold.
c. Robust watermarks are designed to survive "moderate to severe
signal processing attacks".
Watermarks for images can further be classified into spatial or
spectrum watermarks, depending on how they are constructed:
a. spatial watermarks are created in the spatial domain of the
image, and are embedded directly into the pixels of the image.
These usually produce images of high quality, but are not robust to
the common image alterations. b. Spectral (or transform-based)
watermarks are incorporated into the image's transform
coefficients. The inverse-transformed coefficients form the
watermarked data. Perceptual watermarks are invisible watermarks
constructed from techniques that use models of the human visual
system to adapt the strength of the watermark to the image content.
The most effective of these watermarks are known as image-adaptive
watermarks. Finally, blind watermarking techniques are techniques
that are able to detect the watermark in a watermarked digital item
without use of the original digital item. 2.7 EFFECTIVE DIGITAL
WATERMARKS
Features of a Good WatermarkThe following are features of a good
watermark:
1. It should be difficult or impossible to remove a digital
watermark without noticeably degrading the watermarked content.
This is to ensure that the copyright information cannot be
removed.
2. The watermark should be robust. This means that it should
remain in the content after various types of manipulations, both
intentional (known as attacks on the watermark) and unintentional
(alterations that the digital data item would undergo regardless of
whether it contains a watermark or not). These are described below.
If the watermark is a fragile watermark, however, it should not
remain in the digital data after attacks on it, but should be able
to survive certain other alterations (as in the case of images,
where it should be able to survive the common image alteration of
cropping). 3. The watermark should be perceptually invisible, or
transparent. That is, it should be imperceptible (if it is of the
invisible type). Embedding the watermark signal in the digital data
produces alterations, and these should not degrade the perceived
quality of the data. Larger alterations are more robust, and are
easier to detect with certainty, but result in greater degradation
of the data. 4. It should be easy for the owner or a proper
authority to readily detect the watermark. "Such decodability
without requiring the original, unwatermarked image would be
necessary for efficient recovery of property and subsequent
prosecution". Further properties that enhance the effectiveness of
a watermarking technique, but which are not requirements are:5.
Hybrid watermarking refers to the embedding of a number of
different watermarks in the same digital carrier signal. Hybrid
watermarking allows intellectual property rights (IPR) protection,
data authentication and data item tracing all in one go. 6.
Watermark key: it is beneficial to have a key associated with each
watermark that can be used in the production, embedding, and
detection of the watermark. It should be a private key, because
then if the algorithms to produce, embed and detect the watermark
are publicly known, without the key, it is difficult to know what
the watermark signal is. The key indicates the owner of the data.
It is of interest to identify the properties of a digital data item
(the carrier signal) that assist in watermarking: 1. It should have
a high level of redundancy. This is so that it can carry a more
robust watermark without the watermark being noticed. (A more
robust watermark usually requires a larger number of alterations to
the carrier signal). 2. It must tolerate at least small,
well-defined modifications without changing its semantics.
2.8 PROBLEM DEFINATION
In existing system by using TBPC method we cant achieve good
PSNR of a stego image . Reducing distortion between the cover
object and the stego object is an important issue for
steganography. The tree-based parity check method is very efficient
for hiding a message on image. But distortion ,embedding capacity
and time complicity is more.2.9 METHODOLOGIES
2.9.1 MODULE NAMES Location finding method and TBPC. Majority
vote strategy. Average Modifications per Hidden Bit. Time
Complexity of MPC. Comparison for Large Payloads.MODULE1:TREE AND
GRAPH
The tree data structure can be generalized to representdirected
graphsby removing the constraints that a node may have at most one
parent, and that no cycles are allowed. Edges are still abstractly
considered as pairs of nodes, however, the termsparentandchildare
usually replaced by different terminology (for example,
sourceandtarget). Differentimplementation strategiesexist, for
exampleadjacency lists.
Ingraph theory, atreeis a connected acyclicgraph; unless stated
otherwise, trees and graphs are undirected. There is no one-to-one
correspondence between such trees and trees as data structure. We
can take an arbitrary undirected tree, arbitrarily pick one of
itsverticesas theroot, make all its edges directed by making them
point away from the root node - producing anarborescence- and
assign an order to all the nodes. The result corresponds to a tree
data structure. Picking a different root or different ordering
produces a different one. Atree structureis a way of representing
thehierarchicalnature of astructurein a graphical form. It is named
a "tree structure" because the classicrepresentation resembles
atree, even though the chart is generally upside down compared to
an actual tree, with the "root" at the top and the "leaves" at the
bottom. A tree structure is conceptual, and appears in several
forms. For a discussion of tree structures in specific fields,
seeTree (data structure)for computer science: insofar as it relates
to graph theory.Nomenclature and properties Everyfinitetree
structure has a member that has nosuperior. This member is called
the "root" orroot node. It can be thought of as the starting node.
The converse is not true: infinite tree structures may or may not
have a root node. The lines connecting elements are called
"branches", the elements themselves are called "nodes". Nodes
without children are calledleaf nodes, "end-nodes", or "leaves".
The names of relationships between nodes are modeled after family
relations. The gender-neutral names "parent" and "child" have
largely displaced the older "father" and "son" terminology,
although the term "uncle" is still used for other nodes at the same
level as the parent.
A node's "parent" is a node one step higher in the hierarchy
(i.e. closer to the root node) and lying on the same branch.
"Sibling" ("brother" or "sister") nodes share the same parent
node.
A node's "uncles" are siblings of that node's parent.
A node that is connected to all lower-level nodes is called an
"ancestor".
In the example, "encyclopedia" is the parent of "science" and
"culture", its children. "Art" and "craft" are siblings, and
children of "culture", which is their parent and thus one of their
ancestors. Also, "encyclopedia", being the root of the tree, is the
ancestor of "science", "culture", "art" and "craft". Finally,
"science", "art" and "craft", being leaves, are ancestors of no
other node. In a tree structure there is one and only onepath from
any point to any other point. Tree structures are used extensively
incomputer science. Determine embeddable sites in image, i.e. high
frequency regions in image. Construct master tree for the
determined lsb bits, From top to bottom and left to right. To find
out the information held by a leaf node to the root of the master
tree.
YES
NO
YES
NO
Fig 2.9 : TREE FORMATION.Tree Formation
In most data hiding algorithms, after finding the embeddable
sites of the image, the value of these locations can be classified
as either '0' or '1'. They are compared with the logo in the
immediately next step. If the value is the same as the
to-be-embedded bit, no operation is needed. Otherwise, some
distortion creating processes are carried out to toggle the
value.
In TBPC, an N-ary complete tree namely Master Tree is filled up
by the value of these embeddable locations. Every node of an N-ary
complete tree except leaf nodes has N child nodes. In the proposed
algorithm, one leaf node is needed to hold one information bit. To
embed an L bits logo, L leaves are required in the Master Tree.
Parity Calculation
Aparity bitis abitthat is added to ensure that the number of
bits with the valueonein a set of bits isevenorodd. Parity bits are
used as the simplest form oferror detecting code.
There are two variants of parity bits:even parity bitandodd
parity bit. When using even parity, the parity bit is set to 1 if
the number of ones in a given set of bits (not including the parity
bit) is odd, making the number of ones in the entire set of bits
(including the parity bit) even. If the number of on-bits is
already even, it is set to a 0. When using odd parity, the parity
bit is set to 1 if the number of ones in a given set of bits (not
including the parity bit) is even, keeping the number of ones in
the entire set of bits (including the parity bit) odd. And when the
number of set bits is already odd, the odd parity bit is set to 0.
In other words, an even parity bit will be set to "1" if the number
of 1's + 1 is even, and an odd parity bit will be set to "1" if the
number of 1's +1 is odd.
Even parity is a special case of acyclic redundancy check(CRC),
where the 1-bit CRC is generated by thepolynomialx+1. If the parity
bit is present but not used, it may be referred to asmark
parity(when the parity bit is always 1) orspace parity(the bit is
always 0). To find out the information held by a leaf node, we
travel from the leaf node to the root of the Master Tree. If the
occurrence of 1 is an odd number, the information bit of the leaf
node is said to be 1. Otherwise, the information bit is said to be
0.
MASTER TREE
INFO FIG 2.9.1: MASTER TREE FORMATION MODULE 2
We construct the toggle tree with the minimum number of 1s level
by level in the bottom-up order using the following algorithm.
Before embedding and extraction, a location finding method
determines a sequence of locations that point to elements in the
cover object. The embedding algorithm modifies the elements in
these locations to hide the message and the extraction algorithm
can recover the message by inspecting the same sequence of
locations. The TBPC method is a least significant bit (LSB)
steganographic method. Only the LSBs of the elements pointed by the
determined locations are used for embedding and extraction. The
TBPC method constructs a complete N-ary tree, called the master
tree, to represent the LSBs of the cover object. Then it fills the
nodes of the master tree with the LSBs of the cover object level by
level, from top to bottom and left to right. Every node of the tree
corresponds to an LSB in the cover object. Denote the number of
leaves of the master tree by L. The TBPC embedding algorithm
derives an L-bit binary string, called the master string, by
performing parity check on the master tree from the root to the
leaves. The embedding algorithm hides the message by modifying the
bit values of some nodes in the master tree. Assume that the length
of the message is also L. Performing the bitwise exclusive-or (XOR)
operation between the message and the master string, we obtain a
toggle string (e.g., see Fig. 1). Then, the embedding algorithm
constructs a new complete N-ary tree, called the toggle tree in the
bottom-up order and fills the leaves with the bit values of the
toggle string and the other nodes with 0. Then, level by level,
from the bottom to the root, each nonleaf node together with its
child nodes are flipped if all its child nodes have bits 1 (e.g.,
see Fig. 2). The embedding algorithm obtains the stego tree by
performing XOR between the master tree and the toggle tree (e.g.,
see Fig. 3). The TBPC extraction algorithm is simple. We can
extract the message by performing parity check on each root-leaf
path of the stego tree from left to right.Algorithm MPC:
Input: a toggle string of length L;
1. Index the nodes of the initial toggle tree;
2. Set the leaves of the toggle tree from left to right and bit
by bit with the toggle string and the other nodes 0;
3. for i=1 to h for each internal node on level i do
if the majority of its unmarked child nodes holds 1
then flip the bit values of this node and its child nodes;
else if the numbers of 0 and 1 in its unmarked child nodes are
the same
then mark this internal node;
4. if N is even then
for I=h-1for 1
for each marked internal node holding 1 on level i do
flip the bit values of this node and its child nodes; Index all
nodes of a complete N-ary tree with L leaves from top to bottom and
left to right. Set the L-bit toggle string bit by bit into the L
leaves from left to right and the other nodes 0. Assume that the
level of the tree is h. Traverse all nonleaf nodes from level 1 to
h. A nonleaf node and its child nodes form a simple complete
subtree. For each simple complete subtree, if the majority of the
child nodes hold 1, then flip the bit values of all nodes in this
subtree. Since the construction is bottom-up, the bit values of the
child nodes in every simple complete subtree are set after step 3.
Note that marking a node at step 4 applies only for N being even.
When N is even, after step 3, there may exist a two level simple
complete subtree with N/2 1s in the child nodes and 1 in its root.
In this case, flipping the bit values in this simple complete
subtree results in one fewer node holding 1 and keeps the result of
related root-leaf path parity check unchanged. Step 4 takes care of
this when the condition applies, and it is done level by level from
top to bottom. Also note that for the root of the whole toggle
tree, the bit value is always 0 when half of its child nodes hold
1. Thus, after step 4, the bit values of the child nodes in each
simple complete subtree are determined. The number of 1s in the
toggle tree is the number of modifications. When constructing the
toggle tree, the original TBPC method flips a simple complete
subtree only if all of child nodes have 1. We prove that the
majority vote strategy actually obtains toggle trees with the least
number of 1s. We call a toggle tree with the least number of 1s
corresponding to a toggle string an optimal toggle tree. We say
that a toggle tree is in majority form if for each internal node at
least half of its child nodes have bit value 0 and the internal
node holds 0 when exactly half of its child nodes holding 1. The
output of the algorithm is a toggle tree in majority form. The
majority vote guarantees that at least half child nodes of an
internal node hold 0. Note that every optimal toggle tree be
transformed into majority form. It is obvious when N is even. When
N is odd, we can check each 2-level simple complete subtree level
by level in the top-down order and flip the bit values of the root
node and its N child nodes if exactly (N+1)/2 of the child nodes
hold 1. Note that, when this situation applies, the root node must
hold 0 before flipping, otherwise the toggle tree is not optimal.
This rearrangement does not introduce an extra 1 and the result of
each root-leaf path parity check is not affected.STEP 1
STEP 2
TOGGLE TREE
MODULE 3 We construct a method that achieves the expected
embedding modifications per hidden bit of 0.5. In other words, if
we try to embed an L-bit message into the cover object, 0.5L
modifications will occur on average.
to denote the expected embedding modifications per hidden bit,
where is the average number of embedding modifications for an L-bit
message. MPC method performs majority vote on every simple complete
subtree to construct the toggle tree in the bottom-up order.
Therefore, we are going to calculate the expected reduced number of
1s for every simple complete subtree and sum up the expected
reduced number of 1s for all simple complete subtrees.For
convenience, we use i-level tree to denote a complete N-ary tree of
levels. An i-level tree consists of one root and N(i-1) -level
trees. An -level simple complete subtree is a two-level tree
containing a node v at level i and all its child nodes. For an
h-level toggle tree, the level of the root is and the level of a
leaf is 0. Let be the probability that the root of an i-level
simple complete subtree holds 1 after performing majority vote. For
the leaf nodes, is because the leaf nodes are uniformly filled with
0 or 1. For every i-level simple complete subtree, is the same by
symmetry. Let . Since the toggle tree is an N-ary complete tree
constructed by the majority vote strategy, can be expressed as
follows:
Let be the reduced number of 1s after flipping the bit values of
a simple complete subtree that holds t 1s. Therefore, The expected
reduced number of 1s for an i-level simple complete subtree is as
follows:
For an L-bit toggle string, the expected number of 1s in the
toggle string is 0.5L. In the first step for the toggle tree
construction, we fill each leaf with one bit of the toggle string.
Before majority vote, the number of 1s in the toggle tree is 0.5L.
After majority vote, the number of 1s in the toggle tree is . Since
the number of modifications is the number of 1s in the toggle tree,
we finally have the following equation: The expected reduced number
of 1s for an -level simple complete subtree is as follows:
If N=2K+1 is an odd integer, (3) can be further simplified
as
Since
The pToggle of the TBPC method is
Where is the number of leaves and is the number of possible 01
configurations in leaves for an i-level tree.MODULE 4 For embedding
of the MPC method, the construction of an L-bit master string from
a master tree is to perform parity check on L simple root-leaf
paths. The number of parity check operations for each simple
root-leaf path is the number of edges in this path. Since we
perform parity check once for every edge, the total number of
parity check operations is the number of edges in the master tree.
Since the number of nodes in the master tree is
the time complexity to obtain a master string is . The time
complexity to obtain the toggle string is since the toggle string
is derived by performing bitwise exclusive-or between the L-bit
message and the L-bit master string. Thus, the total time
complexity of the embedding algorithm is . For the extraction
algorithm, we perform parity check on L simple root-leaf paths in
the stego tree. Thus, the complexity of the extraction algorithm is
also .
MODULE 5
Embedding messages in steganographic system can be carried out
without use of a key or with use of a key. To improve
steganographic robustness key can be used as a verification option.
It can make an impact on the distribution of bits of a message
within a container, as well as an impact on the procedure of
forming a sequence of embedded bits of a message.
The first level of protection is determined only by the choice
of embedding algorithm. This may be the least significant bits
modification algorithm, or algorithms for modifying the frequency
or spatial-temporal characteristics of the container. The first
level of protection is presented in any steganographic channel.
Steganographic system in this case can be represented as shown
atThe First Protection Level Schemefigure. There following
notations are used:c- is a container file;F- steganographic channel
space (frequency or/and amplitude container part, that is available
for steganographic modification and message signal
transmission);SC- steganographic system;m- message to be
embedded;E- embedding method;- modified container file.
The second protection level of the steganographic system, as
well as all levels of protection of the higher orders, is
characterized by the use of Key (password) via steganographic
modification. An example of a simple key scheme, which provides a
second level of protection, is to write the unmodified or modified
password in the top or bottom of the message; or the distribution
of the password sign on the entire length of the steganographic
channel. Such key schemes do not affect the distribution of
messages through the container and do not use a message
preprocessing according to the defined key (see figureThe Second
Protection Level Scheme). This kind of steganographic systems are
used in such tasks as, for instance, adding a digital signature for
proof of copyright. Data embedding performance is not changed in
comparison with the fastest approach of the first protection level
usage.
Thepayloadis the data to be covertly communicated. Thecarrieris
the signal, stream, or data file into which the payload is hidden;
which differs from the "channel" (typically used to refer to the
type of input, such as "a JPEG image"). The resulting signal,
stream, or data file which has the payload encoded into it is
sometimes referred to as thepackage,stego file, orcovert message.
The percentage of bytes, samples, or other signal elements which
are modified to encode the payload is referred to as theencoding
densityand is typically expressed as a number between 0 and 1.
2.9.2 GIVEN INPUT AND EXPECTED OUTPUT
MODULE 1
INPUT: Input Image/original image.
OUTPUT: embeddable sites and forming tree(TBPC).
MODULE 2
INPUT: tree based parity checking.
OUTPUT: majority parity checking.
MODULE 3
INPUT: majority parity checking.
OUTPUT: average modification per hidden bit.MODULE 4
INPUT: majority parity checking and tree based parity
checking.
OUTPUT: average modification per hidden bits. MODULE 5INPUT:
majority parity checking and tree based parity checking.OUTPUT:
comparison of payloads.2.10 TECHNIQUE OR ALGORITHM Ininformation
theoryandcoding theorywith applications incomputer science
andtelecommunication,error detection and correctionorerror
controlare techniques that enable reliable delivery ofdigital
dataover unreliablecommunication channels. Many communication
channels are subject tochannel noise, and thus errors may be
introduced during transmission from the source to a receiver. Error
detection techniques allow detecting such errors, while error
correction enables reconstruction of the original data. Error
detectionis the detection of errors caused by noise or other
impairments during transmission from the transmitter to the
receiver.
Error correctionis the detection of errors and reconstruction of
the original, error-free data.
The general idea for achieving error detection and correction is
to add someredundancy(i.e., some extra data) to a message, which
receivers can use to check consistency of the delivered message,
and to recover data determined to be erroneous. Error-detection and
correction schemes can be eithersystematicor non-systematic: In a
systematic scheme, the transmitter sends the original data, and
attaches a fixed number ofcheck bits(orparity data), which are
derived from the data bits by somedeterministic algorithm. If only
error detection is required, a receiver can simply apply the same
algorithm to the received data bits and compare its output with the
received check bits; if the values do not match, an error has
occurred at some point during the transmission. In a system that
uses a non-systematic code, the original message is transformed
into an encoded message that has at least as many bits as the
original message.
Good error control performance requires the scheme to be
selected based on the characteristics of the communication channel.
Commonchannel modelsinclude memory-less models where errors occur
randomly and with a certain probability, and dynamic models where
errors occur primarily inbursts. Consequently, error-detecting and
correcting codes can be generally distinguished
betweenrandom-error-detecting/correctingandburst-error-detecting/correcting.
Some codes can also be suitable for a mixture of random errors and
burst errors.
If thechannel capacity cannot be determined, or is highly
varying, an error-detection scheme may be combined with a system
for retransmissions of erroneous data. This is known asautomatic
repeat request(ARQ), and is most notably used in the Internet. An
alternate approach for error control ishybrid automatic repeat
request(HARQ), which is a combination of ARQ and error-correction
coding.
ERROR DETECTION SCHEMES Error detection is most commonly
realized using a suitablehash function(orchecksumalgorithm). A hash
function adds a fixed-lengthtagto a message, which enables
receivers to verify the delivered message by recomputing the tag
and comparing it with the one provided.
There exists a vast variety of different hash function designs.
However, some are of particularly widespread use because of either
their simplicity or their suitability for detecting certain kinds
of errors (e.g., thecyclic redundancy check's performance in
detectingburst errors).
Random-error-correcting codesbased onminimum distancecoding can
provide a suitable alternative to hash functions when a strict
guarantee on the minimum number of errors to be detected is
desired. Repetition codes, described below, are special cases of
error-correcting codes: although rather inefficient, they find
applications for both error correction and detection due to their
simplicity.
Parity bits Aparity bitis a bit that is added to a group of
source bits to ensure that the number of set bits (i.e., bits with
value 1) in the outcome is even or odd. It is a very simple scheme
that can be used to detect single or any other odd number (i.e.,
three, five, etc.) of errors in the output. An even number of
flipped bits will make the parity bit appear correct even though
the data is erroneous.
Extensions and variations on the parity bit mechanism
arehorizontal redundancy checks,vertical redundancy checks, and
"double," "dual," or "diagonal" parity (used inRAID-DP).
ADVANTAGES OF PARITY CHECKING Because of its simplicity, parity
is used in manyhardwareapplications where an operation can be
repeated in case of difficulty, or where simply detecting the error
is helpful. For example, theSCSIandPCI busesuse parity to detect
transmission errors, and manymicroprocessorinstructioncachesinclude
parity protection. Because theI-cachedata is just a copy ofmain
memory, it can be disregarded and re-fetched if it is found to be
corrupted.
Inserialdata transmission, a common format is 7 data bit, an
even parity bit, and one or twostop bits. This format neatly
accommodates all the 7-bitASCII characters in a convenient 8-bit
byte. Other formats are possible; 8 bits of data plus a parity bit
can convey all 8-bit byte values.
In serial communication contexts, parity is usually generated
and checked by interface hardware (e.g., aUART) and, on reception,
the result made available to the CPU (and so to, for instance,
theoperating system) via a status bit in ahardware registerin the
interface hardware. Recovery from the error is usually done by
retransmitting the data, the details of which are usually handled
by software (e.g., the operating system I/O routines).
LEAST SIGNIFICANT BIT Thebinary representationof decimal 149,
with the lsb highlighted. The msb in an 8-bit binary number
represents a value of 128 decimal. The lsb represents a value of 1.
Incomputing, theleast significant bit(lsb) is thebitposition in
abinaryintegergiving the units value, that is, determining whether
the number is even or odd. The lsb is sometimes referred to as
theright-most bit, due to the convention inpositional notationof
writing less significant digits further to the right. It is
analogous to the least significantdigitof adecimalinteger, which is
the digit in theones(right-most) position.
It is common to assign each bit a position number, ranging from
zero to N-1, where N is the number of bits in the binary
representation used. Normally, this is simply the exponent for the
corresponding bit weight in base-2 (such as in231..20). Although a
few CPU manufacturers assignbit numbersthe opposite way (which is
not the same as differentendianness), the termlsb(of course)
remains unambiguous as an alias for the unit bit. By extension, the
least significant bits (plural) are the bits of the number closest
to, and including, the lsb.
The least significant bits have the useful property of changing
rapidly if the number changes even slightly. For example, if 1
(binary 00000001) is added to 3 (binary 00000011), the result will
be 4 (binary 00000100) and three of the least significant bits will
change (011 to 100). By contrast, the threemost significant
bitsstay unchanged (000 to 000). Least significant bits are
frequently employed inpseudorandom number generators,hash
functionsandchecksums.Implementing steganography
Secrets can be hidden inside all sorts of cover information:
text, images, audio, video and more. Most steganographic utilities
nowadays, hide information inside images, as this is relatively
easy to implement. However, there are tools available to store
secrets inside almost any type of cover source. It is also possible
to hide information inside texts, sounds and video films for
example. The most important property of a cover source is the
amount of data that can be stored inside it, without changing the
noticeable properties of the cover. When an image is distorted or a
piece of music sounds different than the original, the cover source
will be suspicious and may be checked more thoroughly.Hiding a
message inside a text
Since everyone can read, encoding text in neutral sentences is
doubtfully effective. But taking the first letter of each word of
the previous sentence, you will see that it is possible and not
very difficult. Hiding information in plain text can be done in
many different ways. The first-letter algorithm used here is not
very secure, as knowledge of the system that is used, automatically
gives you the secret. This is a disadvantage that many techniques
of hiding secrets inside plain text have in common. Many techniques
involve the modification of the layout of a text, rules like using
every n-th character or the altering of the amount of whitespace
after lines or between words. The last technique was successfully
used in practice and even after a text has been printed and copied
on paper for ten times, the secret message could still be
retrieved.
Another possible way of storing a secret inside a text is using
a publicly available cover source, a book or a newspaper, and using
a code which consists for example of a combination of a page
number, a line number and a character number. This way, no
information stored inside the cover source will lead to the hidden
message. Discovering it, relies solely on gaining knowledge of the
secret key.
Images
Hiding information inside images is a popular technique
nowadays. An image with a secret message inside can easily be
spread over the world wide web or in newsgroups. The use of
steganography in newsgroups has been researched by German
steganographic expert Niels Provos, who created a scanning cluster
which detects the presence of hidden messages inside images that
were posted on the net. However, after checking one million images,
no hidden messages were found, so the practical use of
steganography still seems to be limited. To hide a message inside
an image without changing its visible properties, the cover source
can be altered in noisy areas with many color variations, so less
attention will be drawn to the modifications. The most common
methods to make these alterations involve the usage of the
least-significant bit or LSB, masking, filtering and
transformations on the cover image. These techniques can be used
with varying degrees of success on different types of image
files.
Least-significant bit modifications
The most widely used technique to hide data, is the usage of the
LSB. Although there are several disadvantages to this approach, the
relative easiness to implement it, makes it a popular method. To
hide a secret message inside a image, a proper cover image is
needed. Because this method uses bits of each pixel in the image,
it is neccessary to use a lossless compression format, otherwise
the hidden information will get lost in the transformations of a
lossy compression algorithm.
When using a 24 bit color image, a bit of each of the red, green
and blue color components can be used, so a total of 3 bits can be
stored in each pixel. Thus, a 800 600 pixel image can contain a
total amount of 1.440.000 bits (180.000 bytes) of secret data.
For example, the following grid can be considered as 3 pixels of
a 24 bit color image,
using 9 bytes of memory:(00100111 11101001 11001000)
(00100111 11001000 11101001)
(11001000 00100111 11101001)
When the character A, which binary value equals 10000001, is
inserted, the following
grid results:
(00100111 11101000 11001000)
(00100110 11001000 11101000)
(11001000 00100111 11101001)
In this case, only three bits needed to be changed to insert the
character successfully. On average, only half of the bits in an
image will need to be modified to hide a secret message using the
maximal cover size. The resulting changes that are made to the
least significant bits are too small to be recognized by the human
eye, so the message is effectively hidden. While using a 24 bit
image gives a relatively large amount of space to hide messages, it
is also possible to use a 8 bit image as a cover source. Because of
the smaller space and different properties, 8 bit images require a
more careful approach. Where 24 bit images use three bytes to
represent a pixel, an 8 bit image uses only one. Changing the LSB
of that byte will result in a visible change of color, as another
color in the available palette will be displayed. Therefore, the
cover image needs to be selected more carefully and preferably be
in grayscale, as the human eye will not detect the difference
between different gray values as easy as with different colors.
Disadvantages of using LSB alteration, are mainly in the fact
that it requires a fairly large cover image to create a usable
amount of hiding space. Even nowadays, uncompressed images of 800 x
600 pixels are not often used on the Internet, so using these might
rise suspicion. Another disadvantage will arise when compressing an
image concealing a secret using a lossy compression algorithm. The
hidden message will not survive this operation and is lost after
the transformation.
Masking and filtering
Masking and filtering techniques, usually restricted to 24 bits
or grayscale images, take a different approach to hiding a message.
These methods are effectively similar to paper watermarks, creating
markings in an image. This can be achieved for example by modifying
the luminance of parts of the image. While masking does change the
visible properties of an image, it can be done in such a way that
the human eye will not notice the anomalies. Since masking uses
visible aspects of the image, it is more robust than LSB
modification with respect to compression, cropping and different
kinds of image processing. The information is not hidden at the
noise level but is inside the visible part of the image, which
makes it more suitable than LSB modifications in case a lossy
compression algorithm like JPEG is being used.Detecting
steganography
As more and more techniques of hiding information are developed
and improved, the methods of detecting the use of steganography
also advance. Most steganographic techniques involve changing
properties of the cover source and there are several ways of
detecting these changes.
Text
While information can be hidden inside texts in such a way that
the presence of the message can only be detected with knowledge of
the secret key, for example when using the earlier mentioned method
using a publicly available book and a combination of character
positions to hide the message, most of the techniques involve
alterations to the cover source. These modifications can be
detected by looking for patterns in texsts or disturbings thereof,
odd use of language and unusual amounts of whitespace.
Images
Although images can be scanned for suspicious properties in a
very basic way, detecting hidden messages usually requires a more
technical approach. Changes in size, file format, last modified
timestamp and in the color palette might point out the existence of
a hidden message, but this will not always be the case. A widely
used technique for image scanning involves statistical analysis.
Most steganographic algorithms that work on images, assume that the
least-significant bit is more or less random. This is however, an
incorrect assumption. While the LSB might not seem to be of much
importance, applying a filter which only shows the
least-significant bits, will still produce a recognizable image.
Since this is the case, it can be concluded that the LSB are not
random at all, but actually contain information about the whole
image.
When inserting a hidden message into an image, this property
changes. Especially with encrypted data, which has a very high
entropy, the LSB of the cover image will no longer contain
information about the original, but because of the modifications
they will now be more or less random. With a statistical analysis
on the LSB, the difference between random values and real image
values can easily be detected. Using this technique, it is also
possible to detect messages hidden inside JPEG files with the DCT
method, since this also involves LSB modifications, even though
these take place in the frequency domain.
Audio and video
The statistical analysis method can be used against audio files
too, since the LSB modification technique can be used on sounds
too. Except for this, there are several other things that can be
detected. High, inaudible frequencies can be scanned for
information and odd distortions or patterns in the sounds might
point out the existence of a secret message. Also, differences in
pitch, echo or background noise may raise suspicion. Like
implementing steganography using video files as cover sources, the
methods of detecting hidden information are also a combination of
techniques used for images and audio files. However, a different
steganographic technique can be used that is especially effective
when used in video films. The usage of special code signs or
gestures is very difficult to detect with a computer system. This
method was used in the Vietnam war so prisoners of war could
communicate messages secretly through the video films the enemy
soldiers made to send to the home front.
Defeating steganograms
While steganograms may not always be successfully detected,
there are different ways of removing hidden messages from possible
cover sources. Knowledge or certainty of the existence of a hidden
message is not needed, since messages can even be destroyed without
this. Although there will never be a 100 percent guarantee of
success, the number of possible ways of sending hidden messages can
easily be reduced using any combination of steganographic defeating
techniques.Text
The best way of removing hidden messages from a plain text might
be rewriting and reformulating the contents. Rewriting it using
different words and sentence constructions
will most certainly remove all ways of reproducing a hidden
message, since it will take care of almost every possible way data
can be stored inside a plain text. The character position scheme
will no longer work because the words have been changed, and the
same is valid for the differentiations in white spacing, since the
text will have a new layout. The only method that will not be
covered by this technique is the usage of a publicly available
cover source. Since this source cannot easily be altered, there is
no effective way of stopping this method, except for intercepting
the secret key.
Images
Compressing an image using lossy compression will remove
messages that are hidden using the LSB modification technique. This
will also happen when the image is resized, the color palette is
modified or the colors themselves are modified. Conversion to a
different image format, which often uses a different type of
compression, will also help in removing hidden messages. And
altering the luminiscence for example, will remove watermarks in
the visible part of an image.
Audio and video
Most of the techniques that can be used on images, can also be
applied on audio files. Compressing an audio file with lossy
compression will result in loss of the hidden message as it will
change the whole structure of a file. Also, several lossy
compression schemes use the limits of the human ear to their
advantage by removing all frequencies that cannot be heard. This
will also remove any frequencies that are used by a steganographic
system which hides information in that part of the spectrum.
Another possible way of removing steganograms is lowering the bit
rate of the audio file. In that case, there will be less available
space to store hidden data and therefore, at least parts of it will
get lost. For video, once more again, the same methods as for
images and audio files can be applied to remove hidden information.
To defeat the use of signals or gestures however, human insight is
still necessary, as computer systems are not yet capable of
detecting this with a reasonable rate of success.CHAPTER 3
REQUIREMENTS ENGINEERING
3.1 GENERAL
Harris Laplacian detector is used to find feature points
(interst points), by using only feature points we can
watermark/embedded only on particular high frequency regions.
Hiding only on selected regions is more secure when compared to
hiding watermark on all high frequency regions. Genetic algorithm
is used to check out the fitness of the feature regions. This is
one of the advanced methods compare to all existing methods.
3.2 HARDWARE REQUIREMENTS
The hardware requirements may serve as the basis for a contract
for the implementation of the system and should therefore be a
complete and consistent specification of the whole system. They are
used by software engineers as the starting point for the system
design. It shows what the system does and how it should be
implemented.
PROCESSOR
: PENTIUM IV 2.6 GHz, Intel Core 2Duo.
RAM
: 512 MB DD RAM
MONITOR
: 15 COLOR
HARD DISK
: 40 GB
CDDRIVE
: LG 52XKEYBOARD
: STANDARD 102 KEYSMOUSE
: 3 BUTTONS
3.3 SOFTWARE REQUIREMENTS MATLAB 7.9 Version
MATLABMATLAB is a high-performance language for technical
computing. It integrates computation, visualization, and
programming in an easy-to-use environment where problems and
solutions are expressed in familiar mathematical notation.
Typical uses include:
Math and computation.
Algorithm development.
Modeling, simulation, and prototyping.
Data analysis, exploration, and visualization.
Scientific and engineering graphics.
Application development, including Graphical User Interface
building.
MATLAB is an interactive system whose basic data element is an
array that does not r