Final Itimp03

AN OPTIMAL DATA HIDING SCHEME WITH TREE- BASED PARITY CHECKABSTRACT Steganography is defined as the science of hiding or embedding data in a transmission medium. The word Steganography is originally made up of two Greek words which mean Covered Writing. Steganalysis is the science of attacking Steganography. Stenography studies the scheme to hide secrets into the communication between the sender and the receiver such that no other people can detect the existence of the secrets. A steganographic method consists of an embedding algorithm and an extraction algorithm. The embedding algorithm describes how to hide a message into the cover object and the extraction algorithm illustrates how to extract the message from the stego object. A commonly used strategy for steganography is to embed the message by lightly distorting the cover object into the target stego object. If the distortion is sufficiently small, the stego object will be indistinguishable from the noisy cover object. Therefore, reducing distortion is a crucial issue for steganographic methods. we propose an efficient embedding scheme that uses the least number of changes over the tree-based parity check model, by using (Majority parity check) MPC is the algorithm instead of TBPC method we are going to get less distortion in stego image. Keeping height of the tree constant we can embed more data when compare to TBPC method. By introducing the majority vote strategy, we effectively construct the stego object with least distortion under the tree structure model We also show that our method yields a binary linear stego-code. In comparison with the TBPC method, our method significantly reduces the number of modifications on average.TABLE OF CONTENTS

CHAPTER

NO. TITLE PAGE

NO.

ABSTRACT

LIST OF FIGURES

LIST OF SYMBOLS

LIST OF ABBREVIATIONS

LIST OF TABLES

1.CHAPTER 1 : INTRODUCTION

1.1 GENERAL

1.1.1THE IMAGE PROCESSING SYSTEM

1.1.2 IMAGE PROCESSING FUNDAMENTAL

1.2 OBJECTIVE

1.3 EXISTING SYSTEM

1.3.1EXISTINGSYSTEMDISADVANTAGES

1.3.2 LITERATURE SURVEY

1.4 PROPOSED SYSTEM

1.4.1 PROPOSED SYSTEM ADVANTAGES

2. CHAPTER 2 : AN OPTIMAL DATA HIDING SCHEME WITH

TREE- BASED PARITY CHECK

2.1 GENERAL

2.1.1 STEGANOGRAPHY

2.1.CLASSIFICATION OF STEGANOGRAPHY TECHNIQUES2.1.3 WATERMARKING

2.2 PRINCIPLE OF DIGITAL WATERMARKS2.3 STRUCTURE OF A DIGITAL WATERMARK

2.4 THE IMPORTANCE OF DIGITAL WATERMARKS

2.5 THE PURPOSES OF DIGITAL WATERMARKS 2.6 DIGITAL WATERMARK TYPES AND TERMS

2.7 EFFECTIVE DIGITAL WATERMARKS

2.8 PROBLEM DEFINATION2.9 METHODOLOGIES

2.9.1 MODULES NAME

2.3.2 MODULES DESCRIPTION2.3.3 GIVEN INPUTAND EXPECTED OUTPUT2.10 TECHNIQUE OR ALGORITHM

3.CHAPTER 3 : REQUIREMENTS

3.1 GENERAL

3.2 HARDWARE REQUIREMENTS

3.3 SOFTWARE REQUIREMENTS

4.CHAPTER4 :SOFTWARE SPECIFICATION 5.1 general

5.2 features of matlab

5.2.1 INTERFACING WITH OTHER LANGUAGES

5.2.2 ANALYZING AND ACCESSING DATA

5.2.3 PERFORMING NUMERIC COMPUTATION

5.CHAPTER 5 : IMPLEMENTATION

6.1 GENERAL

6.2 IMPLEMENTATION CODING

6.CHAPTER 6 : SNAPSHOTS

7.1 SNAPSHOTS

7.CHAPTER 7:APPLICATION AND FUTURE ENCHANCEMENT

9.1 GENERAL

9.2 APPLICATIONS

9.3 FUTURE ENHANCEMENTS

8.CHAPTER 8 :

10.1CONCLUSION

10.2 REFERENCES

LIST OF FIGURES

FIGURE NONAME OF THE FIGUREPAGE NO

1.1 A BLOCK DAIGRAM FOR IMAGE PROCESSING SYSTEM

1.2BLOCK DIAGRAM OF FUNDAMENTAL SEQUENCE INVOLVED IN AN IMAGE PROCESSING SYSTEM

1.3 IMAGE PROCESSING TECHNIQUES

1.4GENERAL METHOD FOR FEATURE BASED WATERMARKING.

2.1BASIC BLOCK DIAGRAM OF STEGANOGRAPHY.

2.2CLASSIFICATION OF STEGANOGRAPHY TECHNIQUES.

2.9TREE FORMATION

2.9.1MASTER TREE FORMATION

2.9.2OPTIMIZED TREE

LIST OF ABBREVIATIONS

NMR - NUCLEAR MAGNETIC RESONANCE.

TBPC - TREE BASED PARITY CHECKING.BER - BIT ERROR RATE.

D/A - DIGITAL-TO-ANALOG

A/D - ANALOG- TO-DIGITAL

PSNR - PEAK SIGNAL NOISE RATIO.

MPC - MAJORITY PARITY CHEC KING. CHAPTER 1 INTRODUCTION

1.1 GENERAL The term digital image refers to processing of a two dimensional picture by a digital computer. In a broader context, it implies digital processing of any two dimensional data. A digital image is an array of real or complex numbers represented by a finite number of bits. An image given in the form of a transparency, slide, photograph or an X-ray is first digitized and stored as a matrix of binary digits in computer memory. This digitized image can then be processed and/or displayed on a high-resolution television monitor. For display, the image is stored in a rapid-access buffer memory, which refreshes the monitor at a rate of 25 frames per second to produce a visually continuous display.

1.1.1 THE IMAGE PROCESSING SYSTEM

DIGITIZER A digitizer converts an image into a numerical representation suitable for input into a digital computer. Some common digitizers are

Microdensitometer

Flying spot scanner

Image dissector

Videocon camera

Photosensitive solid- state arrays. IMAGE PROCESSOR An image processor does the functions of image acquisition, storage, preprocessing, segmentation, representation, recognition and interpretation and finally displays or records the resulting image. The following block diagram gives the fundamental sequence involved in an image processing system.

As detailed in the diagram, the first step in the process is image acquisition by an imaging sensor in conjunction with a digitizer to digitize the image. The next step is the preprocessing step where the image is improved being fed as an input to the other processes. Preprocessing typically deals with enhancing, removing noise, isolating regions, etc. Segmentation partitions an image into its constituent parts or objects. The output of segmentation is usually raw pixel data, which consists of either the boundary of the region or the pixels in the region themselves. Representation is the process of transforming the raw pixel data into a form useful for subsequent processing by the computer. Description deals with extracting features that are basic in differentiating one class of objects from another. Recognition assigns a label to an object based on the information provided by its descriptors. Interpretation involves assigning meaning to an ensemble of recognized objects. The knowledge about a problem domain is incorporated into the knowledge base. The knowledge base guides the operation of each processing module and also controls the interaction between the modules. Not all modules need be necessarily present for a specific function. The composition of the image processing system depends on its application. The frame rate of the image processor is normally around 25 frames per second. DIGITAL COMPUTER Mathematical processing of the digitized image such as convolution, averaging, addition, subtraction, etc. are done by the computer. MASS STORAGE The secondary storage devices normally used are floppy disks, CD ROMs etc. HARD COPY DEVICE The hard copy device is used to produce a permanent copy of the image and for the storage of the software involved. OPERATOR CONSOLE The operator console consists of equipment and arrangements for verification of intermediate results and for alterations in the software as and when require. The operator is also capable of checking for any resulting errors and for the entry of requisite data.1.1.2 IMAGE PROCESSING FUNDAMENTAL

Digital image processing refers processing of the image in digital form. Modern cameras may directly take the image in digital form but generally images are originated in optical form. They are captured by video cameras and digitalized. The digitalization process includes sampling, quantization. Then these images are processed by the five fundamental processes, at least any one of them, not necessarily all of them. IMAGE PROCESSING TECHNIQUES

This section gives various image processing techniques.

FIG1.3: IMAGE PROCESSING TECHNIQUES

IMAGE ENHANCEMENT

Image enhancement operations improve the qualities of an image like improving the images contrast and brightness characteristics, reducing its noise content, or sharpen the details. This just enhances the image and reveals the same information in more understandable image. It does not add any information to it. IMAGE RESTORATION

Image restoration like enhancement improves the qualities of image but all the operations are mainly based on known, measured, or degradations of the original image. Image restorations are used to restore images with problems such as geometric distortion, improper focus, repetitive noise, and camera motion. It is used to correct images for known degradations. IMAGE ANALYSIS

Image analysis operations produce numerical or graphical information based on characteristics of the original image. They break into objects and then classify them. They depend on the image statistics. Common operations are extraction and description of scene and image features, automated measurements, and object classification. Image analyze are mainly used in machine vision applications.

IMAGE COMPRESSION

Image compression and decompression reduce the data content necessary to describe the image. Most of the images contain lot of redundant information, compression removes all the redundancies. Because of the compression the size is reduced, so efficiently stored or transported. The compressed image is decompressed when displayed. Lossless compression preserves the exact data in the original image, but Lossy compression does not represent the original image but provide excellent compression. IMAGE SYNTHESIS

Image synthesis operations create images from other images or non-image data. Image synthesis operations generally create images that are either physically impossible or impractical to acquire. APPLICATIONS OF DIGITAL IMAGE PROCESSING

Digital image processing has a broad spectrum of applications, such as remote sensing via satellites and other spacecrafts, image transmission and storage for business applications, medical processing, radar, sonar and acoustic image processing, robotics and automated inspection of industrial parts.

MEDICAL APPLICATIONS

In medical applications, one is concerned with processing of chest X-rays, cineangiograms, projection images of transaxial tomography and other medical images that occur in radiology, nuclear magnetic resonance (NMR) and ultrasonic scanning. These images may be used for patient screening and monitoring or for detection of tumors or other disease in patients. SATELLITE IMAGING

Images acquired by satellites are useful in tracking of earth resources; geographical mapping; prediction of agricultural crops, urban growth and weather; flood and fire control; and many other environmental applications. Space image applications include recognition and analysis of objects contained in image obtained from deep space-probe missions. COMMUNICATION

Image transmission and storage applications occur in broadcast television, teleconferencing, and transmission of facsimile images for office automation, communication of computer networks, closed-circuit television based security monitoring systems and in military communications. RADAR IMAGING SYSTEMS

Radar and sonar images are used for detection and recognition of various types of targets or in guidance and maneuvering of aircraft or missile systems. DOCUMENT PROCESSING It is used in scanning, and transmission for converting paper documents to a digital image form, compressing the image, and storing it on magnetic tape. It is also used in document reading for automatically detecting and recognizing printed characteristics.

DEFENSE/INTELLIGENCE It is used in reconnaissance photo-interpretation for automatic interpretation of earth satellite imagery to look for sensitive targets or military threats and target acquisition and guidance for recognizing and tracking targets in real-time smart-bomb and missile-guidance systems.

1.2 OBJECTIVE

The main goal of our project is distortion between the cover object and the stego object is an important issue for steganography. The tree-based parity check method is very efficient for hiding a message on image data due to its simplicity. more distortion indicates quality of the image reduces that is PSNR value of image reduces and the hacker can easily identify some message or secrete message in the image.

Based on this approach, we propose a majority vote strategy that results in least distortion for finding a stego object. The lower embedding efficiency of our method is better than that of previous works when the hidden message length is relatively large.1.3 EXISTING SYSTEM Matrix embedding uses linear codes, which is also called syndrome coding or coset encoding. It embeds and extracts a message by using the parity check matrix of a linear code. TBPC (Tree Based Parity Checking),is the existing method .Achieving good PSNR value of stego image is not good in this method. 1.3.1 DISADVANTAGES OF EXISTING SYSTEM For matrix embedding, finding the stego object with least distortion is difficult in general. In this method the efficiency is low.

Embedding efficiency is less.

Time complicity is more.

LITERATURE SURVEY:1. J. Fridrich, Asymptotic behavior of the ZZW embedding construction, IEEE Trans. Inf. Forensics Security, vol. 4, no. 1, pp. 151154, Mar. 2009. We analyze asymptotic behavior of the embedding construction for steganography proposed by Zhang, Zhang, and Wang (ZZW) at 10th Information Hiding by deriving a closed form expression for the limit between embedding efficiency of the ZZW construction and the theoretical upper bound as a function of relative payload. This result confirms the experimental observation made in the original publication.2. R. Y. M. Li, O. C. Au, K. K. Lai, C. K. Yuk, and S.-Y. Lam, Data hiding with tree based parity check, in Proc. IEEE Int. Conf. Multimedia and Expo (ICME 07), 2007, pp. 635638. In this paper, we propose a novel algorithm namely tree based parity check (TBPC) that can be applied to most of the existing data hiding algorithms to achieve improvement in visual quality. In data hiding process, distortion is created when the original image is modified. Most existing data hiding algorithms try to minimize the visual artifacts introduced by the modifications. The proposed algorithm tries to reduce the probability of modifying the original host image. Theoretical analysis and experimental results are given in this paper. Both measures suggest that an improvement in visual quality is achieved in the watermarked image.

3. W. Zhang and S. Li, A coding problem in steganography, Designs, Codes Cryptogr., vol. 46, no. 1, pp. 6881, 2008. To study how to design a steganographic algorithm more efficiently, a new coding problem--steganographic codes (abbreviated stego-codes)--is presented in this paper. The stego-codes are defined over the field with q(q 2) elements. A method of constructing linear stego-codes is proposed by using the direct sum of vector subspaces. And the problem of linear stego-codes is converted to an algebraic problem by introducing the concept of the tth dimension of a vector space. Some bounds on the length of stego-codes are obtained, from which the maximum length embeddable (MLE) code arises. It is shown that there is a corresponding relation between MLE codes and perfect error-correcting codes. Furthermore the classification of all MLE codes and a lower bound on the number of binary MLE codes are obtained based on the corresponding results on perfect codes. Finally hiding redundancy is defined to value the performance of stego-codes.

4. W. Zhang, X. Zhang, and S. Wang, Maximizing steganographic embedding efficiency by combining hamming codes and wet paper codes, in Proc. Int. Workshop Inf. Hiding (IH 08), 2008, vol. LNCS 5284, pp. For good security and large payload in steganography, it is desired to embed as many messages as possible per change of the cover-object, i.e., to have high embedding efficiency. Steganographic codes derived from covering codes can improve embedding efficiency. In this paper, we propose a new method to construct stego-codes, showing that not just one but a family of stego-codes can be generated from one covering code by combining Hamming codes and wet paper codes. This method can enormously expand the set of embedding schemes as applied in steganography. Performances of stego-code families of structured codes and random codes are analyzed. By using the stego-code families of LDGM codes, we obtain a family of near optimal embedding schemes for binary steganography and 1 steganography, respectively, which can approach the upper bound of embedding efficiency for various chosen embedding.5. M. Khatirinejad and P. Lisonek, Linear codes for high payload steganography, Discrete Applied Math., vol. 157, no. 5, pp. 971981, 2009. Steganography is concerned with communicating hidden messages in such a way that no one apart from the sender and the intended recipient can detect the very existence of the message. We study the syndrome coding method (sometimes also called matrix embedding method), which uses a linear code as an ingredient. Among all codes of a fixed block length and fixed dimension (and thus of a fixed information rate), an optimal code is one that makes it most difficult for an eavesdropper to detect the presence of the hidden message. We show that the average distance to code is the appropriate concept that replaces the covering radius for this particular application. We completely classify the optimal codes in the cases when the linear code used in the syndrome coding method is a 1- or 2-dimensional code over GF(2). In the steganography application this translates to cases when the code carries a high payload (has a high information rate).

1.4 PROPOSED METHOD

We propose the toggle criteria of a node in the TBPC method can be relaxed by the strategy of majority vote. Our strategy inherits the efficiency of the TBPC method and produces a stego object with least distortion under the tree based parity check model.

1.4.1 ADVANTAGES OF PROPOSED SYSTEM In this method we effectively construct the stego object with least distortion under the tree structure model.

This method significantly reduces the number of modifications on average. CHAPTER 2

PROJECT DESCRIPTION

2.1 GENERAL

In this project TBPC method can be formulated as a matrix embedding method, but is more efficient than those based on linear codes. Due to its simplicity, the TBPC method provides very efficient embedding and extraction algorithms. A systematic method to generate codes with an arbitrary small relative payload from any code with a large relative payload. Since our method works naturally with large relative payloads. implies that our method applies to small relative payloads as well.2.1.1 STEGANOGRAPHY

Steganography is the art and science of hiding messages. Steganography and cryptology are similar in the way that they both are used to protect important information. The difference between the two is that Steganography involves hiding information so it appears that no information is hidden at all. If a person views the digital object that the information is hidden inside, he or she will have no idea that there is any hidden information, therefore the person will not attempt to decrypt the information, this is the main objective behind steganography. Steganography comes from the Greek words Steganos (Covered) and Graptos (Writing), these days the sense of the word steganography usually refers to information or a file that has been concealed inside a digital Picture, Video or Audio file. What Steganography technically does is to make use of human awareness; human senses are not trained to look for files that have information hidden inside of them, although there are programs available that can do what is called Steganalysis (Detecting use of Steganography.) The most common use of Steganography is to hide a file inside another file. When information or a file is hidden inside a carrier file, the data is usually encrypted with a password.

The basic model of steganography consists of Carrier, Message and Password. Carrier is also known as cover-object, which the message is embedded and serves to hide the presence of the message. Message is the data that the sender wishes to remain it confidential. It can be plain text, cipher text, other image, or anything that can be embedded in a bit stream such as a copyright mark, a covert communication, or a serial number. Password is known as stego-key, which ensures that only recipient who know the corresponding decoding key will be able to extract the message from a cover-object. The cover-object with the secretly embedded message is then called the stego-object. Recovering message from a stego-object requires the cover-object itself and a corresponding decoding key if a stego-key was used during the encoding process. The original image may or may not be required in most applications to extract the message.

There are several suitable carriers below to be the cover-object:

Network Protocols such as TCP, IP and UDP

Audio that using digital audio formats such as wav, midi, avi, mpeg, mpi and voc .

File and Disk that can hides and append files by using the slack space

Text such as null characters, just alike Morse code including html and java

Images file such as bmp, gif and jpg, where they can be both color and gray-scale. In general, the information hiding process extracts redundant bits from cover-object. The process consists of two steps.

Identification of redundant bits in a cover-object. Redundant bits are those bits that can be modified without corrupting the quality or destroying the integrity of the cover-object.

The embedding process then selects the subset of the redundant bits to be replaced with data from a secret message. The stego-object is created by replacing the selected redundant bits with message bits.

Data-hiding techniques should be capable of embedding data in a host signal with the following restrictions and features:

1. The host signal should be non objectionably degraded and the embedded data should be minimally perceptible. (The goal is for the data to remain hidden. As any magician will tell you, it is possible for something to be hidden while it remains in plain sight; you merely keep the person from looking at it. We will use the words hidden, inaudible, imperceivable, and invisible to mean that an observer does not notice the presence of the data, even if they are perceptible.)

2. The embedded data should be directly encoded into the media, rather than into a header or wrapper, so that the data remain intact across varying data file formats.

3. The embedded data should be immune to modifications ranging from intentional and intelligent attempts at removal to anticipated manipulations, e.g., channel noise, filtering, resampling, cropping, encoding, lossy compressing, printing and scanning, digital-to-analog (D/A) conversion, and analog- to-digital (A/D) conversion, etc.

4. Asymmetrical coding of the embedded data is desirable, since the purpose of data hiding is to keep the data in the host signal, but not necessarily to make the data difficult to access.

5. Error correction coding1 should be used to ensure data integrity. It is inevitable that there will be some degradation to the embedded data when the host signal is modified.

6. The embedded data should be self-clocking or arbitrarily re-entrant. This ensures that the embedded data can be recovered when only fragments of the host signal are available, e.g., if a sound bite is extracted from an interview, data embedded in the audio segment can be recovered. This feature also facilitates automatic decoding of the hidden data, since there is no need to refer to the original host signal.

stego key TRANSMISSION STEGO KEY

cover STEGO STEGO EMBEDDED

signal SIGNAL SIGNAL DATA

EMBEDDED data

FIG 2.1: BASIC BLOCK DIAGRAM OF STEGANOGRAPHY. 2.1.2 CLASSIFICATION OF STEGANOGRAPHY TECHNIQUES

Over the past few years, numerous steganography techniques that embed hidden messages in multimedia objects have been proposed. There have been many techniques for hiding information or messages in images in such a manner that the alterations made to the image are perceptually indiscernible. Common approaches are including:

Least significant bit insertion (LSB).

Masking and filtering.

Transform techniques Least significant bits (LSB) insertion is a simple approach to embedding information in image file. The simplest steganographic techniques embed the bits of the message directly into least significant bit plane of the cover-image in a deterministic sequence. Modulating the least significant bit does not result in human-perceptible difference because the amplitude of the change is small.

Masking and filtering techniques, usually restricted to 24 bits and gray scale images, hide information by marking an image, in a manner similar to paper watermarks. The techniques performs analysis of the image, thus embed the information in significant areas so that the hidden message is more integral to the cover image than just hiding it in the noise level.

Transform techniques embed the message by modulating coefficients in a transform domain, such as the Discrete Cosine Transform (DCT) used in JPEG compression, Discrete Fourier Transform, or Wavelet Transform. These methods hide messages in significant areas of the cover-image, which make them more robust to attack. Transformations can be applied over the entire image, to block throughout the image, or other variants.

There are several approaches in classifying Steganographic systems. One could categorize them according to the type of covers used for secret communication or according to the cover modifications applied in the embedding process. The second approach will be followed in this section, and the Steganographic methods are grouped in six categories, although in some cases an exact classification is not possible. Figure 1 presents the steganography classification.

FIG 2.2: CLASSIFICATION OF STEGANOGRAPHY TECHNIQUES.

The main goal of steganography is to communicate securely in a completely undetectable manner and to avoid drawing suspicion to the transmission of a hidden data. It is not to keep others from knowing the hidden information, but it is to keep others from thinking that the information even exists. If a steganography method causes someone to suspect the carrier medium, then the method has failed.

Until recently, information hiding techniques received very much less attention from the research community and from industry than cryptography. This situation is, however, changing rapidly and the first academic conference on this topic was organized in 1996. There has been a rapid growth of interest in steganography for two main reasons:

The publishing and broadcasting industries have become interested in techniques for hiding encrypted copyright marks and serial numbers in digital films, audio recordings, books and multimedia products.

Moves by various governments to restrict the availability of encryption services have motivated people to study methods by which private messages can be embedded in seemingly innocuous cover messages.

STEGANOGRAPHY VS CRYPTOGRAPHY

Basically, the purpose of cryptography and steganography is to provide secret communication. However, steganography is not the same as cryptography. Cryptography hides the contents of a secret message from a malicious people, whereas steganography even conceals the existence of the message. Steganography must not be confused with cryptography, where we transform the message so as to make it meaning obscure to a malicious people who intercept it. Therefore, the definition of breaking the system is different. In cryptography, the system is broken when the attacker can read the secret message. Breaking a steganographic system need the attacker to detect that steganography has been used and he is able to read the embedded message.

In cryptography, the structure of a message is scrambled to make it meaningless and unintelligible unless the decryption key is available. It makes no attempt to disguise or hide the encoded message. Basically, cryptography offers the ability of transmitting information between persons in a way that prevents a third party from reading it. Cryptography can also provide authentication for verifying the identity of someone or something

In contrast, steganography does not alter the structure of the secret message, but hides it inside a cover-image so it cannot be seen. A message in cipher text, for instance, might arouse suspicion on the part of the recipient while an invisible message created with steganographic methods will not. In other word, steganography prevents an unintended recipient from suspecting that the data exists. In addition, the security of classical steganography system relies on secrecy of the data encoding system. Once the encoding system is known, the steganography system is defeated.

It is possible to combine the techniques by encrypting message using cryptography and then hiding the encrypted message using steganography. The resulting stego-image can be transmitted without revealing that secret information is being exchanged. Furthermore, even if an attacker were to defeat the steganographic technique and detect the message from the stego-object, he would still require the cryptographic decoding key to decipher the encrypted message. Table 1 shows that both technologies have counter advantages and disadvantages.

STEGANOGRAPHY

CRYPTOGRAPHY

Unknown message passing.

Little known technology.

Technology still being developed for certain formats.

Once detected message is known Many Carrier formats.

Known message passing.

Common technology.

Most algorithms known to government departments Strong algorithm are currently resistant to brute force attack.

Large expensive computing power required for cracking

Technology increase reduces strength.

STEGANOGRAPHY APPLICATIONS

There are many applications for digital steganography of image, including copyright protection, feature tagging, and secret communication. Copyright notice or watermark can embedded inside an image to identify it as intellectual property. If someone attempts to use this image without permission, we can prove by extracting the watermark

In feature tagging, captions, annotations, time stamps, and other descriptive elements can be embedded inside an image. Copying the stegoimage also copies of the embedded features and only parties who possess the decoding stego-key will be able to extract and view the features. On the other hand, secret communication does not advertise a covert communication by using steganography. Therefore, it can avoid scrutiny of the sender, message and recipient. This is effective only if the hidden communication is not detected by the others people.

WATERMARKING

Digital watermarking is an extension of steganography, is a promising solution for content copyright protection in the global network. It imposes extra robustness on embedded information. Digital watermarking is the science of embedding copyright information in the original files. The information embedded is called watermarks. Digital watermarking does not leave a noticeable mark on the content and dont affect its appreciation. These are imperceptible and detected only by proper authorities. Digital watermarks are difficult to remove without noticeable degrading the content and are covert means in situations where cryptography fails to provide robustness. The content is watermarked by converting copyright information into random digital noise using special algorithm that is perceptible only to the creator. Watermarks are resistant to filtering and stay with the content as long as the original has not been purposely damaged.

HISTORY ABOUT WATERMARKING

The distribution of works of art, including pictures, music, video and textual documents, has become easier. With the widespread and increasing use of the Internet, digital forms of these media (still images, audio, video, text) are easily accessible. This is clearly advantageous, in that it is easier to market and sell one's works of art. However, this same property threatens copyright protection. Digital documents are easy to copy and distribute, allowing for pirating. There are a number of methods for protecting ownership. One of these is known as digital watermarking. Digital watermarking is the process of inserting a digital signal or pattern (indicative of the owner of the content) into digital content. The signal, known as a watermark, can be used later to identify the owner of the work, to authenticate the content, and to trace illegal copies of the work. Watermarks of varying degrees of obtrusiveness are added to presentation media as a guarantee of authenticity, quality, ownership, and source. To be effective in its purpose, a watermark should adhere to a few requirements. In particular, it should be robust, and transparent. Robustness requires that it be able to survive any alterations or distortions that the watermarked content may undergo, including intentional attacks to remove the watermark, and common signal processing alterations used to make the data more efficient to store and transmit. This is so that afterwards, the owner can still be identified. Transparency requires a watermark to be imperceptible so that it does not affect the quality of the content, and makes detection, and therefore removal, by pirates less possible. The media of focus in this paper is the still image. There are a variety of image watermarking techniques, falling into 2 main categories, depending on in which domain the watermark is constructed: the spatial domain (producing spatial watermarks) and the frequency domain (producing spectral watermarks). The effectiveness of a watermark is improved when the technique exploits known properties of the human visual system. These are known as perceptually based watermarking techniques. Within this category, the class of image-adaptive watermarks proves most effective. In conclusion, image watermarking techniques that take advantage of properties of the human visual system, and the characteristics of the image create the most robust and transparent watermarks. Digital watermarking is a technology for embedding various types of information in digital content. In general, information for protecting copyrights and proving the validity of data is embedded as a watermark. A digital watermark is a digital signal or pattern inserted into digital content. The digital content could be a still image, an audio clip, a video clip, a text document, or some form of digital data that the creator or owner would like to protect. The main purpose of the watermark is to identify who the owner of the digital data is, but it can also identify the intended recipient. Why do we need to embed such information in digital content using digital watermark technology? The Internet boom is one of the reasons. It has become easy to connect to the Internet from home computers and obtain or provide various information using the World Wide Web. All the information handled on the Internet is provided as digital content. Such digital content can be easily copied in a way that makes the new file indistinguishable from the original. Then the content can be reproduced in large quantities.

For example, if paper bank notes or stock certificates could be easily copied and used, trust in their authenticity would greatly be reduced, resulting in a big loss. To prevent this, currencies and stock certificates contain watermarks. These watermarks are one of the methods for preventing counterfeit and illegal use.

Digital watermarks apply a similar method to digital content. Watermarked content can prove its origin, thereby protecting copyright. A watermark also discourages piracy by silently and psychologically deterring criminals from making illegal copies.2.2 PRINCIPLE OF DIGITAL WATERMARKS

A watermark on a bank note has a different transparency than the rest of the note when a light is shined on it. However, this method is useless in the digital world. Currently there are various techniques for embedding digital watermarks. Basically, they all digitally write desired information directly onto images or audio data in such a manner that the images or audio data are not damaged. Embedding a watermark should not result in a significant increase or reduction in the original data. Digital watermarks are added to images or audio data in such a way that they are invisible or inaudible and unidentifiable by human eye or ear. Furthermore, they can be embedded in content with a variety of file formats. Digital watermarking is the content protection method for the multimedia era. Materials suitable for watermarking. Digital watermarking is applicable to any type of digital content, including still images, animation, and audio data. It is easy to embed watermarks in material that has a comparatively high redundancy level ("wasted"), such as color still images, animation, and audio data; however, it is difficult to embed watermarks in material with a low redundancy level, such as black-and-white still images.To solve this problem, we developed a technique for embedding digital watermarks in black-and-white still images and a software application that can effectively embed and detect digital watermarks.

2.3 STRUCTURE OF A DIGITAL WATERMARK The material that contains a digital watermark is called a carrier. A digital watermark is not provided as a separate file or a link. It is information that is directly embedded in the carrier file. Therefore, simply viewing the carrier image containing it cannot identify the digital watermark. Special software is needed to embed and detect such digital watermarks. Kowas SteganoSign is one of these software packages. Both images and audio data can carry watermarks. A digital watermark can be detected as shown in the following illustration.

2.4 THE IMPORTANCE OF DIGITAL WATERMARKS The Internet has provided worldwide publishing opportunities to creators of various works, including writers, photographers, musicians and artists. However, these same opportunities provide ease of access to these works, which has resulted in pirating. It is easy to duplicate audio and visual files, and is therefore probable that duplication on the Internet occurs without the rightful owners permission. An example of an area where copyright protection needs to be enforced is in the on-line music industry.

Digital watermarking is being recognized as a way for improving this situation. RIAA reports that "record labels see watermarking as a crucial piece of the copy protection system, whether their music is released over the Internet or on DVD-Audio". They are of the opinion that any encryption system can be broken, sooner or later, and that digital watermarking is needed to indicate who the culprit is. Another scenario in which the enforcement of copyright is needed is in newsgathering. When digital cameras are used to snapshoot an event, the images must be watermarked as they are captured. This is so that later, image's origin and content can be verified. This suggests that there are many applications that could require image watermarking, including Internet imaging, digital libraries, digital cameras, medical imaging, image and video databases, surveillance imaging, video-on-demand systems, and satellite-delivered video.

2.5 THE PURPOSES OF DIGITAL WATERMARKS

Watermarks are a way of dealing with the problems mentioned above by providing a number of services:

1. They aim to mark digital data permanently and unalterably, so that the source as well as the intended recipient of the digital work is known. Copyright owners can incorporate identifying information into their work. That is, watermarks are used in the protection of ownership. The presence of a watermark in a work suspected of having been copied can prove that it has been copied.

2. By indicating the owner of the work, they demonstrate the quality and assure the authenticity of the work. 3. With a tracking service, owners are able to find illegal copies of their work on the Internet. In addition, because each purchaser of the data has a unique watermark embedded in his/her copy, any unauthorized copies that s/he has distributed can be traced back to him/her.

4. Watermarks can be used to identify any changes that have been made to the watermarked data.

5. Some more recent techniques are able to correct the alteration as well.

2.6 DIGITAL WATERMARK TYPES AND TERMSWatermarks can be visible or invisible:

a. Visible watermarks are designed to be easily perceived by a viewer (or listener). They clearly identify the owner of the digital data, but should not detract from the content of the data.

b. Invisible watermarks are designed to be imperceptible under normal viewing (or listening) conditions; more of the current research focuses on this type of watermark than the visible type. Both of these types of watermarks are useful in deterring theft, but they achieve this in different ways. Visible watermarks give an immediate indication of who the owner of the digital work is, and data watermarked with visible watermarks are not of as much usefulness to a potential pirate (because the watermark is visible). Invisible watermarks, on the other hand, increase the likelihood of prosecution after the theft has occurred. These watermarks should therefore not be detectable to thieves, otherwise they would try to remove it; however, they should be easily detectable by the owners.

A further classification of watermarks is into fragile, semi-fragile or robust:

a. A fragile watermark is embedded in digital data to for the purpose of detecting any changes that have been made to the content of the data. They achieve this because they are distorted, or "broken", easily. Fragile watermarks are applicable in image authentication systems.

b. Semi-fragile watermarks detect any changes above a user-specified threshold.

c. Robust watermarks are designed to survive "moderate to severe signal processing attacks".

Watermarks for images can further be classified into spatial or spectrum watermarks, depending on how they are constructed:

a. spatial watermarks are created in the spatial domain of the image, and are embedded directly into the pixels of the image. These usually produce images of high quality, but are not robust to the common image alterations. b. Spectral (or transform-based) watermarks are incorporated into the image's transform coefficients. The inverse-transformed coefficients form the watermarked data. Perceptual watermarks are invisible watermarks constructed from techniques that use models of the human visual system to adapt the strength of the watermark to the image content. The most effective of these watermarks are known as image-adaptive watermarks. Finally, blind watermarking techniques are techniques that are able to detect the watermark in a watermarked digital item without use of the original digital item. 2.7 EFFECTIVE DIGITAL WATERMARKS

Features of a Good WatermarkThe following are features of a good watermark:

1. It should be difficult or impossible to remove a digital watermark without noticeably degrading the watermarked content. This is to ensure that the copyright information cannot be removed.

2. The watermark should be robust. This means that it should remain in the content after various types of manipulations, both intentional (known as attacks on the watermark) and unintentional (alterations that the digital data item would undergo regardless of whether it contains a watermark or not). These are described below. If the watermark is a fragile watermark, however, it should not remain in the digital data after attacks on it, but should be able to survive certain other alterations (as in the case of images, where it should be able to survive the common image alteration of cropping). 3. The watermark should be perceptually invisible, or transparent. That is, it should be imperceptible (if it is of the invisible type). Embedding the watermark signal in the digital data produces alterations, and these should not degrade the perceived quality of the data. Larger alterations are more robust, and are easier to detect with certainty, but result in greater degradation of the data. 4. It should be easy for the owner or a proper authority to readily detect the watermark. "Such decodability without requiring the original, unwatermarked image would be necessary for efficient recovery of property and subsequent prosecution". Further properties that enhance the effectiveness of a watermarking technique, but which are not requirements are:5. Hybrid watermarking refers to the embedding of a number of different watermarks in the same digital carrier signal. Hybrid watermarking allows intellectual property rights (IPR) protection, data authentication and data item tracing all in one go. 6. Watermark key: it is beneficial to have a key associated with each watermark that can be used in the production, embedding, and detection of the watermark. It should be a private key, because then if the algorithms to produce, embed and detect the watermark are publicly known, without the key, it is difficult to know what the watermark signal is. The key indicates the owner of the data. It is of interest to identify the properties of a digital data item (the carrier signal) that assist in watermarking: 1. It should have a high level of redundancy. This is so that it can carry a more robust watermark without the watermark being noticed. (A more robust watermark usually requires a larger number of alterations to the carrier signal). 2. It must tolerate at least small, well-defined modifications without changing its semantics.

2.8 PROBLEM DEFINATION

In existing system by using TBPC method we cant achieve good PSNR of a stego image . Reducing distortion between the cover object and the stego object is an important issue for steganography. The tree-based parity check method is very efficient for hiding a message on image. But distortion ,embedding capacity and time complicity is more.2.9 METHODOLOGIES

2.9.1 MODULE NAMES Location finding method and TBPC. Majority vote strategy. Average Modifications per Hidden Bit. Time Complexity of MPC. Comparison for Large Payloads.MODULE1:TREE AND GRAPH

The tree data structure can be generalized to representdirected graphsby removing the constraints that a node may have at most one parent, and that no cycles are allowed. Edges are still abstractly considered as pairs of nodes, however, the termsparentandchildare usually replaced by different terminology (for example, sourceandtarget). Differentimplementation strategiesexist, for exampleadjacency lists.

Ingraph theory, atreeis a connected acyclicgraph; unless stated otherwise, trees and graphs are undirected. There is no one-to-one correspondence between such trees and trees as data structure. We can take an arbitrary undirected tree, arbitrarily pick one of itsverticesas theroot, make all its edges directed by making them point away from the root node - producing anarborescence- and assign an order to all the nodes. The result corresponds to a tree data structure. Picking a different root or different ordering produces a different one. Atree structureis a way of representing thehierarchicalnature of astructurein a graphical form. It is named a "tree structure" because the classicrepresentation resembles atree, even though the chart is generally upside down compared to an actual tree, with the "root" at the top and the "leaves" at the bottom. A tree structure is conceptual, and appears in several forms. For a discussion of tree structures in specific fields, seeTree (data structure)for computer science: insofar as it relates to graph theory.Nomenclature and properties Everyfinitetree structure has a member that has nosuperior. This member is called the "root" orroot node. It can be thought of as the starting node. The converse is not true: infinite tree structures may or may not have a root node. The lines connecting elements are called "branches", the elements themselves are called "nodes". Nodes without children are calledleaf nodes, "end-nodes", or "leaves". The names of relationships between nodes are modeled after family relations. The gender-neutral names "parent" and "child" have largely displaced the older "father" and "son" terminology, although the term "uncle" is still used for other nodes at the same level as the parent.

A node's "parent" is a node one step higher in the hierarchy (i.e. closer to the root node) and lying on the same branch. "Sibling" ("brother" or "sister") nodes share the same parent node.

A node's "uncles" are siblings of that node's parent.

A node that is connected to all lower-level nodes is called an "ancestor".

In the example, "encyclopedia" is the parent of "science" and "culture", its children. "Art" and "craft" are siblings, and children of "culture", which is their parent and thus one of their ancestors. Also, "encyclopedia", being the root of the tree, is the ancestor of "science", "culture", "art" and "craft". Finally, "science", "art" and "craft", being leaves, are ancestors of no other node. In a tree structure there is one and only onepath from any point to any other point. Tree structures are used extensively incomputer science. Determine embeddable sites in image, i.e. high frequency regions in image. Construct master tree for the determined lsb bits, From top to bottom and left to right. To find out the information held by a leaf node to the root of the master tree.

YES

NO

YES

NO

Fig 2.9 : TREE FORMATION.Tree Formation

In most data hiding algorithms, after finding the embeddable sites of the image, the value of these locations can be classified as either '0' or '1'. They are compared with the logo in the immediately next step. If the value is the same as the to-be-embedded bit, no operation is needed. Otherwise, some distortion creating processes are carried out to toggle the value.

In TBPC, an N-ary complete tree namely Master Tree is filled up by the value of these embeddable locations. Every node of an N-ary complete tree except leaf nodes has N child nodes. In the proposed algorithm, one leaf node is needed to hold one information bit. To embed an L bits logo, L leaves are required in the Master Tree. Parity Calculation

Aparity bitis abitthat is added to ensure that the number of bits with the valueonein a set of bits isevenorodd. Parity bits are used as the simplest form oferror detecting code.

There are two variants of parity bits:even parity bitandodd parity bit. When using even parity, the parity bit is set to 1 if the number of ones in a given set of bits (not including the parity bit) is odd, making the number of ones in the entire set of bits (including the parity bit) even. If the number of on-bits is already even, it is set to a 0. When using odd parity, the parity bit is set to 1 if the number of ones in a given set of bits (not including the parity bit) is even, keeping the number of ones in the entire set of bits (including the parity bit) odd. And when the number of set bits is already odd, the odd parity bit is set to 0. In other words, an even parity bit will be set to "1" if the number of 1's + 1 is even, and an odd parity bit will be set to "1" if the number of 1's +1 is odd.

Even parity is a special case of acyclic redundancy check(CRC), where the 1-bit CRC is generated by thepolynomialx+1. If the parity bit is present but not used, it may be referred to asmark parity(when the parity bit is always 1) orspace parity(the bit is always 0). To find out the information held by a leaf node, we travel from the leaf node to the root of the Master Tree. If the occurrence of 1 is an odd number, the information bit of the leaf node is said to be 1. Otherwise, the information bit is said to be 0.

MASTER TREE

INFO FIG 2.9.1: MASTER TREE FORMATION MODULE 2

We construct the toggle tree with the minimum number of 1s level by level in the bottom-up order using the following algorithm. Before embedding and extraction, a location finding method determines a sequence of locations that point to elements in the cover object. The embedding algorithm modifies the elements in these locations to hide the message and the extraction algorithm can recover the message by inspecting the same sequence of locations. The TBPC method is a least significant bit (LSB) steganographic method. Only the LSBs of the elements pointed by the determined locations are used for embedding and extraction. The TBPC method constructs a complete N-ary tree, called the master tree, to represent the LSBs of the cover object. Then it fills the nodes of the master tree with the LSBs of the cover object level by level, from top to bottom and left to right. Every node of the tree corresponds to an LSB in the cover object. Denote the number of leaves of the master tree by L. The TBPC embedding algorithm derives an L-bit binary string, called the master string, by performing parity check on the master tree from the root to the leaves. The embedding algorithm hides the message by modifying the bit values of some nodes in the master tree. Assume that the length of the message is also L. Performing the bitwise exclusive-or (XOR) operation between the message and the master string, we obtain a toggle string (e.g., see Fig. 1). Then, the embedding algorithm constructs a new complete N-ary tree, called the toggle tree in the bottom-up order and fills the leaves with the bit values of the toggle string and the other nodes with 0. Then, level by level, from the bottom to the root, each nonleaf node together with its child nodes are flipped if all its child nodes have bits 1 (e.g., see Fig. 2). The embedding algorithm obtains the stego tree by performing XOR between the master tree and the toggle tree (e.g., see Fig. 3). The TBPC extraction algorithm is simple. We can extract the message by performing parity check on each root-leaf path of the stego tree from left to right.Algorithm MPC:

Input: a toggle string of length L;

1. Index the nodes of the initial toggle tree;

2. Set the leaves of the toggle tree from left to right and bit by bit with the toggle string and the other nodes 0;

3. for i=1 to h for each internal node on level i do

if the majority of its unmarked child nodes holds 1

then flip the bit values of this node and its child nodes;

else if the numbers of 0 and 1 in its unmarked child nodes are the same

then mark this internal node;

4. if N is even then

for I=h-1for 1

for each marked internal node holding 1 on level i do

flip the bit values of this node and its child nodes; Index all nodes of a complete N-ary tree with L leaves from top to bottom and left to right. Set the L-bit toggle string bit by bit into the L leaves from left to right and the other nodes 0. Assume that the level of the tree is h. Traverse all nonleaf nodes from level 1 to h. A nonleaf node and its child nodes form a simple complete subtree. For each simple complete subtree, if the majority of the child nodes hold 1, then flip the bit values of all nodes in this subtree. Since the construction is bottom-up, the bit values of the child nodes in every simple complete subtree are set after step 3. Note that marking a node at step 4 applies only for N being even. When N is even, after step 3, there may exist a two level simple complete subtree with N/2 1s in the child nodes and 1 in its root. In this case, flipping the bit values in this simple complete subtree results in one fewer node holding 1 and keeps the result of related root-leaf path parity check unchanged. Step 4 takes care of this when the condition applies, and it is done level by level from top to bottom. Also note that for the root of the whole toggle tree, the bit value is always 0 when half of its child nodes hold 1. Thus, after step 4, the bit values of the child nodes in each simple complete subtree are determined. The number of 1s in the toggle tree is the number of modifications. When constructing the toggle tree, the original TBPC method flips a simple complete subtree only if all of child nodes have 1. We prove that the majority vote strategy actually obtains toggle trees with the least number of 1s. We call a toggle tree with the least number of 1s corresponding to a toggle string an optimal toggle tree. We say that a toggle tree is in majority form if for each internal node at least half of its child nodes have bit value 0 and the internal node holds 0 when exactly half of its child nodes holding 1. The output of the algorithm is a toggle tree in majority form. The majority vote guarantees that at least half child nodes of an internal node hold 0. Note that every optimal toggle tree be transformed into majority form. It is obvious when N is even. When N is odd, we can check each 2-level simple complete subtree level by level in the top-down order and flip the bit values of the root node and its N child nodes if exactly (N+1)/2 of the child nodes hold 1. Note that, when this situation applies, the root node must hold 0 before flipping, otherwise the toggle tree is not optimal. This rearrangement does not introduce an extra 1 and the result of each root-leaf path parity check is not affected.STEP 1

STEP 2

TOGGLE TREE

MODULE 3 We construct a method that achieves the expected embedding modifications per hidden bit of 0.5. In other words, if we try to embed an L-bit message into the cover object, 0.5L modifications will occur on average.

to denote the expected embedding modifications per hidden bit, where is the average number of embedding modifications for an L-bit message. MPC method performs majority vote on every simple complete subtree to construct the toggle tree in the bottom-up order. Therefore, we are going to calculate the expected reduced number of 1s for every simple complete subtree and sum up the expected reduced number of 1s for all simple complete subtrees.For convenience, we use i-level tree to denote a complete N-ary tree of levels. An i-level tree consists of one root and N(i-1) -level trees. An -level simple complete subtree is a two-level tree containing a node v at level i and all its child nodes. For an h-level toggle tree, the level of the root is and the level of a leaf is 0. Let be the probability that the root of an i-level simple complete subtree holds 1 after performing majority vote. For the leaf nodes, is because the leaf nodes are uniformly filled with 0 or 1. For every i-level simple complete subtree, is the same by symmetry. Let . Since the toggle tree is an N-ary complete tree constructed by the majority vote strategy, can be expressed as follows:

Let be the reduced number of 1s after flipping the bit values of a simple complete subtree that holds t 1s. Therefore, The expected reduced number of 1s for an i-level simple complete subtree is as follows:

For an L-bit toggle string, the expected number of 1s in the toggle string is 0.5L. In the first step for the toggle tree construction, we fill each leaf with one bit of the toggle string. Before majority vote, the number of 1s in the toggle tree is 0.5L. After majority vote, the number of 1s in the toggle tree is . Since the number of modifications is the number of 1s in the toggle tree, we finally have the following equation: The expected reduced number of 1s for an -level simple complete subtree is as follows:

If N=2K+1 is an odd integer, (3) can be further simplified as

Since

The pToggle of the TBPC method is

Where is the number of leaves and is the number of possible 01 configurations in leaves for an i-level tree.MODULE 4 For embedding of the MPC method, the construction of an L-bit master string from a master tree is to perform parity check on L simple root-leaf paths. The number of parity check operations for each simple root-leaf path is the number of edges in this path. Since we perform parity check once for every edge, the total number of parity check operations is the number of edges in the master tree. Since the number of nodes in the master tree is

the time complexity to obtain a master string is . The time complexity to obtain the toggle string is since the toggle string is derived by performing bitwise exclusive-or between the L-bit message and the L-bit master string. Thus, the total time complexity of the embedding algorithm is . For the extraction algorithm, we perform parity check on L simple root-leaf paths in the stego tree. Thus, the complexity of the extraction algorithm is also .

MODULE 5

Embedding messages in steganographic system can be carried out without use of a key or with use of a key. To improve steganographic robustness key can be used as a verification option. It can make an impact on the distribution of bits of a message within a container, as well as an impact on the procedure of forming a sequence of embedded bits of a message.

The first level of protection is determined only by the choice of embedding algorithm. This may be the least significant bits modification algorithm, or algorithms for modifying the frequency or spatial-temporal characteristics of the container. The first level of protection is presented in any steganographic channel. Steganographic system in this case can be represented as shown atThe First Protection Level Schemefigure. There following notations are used:c- is a container file;F- steganographic channel space (frequency or/and amplitude container part, that is available for steganographic modification and message signal transmission);SC- steganographic system;m- message to be embedded;E- embedding method;- modified container file.

The second protection level of the steganographic system, as well as all levels of protection of the higher orders, is characterized by the use of Key (password) via steganographic modification. An example of a simple key scheme, which provides a second level of protection, is to write the unmodified or modified password in the top or bottom of the message; or the distribution of the password sign on the entire length of the steganographic channel. Such key schemes do not affect the distribution of messages through the container and do not use a message preprocessing according to the defined key (see figureThe Second Protection Level Scheme). This kind of steganographic systems are used in such tasks as, for instance, adding a digital signature for proof of copyright. Data embedding performance is not changed in comparison with the fastest approach of the first protection level usage.

Thepayloadis the data to be covertly communicated. Thecarrieris the signal, stream, or data file into which the payload is hidden; which differs from the "channel" (typically used to refer to the type of input, such as "a JPEG image"). The resulting signal, stream, or data file which has the payload encoded into it is sometimes referred to as thepackage,stego file, orcovert message. The percentage of bytes, samples, or other signal elements which are modified to encode the payload is referred to as theencoding densityand is typically expressed as a number between 0 and 1. 2.9.2 GIVEN INPUT AND EXPECTED OUTPUT

MODULE 1

INPUT: Input Image/original image.

OUTPUT: embeddable sites and forming tree(TBPC).

MODULE 2

INPUT: tree based parity checking.

OUTPUT: majority parity checking.

MODULE 3

INPUT: majority parity checking.

OUTPUT: average modification per hidden bit.MODULE 4

INPUT: majority parity checking and tree based parity checking.

OUTPUT: average modification per hidden bits. MODULE 5INPUT: majority parity checking and tree based parity checking.OUTPUT: comparison of payloads.2.10 TECHNIQUE OR ALGORITHM Ininformation theoryandcoding theorywith applications incomputer science andtelecommunication,error detection and correctionorerror controlare techniques that enable reliable delivery ofdigital dataover unreliablecommunication channels. Many communication channels are subject tochannel noise, and thus errors may be introduced during transmission from the source to a receiver. Error detection techniques allow detecting such errors, while error correction enables reconstruction of the original data. Error detectionis the detection of errors caused by noise or other impairments during transmission from the transmitter to the receiver.

Error correctionis the detection of errors and reconstruction of the original, error-free data.

The general idea for achieving error detection and correction is to add someredundancy(i.e., some extra data) to a message, which receivers can use to check consistency of the delivered message, and to recover data determined to be erroneous. Error-detection and correction schemes can be eithersystematicor non-systematic: In a systematic scheme, the transmitter sends the original data, and attaches a fixed number ofcheck bits(orparity data), which are derived from the data bits by somedeterministic algorithm. If only error detection is required, a receiver can simply apply the same algorithm to the received data bits and compare its output with the received check bits; if the values do not match, an error has occurred at some point during the transmission. In a system that uses a non-systematic code, the original message is transformed into an encoded message that has at least as many bits as the original message.

Good error control performance requires the scheme to be selected based on the characteristics of the communication channel. Commonchannel modelsinclude memory-less models where errors occur randomly and with a certain probability, and dynamic models where errors occur primarily inbursts. Consequently, error-detecting and correcting codes can be generally distinguished betweenrandom-error-detecting/correctingandburst-error-detecting/correcting. Some codes can also be suitable for a mixture of random errors and burst errors.

If thechannel capacity cannot be determined, or is highly varying, an error-detection scheme may be combined with a system for retransmissions of erroneous data. This is known asautomatic repeat request(ARQ), and is most notably used in the Internet. An alternate approach for error control ishybrid automatic repeat request(HARQ), which is a combination of ARQ and error-correction coding.

ERROR DETECTION SCHEMES Error detection is most commonly realized using a suitablehash function(orchecksumalgorithm). A hash function adds a fixed-lengthtagto a message, which enables receivers to verify the delivered message by recomputing the tag and comparing it with the one provided.

There exists a vast variety of different hash function designs. However, some are of particularly widespread use because of either their simplicity or their suitability for detecting certain kinds of errors (e.g., thecyclic redundancy check's performance in detectingburst errors).

Random-error-correcting codesbased onminimum distancecoding can provide a suitable alternative to hash functions when a strict guarantee on the minimum number of errors to be detected is desired. Repetition codes, described below, are special cases of error-correcting codes: although rather inefficient, they find applications for both error correction and detection due to their simplicity.

Parity bits Aparity bitis a bit that is added to a group of source bits to ensure that the number of set bits (i.e., bits with value 1) in the outcome is even or odd. It is a very simple scheme that can be used to detect single or any other odd number (i.e., three, five, etc.) of errors in the output. An even number of flipped bits will make the parity bit appear correct even though the data is erroneous.

Extensions and variations on the parity bit mechanism arehorizontal redundancy checks,vertical redundancy checks, and "double," "dual," or "diagonal" parity (used inRAID-DP).

ADVANTAGES OF PARITY CHECKING Because of its simplicity, parity is used in manyhardwareapplications where an operation can be repeated in case of difficulty, or where simply detecting the error is helpful. For example, theSCSIandPCI busesuse parity to detect transmission errors, and manymicroprocessorinstructioncachesinclude parity protection. Because theI-cachedata is just a copy ofmain memory, it can be disregarded and re-fetched if it is found to be corrupted.

Inserialdata transmission, a common format is 7 data bit, an even parity bit, and one or twostop bits. This format neatly accommodates all the 7-bitASCII characters in a convenient 8-bit byte. Other formats are possible; 8 bits of data plus a parity bit can convey all 8-bit byte values.

In serial communication contexts, parity is usually generated and checked by interface hardware (e.g., aUART) and, on reception, the result made available to the CPU (and so to, for instance, theoperating system) via a status bit in ahardware registerin the interface hardware. Recovery from the error is usually done by retransmitting the data, the details of which are usually handled by software (e.g., the operating system I/O routines).

LEAST SIGNIFICANT BIT Thebinary representationof decimal 149, with the lsb highlighted. The msb in an 8-bit binary number represents a value of 128 decimal. The lsb represents a value of 1. Incomputing, theleast significant bit(lsb) is thebitposition in abinaryintegergiving the units value, that is, determining whether the number is even or odd. The lsb is sometimes referred to as theright-most bit, due to the convention inpositional notationof writing less significant digits further to the right. It is analogous to the least significantdigitof adecimalinteger, which is the digit in theones(right-most) position.

It is common to assign each bit a position number, ranging from zero to N-1, where N is the number of bits in the binary representation used. Normally, this is simply the exponent for the corresponding bit weight in base-2 (such as in231..20). Although a few CPU manufacturers assignbit numbersthe opposite way (which is not the same as differentendianness), the termlsb(of course) remains unambiguous as an alias for the unit bit. By extension, the least significant bits (plural) are the bits of the number closest to, and including, the lsb.

The least significant bits have the useful property of changing rapidly if the number changes even slightly. For example, if 1 (binary 00000001) is added to 3 (binary 00000011), the result will be 4 (binary 00000100) and three of the least significant bits will change (011 to 100). By contrast, the threemost significant bitsstay unchanged (000 to 000). Least significant bits are frequently employed inpseudorandom number generators,hash functionsandchecksums.Implementing steganography

Secrets can be hidden inside all sorts of cover information: text, images, audio, video and more. Most steganographic utilities nowadays, hide information inside images, as this is relatively easy to implement. However, there are tools available to store secrets inside almost any type of cover source. It is also possible to hide information inside texts, sounds and video films for example. The most important property of a cover source is the amount of data that can be stored inside it, without changing the noticeable properties of the cover. When an image is distorted or a piece of music sounds different than the original, the cover source will be suspicious and may be checked more thoroughly.Hiding a message inside a text

Since everyone can read, encoding text in neutral sentences is doubtfully effective. But taking the first letter of each word of the previous sentence, you will see that it is possible and not very difficult. Hiding information in plain text can be done in many different ways. The first-letter algorithm used here is not very secure, as knowledge of the system that is used, automatically gives you the secret. This is a disadvantage that many techniques of hiding secrets inside plain text have in common. Many techniques involve the modification of the layout of a text, rules like using every n-th character or the altering of the amount of whitespace after lines or between words. The last technique was successfully used in practice and even after a text has been printed and copied on paper for ten times, the secret message could still be retrieved.

Another possible way of storing a secret inside a text is using a publicly available cover source, a book or a newspaper, and using a code which consists for example of a combination of a page number, a line number and a character number. This way, no information stored inside the cover source will lead to the hidden message. Discovering it, relies solely on gaining knowledge of the secret key.

Images

Hiding information inside images is a popular technique nowadays. An image with a secret message inside can easily be spread over the world wide web or in newsgroups. The use of steganography in newsgroups has been researched by German steganographic expert Niels Provos, who created a scanning cluster which detects the presence of hidden messages inside images that were posted on the net. However, after checking one million images, no hidden messages were found, so the practical use of steganography still seems to be limited. To hide a message inside an image without changing its visible properties, the cover source can be altered in noisy areas with many color variations, so less attention will be drawn to the modifications. The most common methods to make these alterations involve the usage of the least-significant bit or LSB, masking, filtering and transformations on the cover image. These techniques can be used with varying degrees of success on different types of image files.

Least-significant bit modifications

The most widely used technique to hide data, is the usage of the LSB. Although there are several disadvantages to this approach, the relative easiness to implement it, makes it a popular method. To hide a secret message inside a image, a proper cover image is needed. Because this method uses bits of each pixel in the image, it is neccessary to use a lossless compression format, otherwise the hidden information will get lost in the transformations of a lossy compression algorithm.

When using a 24 bit color image, a bit of each of the red, green and blue color components can be used, so a total of 3 bits can be stored in each pixel. Thus, a 800 600 pixel image can contain a total amount of 1.440.000 bits (180.000 bytes) of secret data.

For example, the following grid can be considered as 3 pixels of a 24 bit color image,

using 9 bytes of memory:(00100111 11101001 11001000)

(00100111 11001000 11101001)

(11001000 00100111 11101001)

When the character A, which binary value equals 10000001, is inserted, the following

grid results:

(00100111 11101000 11001000)

(00100110 11001000 11101000)

(11001000 00100111 11101001)

In this case, only three bits needed to be changed to insert the character successfully. On average, only half of the bits in an image will need to be modified to hide a secret message using the maximal cover size. The resulting changes that are made to the least significant bits are too small to be recognized by the human eye, so the message is effectively hidden. While using a 24 bit image gives a relatively large amount of space to hide messages, it is also possible to use a 8 bit image as a cover source. Because of the smaller space and different properties, 8 bit images require a more careful approach. Where 24 bit images use three bytes to represent a pixel, an 8 bit image uses only one. Changing the LSB of that byte will result in a visible change of color, as another color in the available palette will be displayed. Therefore, the cover image needs to be selected more carefully and preferably be in grayscale, as the human eye will not detect the difference between different gray values as easy as with different colors.

Disadvantages of using LSB alteration, are mainly in the fact that it requires a fairly large cover image to create a usable amount of hiding space. Even nowadays, uncompressed images of 800 x 600 pixels are not often used on the Internet, so using these might rise suspicion. Another disadvantage will arise when compressing an image concealing a secret using a lossy compression algorithm. The hidden message will not survive this operation and is lost after the transformation.

Masking and filtering

Masking and filtering techniques, usually restricted to 24 bits or grayscale images, take a different approach to hiding a message. These methods are effectively similar to paper watermarks, creating markings in an image. This can be achieved for example by modifying the luminance of parts of the image. While masking does change the visible properties of an image, it can be done in such a way that the human eye will not notice the anomalies. Since masking uses visible aspects of the image, it is more robust than LSB modification with respect to compression, cropping and different kinds of image processing. The information is not hidden at the noise level but is inside the visible part of the image, which makes it more suitable than LSB modifications in case a lossy compression algorithm like JPEG is being used.Detecting steganography

As more and more techniques of hiding information are developed and improved, the methods of detecting the use of steganography also advance. Most steganographic techniques involve changing properties of the cover source and there are several ways of detecting these changes.

Text

While information can be hidden inside texts in such a way that the presence of the message can only be detected with knowledge of the secret key, for example when using the earlier mentioned method using a publicly available book and a combination of character positions to hide the message, most of the techniques involve alterations to the cover source. These modifications can be detected by looking for patterns in texsts or disturbings thereof, odd use of language and unusual amounts of whitespace.

Images

Although images can be scanned for suspicious properties in a very basic way, detecting hidden messages usually requires a more technical approach. Changes in size, file format, last modified timestamp and in the color palette might point out the existence of a hidden message, but this will not always be the case. A widely used technique for image scanning involves statistical analysis. Most steganographic algorithms that work on images, assume that the least-significant bit is more or less random. This is however, an incorrect assumption. While the LSB might not seem to be of much importance, applying a filter which only shows the least-significant bits, will still produce a recognizable image. Since this is the case, it can be concluded that the LSB are not random at all, but actually contain information about the whole image.

When inserting a hidden message into an image, this property changes. Especially with encrypted data, which has a very high entropy, the LSB of the cover image will no longer contain information about the original, but because of the modifications they will now be more or less random. With a statistical analysis on the LSB, the difference between random values and real image values can easily be detected. Using this technique, it is also possible to detect messages hidden inside JPEG files with the DCT method, since this also involves LSB modifications, even though these take place in the frequency domain.

Audio and video

The statistical analysis method can be used against audio files too, since the LSB modification technique can be used on sounds too. Except for this, there are several other things that can be detected. High, inaudible frequencies can be scanned for information and odd distortions or patterns in the sounds might point out the existence of a secret message. Also, differences in pitch, echo or background noise may raise suspicion. Like implementing steganography using video files as cover sources, the methods of detecting hidden information are also a combination of techniques used for images and audio files. However, a different steganographic technique can be used that is especially effective when used in video films. The usage of special code signs or gestures is very difficult to detect with a computer system. This method was used in the Vietnam war so prisoners of war could communicate messages secretly through the video films the enemy soldiers made to send to the home front.

Defeating steganograms

While steganograms may not always be successfully detected, there are different ways of removing hidden messages from possible cover sources. Knowledge or certainty of the existence of a hidden message is not needed, since messages can even be destroyed without this. Although there will never be a 100 percent guarantee of success, the number of possible ways of sending hidden messages can easily be reduced using any combination of steganographic defeating techniques.Text

The best way of removing hidden messages from a plain text might be rewriting and reformulating the contents. Rewriting it using different words and sentence constructions

will most certainly remove all ways of reproducing a hidden message, since it will take care of almost every possible way data can be stored inside a plain text. The character position scheme will no longer work because the words have been changed, and the same is valid for the differentiations in white spacing, since the text will have a new layout. The only method that will not be covered by this technique is the usage of a publicly available cover source. Since this source cannot easily be altered, there is no effective way of stopping this method, except for intercepting the secret key.

Images

Compressing an image using lossy compression will remove messages that are hidden using the LSB modification technique. This will also happen when the image is resized, the color palette is modified or the colors themselves are modified. Conversion to a different image format, which often uses a different type of compression, will also help in removing hidden messages. And altering the luminiscence for example, will remove watermarks in the visible part of an image.

Audio and video

Most of the techniques that can be used on images, can also be applied on audio files. Compressing an audio file with lossy compression will result in loss of the hidden message as it will change the whole structure of a file. Also, several lossy compression schemes use the limits of the human ear to their advantage by removing all frequencies that cannot be heard. This will also remove any frequencies that are used by a steganographic system which hides information in that part of the spectrum. Another possible way of removing steganograms is lowering the bit rate of the audio file. In that case, there will be less available space to store hidden data and therefore, at least parts of it will get lost. For video, once more again, the same methods as for images and audio files can be applied to remove hidden information. To defeat the use of signals or gestures however, human insight is still necessary, as computer systems are not yet capable of detecting this with a reasonable rate of success.CHAPTER 3

REQUIREMENTS ENGINEERING

3.1 GENERAL

Harris Laplacian detector is used to find feature points (interst points), by using only feature points we can watermark/embedded only on particular high frequency regions. Hiding only on selected regions is more secure when compared to hiding watermark on all high frequency regions. Genetic algorithm is used to check out the fitness of the feature regions. This is one of the advanced methods compare to all existing methods.

3.2 HARDWARE REQUIREMENTS

The hardware requirements may serve as the basis for a contract for the implementation of the system and should therefore be a complete and consistent specification of the whole system. They are used by software engineers as the starting point for the system design. It shows what the system does and how it should be implemented.

PROCESSOR

: PENTIUM IV 2.6 GHz, Intel Core 2Duo.

RAM

: 512 MB DD RAM

MONITOR

: 15 COLOR

HARD DISK

: 40 GB

CDDRIVE

: LG 52XKEYBOARD

: STANDARD 102 KEYSMOUSE

: 3 BUTTONS

3.3 SOFTWARE REQUIREMENTS MATLAB 7.9 Version

MATLABMATLAB is a high-performance language for technical computing. It integrates computation, visualization, and programming in an easy-to-use environment where problems and solutions are expressed in familiar mathematical notation.

Typical uses include:

Math and computation.

Algorithm development.

Modeling, simulation, and prototyping.

Data analysis, exploration, and visualization.

Scientific and engineering graphics.

Application development, including Graphical User Interface building.

MATLAB is an interactive system whose basic data element is an array that does not r

Final Itimp03

Documents

embedding algorithm

target stego object

stego image

tbpc method

extraction algorithm

noisy cover object

optimal data hiding

word steganography