Top Banner
DELHI TECHNOLOGICAL UNIVERSITY SHAHBAD DAULTAPUR, BAWANA RD., DELHI-42 DEPARTMENT OF COMPUTER ENGINEERING SELF-STUDY SEEMINAR REPORT ON STEGANOGRAPHY PRESENTED BY
25
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Steganography

DELHI TECHNOLOGICAL UNIVERSITY

SHAHBAD DAULTAPUR, BAWANA RD., DELHI-42

DEPARTMENT OF COMPUTER ENGINEERING

SELF-STUDY SEEMINAR REPORT ON

STEGANOGRAPHY

PRESENTED BY

Page 2: Steganography

CERTIFICATE

I Abhishek Singh, hereby solemnly affirm that the Self-Study Seminar Report entitled

“Steganography” being submitted by me in partial fulfillment of the requirements for the

award of the degree of Bachelor of Technology in Computer Engineering, to the Delhi

Technological University, is a record of bona fide work carried out by me under the

guidance of Ms. Indu Singh and Mr. Vinod Kumar___________

The work reported in this report in full or in part has not been submitted to any university

or Institute for the award of any other degree or diploma.

Place: DTU, Bawana Road, Delhi-11092

Date:

Page 3: Steganography

ACKNOWLEDGEMENT

I would like to express our greatest gratitude to the people who have helped &

support me throughout my Self-Study Seminar Report.

I are grateful to my guides, ____Ms. Indu Singh and Mr. Vinod Kumar ______ for

their continuous support for the project, from initial advice to contacts in the early stages of

conceptual inception & through ongoing advice & encouragement to this day.

I wish to thank them for their undivided support and interest, which inspired and

encouraged me to go the right way.

At last, but not the least, I want to thank my friends who appreciated me for my work and

motivated me; and finally to God who made all the things possible.

Page 4: Steganography

ABSTRACT

Steganography is the art and science of writing hidden messages in such a way

that no one apart from the sender and intended recipient even realizes there is a hidden

message.

This Report is primarily a reference resource on the Steganography. The material here is

designed to allow a student to learn and understand what Steganography is and how it

works. I do discuss quite a number of Practical Steganographic Methodology to achieve

Covert Communications, but I also focus on how these secrete communication using

Steganography can be busted, as well as how Steganography can used other than as method

to hide data. Also in this report you will find how Steganography does when compared to

the quite famous methodology of Securing Data i.e. Cryptology. If you want to really

understand about Steganography is and what makes it work, you’ve come to the right

place. If all you want is simple instructions on how to hide information in media files, this

probably isn’t the Guide for you. It tells you about Steganography and its deep study.

IMPORTANT KEYWORDS

Steganography

Cryptography

Cover Media

Plain Text

Digital Watermarking

Steganalysis

Stego-Key

Stego-Media

TABLE OF CONTENT

Page 5: Steganography

1. INTRODUCTION…………………………………………………………………1

WHAT EXACTLY IS STEGANOGRAPHY?…………………………......1

WHY IS STEGANOGRAPHY IMPORTANT?............................................1

2. LITERATURE SURVEY…………………………………………………………3

STEGANOGRAPHY IN TEXT FILES…………………………………….5

STEGANOGRAPHY IN AUDIO FILES…………………………………..6

STEGANOGRAPHY IN MAGES………………………………………….8

COVERT CHANNELS……………..……………………………………..10

STEGANOGRAPHY OVER NETWORKS………………………………11

3. DISCUSSION……………………………………………………………………..13

STEGANALYSIS…………………………………………………………13

APPLICATIONS………………………………………………………….15

STEGANOGRAPHY VS. CRYPTOGRAPHY………………………..…17

FUTURE SCOPE OF STEGANOGRAPHY……………………………..18

4. CONCLUSION…………………………………………………………………...19

5. REFERENCES…………………………………………………………………...20

CHAPTER 1:

Page 6: Steganography

INTRODUCTION

WHAT EXACTLY IS STEGANOGRAPHY?

The word Steganography is of Greek origin and means "covered, or hidden

writing".

Steganography is the art and science of communicating in a way which hides the

existence of the communication. Steganography boosts the concept of Information

Hiding without altering or recasting the Information.

Steganography has been widely used, including in recent historical times and the present

day, from using slaves’ shaven heads and patterns sewed into quilts in ancient time, to

using crossword puzzles and invisible ink in modern times, to relay messages which were

not meant for others to read.

However today, in the 21st Century, Steganography is most often associated with data

hidden with other data in an electronic file, including the concealment of information

within computer files such as a document file, image file, program and even protocol.

WHY IS STEGANOPGRAPHY IMPORTANT?

The Primary Goal of Steganography is to hide messages inside other harmless

messages in a way that does not allow any enemy to even detect that there is a second

secret message present. Only due to Steganography, Secret Communications can be

established which is presently required by an individual as well as any organization or the

Government.

Another purpose Steganography serves is to protect the Intellectual Property Rights of

Copyright Owners over their digital media. Information about the owner, recipient and

access level can be embedded within the media through Steganographic Methods. This

allows no one to claim their rights over someone else’s copyrighted media as long as the

Page 7: Steganography

“Watermark” hidden in the media is present. Steganography also makes the removal of the

“Watermark” without destroying the media itself.

Page 8: Steganography

CHAPTER 2

LITERATURE SURVEY

The purpose of this Literature Review is the state and explain the how

Steganography actually helps in strengthening the security of any sensitive information. It

will take a look at how the technology and encoding systems have developed today to

sophisticatedly embed data over different types of files such that the embedded data is not

even perceptible to anyone not only at first look but also when the file undergoes scrutiny.

Furthermore, the use of mediums, which were not intended for communications, are being

used to transmit stego-media are also mentioned in this section. Also explaining how the

networking today helps in Steganography.

Steganography involves hiding data in an overt message and doing it in such a way

that it is difficult for an adversary to detect and difficult for an adversary to remove. Based

on this goal, the basic Steganographic Model is shown below:

Page 9: Steganography

Carrier Media (cover-media) is an original unaltered message into which the

message meant to be hidden is to be embedded into the media.

Encoding process/Steganographic Algorithm in which the sender tries to hide a

message by embedding it into a cover-text, usually using a key, to obtain a stego-

media.

The Payload (or the Secrete Message) is taken along with Carrier Media which are

worked in the Steganographic Technique/Algorithm to embed the payload into

carrier.

The unique key in the Encoding System/Algorithm known as Stego-Key decides

how the process of alteration of carrier media is to be done in order to

accommodate the payload without revealing the payload to others.

The final incorporated media obtained from the Algorithm is called the Stego-

Media. It is this media which holds the secrete message and is ready to be

transmitted without being detected for holding any covert information!

Page 10: Steganography

Recovering process in which the receiver tries to get, using the key only, the hidden

message in the stego-media.

Security requirement is that a third person watching such a communication should

not be able to find out whether the sender has been active, and when, in the sense

that he really embedded a message in the carrier media. In other words, stego-

media should be indistinguishable from Carrier media.

Depending upon the type of carrier media, suitable technique can be applied to bring about

the maximum encoding density i.e. the degree of alteration of carrier media to encode the

payload.

Steganography can be classified into image, text, audio steganography depending on

the cover media used to embed secret data

STEGANOGRAPHY IN TEXT FILES

Text steganography can involve anything from changing the formatting of an

existing text, to changing words within a text, to generating random character sequences or

using context-free grammars to generate readable texts. Text steganography is believed to

be the trickiest due to deficiency of redundant information which is present in image, audio

or a video file. However, in text documents, we can hide information by introducing

changes in the structure of the document without making a notable change in the concerned

output. The Payload can be embedded in a text through 3 pre-established techniques:

1. Line-Shift Coding

In this method, secret message is hidden by vertically shifting the text lines to some

degree. Vertically shift the positions of the locations of text lines in order to hide

the secret information in the document.

The user can chose an arbitrary number between 0 and 127, which is converted into

an 8 bit sequence. According to this bit sequence every second line is shifted one

pixel down if there is a 1 whereas a 0 keeps the line in the normal position.

Page 11: Steganography

2. Word-Shift Coding

In this method, secret message is hidden by shifting the words horizontally, i.e. left

or right to represent bit 0 or 1 respectively. Words shift can only be detected using

correlation method. The scheme is based on the fact that most documents use

variable word spacing to distribute white space when justifying a document. If the

space between adjacent words were not different, word-shift coding would not

make sense at all.

3. Feature Coding

In feature coding, secret message is hidden by altering one or more features of the

text. A parser examines a document and picks out all the features that it can use to

hide the information.

For example, point in letters i and j can be displaced, length of strike in letters f and

t can be changed, or by extending or shortening height of letters b, d, h, etc.

STEGANOGRAPHY IN AUDIO FILES

Today most of the Audio Files are available in Digital form. Digital audio is

discrete rather than continuous signal as found in analog audio. A discrete signal is created

by sampling a continuous analog signal at a specified rate. Digital audio is stored in a

computer as a sequence of 0's and 1's. With the right tools, it is possible to change the

individual bits that make up a digital audio file. Such precise control allows changes to be

made to the binary sequence that are not discernible to the human ear. There have been

many techniques for hiding information or messages in audio in such a manner that the

alterations made to the audio file are perceptually indiscernible. Some of the pre-

established and prominent methods of Audio Steganography are:

1. Least Significant Bit Coding

A very popular methodology is the LSB (Least Significant Bit) algorithm, which

replaces the least significant bit in some bytes of the cover file to hide a sequence

of bytes containing the hidden data. That's usually an effective technique in cases

where the LSB substitution doesn't cause significant quality degradation. Note that

there's a fifty percent chance that the bit you're replacing is the same as its

Page 12: Steganography

replacement, in other words, half the time, the bit doesn't change, which helps to

minimize quality degradation.

The illustration shows how the message 'HEY' is encoded in a 16-bit CD quality

sample using the LSB method. Here the secret information is ‘HEY’ and the cover

file is audio file. HEY is to be embedded inside the audio file. First the secret

information ‘HEY’ and the audio file are converted into bit stream. The least

significant column of the audio file is replaced by the bit stream of secrete

information ‘HEY’. The resulting file after embedding secret information ‘HEY’ is

called Stego-file.

2. Echo Hiding

Echo hiding technique embeds secret information in a sound file by introducing an

echo into the discrete signal. Echo hiding has advantages of providing a high

Encoding Density and superior robustness when compared to other methods. Only

one bit of secret information could be encoded if only one echo was produced from

the original signal. Hence, before the encoding process begins the original signal is

Page 13: Steganography

broken down into blocks. Once the encoding process is done, the blocks are

concatenated back together to create the final signal [5, 20]. Echo Hiding is shown

in the illustration:

3. Phase Coding

The phase coding technique works by replacing the phase of an initial audio

segment with a reference phase that represents the secret information. The

remaining segments phase is adjusted in order to preserve the relative phase

between segments.

STEGANOGRAPHY IN IMAGES

To a computer, an image is an array of numbers that represent light intensities at

various points, or pixels. The 8-bit color images is more preferably used to hide

information. In 8-bit color images, each pixel is represented as a single byte. Each pixel

merely points to a color index table, or palette, with 256 possible colors. The pixel's value,

then, is between 0 and 255.

The challenge of using steganography in computer images is to hide as much data as

possible with the least noticeable difference in the image. Many Steganography experts

recommend using images featuring 256 shades of gray as the palette, for reasons that will

become apparent. Grey-scale images are preferred because the shades change very

gradually between palette entries. This increases the image's ability to hide information!

Page 14: Steganography

Also Images having large areas of solid color should be avoided to be used as Carrier Files

as even minute change in the Carrier would be visual noticeable!

Information in Images can be hidden many different ways. Some of the prominent

Encoding techniques of Image Steganography are:

1. Least Significant Bit Encoding a.k.a. Insertion Technique

One of the most common techniques used in steganography today is called least

significant bit (LSB) insertion. This method is exactly what it sounds like; the least

significant bits of the cover-image are altered so that they form the embedded

information. The following example shows how the letter A can be hidden in the

first eight bytes of three pixels in a 24-bit image:

Pixels: (00100111 11101001 11001000)

(00100111 11001000 11101001)

(11001000 00100111 11101001)

The Code for letter ‘A’ which is to be hidden: 10000001

After the Insertion technique:

Resultant Pixels: (00100111 11101000 11001000)

(00100110 11001000 11101000)

(11001000 00100111 11101001)

The highlighted bits of respective bytes are the only bits of 3 pixels that were

actually altered. LSB insertion requires on average that only half the bits in an

image be changed. Since the 8-bit letter A only requires eight bytes to hide it in, the

ninth byte of the three pixels can be used to hide the next character of the hidden

message!

Page 15: Steganography

2 or more Least Significant bits per byte can be used to embed more payload

increasing the encoding density, however the cover image degrades sharply leading

to its easing detection for holding hidden information.

2. Masking and Filtering

Masking and filtering techniques hide information by marking an image in a

manner similar to paper watermarks. Because watermarking techniques are more

integrated into the image, they may be applied without fear of image destruction

from “lossy” compression. By covering, or masking a faint but perceptible signal

with another to make the first non-perceptible, we exploit the fact that the human

visual system cannot detect slight changes in certain temporal domains of the

image.

Page 16: Steganography

COVERT CHANNELS

Covert channels are communication paths that were neither designed nor intended to

transfer information at all, but are used that way, using entities that were not intended for

such use.

The most prominent example of Covert Channel is the TCP/IP Headers.

The IP header contains the key information that is needed for packets of data to be routed

properly. When using Stego you want to find fields that you can overwrite or change that

will not have an effect on the host communication. One field in the IP header that you can

change without having any effect is the IP identification number. Usually the ID is

incremented by 1 for each packet that is sent out, but any number can be used and the

protocol will function properly. This ability to make a change and not damage functionality

makes this piece of data an ideal candidate for hiding stego.

In the TCP, we can use the sequence and the acknowledgement number fields to hide the

data. During the initial handshake the values of sequence and acknowledgement numbers

are picked and randomly generated by the sender and receiver. Therefore, the first packet

that is sent can contain data hidden in those fields because the initial values don’t have any

purpose. Essentially, once communication has been established, these fields can no longer

be used to hide data.

STEGANOGRAPHY OVER NETWORK

On a computer network you can use Steganographic techniques to hide files in traffic,

transmit viruses, or hide the trail somebody when moving around online. The 3 methods to

implement Steganography over a network are:

1. Hiding in an Attachment

Hiding data in an attachment is the most basic form of using a network to transmit

stego from one person to another. Three most popular ways to do this are with

Email, by file transfer such as FTP, or by posting a file on a website

Page 17: Steganography

2. Hiding in a Transmission

Hiding data in an attachment is the most basic form of using a network to transmit

stego from one person to another. Three most popular ways to do this are with

Email, by file transfer such as FTP, or by posting a file on a website

3. Hiding in Overt Protocol

Essentially, this is what is called as data camouflaging, where you make data look

like something else. With this technique you take data, put it in normal network

traffic, and modify the data in such a way that it looks like the overt protocol.

For example, most networks carry large amounts of HTTP or Web traffic. You

could send data over port 80, and it would look like web traffic. The problem is

that, if someone examined the payload, it would not look like normal web traffic,

which usually contains HTML. What if you added symbols such as < >, < / > to the

data? Because these are the types of characters that HTML contains, the traffic

would look like Web traffic and probably would slip by the casual observer.

Page 18: Steganography

CHAPTER 3

DISCUSSION

This Section is included in this report to state and explain where exactly

Steganography stands in the present time. Despite the presence of many other method of

secure communication, like Cryptography, why is Steganography now being more

preferred than the rest. Also this section also explains if there is way to hide information in

any media, then there is a way to catch these Stego-media carrying the covert information

while being transmitted as any other carrier media.

STAGANALYSIS

Since Steganography provides a medium of covert communications, it can be

abused by some anti-social or ill elements present in the world. These elements can use any

media to send their messages though the internet without being detected by anyone. Hence

it is necessary to analyze any suspected media for a hidden message. This is where

Steganalysis comes in.

Steganalysis is the study of detecting messages hidden using steganography. The goal of

Steganalysis is to identify suspected packages, determine whether or not they have a

payload encoded into them, and, if possible, recover that payload. Steganalysis generally

starts with a pile of suspect data files, but little information about which of the files, if any,

contain a payload.

Basic Technique: A set of unmodified files of the same type, and ideally from the same

source as the set being inspected, are analyzed for various statistics. Some of these are as

simple as spectrum analysis, also attempt to look for inconsistencies in the way this data

has been compressed. One case where detection of suspect files is straightforward is when

Page 19: Steganography

the original, unmodified carrier is available for comparison. Comparing the package

against the original file will yield the differences caused by encoding the payload—and,

thus, the payload can be extracted.

Another Technique: Most programs which are employed to detect Stego-Media used the

concept of randomness.

With most files that do not contain random data you would get a histogram that has peaks

and valleys where certain characters appear often and others appear only infrequently.

Below is a histogram for unencrypted data.

On the other hand, with encrypted data, because it is random, you acquire a much flatter

histogram, as you can see in Figure 4.2. You can see that, compared to Figure 4.1, Figure

4.2 looks extremely flat. Every character appears with equal frequency.

Hence the degree of randomness depicted by the Histogram can be used to take up media

for inspection even if an unmodified file from same source is not available for comparison

Page 20: Steganography

APPLICATIONS

Data Hiding was the Major Application of the Steganography since its inception.

However, by the 21st Century, Steganography has found many more applications which are

not meant for any kind of secrete communications! Some of the Prominent Applications of

Steganography are explain below:

1. Digital Watermarking

A Digital Watermark is a kind of marker covertly embedded in a noise-tolerant

signal such as audio or image data. A popular application of watermarking

techniques is to provide a proof of ownership of digital data by embedding

copyright statements into video or image digital products and hence does not need

any relation to the Carrier Media. Like traditional watermarks, digital watermarks

can be made perceptible under certain conditions, i.e. after using some algorithm,

and imperceptible anytime else, keeping in mind that the Digital Watermark does

not distorts the carrier signal.

Both steganography and digital watermarking employ Steganographic techniques to

embed data covertly in noisy signals. Steganography aims for imperceptibility to

human senses, but digital watermarking tries to control the robustness as top

priority even if the watermark becomes perceptible.

Page 21: Steganography

Digital watermarking may be used for a wide range of applications, such as:

Copyright protection

Source tracking (different recipients get differently watermarked content)

Broadcast monitoring (television news often contains watermarked video

from international agencies)

Authentication of the Media

2. Tamper Proofing of Files

When any fie containing any sensitive information is being transferred or

transmitted, there is a risk of it being altered by any person or organization which

could prove fatal. The principle of Steganography can help here. A data, just like a

watermark, can be embedded into the file through Steganographic methods. If the

file undergoes any changes by any unauthorized person, the embedded data will

also undergo some changes. Hence the embedded data can be check to see whether

the file has been tampered or not. Of course it is not possible to remove the

embedded data without destroying the file itself. Therefore the presence of the

embedded data into the file acts as medium for making the file Tamper Proof.

3. Steganography in Modern Printers

Steganography is used by some modern printers, including HP and Xerox brand

color laser printers. Tiny yellow dots are added to each page. The dots are barely

visible and contain encoded printer serial numbers, as well as date and time stamps.

4. Steganography in DNA

Even biological data, stored on DNA, may be a candidate for hiding messages, as

biotech companies seek to prevent unauthorized use of their genetically engineered

material. The technology is already in place for this: three New York researchers

successfully hid a secret message in a DNA sequence and sent it across the country.

Page 22: Steganography

STEGANOGRAPHY VS CRYPTOGRAPHY

Cryptography and Steganography are well known and widely used techniques that

manipulate information in order to cipher or hide their existence respectively.

Steganography is the art and science of communicating in a way which hides the existence

of the communication, while Cryptography scrambles a message so it cannot be

understood; the Steganography hides the message so it cannot be seen. Even though both

methods provide security, there are some significant differences between the 2 sciences.

However it’s the Steganography which comes on top, providing a better sense of

security over Cryptography.

The advantage of steganography over cryptography alone is that messages do not

attract attention to themselves, to messengers, or to recipients. Whereas the goal of

cryptography is to make data unreadable by a third party, the goal of steganography

is to hide the data from a third party.

Steganography

Only Sender & Reciever knows the existence of the message.

Prevents Discovery of the very existence of Communications.

Does not Alter the structur & content of the message.

Cryptograpphy

The Existence of Message is Known to all..

Prevents Unauthorized party to dicover the contents of

Communications.

Alters the structure of the Meessage.

Page 23: Steganography

But still, steganography and cryptography can be used together to ensure more

robust and foolproof security of the covert message.

FUTURE SCOPE OF STEGANOGRAPHY

One thing that you can certainly say about the future of Steganography: Change and

the introduction of new approaches will continue to occur on a frequent basis. The first

area where we can appreciate Stego making strides is in the technology used to produce

and break it.

Improved Resistance to Analysis

As Stego-media gets more sophisticated, its resistance to being analyzed, or even

recognized, will improve. In the current state of Stego technology, if you suspect

Stego-media is being used, it is relatively easy to detect it. Once you detect it, you

can probably retrieve the contents which would then be protected only by the

strength of the encryption applied to it, if any. In the future, efforts will make

Stego undetectable and irretrievable except by those for whom it is intended. The

ability to manipulate data, then printing out a hard copy, rescanning it, and again

being able to retrieve the hidden data would be an intriguing scenario.

Higher Encoding Density

The ability to hide huge amounts of data with stego is another logical area for

improvement. Currently stego can use only a certain amount of data bits in a host

file without degrading the file to the point where it’s obvious that stego is being

used. As stego is used in crimes such as corporate espionage, there will be more

demand to hide larger amounts of data. Large-scale stego, where you can perform

compression on huge amounts of data on the fly and store it in small files, is one

possible future.

Page 24: Steganography

CONCLUSION

Steganography is a really interesting subject and outside of the mainstream

cryptography and system administration that most of us deal with day after day. “You

never know if a message is hidden”, this is the dilemma that empowers steganography.

As main emphasis of Steganography is placed on the areas of providing a Covert

Communication in an Overt Environment, but it does not mean it is limited to Data Hiding.

Steganography is now also finding grounds in copyright protection, privacy protection, and

surveillance, and in checking genuineness. I believe that steganography will continue to

grow in importance as a protection mechanism.

In spite of being proved more superior to Cryptography, Steganography is not intended to

replace cryptography but rather to supplement it. If a message is encrypted and hidden with

a Steganographic method it provides an additional layer of protection and reduces the

chance of the hidden message being detected. Also the opposite way can be done: The

secrete message first being embedded into the Carrier file and then the carrier file

undergoes Encryption.

Hence both Steganography and Cryptography together can unlock a new world of hiding

data and reach new heights.

However with the advancement of Data Hiding and encryption, it is expected that

Steganography can be abused for greedy and illicit purposes. But with continuous

advancements in technology it is expected that in the near future more efficient and

advanced techniques in Steganalysis will emerge that will help law enforcement to better

detect illicit materials transmitted through the Internet.

Page 25: Steganography

REFERENCES

Site: http://en.wikipedia.org/wiki/Steganography

Site: http://www.sarc-wv.com/news.aspx

Site:

http://www.strangehorizons.com/2001/20011008/steganography.shtml

Site: http://\vww.jjtc.com/stegdoc

Eric Cole - “Hiding in Plain Text”, Wiley Publishing Inc. :2003

S. Katzenbeisser, F. Petitcolas – “Information Hiding: Techniques for

Steganography and Digital Watermarking”