A Paper Presentation on Steganography

www.1000projects.com

www.fullinterview.com

www.chetanasprojects.com

A PAPER PRESENTATION ON Steganography

Abstract

Steganography is the dark cousin of cryptography, the use of codes. While cryptography provides privacy, steganography is intended to provide secrecy. Privacy is what you need when you use your credit card on the Internet -- you don't want your number revealed to the public. For this, you use cryptography, and send a coded pile of gibberish that only the web site can decipher. Though your code may be unbreakable, any hacker can look and see you've sent a message. For true secrecy, you don't want anyone to know you're sending a message at all. Steganography is the art and science of writing hidden messages in such a way that no-one apart from the sender and intended recipient even realizes there is a hidden message, a form of security through obscurity. By contrast, cryptography obscures the meaning of a message, but it does not conceal the fact that there is a message. Today, the term steganography includes the concealment of digital information within computer files. For example, the sender might start with an ordinary-looking image file, then adjust the color of every 100th pixel to correspond to a letter in the www.1000projects.com






alphabet—a change so subtle that someone who isn't actively looking for it is unlikely to notice it. The advantage of steganography over cryptography alone is that messages do not attract attention to themselves, to messengers, or to recipients. An unhidden coded message, no matter how unbreakable it is, will arouse suspicion and may in itself be incriminating, as in countries where encryption is illegal. Often, steganography and cryptography are used together to ensure security of the covert message.

Definition:

The word steganography is of Greek origin and means "covered, or hidden writing". It is the science of hiding information. Whereas the goal of cryptography is to make data unreadable by a third party, the goal of steganography is to hide the data from a third party. There are a large number of steganographic methods that most of us are familiar with (especially if you watch a lot of spy movies!), ranging from invisible ink and microdots to secreting a hidden message in the second letter of each word of a large body of text and spread spectrum radio communication. With computers and networks, there are many other ways of hiding information, such as:

Covert channels (e.g., Loki and some distributed denial-of-service tools use the Internet Control Message Protocol, or ICMP, as the communications channel between the "bad guy" and a compromised system)

Hidden text within Web pages

Hiding files in "plain sight" (e.g., what better place to "hide" a file than with an important sounding name in the c:\winnt\system32 directory?)

Null ciphers (e.g., using the first letter of each word to form a hidden message in an otherwise innocuous text)

Steganography today, however, is significantly more sophisticated than the examples above suggest, allowing a user to hide large amounts of information within image and audio files. These forms of steganography often are used in conjunction with cryptography so that the information is doubly protected; first it is encrypted and then hidden so that an adversary has to first find the information (an often difficult task in and of itself) and then decrypt it.

History of Steganography:

Steganography has been widely used in historical times, especially before cryptographic systems were developed. Examples of historical usage include:







Hidden messages in wax tablets: in ancient Greece, people wrote messages on the wood, and then covered it with wax so that it looked like an ordinary, unused tablet.

Hidden messages on messenger's body: also in ancient Greece. Herodotus tells the story of a message tattooed on a slave's shaved head, hidden by the growth of his hair, and exposed by shaving his head again. The message allegedly carried a warning to Greece about Persian invasion plans. This method has obvious drawbacks:

1. It is impossible to send a message as quickly as the slave can travel, because it takes months to grow hair.

2. A slave can only be used once for this purpose. (This is why slaves were used: they were considered expendable.)

A more subtle method, nearly as old, is to use invisible ink. Described as early as the first century AD, invisible inks were commonly used for serious communications until WWII. The simplest are organic compounds, such as lemon juice, milk, or urine, all of which turn dark when held over a flame. In 1641, Bishop John Wilkins suggested onion juice, alum, ammonia salts, and for glow-in-the dark writing the "distilled Juice of Glowworms." Modern invisible inks fluoresce under ultraviolet light and are used as anti-counterfeit devices. For example, "VOID" is printed on checks and other official documents in an ink that appears under the strong ultraviolet light used for photocopies.

During the American Revolution, both sides made extensive use of chemical inks that required special developers to detect, though the British had discovered the American formula by 1777. Throughout World War II, the two sides raced to create new secret inks and to find developers for the ink of the enemy. In the end, though, the volume of communications rendered invisible ink impractical.

With the advent of photography, microfilm was created as a way to store a large amount of information in a very small space. In both world wars, the Germans used "microdots" to hide information, a technique which J. Edgar Hoover called "the enemy's masterpiece of espionage." A secret message was photographed, reduced to the size of a printed period, and then pasted into an innocuous cover message, magazine, or newspaper. The Americans caught on only when tipped by a double agent: "Watch out for the dots -- lots and lots of little dots."

Modern updates to these ideas use computers to make the hidden message even less noticeable. For example, laser printers can adjust spacing of lines and characters by less than 1/300th of an inch. To hide a zero, leave a standard space, and to hide a one leave 1/300th of an inch more than usual. Varying the spacing over an entire document can hide a short binary message that is undetectable by the human eye. Even better, this sort of trick stands up well to repeated photocopying.




http://en.wikipedia.org/wiki/Greece

http://en.wikipedia.org/wiki/Plan

http://en.wikipedia.org/wiki/Invasion

http://en.wikipedia.org/wiki/Persian_Empire

http://en.wikipedia.org/wiki/Shave

http://en.wikipedia.org/wiki/Slavery

http://en.wikipedia.org/wiki/Tattoo

http://en.wikipedia.org/wiki/Herodotus

http://en.wikipedia.org/wiki/Wax




All of these approaches to steganography have one thing in common -- they hide the secret message in the physical object which is sent. The cover message is merely a distraction, and could be anything. Of the innumerable variations on this theme, none will work for electronic communications because only the pure information of the cover message is transmitted. Nevertheless, there is plenty of room to hide secret information in a not-so-secret message. It just takes ingenuity.

The monk Johannes Trithemius, considered one of the founders of modern cryptography, had ingenuity in spades. His three volume work Steganographia, written around 1500, describes an extensive system for concealing secret messages within innocuous texts. On its surface, the book seems to be a magical text, and the initial reaction in the 16th century was so strong that Steganographia was only circulated privately until publication in 1606. But less than five years ago, Jim Reeds of AT&T Labs deciphered mysterious codes in the third volume, showing that Trithemius' work is more a treatise on cryptology than demonology. Reeds' fascinating account of the code breaking process is quite readable.

One of Trithemius' schemes was to conceal messages in long invocations of the names of angels, with the secret message appearing as a pattern of letters within the words. For example, as every other letter in every other word:

Padiel aporsy mesarpon omeuas peludyn malpreaxo which reveals "prymus apex."

Another clever invention in Steganographia was the "Ave Maria" cipher. The book contains a series of tables, each of which has a list of words, one per letter. To code a message, the message letters are replaced by the corresponding words. If the tables are used in order, one table per letter, then the coded message will appear to be an innocent prayer.

The modern version of Trithemius' scheme is undoubtedly SpamMimic. This simple system hides a short text message in a letter that looks exactly like spam, which is as ubiquitous on the Internet today as innocent prayers were in the 16th century. SpamMimic uses a "grammar" to make the messages. For example, a simple sentence in English is constructed with a subject, verb, and object, in that order. Given lists of 26 subjects, 26 verbs, and 26 objects, we could construct a three word sentence that encodes a three letter message. If you carefully prescribe a set of rules, you can make a grammar that describes spam.

Unfortunately, for serious users, every scheme we've seen is unacceptable. All are well known, and once a technique is suspected the hidden messages are easy to discover. Worse, a ten page document whose line spacing spells out a secret message is completely incriminating, even if the message is in an unbreakable code. A good steganographic technique should provide secrecy even if everyone knows it's being used.







Steganographic Methods:

The following formula provides a very generic description of the pieces of the steganographic process:

cover_medium + hidden_data + stego_key = stego_medium

In this context, the cover_medium is the file in which we will hide the hidden_data, which may also be encrypted using the stego_key. The resultant file is the stego_medium (which will, of course. be the same type of file as the cover_medium). The cover_medium (and, thus, the stego_medium) are typically image or audio files. Here we will focus on image files and will, therefore, refer to the cover_image and stego_image.

Before discussing how information is hidden in an image file, it is worth a fast review of how images are stored in the first place. An image file is merely a binary file containing a binary representation of the color or light intensity of each picture element (pixel) comprising the image.

Images typically use either 8-bit or 24-bit color. When using 8-bit color, there is a definition of up to 256 colors forming a palette for this image, each color denoted by an 8-bit value. A 24-bit color scheme, as the term suggests, uses 24 bits per pixel and provides a much better set of colors. In this case, each pixel is represented by three bytes, each byte representing the intensity of the three primary colors red, green, and blue (RGB), respectively. The Hypertext Markup Language (HTML) format for indicating colors in a Web page often uses a 24-bit format employing six hexadecimal digits, each pair representing the amount of red, blue, and green, respectively. The color orange, for example, would be displayed with red set to 100% (decimal 255, hex FF), green set to 50% (decimal 127, hex 7F), and no blue (0), so we would use "#FF7F00" in the HTML code.

The size of an image file, then, is directly related to the number of pixels and the granularity of the color definition. A typical 640x480 pix image using a palette of 256 colors would require a file about 307 KB in size (640 • 480 bytes), whereas a 1024x768 pix high-resolution 24-bit color image would result in a 2.36 MB file (1024 • 768 • 3 bytes).

To avoid sending files of this enormous size, a number of compression schemes have been developed over time, notably Bitmap (BMP), Graphic Interchange Format (GIF), and Joint Photographic Experts Group (JPEG) file types. Not all are equally suited to steganography, however.







GIF and 8-bit BMP files employ what is known as lossless compression, a scheme that allows the software to exactly reconstruct the original image. JPEG, on the other hand, uses lossy compression, which means that the expanded image is very nearly the same as the original but not an exact duplicate. While both methods allow computers to save storage space, lossless compression is much better suited to applications where the integrity of the original information must be maintained, such as steganography. While JPEG can be used for stego applications, it is more common to embed data in GIF or BMP files.

The simplest approach to hiding data within an image file is called least significant bit (LSB) insertion. In this method, we can take the binary representation of the hidden_data and overwrite the LSB of each byte within the cover_image. If we are using 24-bit color, the amount of change will be minimal and indiscernible to the human eye. As an example, suppose that we have three adjacent pixels (nine bytes) with the following RGB encoding:

10010101 00001101 1100100110010110 00001111 1100101010011111 00010000 11001011

Now suppose we want to "hide" the following 9 bits of data (the hidden data is usually compressed prior to being hidden): 101101101. If we overlay these 9 bits over the LSB of the 9 bytes above, we get the following (where bits in bold have been changed):

10010101 00001100 1100100110010111 00001110 1100101110011111 00010000 11001011

Note that we have successfully hidden 9 bits but at a cost of only changing 4, or roughly 50%, of the LSBs.

This description is meant only as a high-level overview. Similar methods can be applied to 8-bit color but the changes, as the reader might imagine, are more dramatic. Gray-scale images, too, are very useful for steganographic purposes. One potential problem with any of these methods is that they can be found by an adversary who is looking. In addition, there are other methods besides LSB insertion with which to insert hidden information.

Without going into any detail, it is worth mentioning Steganalysis, the art of detecting and breaking steganography. One form of this analysis is to examine the color palette of a graphical image. In most images, there will be a unique binary encoding of each individual color. If the image contains hidden data, however, many colors in the palette will have duplicate binary







encodings since, for all practical purposes, we can't count the LSB. If the analysis of the color palette of a given file yields many duplicates, we might safely conclude that the file has hidden information.

But what files would you analyze? Suppose I decide to post a hidden message by hiding it in an image file that I post at an auction site on the Internet. The item I am auctioning is real so a lot of people may access the site and download the file; only a few people know that the image has special information that only they can read. And we haven't even discussed hidden data inside audio files! Indeed, the quantity of potential cover files makes steganalysis a Herculean task.

The key innovation in recent years was to choose an innocent looking cover that contains plenty of random information, called white noise. You can hear white noise as a nearly silent hiss of a blank tape playing. The secret message replaces the white noise, and if done properly it will appear to be as random as the noise was. The most popular methods use digitized photographs, so let's explore these techniques in some depth. Digitized photographs and video also harbor plenty of white noise. A digitized photograph is stored as an array of colored dots, called pixels. Each pixel typically has three numbers associated with it, one each for red, green, and blue intensities, and these values often range from 0-255. Each number is stored as eight bits (zeros and ones), with a one worth 128 in the most significant bit (on the left), then 64, 32, 16, 8, 4, 2, and a one in the least significant bit (on the right) worth just 1.

A difference of one or two in the intensities is imperceptible, and, in fact, a digitized picture can still look good if the least significant four bits of intensity are altered -- a change of up to 16 in the color's value. This gives plenty of space to hide a secret message. Text is usually stored with 8 bits per letter, so we could hide 1.5 letters in each pixel of the cover photo. A 640x480 pixel image, the size of a small computer monitor, can hold over 400,000 characters. That's a whole novel hidden in one modest photo!

Hiding a secret photo in a cover picture is even easier. Line them up, pixel by pixel. Take the important four bits of each color value for each pixel in the secret photo (the left ones). Replace the unimportant four bits in the cover photo (the right ones). The cover photo won't change







much, you won't lose much of the secret photo, but to an untrained eye you're sending a completely innocuous picture.

Unfortunately, anyone who cares to find your hidden image probably has a trained eye. The intensity values in the original cover image were white noise, i.e. random. The new values are strongly patterned, because they represent significant information of the secret image. This is the sort of change which is easily detectable by statistics. So the final trick to good steganography is make the message look random before hiding it.

One solution is simply to encode the message before hiding it. Using a good code, the coded message will appear just as random as the picture data it is replacing. Another approach is to spread the hidden information randomly over the photo. "Pseudo-random number" generators take a starting value, called a seed, and produce a string of numbers which appear random. For example, pick a number between 0 and 16 for a seed. Multiply your seed by 3, add 1, and take the remainder after division by 17. Repeat, repeat, repeat. Unless you picked 8, you'll find yourself somewhere in the sequence 1, 4, 13, 6, 2, 7, 5, 16, 15, 12, 3, 10, 14, 9, 11, 0, 1, 4, . . . which appears somewhat random. To spread a hidden message randomly over a cover picture, use the pseudo-random sequence of numbers as the pixel order. Descrambling the photo requires knowing the seed that started the pseudo-random number generator.

Steganography Examples:







FIGURE 1. The cover_image (5th wave.gif), hidden_data file (virusdetectioninfo.txt), and stego_key.

The following examples come from Andy Brown's S-Tools for Windows. S-Tools allow users to hide information into BMP, GIF, or WAV files. The basic scheme of the program is straight-www.1000projects.com






forward; you drag an image or audio file into the S-Tools active window to act as the cover_medium, drag the hidden_data file onto the cover_medium, and then provide a stego_key for encryption. The result is the stego_medium. All of this is shown in Figure 1:

1. I highlighted the GIF image file 5th wave.gif and dragged it to the S-Tools active window. Note that S-Tools reports that up to 138,547 bytes can be hidden in this image file.

2. I next highlighted a 14 KB text file called virusdetectioninfo.txt and dragged it onto the image file in S-Tools.

3. A dialog box pops up telling me that I am hiding 6,019 bytes of data and asks for a passphrase with which to encrypt the hidden text; the default secret key crypto scheme used by S-Tools is the International Data Encryption Algorithm (IDEA).













FIGURE 3. Extracting hidden information from the image file.







4. Once the image file has been received, the user merely drags the file to S-Tools and right-clicks over the image, specifying the Reveal option. A dialog box will pop up requesting the passphrase. Figure 3 shows the information about the hidden archive file, and allows the user to open the file.

Applications:

With these new techniques, a hidden message is indistinguishable from white noise. Even if the message is suspected, there is no proof of its existence. To actually prove there was a message, and not just randomness, the code needs to be cracked or the random number seed guessed. This feature of modern steganography is called "plausible deniability."

All of this sounds fairly nefarious, and in fact the obvious uses of steganography are for things like espionage. But there are a number of peaceful applications. The simplest and oldest are used in map making, where cartographers sometimes add a tiny fictional street to their maps, allowing them to prosecute copycats. A similar trick is to add fictional names to mailing lists as a check against unauthorized resellers.

Most of the newer applications use steganography like a watermark, to protect a copyright on information. Photo collections, sold on CD, often have hidden messages in the photos which allow detection of unauthorized use. The same technique applied to DVDs is even more effective, since the industry builds DVD recorders to detect and disallow copying of protected DVDs.

Even biological data, stored on DNA, may be a candidate for hidden messages, as biotech companies seek to prevent unauthorized use of their genetically engineered material. The technology is already in place for this: three New York researchers successfully hid a secret message in a DNA sequence and sent it across the country. Sound like science fiction? A secret message in DNA provided Star Trek's explanation for the dubious fact that all aliens seem to be humans in prosthetic makeup!

Maybe, as in Star Trek, there really is a message hidden somewhere for humans to find. In the real world, the place to look for such a message is space, and humans have been looking for quite some time. Marconi, the inventor of radio, speculated that strange signals heard by his company might be signals from another planet. To his credit, he was hearing these signals years before his competitors, but today they are known to be caused by lightning strikes.







In 1924, Mars passed relatively close to Earth, and the U.S. Army and Navy actually ordered their stations to quiet transmissions and listen for signals. They found nothing. In 1960, Dr. Frank Drake and a cadre of radio technicians used their 85 foot radio telescope for one of the first extensive studies of signals from space. They listened to Tau Ceti and Epsilon Erdani for 150 hours, and found nothing.

Today, the search for messages from space is underway on an unbelievable scale. The SETI@home project, based in Berkeley, has convinced millions of people to use their home computers in the search for signals. Their simple marketing trick was to package the calculations in a nifty screensaver, and now SETI@home is the largest computation in history. They've been looking for more than two years, with a telescope a thousand feet wide, but still they have found nothing.

Conclusion:

Steganography is a fascinating and effective method of hiding data that has been used throughout history. Methods that can be employed to uncover such devious tactics, but the first step are awareness that such methods even exist. There are many good reasons as well to use this type of data hiding, including watermarking or a more secure central storage method for such things as passwords, or key processes. Regardless, the technology is easy to use and difficult to detect. The more that you know about its features and functionality, the more ahead you will be in the game.




A Paper Presentation on Steganography

Documents

A Paper Presentation on Steganography