final doc

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

Chapter 1

Introduction

1.1 Introduction:

Steganography is the art and science of writing hidden messages in such a way that no

one, apart from the sender and intended recipient, suspects the existence of the message, a form

of security through obscurity. The word steganography is of Greek origin and means "concealed

writing". The first recorded use of the term was in 1499 by Johannes Trithemius in his

Steganographia, a treatise on cryptography and steganography disguised as a book on magic.

Generally, messages will appear to be something else: images, articles, shopping lists, or some

other covertext and, classically, the hidden message may be in invisible ink between the visible

lines of a private letter.

The advantage of steganography, over cryptography alone, is that messages do not attract

attention to themselves. Plainly visible encrypted messages—no matter how unbreakable—will

arouse suspicion, and may in themselves be incriminating in countries where encryption is

illegal.[1] Therefore, whereas cryptography protects the contents of a message, steganography can

be said to protect both messages and communicating parties.

Steganography includes the concealment of information within computer files. In digital

steganography, electronic communications may include steganographic coding inside of a

transport layer, such as a document file, image file, program or protocol. Media files are ideal for

steganographic transmission because of their large size. As a simple example, a sender might

start with an innocuous image file and adjust the color of every 100th pixel to correspond to a

letter in the alphabet, a change so subtle that someone not specifically looking for it is unlikely to

notice it.

College Name Page 1

http://en.wikipedia.org/wiki/Security_through_obscurity

http://en.wikipedia.org/wiki/Steganography#cite_note-0

http://en.wikipedia.org/wiki/Encryption

http://en.wikipedia.org/wiki/Cryptography

http://en.wikipedia.org/wiki/Invisible_ink

http://en.wikipedia.org/wiki/Johannes_Trithemius#Steganographia

http://en.wikipedia.org/wiki/Johannes_Trithemius

http://en.wikipedia.org/wiki/Ancient_Greek


1.2 Ancient steganography

The first recorded uses of steganography can be traced back to 440 BC when Herodotus

mentions two examples of steganography in The Histories of Herodotus.[2] Demaratus sent a

warning about a forthcoming attack to Greece by writing it directly on the wooden backing of a

wax tablet before applying its beeswax surface. Wax tablets were in common use then as

reusable writing surfaces, sometimes used for shorthand. Another ancient example is that of

Histiaeus, who shaved the head of his most trusted slave and tattooed a message on it. After his

hair had grown the message was hidden. The purpose was to instigate a revolt against the

Persians.

1.3 Steganographic techniques

There are three steganographic techniques are available. They are:

Physical Steganography. Digital steganography.

Printed steganography.

1.3.1 Physical steganography

Steganography has been widely used including recent historical times and the present day. Possible permutations are endless and known examples include:

College Name Page 2

http://en.wikipedia.org/wiki/Persians

http://en.wikipedia.org/wiki/Histiaeus

http://en.wikipedia.org/wiki/Stenography

http://en.wikipedia.org/wiki/Wax_tablet

http://en.wikipedia.org/wiki/Demaratus


http://en.wikipedia.org/wiki/The_Histories_of_Herodotus

http://en.wikipedia.org/wiki/Herodotus

http://en.wikipedia.org/wiki/File:Steganart03.jpg


Steganart example. Within this picture, the letters position of a hidden message are

represented by increasing numbers (1 to 20), and a letter value is given by its intersection

position in the grid. For instance, the first letter of the hidden message is at the intersection of 1

and 4. So, after a few tries, the first letter of the message seems to be the 14th letter of the

alphabet; the last one (number 20) is the 5th letter of the alphabet.

Hidden messages within wax tablets: in ancient Greece, people wrote messages on the

wood, then covered it with wax upon which an innocent covering message was written.

Hidden messages on messenger's body: also in ancient Greece. Herodotus tells the story

of a message tattooed on a slave's shaved head, hidden by the growth of his hair, and

exposed by shaving his head again. The message allegedly carried a warning to Greece

about Persian invasion plans. This method has obvious drawbacks such as delayed

transmission while waiting for the slave's hair to grow, and its one-off use since

additional messages requires additional slaves. In WWII, the French Resistance sent

some messages written on the backs of couriers using invisible ink.

Hidden messages on paper written in secret inks, under other messages or on the blank

parts of other messages.

Messages written in morse code on knitting yarn and then knitted into a piece of clothing

worn by a courier.

Messages written on the back of postage stamps.

During and after World War II, espionage agents used photographically produced

microdots to send information back and forth. Microdots were typically minute, about or

less than the size of the period produced by a typewriter. WWII microdots needed to be

embedded in the paper and covered with an adhesive (such as collodion). This was

reflective and thus detectable by viewing against glancing light. Alternative techniques

included inserting microdots into slits cut into the edge of post cards.

College Name Page 3

http://en.wikipedia.org/wiki/Collodion

http://en.wikipedia.org/wiki/Typewriter

http://en.wikipedia.org/wiki/Full_stop

http://en.wikipedia.org/wiki/Microdot

http://en.wikipedia.org/wiki/Espionage

http://en.wikipedia.org/wiki/World_War_II

http://en.wikipedia.org/wiki/Invisible_ink

http://en.wikipedia.org/wiki/Plan

http://en.wikipedia.org/wiki/Invasion

http://en.wikipedia.org/wiki/Persian_Empire

http://en.wikipedia.org/wiki/Shaving

http://en.wikipedia.org/wiki/Slavery

http://en.wikipedia.org/wiki/Tattoo

http://en.wikipedia.org/wiki/Herodotus

http://en.wikipedia.org/wiki/Wax

http://en.wikipedia.org/wiki/Greece

http://en.wikipedia.org/wiki/Wax_tablet


During World War II, a spy for the Japanese in New York City, Velvalee Dickinson, sent

information to accommodation addresses in neutral South America. She was a dealer in

dolls, and her letters discussed how many of this or that doll to ship. The stegotext was

the doll orders, the concealed 'plaintext' was itself encoded and gave information about

ship movements, etc. Her case became somewhat famous and she became known as the

Doll Woman.

Cold War counter-propaganda. During 1968, crew members of the USS Pueblo (AGER-

2) intelligence ship held as prisoners by North Korea, communicated in sign language

during staged photo opportunities, informing the United States they were not defectors

but rather were being held captured by the North Koreans. In other photos presented to

the US, crew members gave "the finger" to the unsuspecting North Koreans, in an

attempt to discredit photos that showed them smiling and comfortable.[3]

1.3.2 Digital steganography

Modern steganography entered the world in 1985 with the advent of the personal

computer applied to classical steganography problems. [4] Development following that was slow,

but has since taken off, going by the number of 'stego' programs available: Over 725 digital

steganography applications have been identified by the Steganography Analysis and Research

Center. [5] Digital steganography techniques include:

College Name Page 4




http://en.wikipedia.org/wiki/The_finger

http://en.wikipedia.org/wiki/United_States

http://en.wikipedia.org/wiki/North_Korea

http://en.wikipedia.org/wiki/USS_Pueblo_(AGER-2)

http://en.wikipedia.org/wiki/USS_Pueblo_(AGER-2)

http://en.wikipedia.org/wiki/Doll

http://en.wikipedia.org/wiki/South_America

http://en.wikipedia.org/wiki/Velvalee_Dickinson

http://en.wikipedia.org/wiki/New_York,_New_York

http://en.wikipedia.org/wiki/Japan

http://en.wikipedia.org/wiki/World_War_II

http://en.wikipedia.org/wiki/File:StenographyOriginal.png


Image of a tree. By removing all but the last 2 bits of each color component, an almost

completely black image results. Making the resulting image 85 times brighter results in the

image below.

Image of a cat extracted from above image.

Concealing messages within the lowest bits of noisy images or sound files.

Concealing data within encrypted data. The data to be concealed is first encrypted before

being used to overwrite part of a much larger block of encrypted data.

Chaffing and winnowing .

Mimic functions convert one file to have the statistical profile of another. This can thwart

statistical methods that help brute-force attacks identify the right solution in a ciphertext-

only attack.

Concealed messages in tampered executable files, exploiting redundancy in the i386

instruction set.

College Name Page 5

http://en.wikipedia.org/wiki/Ciphertext-only_attack

http://en.wikipedia.org/wiki/Ciphertext-only_attack

http://en.wikipedia.org/w/index.php?title=Mimic_functions&action=edit&redlink=1

http://en.wikipedia.org/wiki/Chaffing_and_winnowing

http://en.wikipedia.org/wiki/Image_noise

http://en.wikipedia.org/wiki/Color_component

http://en.wikipedia.org/wiki/Bit

http://en.wikipedia.org/wiki/File:StenographyRecovered.png


Pictures embedded in video material (optionally played at slower or faster speed).

Injecting imperceptible delays to packets sent over the network from the keyboard.

Delays in keypresses in some applications (telnet or remote desktop software) can mean a

delay in packets, and the delays in the packets can be used to encode data.

Content-Aware Steganography hides information in the semantics a human user assigns

to a datagram. These systems offer security against a non-human adversary/warden.

Blog -Steganography. Messages are fractionalized and the (encrypted) pieces are added as

comments of orphaned web-logs (or pin boards on social network platforms). In this case

the selection of blogs is the symmetric key that sender and recipient are using; the carrier

of the hidden message is the whole blogosphere.

1.3.3 Printed steganography

Digital steganography output may be in the form of printed documents. A message, the

plaintext, may be first encrypted by traditional means, producing a cipher text. Then, an

innocuous cover text is modified in some way to as to contain the cipher text, resulting in the

stegotext. For example, the letter size, spacing, typeface, or other characteristics of a covertext

can be manipulated to carry the hidden message. Only a recipient who knows the technique used

can recover the message and then decrypt it. Francis Bacon developed Bacon's cipher as such a

technique.

1.4 Organisation of thesis:

Chapter 2 will define steganography, provide a brief history, and explain various methods

of steganography. Chapter 3 will review several software applications that provide

steganographic services and mention the approaches taken. Chapter 4 will conclude with a brief

discussion of the implications of steganographic technology. Chapter 5 will list the resources

used in researching this topic and additional readings for those interested in more in-depth

understanding of steganography.

College Name Page 6

http://en.wikipedia.org/wiki/Bacon's_cipher

http://en.wikipedia.org/wiki/Francis_Bacon

http://en.wikipedia.org/wiki/Typeface

http://en.wikipedia.org/wiki/Ciphertext

http://en.wikipedia.org/wiki/Plaintext

http://en.wikipedia.org/wiki/Blogosphere

http://en.wikipedia.org/wiki/Blog

http://en.wikipedia.org/w/index.php?title=Content-Aware_Steganography&action=edit&redlink=1

http://en.wikipedia.org/wiki/Remote_desktop_software

http://en.wikipedia.org/wiki/Telnet


Chapter 2

Literature Survey

This section gives the brief introduction of Cryptography, steganography, and also

provides a brief history, and explains various methods of steganography.

2.1 Cryptography:

Does increased security provide comfort to paranoid people? Or does security provide

some very basic protections that we are naive to believe that we don't need? During this time

when the Internet provides essential communication between tens of millions of people and is

being increasingly used as a tool for commerce, security becomes a tremendously important

issue to deal with.

There are many aspects to security and many applications, ranging from secure

commerce and payments to private communications and protecting passwords. One essential

aspect for secure communications is that of cryptography, which is the focus of this chapter. But

it is important to note that while cryptography is necessary for secure communications, it is not

by itself sufficient. The reader is advised, then, that the topics covered in this chapter only

describe the first of many steps necessary for better security in any number of situations.

College Name Page 7


This paper has two major purposes. The first is to define some of the terms and concepts

behind basic cryptographic methods, and to offer a way to compare the myriad cryptographic

schemes in use today. The second is to provide some real examples of cryptography in use today.

I would like to say at the outset that this paper is very focused on terms, concepts, and

schemes in current use and is not a treatise of the whole field. No mention is made here about

pre-computerized crypto schemes, the difference between a substitution and transposition cipher,

cryptanalysis, or other history. Interested readers should check out some of the books in the

bibliography below for this detailed — and interesting! — background information.

2.1.1 THE PURPOSE OF CRYPTOGRAPHY

Cryptography is the science of writing in secret code and is an ancient art; the first

documented use of cryptography in writing dates back to circa 1900 B.C. when an Egyptian

scribe used non-standard hieroglyphs in an inscription. Some experts argue that cryptography

appeared spontaneously sometime after writing was invented, with applications ranging from

diplomatic missives to war-time battle plans. It is no surprise, then, that new forms of

cryptography came soon after the widespread development of computer communications. In data

and telecommunications, cryptography is necessary when communicating over any untrusted

medium, which includes just about any network, particularly the Internet.

Within the context of any application-to-application communication, there are some

specific security requirements, including:

Authentication: The process of proving one's identity. (The primary forms of

host-to-host authentication on the Internet today are name-based or address-based,

both of which are notoriously weak.)

Privacy/confidentiality: Ensuring that no one can read the message except the

intended receiver.

College Name Page 8


Integrity: Assuring the receiver that the received message has not been altered in

any way from the original.

Non-repudiation: A mechanism to prove that the sender really sent this message.

Cryptography, then, not only protects data from theft or alteration, but can also be used

for user authentication. There are, in general, three types of cryptographic schemes typically used

to accomplish these goals: secret key (or symmetric) cryptography, public-key (or asymmetric)

cryptography, and hash functions, each of which is described below. In all cases, the initial

unencrypted data is referred to as plaintext. It is encrypted into ciphertext, which will in turn

(usually) be decrypted into usable plaintext.

In many of the descriptions below, two communicating parties will be referred to as Alice

and Bob; this is the common nomenclature in the crypto field and literature to make it easier to

identify the communicating parties. If there is a third or fourth party to the communication, they

will be referred to as Carol and Dave. Mallory is a malicious party, Eve is an eavesdropper, and

Trent is a trusted third party.

2.1.2 TYPES OF CRYPTOGRAPHIC ALGORITHMS

There are several ways of classifying cryptographic algorithms. For purposes of this

paper, they will be categorized based on the number of keys that are employed for encryption

and decryption, and further defined by their application and use. The three types of algorithms

that will be discussed are (Figure 1):

Secret Key Cryptography (SKC): Uses a single key for both encryption and

decryption

Public Key Cryptography (PKC): Uses one key for encryption and another for

decryption

Hash Functions: Uses a mathematical transformation to irreversibly "encrypt"

information

College Name Page 9


2.1.2.1 Secret Key Cryptography

With secret key cryptography, a single key is used for both encryption and decryption. As

shown in Figure 1A, the sender uses the key (or some set of rules) to encrypt the plaintext and

sends the ciphertext to the receiver. The receiver applies the same key (or ruleset) to decrypt the

message and recover the plaintext. Because a single key is used for both functions, secret key

cryptography is also called symmetric encryption.

With this form of cryptography, it is obvious that the key must be known to both the

sender and the receiver; that, in fact, is the secret. The biggest difficulty with this approach, of

course, is the distribution of the key.

Secret key cryptography schemes are generally categorized as being either stream ciphers

or block ciphers. Stream ciphers operate on a single bit (byte or computer word) at a time and

implement some form of feedback mechanism so that the key is constantly changing. A block

cipher is so-called because the scheme encrypts one block of data at a time using the same key

on each block. In general, the same plaintext block will always encrypt to the same ciphertext

when using the same key in a block cipher whereas the same plaintext will encrypt to different

ciphertext in a stream cipher.

Stream ciphers come in several flavors but two are worth mentioning here. Self-

synchronizing stream ciphers calculate each bit in the keystream as a function of the previous n

bits in the keystream. It is termed "self-synchronizing" because the decryption process can stay

synchronized with the encryption process merely by knowing how far into the n-bit keystream it

is. One problem is error propagation; a garbled bit in transmission will result in n garbled bits at

the receiving side. Synchronous stream ciphers generate the keystream in a fashion independent

of the message stream but by using the same keystream generation function at sender and

receiver. While stream ciphers do not propagate transmission errors, they are, by their nature,

periodic so that the keystream will eventually repeat.

Block ciphers can operate in one of several modes; the following four are the most important:

College Name Page 10


Electronic Codebook (ECB) mode is the simplest, most obvious application: the

secret key is used to encrypt the plaintext block to form a ciphertext block. Two

identical plaintext blocks, then, will always generate the same ciphertext block.

Although this is the most common mode of block ciphers, it is susceptible to a

variety of brute-force attacks.

Cipher Block Chaining (CBC) mode adds a feedback mechanism to the

encryption scheme. In CBC, the plaintext is exclusively-ORed (XORed) with the

previous ciphertext block prior to encryption. In this mode, two identical blocks

of plaintext never encrypt to the same ciphertext.

Cipher Feedback (CFB) mode is a block cipher implementation as a self-

synchronizing stream cipher. CFB mode allows data to be encrypted in units

smaller than the block size, which might be useful in some applications such as

encrypting interactive terminal input. If we were using 1-byte CFB mode, for

example, each incoming character is placed into a shift register the same size as

the block, encrypted, and the block transmitted. At the receiving side, the

ciphertext is decrypted and the extra bits in the block (i.e., everything above and

beyond the one byte) are discarded.

Output Feedback (OFB) mode is a block cipher implementation conceptually

similar to a synchronous stream cipher. OFB prevents the same plaintext block

from generating the same ciphertext block by using an internal feedback

mechanism that is independent of both the plaintext and ciphertext bitstreams.

A nice overview of these different modes can be found at progressive-coding.com.

Secret key cryptography algorithms that are in use today include:

Data Encryption Standard (DES): The most common SKC scheme used today, DES was designed by IBM in the 1970s and adopted by the National Bureau of Standards (NBS) [now the National Institute for Standards and Technology (NIST)] in 1977 for commercial and unclassified government applications. DES is a block-cipher employing a 56-bit key that operates on 64-bit blocks. DES has a


http://www.progressive-coding.com/tutorial.php?id=3


complex set of rules and transformations that were designed specifically to yield fast hardware implementations and slow software implementations, although this latter point is becoming less significant today since the speed of computer processors is several orders of magnitude faster today than twenty years ago. IBM also proposed a 112-bit key for DES, which was rejected at the time by the government; the use of 112-bit keys was considered in the 1990s, however, conversion was never seriously considered.

DES is defined in American National Standard X3.92 and three Federal Information Processing Standards (FIPS):

o FIPS 46-3: DES o FIPS 74: Guidelines for Implementing and Using the NBS Data

Encryption Standard

o FIPS 81: DES Modes of Operation

Information about vulnerabilities of DES can be obtained from the Electronic Frontier Foundation.

Two important variants that strengthen DES are:

o Triple-DES (3DES): A variant of DES that employs up to three 56-bit keys and makes three encryption/decryption passes over the block; 3DES is also described in FIPS 46-3 and is the recommended replacement to DES.

o DESX : A variant devised by Ron Rivest. By combining 64 additional key bits to the plaintext prior to encryption, effectively increases the keylength to 120 bits.

More detail about DES, 3DES, and DESX can be found below in Section 5.4.

Advanced Encryption Standard (AES): In 1997, NIST initiated a very public, 4-

1/2 year process to develop a new secure cryptosystem for U.S. government

applications. The result, the Advanced Encryption Standard, became the official

successor to DES in December 2001. AES uses an SKC scheme called Rijndael, a

block cipher designed by Belgian cryptographers Joan Daemen and Vincent

Rijmen. The algorithm can use a variable block length and key length; the latest

specification allowed any combination of keys lengths of 128, 192, or 256 bits

and blocks of length 128, 192, or 256 bits. NIST initially selected Rijndael in


http://www.esat.kuleuven.ac.be/~rijmen/rijndael/index.html

http://www.nist.gov/aes

http://www.garykessler.net/library/crypto.html#desmath

http://www.rsasecurity.com/rsalabs/node.asp?id=2232

http://csrc.nist.gov/publications/fips/fips46-3/fips46-3.pdf

http://www.eff.org/pub/Privacy/Crypto_misc/DES_Cracking

http://www.eff.org/pub/Privacy/Crypto_misc/DES_Cracking

http://www.itl.nist.gov/div897/pubs/fip81.htm



http://csrc.nist.gov/publications/fips/fips46-3/fips46-3.pdf


October 2000 and formal adoption as the AES standard came in December 2001.

FIPS PUB 197 describes a 128-bit block cipher employing a 128-, 192-, or 256-

bit key. The AES process and Rijndael algorithm are described in more detail

below in Section 5.9.

CAST-128/256: CAST-128, described in Request for Comments (RFC) 2144, is a

DES-like substitution-permutation crypto algorithm, employing a 128-bit key

operating on a 64-bit block. CAST-256 (RFC 2612) is an extension of CAST-128,

using a 128-bit block size and a variable length (128, 160, 192, 224, or 256 bit)

key. CAST is named for its developers, Carlisle Adams and Stafford Tavares and

is available internationally. CAST-256 was one of the Round 1 algorithms in the

AES process.

International Data Encryption Algorithm (IDEA) : Secret-key cryptosystem

written by Xuejia Lai and James Massey, in 1992 and patented by Ascom; a 64-

bit SKC block cipher using a 128-bit key. Also available internationally.

Rivest Ciphers (aka Ron's Code): Named for Ron Rivest, a series of SKC

algorithms.

o RC1: Designed on paper but never implemented.

o RC2: A 64-bit block cipher using variable-sized keys designed to replace

DES. It's code has not been made public although many companies have

licensed RC2 for use in their products. Described in RFC 2268.

o RC3: Found to be breakable during development.

o RC4 : A stream cipher using variable-sized keys; it is widely used in

commercial cryptography products, although it can only be exported using

keys that are 40 bits or less in length.

o RC5: A block-cipher supporting a variety of block sizes, key sizes, and

number of encryption passes over the data. Described in RFC 2040.


http://www.rfc-editor.org/rfc/rfc2040.txt

http://ciphersaber.gurus.com/


http://home.ecn.ab.ca/~jsavard/crypto/co0404.htm


http://www.entrust.com/resources/pdf/cast-256.pdf


http://www.garykessler.net/library/crypto.html#aes

http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf


o RC6 : An improvement over RC5, RC6 was one of the AES Round 2

algorithms.

Blowfish : A symmetric 64-bit block cipher invented by Bruce Schneier; optimized

for 32-bit processors with large data caches, it is significantly faster than DES on

a Pentium/PowerPC-class machine. Key lengths can vary from 32 to 448 bits in

length. Blowfish, available freely and intended as a substitute for DES or IDEA,

is in use in over 80 products.

Twofish : A 128-bit block cipher using 128-, 192-, or 256-bit keys. Designed to be

highly secure and highly flexible, well-suited for large microprocessors, 8-bit

smart card microprocessors, and dedicated hardware. Designed by a team led by

Bruce Schneier and was one of the Round 2 algorithms in the AES process.

Camellia : A secret-key, block-cipher crypto algorithm developed jointly by

Nippon Telegraph and Telephone (NTT) Corp. and Mitsubishi Electric

Corporation (MEC) in 2000. Camellia has some characteristics in common with

AES: a 128-bit block size, support for 128-, 192-, and 256-bit key lengths, and

suitability for both software and hardware implementations on common 32-bit

processors as well as 8-bit processors (e.g., smart cards, cryptographic hardware,

and embedded systems). Also described in RFC 3713. Camellia's application in

IPsec is described in RFC 4312 and application in OpenPGP in RFC 5581.

MISTY1: Developed at Mitsubishi Electric Corp., a block cipher using a 128-bit

key and 64-bit blocks, and a variable number of rounds. Designed for hardware

and software implementations, and is resistant to differential and linear

cryptanalysis. Described in RFC 2994.

Secure and Fast Encryption Routine (SAFER) : Secret-key crypto scheme

designed for implementation in software. Versions have been defined for 40-, 64-,

and 128-bit keys.


http://fn2.freenet.edmonton.ab.ca/~jsavard/co0403.html





http://info.isl.ntt.co.jp/camellia/Publications/camellia.pdf

http://www.counterpane.com/twofish.html

http://www.counterpane.com/blowfish.html



KASUMI : A block cipher using a 128-bit key that is part of the Third-Generation

Partnership Project (3gpp), formerly known as the Universal Mobile

Telecommunications System (UMTS). KASUMI is the intended confidentiality

and integrity algorithm for both message content and signaling data for emerging

mobile communications systems.

SEED : A block cipher using 128-bit blocks and 128-bit keys. Developed by the

Korea Information Security Agency (KISA) and adopted as a national standard

encryption algorithm in South Korea. Also described in RFC 4269.

Skipjack : SKC scheme proposed for Capstone. Although the details of the

algorithm were never made public, Skipjack was a block cipher using an 80-bit

key and 32 iteration cycles per 64-bit block.

2.1.2.2. Public-Key Cryptography

Public-key cryptography has been said to be the most significant new development in

cryptography in the last 300-400 years. Modern PKC was first described publicly by Stanford

University professor Martin Hellman and graduate student Whitfield Diffie in 1976. Their paper

described a two-key crypto system in which two parties could engage in a secure communication

over a non-secure communications channel without having to share a secret key.

PKC depends upon the existence of so-called one-way functions, or mathematical

functions that are easy to computer whereas their inverse function is relatively difficult to

compute. Let me give you two simple examples:

1. Multiplication vs. factorization: Suppose I tell you that I have two numbers, 9 and

16, and that I want to calculate the product; it should take almost no time to

calculate the product, 144. Suppose instead that I tell you that I have a number,


http://csrc.nist.gov/CryptoToolkit/skipjack/skipjack.pdf


http://www.kisa.or.kr/seed/seed_eng.html

http://networking.champlain.edu/download/3G_KASUMI.pdf


144, and I need you tell me which pair of integers I multiplied together to obtain

that number. You will eventually come up with the solution but whereas

calculating the product took milliseconds, factoring will take longer because you

first need to find the 8 pair of integer factors and then determine which one is the

correct pair.

2. Exponentiation vs. logarithms: Suppose I tell you that I want to take the number 3

to the 6th power; again, it is easy to calculate 36=729. But if I tell you that I have

the number 729 and want you to tell me the two integers that I used, x and y so

that logx 729 = y, it will take you longer to find all possible solutions and select

the pair that I used.

While the examples above are trivial, they do represent two of the functional pairs that

are used with PKC; namely, the ease of multiplication and exponentiation versus the relative

difficulty of factoring and calculating logarithms, respectively. The mathematical "trick" in PKC

is to find a trap door in the one-way function so that the inverse calculation becomes easy given

knowledge of some item of information.

Generic PKC employs two keys that are mathematically related although knowledge of

one key does not allow someone to easily determine the other key. One key is used to encrypt the

plaintext and the other key is used to decrypt the ciphertext. The important point here is that it

does not matter which key is applied first, but that both keys are required for the process to

work (Figure 1B). Because a pair of keys are required, this approach is also called asymmetric

cryptography.

In PKC, one of the keys is designated the public key and may be advertised as widely as

the owner wants. The other key is designated the private key and is never revealed to another

party. It is straight forward to send messages under this scheme. Suppose Alice wants to send

Bob a message. Alice encrypts some information using Bob's public key; Bob decrypts the

ciphertext using his private key. This method could be also used to prove who sent a message;

Alice, for example, could encrypt some plaintext with her private key; when Bob decrypts using



Alice's public key, he knows that Alice sent the message and Alice cannot deny having sent the

message (non-repudiation).

Public-key cryptography algorithms that are in use today for key exchange or digital signatures

include:

RSA: The first, and still most common, PKC implementation, named for the three

MIT mathematicians who developed it — Ronald Rivest, Adi Shamir, and

Leonard Adleman. RSA today is used in hundreds of software products and can

be used for key exchange, digital signatures, or encryption of small blocks of data.

RSA uses a variable size encryption block and a variable size key. The key-pair is

derived from a very large number, n, that is the product of two prime numbers

chosen according to special rules; these primes may be 100 or more digits in

length each, yielding an n with roughly twice as many digits as the prime factors.

The public key information includes n and a derivative of one of the factors of n;

an attacker cannot determine the prime factors of n (and, therefore, the private

key) from this information alone and that is what makes the RSA algorithm so

secure. (Some descriptions of PKC erroneously state that RSA's safety is due to

the difficulty in factoring large prime numbers. In fact, large prime numbers, like

small prime numbers, only have two factors!) The ability for computers to factor

large numbers, and therefore attack schemes such as RSA, is rapidly improving

and systems today can find the prime factors of numbers with more than 200

digits. Nevertheless, if a large number is created from two prime factors that are

roughly the same size, there is no known factorization algorithm that will solve

the problem in a reasonable amount of time; a 2005 test to factor a 200-digit

number took 1.5 years and over 50 years of compute time (see the Wikipedia

article on integer factorization.) Regardless, one presumed protection of RSA is

that users can easily increase the key size to always stay ahead of the computer

processing curve. As an aside, the patent for RSA expired in September 2000


http://en.wikipedia.org/wiki/Integer_factorization


which does not appear to have affected RSA's popularity one way or the other. A

detailed example of RSA is presented below in Section 5.3.

Diffie-Hellman : After the RSA algorithm was published, Diffie and Hellman

came up with their own algorithm. D-H is used for secret-key key exchange only,

and not for authentication or digital signatures. More detail about Diffie-Hellman

can be found below in Section 5.2.

Digital Signature Algorithm (DSA) : The algorithm specified in NIST's Digital

Signature Standard (DSS), provides digital signature capability for the

authentication of messages.

ElGamal : Designed by Taher Elgamal, a PKC system similar to Diffie-Hellman

and used for key exchange.

Elliptic Curve Cryptography (ECC): A PKC algorithm based upon elliptic curves.

ECC can offer levels of security with small keys comparable to RSA and other

PKC methods. It was designed for devices with limited compute power and/or

memory, such as smartcards and PDAs. More detail about ECC can be found

below in Section 5.8. Other references include "The Importance of ECC" Web

page and the "Online Elliptic Curve Cryptography Tutorial", both from Certicom.

Public-Key Cryptography Standards (PKCS) : A set of interoperable standards

and guidelines for public-key cryptography, designed by RSA Data Security Inc.

o PKCS #1 : RSA Cryptography Standard (Also RFC 3447)

o PKCS #2: Incorporated into PKCS #1.

o PKCS #3 : Diffie-Hellman Key-Agreement Standard

o PKCS #4: Incorporated into PKCS #1.






http://www.certicom.com/index.php?action=ecc_tutorial,home

http://www.certicom.com/index.php?action=res,ecc_intro_home

http://www.garykessler.net/library/crypto.html#ecc

http://www.iusmentis.com/technology/encryption/elgamal/

http://www.nist.gov/public_affairs/releases/digsigst.htm

http://www.garykessler.net/library/crypto.html#dhmath

http://www.rsasecurity.com/rsalabs/faq/3-6-1.html

http://www.garykessler.net/library/crypto.html#rsamath


o PKCS #5 : Password-Based Cryptography Standard (PKCS #5 V2.0 is also

RFC 2898)

o PKCS #6 : Extended-Certificate Syntax Standard (being phased out in

favor of X.509v3)

o PKCS #7 : Cryptographic Message Syntax Standard (Also RFC 2315)

o PKCS #8 : Private-Key Information Syntax Standard (Also RFC 5208)

o PKCS #9 : Selected Attribute Types (Also RFC 2985)

o PKCS #10 : Certification Request Syntax Standard (Also RFC 2986)

o PKCS #11 : Cryptographic Token Interface Standard

o PKCS #12 : Personal Information Exchange Syntax Standard

o PKCS #13 : Elliptic Curve Cryptography Standard

o PKCS #14: Pseudorandom Number Generation Standard is no longer

available

o PKCS #15 : Cryptographic Token Information Format Standard

Cramer-Shoup : A public-key cryptosystem proposed by R. Cramer and V. Shoup

of IBM in 1998.

Key Exchange Algorithm (KEA) : A variation on Diffie-Hellman; proposed as the

key exchange method for Capstone.

LUC : A public-key cryptosystem designed by P.J. Smith and based on Lucas

sequences. Can be used for encryption and signatures, using integer factoring.

A digression: Who invented PKC? I tried to be careful in the first paragraph of this

section to state that Diffie and Hellman "first described publicly" a PKC scheme. Although I


http://www.kisa.or.kr/technology/sub1/LUC.htm

http://csrc.nist.gov/CryptoToolkit/skipjack/skipjack.pdf

http://www.zurich.ibm.com/Technology/Security/publications/1998/CS.pdf

















have categorized PKC as a two-key system, that has been merely for convenience; the real

criteria for a PKC scheme is that it allows two parties to exchange a secret even though the

communication with the shared secret might be overheard. There seems to be no question that

Diffie and Hellman were first to publish; their method is described in the classic paper, "New

Directions in Cryptography," published in the November 1976 issue of IEEE Transactions on

Information Theory. As shown below, Diffie-Hellman uses the idea that finding logarithms is

relatively harder than exponentiation. And, indeed, it is the precursor to modern PKC which does

employ two keys. Rivest, Shamir, and Adleman described an implementation that extended this

idea in their paper "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems,"

published in the February 1978 issue of the Communications of the ACM (CACM). Their

method, of course, is based upon the relative ease of finding the product of two large prime

numbers compared to finding the prime factors of a large number.

Some sources, though, credit Ralph Merkle with first describing a system that allows two

parties to share a secret although it was not a two-key system, per se. A Merkle Puzzle works

where Alice creates a large number of encrypted keys, sends them all to Bob so that Bob chooses

one at random and then lets Alice know which he has selected. An eavesdropper will see all of

the keys but can't learn which key Bob has selected (because he has encrypted the response with

the chosen key). In this case, Eve's effort to break in is the square of the effort of Bob to choose a

key. While this difference may be small it is often sufficient. Merkle apparently took a computer

science course at UC Berkeley in 1974 and described his method, but had difficulty making

people understand it; frustrated, he dropped the course. Meanwhile, he submitted the paper

"Secure Communication Over Insecure Channels" which was published in the CACM in April

1978; Rivest et al.'s paper even makes reference to it. Merkle's method certainly wasn't published

first, but did he have the idea first?

An interesting question, maybe, but who really knows? For some time, it was a quiet

secret that a team at the UK's Government Communications Headquarters (GCHQ) had first

developed PKC in the early 1970s. Because of the nature of the work, GCHQ kept the original

memos classified. In 1997, however, the GCHQ changed their posture when they realized that



there was nothing to gain by continued silence. Documents show that a GCHQ mathematician

named James Ellis started research into the key distribution problem in 1969 and that by 1975,

Ellis, Clifford Cocks, and Malcolm Williamson had worked out all of the fundamental details of

PKC, yet couldn't talk about their work. (They were, of course, barred from challenging the RSA

patent!) After more than 20 years, Ellis, Cocks, and Williamson have begun to get their due

credit.

2.1.2.4. Hash Functions

Hash functions, also called message digests and one-way encryption, are algorithms that,

in some sense, use no key (Figure 1C). Instead, a fixed-length hash value is computed based

upon the plaintext that makes it impossible for either the contents or length of the plaintext to be

recovered. Hash algorithms are typically used to provide a digital fingerprint of a file's contents,

often used to ensure that the file has not been altered by an intruder or virus. Hash functions are

also commonly employed by many operating systems to encrypt passwords. Hash functions,

then, provide a measure of the integrity of a file.

Hash algorithms that are in common use today include:

Message Digest (MD) algorithms: A series of byte-oriented algorithms that

produce a 128-bit hash value from an arbitrary-length message.

o MD2 (RFC 1319): Designed for systems with limited memory, such as

smart cards.

o MD4 (RFC 1320): Developed by Rivest, similar to MD2 but designed

specifically for fast processing in software.

o MD5 (RFC 1321): Also developed by Rivest after potential weaknesses

were reported in MD4; this scheme is similar to MD4 but is slower

because more manipulation is made to the original data. MD5 has been

implemented in a large number of products although several weaknesses






in the algorithm were demonstrated by German cryptographer Hans

Dobbertin in 1996.

Secure Hash Algorithm (SHA): Algorithm for NIST's Secure Hash Standard

(SHS). SHA-1 produces a 160-bit hash value and was originally published as

FIPS 180-1 and RFC 3174. FIPS 180-2 describes five algorithms in the SHS:

SHA-1 plus SHA-224, SHA-256, SHA-384, and SHA-512 which can produce

hash values that are 224, 256, 384, or 512 bits in length, respectively. SHA-224, -

256, -384, and -512 are also described in RFC 4634.

RIPEMD : A series of message digests that initially came from the RIPE (RACE

Integrity Primitives Evaluation) project. RIPEMD-160 was designed by Hans

Dobbertin, Antoon Bosselaers, and Bart Preneel, and optimized for 32-bit

processors to replace the then-current 128-bit hash functions. Other versions

include RIPEMD-256, RIPEMD-320, and RIPEMD-128.

HAVAL (HAsh of VAriable Length) : Designed by Y. Zheng, J. Pieprzyk and J.

Seberry, a hash algorithm with many levels of security. HAVAL can create hash

values that are 128, 160, 192, 224, or 256 bits in length.

Whirlpool : A relatively new hash function, designed by V. Rijmen and P.S.L.M.

Barreto. Whirlpool operates on messages less than 2256 bits in length, and

produces a message digest of 512 bits. The design of this has function is very

different than that of MD5 and SHA-1, making it immune to the same attacks as

on those hashes (see below).

Tiger : Designed by Ross Anderson and Eli Biham, Tiger is designed to be secure,

run efficiently on 64-bit processors, and easily replace MD4, MD5, SHA and

SHA-1 in other applications. Tiger/192 produces a 192-bit output and is

compatible with 64-bit architectures; Tiger/128 and Tiger/160 produce the first

128 and 160 bits, respectively, to provide compatibility with the other hash

functions mentioned above.


http://www.cs.technion.ac.il/~biham/Reports/Tiger/

http://paginas.terra.com.br/informatica/paulobarreto/WhirlpoolPage.html

http://www.calyptix.com/technology/haval.php

http://www.esat.kuleuven.ac.be/~cosicart/pdf/AB-9601/AB-9601.pdf

http://www.esat.kuleuven.ac.be/~bosselae/ripemd160.html


http://csrc.nist.gov/publications/fips/fips180-2/fips180-2withchangenotice.pdf



Hash functions are sometimes misunderstood and some sources claim that no two files

can have the same hash value. This is, in fact, not correct. Consider a hash function that provides

a 128-bit hash value. There are, obviously, 2128 possible hash values. But there are a lot more

than 2128 possible files. Therefore, there have to be multiple files — in fact, there have to be an

infinite number of files! — that can have the same 128-bit hash value.

The difficulty is finding two files with the same hash! What is, indeed, very hard to do is

to try to create a file that has a given hash value so as to force a hash value collision — which is

the reason that hash functions are used extensively for information security and computer

forensics applications. Alas, researchers in 2004 found that practical collision attacks could be

launched on MD5, SHA-1, and other hash algorithms. Readers interested in this problem should

read the following:

Burr, W. (2006, Match/April). Cryptographic hash standards: Where do we go

from here? IEEE Security & Privacy, 4(2), 88-91.

Gutman, P., Naccache, D., & Palmer, C.C. (2005, May/June). When hashes

collide. IEEE Security & Privacy, 3(3), 68-71.

Klima, V. (March 2005) "Finding MD5 Collisions - a Toy For a Notebook."

Thompson, E. (2005, February). MD5 collisions and the impact on computer

forensics. Digital Investigation, 2(1), 36-40.

Wang, X., Feng, D., Lai, X., & Yu, H. (August 2004). "Collisions for Hash

Functions MD4, MD5, HAVAL-128 and RIPEMD."

Wang, X., Yin, Y.L., & Yu, H. (February 2005). "Collision Search Attacks on

SHA1."

Readers are also referred to the Eindhoven University of Technology HashClash Project

Web site. An excellent overview of the situation with hash collisions (circa 2005) can be found


http://www.win.tue.nl/hashclash/rogue-ca/

http://theory.csail.mit.edu/~yiqun/shanote.pdf

http://theory.csail.mit.edu/~yiqun/shanote.pdf

http://eprint.iacr.org/2004/199.pdf

http://eprint.iacr.org/2004/199.pdf

http://cryptography.hyperlink.cz/md5/MD5_collisions.pdf


in RFC 4270 (by P. Hoffman and B. Schneier, November 2005). And for additional information

on hash functions, see David Hopwood's MessageDigest Algorithms page.

At this time, there is no obvious successor to MD5 and SHA-1 that could be put into use

quickly; there are so many products using these hash functions that it could take many years to

flush out all use of 128- and 160-bit hashes. That said, NIST announced in 2007 their

Cryptographic Hash Algorithm Competition to find the next-generation secure hashing method.

Dubbed SHA-3, this new scheme will augment FIPS 180-2. A list of submissions can be found at

The SHA-3 Zoo. The SHA-3 standard may not be available until 2011 or 2012.

Certain extensions of hash functions are used for a variety of information security and

digital forensics applications, such as:

Hash libraries are sets of hash values corresponding to known files. A hash

library of known good files, for example, might be a set of files known to be a

part of an operating system, while a hash library of known bad files might be of a

set of known child pornographic images.

Rolling hashes refer to a set of hash values that are computed based upon a fixed-

length "sliding window" through the input. As an example, a hash value might be

computed on bytes 1-10 of a file, then on bytes 2-11, 3-12, 4-13, etc.

Fuzzy hashes are an area of intense research and represent hash values that

represent two inputs that are similar. Fuzzy hashes are used to detect documents,

images, or other files that are close to each other with respect to content. See

"Fuzzy Hashing" (PDF | PPT) by Jesse Kornblum for a good treatment of this

topic.


http://jessekornblum.com/research/fuzzy-hashing-cdfsl-2007.ppt

http://www.dfrws.org/2006/proceedings/12-Kornblum-pres.pdf

http://ehash.iaik.tugraz.at/wiki/The_SHA-3_Zoo%22

http://csrc.nist.gov/groups/ST/hash/sha-3/index.html

http://www.users.zetnet.co.uk/hopwood/crypto/scan/md.html



2.2 SteganographyThe word steganography literally means covered writing as derived from Greek. It

includes a vast array of methods of secret communications that conceal the very existence of the

message. Among these methods are invisible inks, microdots, character arrangement (other than

the cryptographic methods of permutation and substitution), digital signatures, covert channels

and spread-spectrum communications.

Steganography is the art of concealing the existence of information within seemingly

innocuous carriers. Steganography can be viewed as akin to cryptography. Both have been used

throughout recorded history as means to protect information. At times these two technologies

seem to converge while the objectives of the two differ. Cryptographic techniques "scramble"

messages so if intercepted, the messages cannot be understood. Steganography, in an essence,

"camouflages" a message to hide its existence and make it seem "invisible" thus concealing the

fact that a message is being sent altogether. An encrypted message may draw suspicion while an

invisible message will not.

Over the past couple of years, steganography has been the source of a lot of discussion,

particularly as it was suspected that terrorists connected with the September 11 attacks might

have used it for covert communications. While no such connection has been proven, the concern

points out the effectiveness of steganography as a means of obscuring data. Indeed, along with

encryption, steganography is one of the fundamental ways by which data can be kept



confidential. This article will offer a brief introductory discussion of steganography: what it is,

how it can be used, and the true implications it can have on information security.

David Kahn places steganography and cryptography in a table to differentiate against the

types and counter methods used. Here security is defined as methods of "protecting" information

where intelligence is defined as methods of "retrieving" information.

Signal Security Signal Intelligence

Communication Security Communication Intelligence

Steganography (invisible inks, open codes, messages in hollow heels) and Transmission Security (spurt radio and spread spectrum systems)

Interception and direction-finding

Cryptography(codes and ciphers) Cryptanalysis

Traffic security(call-sign changes, dummy messages, radio silence)

Traffic analysis (direction-finding, message-flow studies, radio finger printing)

Electronic Security Electronic Intelligence

Emission Security (shifting of radar frequencies, spread spectrum)

Electronic Reconnaissance (eaves-dropping on radar emissions)

Counter-Countermeasures "looking through" (jammed radar)

Countermeasures (jamming radar and false radar echoes)

Table 1: Kahn's Security Table



Steganography has its place in security. It is not intended to replace cryptography but

supplement it. Hiding a message with steganography methods reduces the chance of a message

being detected. However, if that message is also encrypted, if discovered, it must also be cracked

(yet another layer of protection).

While we are discussing it in terms of computer security, steganography is really nothing

new, as it has been around since the times of ancient Rome. For example, in ancient Rome and

Greece, text was traditionally written on wax that was poured on top of stone tablets. If the

sender of the information wanted to obscure the message - for purposes of military intelligence,

for instance - they would use steganography: the wax would be scraped off and the message

would be inscribed or written directly on the tablet, wax would then be poured on top of the

message, thereby obscuring not just its meaning but its very existence[1].

According to Dictionary.com, steganography (also known as "steg" or "stego") is "the art

of writing in cipher, or in characters, which are not intelligible except to persons who have the

key; cryptography" [2]. In computer terms, steganography has evolved into the practice of hiding

a message within a larger one in such a way that others cannot discern the presence or contents

of the hidden message[3]. In contemporary terms, steganography has evolved into a digital

strategy of hiding a file in some form of multimedia, such as an image, an audio file (like a .wav

or mp3) or even a video file.

2.2.1 History and Steganography

Throughout history, a multitude of methods and variations have been used to hide

information. David Kahn's The Codebreakers provides an excellent accounting of this history

[Kahn67]. Bruce Norman recounts numerous tales of cryptography and steganography during

times of war in Secret Warfare: The Battle of Codes and Ciphers.

One of the first documents describing steganography is from the Histories of Herodotus. In

ancient Greece, text was written on wax covered tablets. In one story Demeratus wanted to notify

Sparta that Xerxes intended to invade Greece. To avoid capture, he scraped the wax off of the


http://www.securityfocus.com/infocus/1684#ref_message

http://www.securityfocus.com/infocus/1684#ref_cryptography

http://dictionary.reference.com/search?q=steganography

http://www.securityfocus.com/infocus/1684#ref_existance


tablets and wrote a message on the underlying wood. He then covered the tablets with wax again.

The tablets appeared to be blank and unused so they passed inspection by sentries without

question.

Another ingenious method was to shave the head of a messenger and tattoo a message or

image on the messengers head. After allowing his hair to grow, the message would be undetected

until the head was shaved again.

Another common form of invisible writing is through the use of Invisible inks. Such inks

were used with much success as recently as WWII. An innocent letter may contain a very

different message written between the lines [Zim48]. Early in WWII steganographic technology

consisted almost exclusively of invisible inks [Kahn67]. Common sources for invisible inks are

milk, vinegar, fruit juices and urine. All of these darken when heated.

With the improvement of technology and the ease as to the decoding of these invisible

inks, more sophisticated inks were developed which react to various chemicals. Some messages

had to be "developed" much as photographs are developed with a number of chemicals in

processing labs.

Null ciphers (unencrypted messages) were also used. The real message is "camouflaged"

in an innocent sounding message. Due to the "sound" of many open coded messages, the suspect

communications were detected by mail filters. However "innocent" messages were allowed to

flow through. An example of a message containing such a null cipher from [JDJ01] is:

Fishing freshwater bends and saltwater

coasts rewards anyone feeling stressed.

Resourceful anglers usually find masterful

leapers fun and admit swordfish rank

overwhelming anyday.

By taking the third letter in each word, the following message emerges [Zevon]:



Send Lawyers, Guns, and Money.

The following message was actually sent by a German Spy in WWII [Kahn67]:

Apparently neutral's protest is thoroughly discounted

and ignored. Isman hard hit. Blockade issue affects

pretext for embargo on by products, ejecting suets and

vegetable oils.

Taking the second letter in each word the following message emerges:

Pershing sails from NY June 1.

As message detection improved, new technologies were developed which could pass

more information and be even less conspicuous. The Germans developed microdot technology

which FBI Director J. Edgar Hoover referred to as "the enemy's masterpiece of espionage."

Microdots are photographs the size of a printed period having the clarity of standard-sized

typewritten pages. The first microdots were discovered masquerading as a period on a typed

envelope carried by a German agent in 1941. The message was not hidden, nor encrypted. It was

just so small as to not draw attention to itself (for a while). Besides being so small, microdots

permitted the transmission of large amounts of data including drawings and photographs

[Kahn67].

With many methods being discovered and intercepted, the Office of Censorship took

extreme actions such as banning flower deliveries which contained delivery dates, crossword

puzzles and even report cards as they can all contain secret messages. Censors even went as far

as rewording letters and replacing stamps on envelopes.

With every discovery of a message hidden using an existing application, a new

steganographic application is being devised. There are even new twists to old methods. Drawings



have often been used to conceal or reveal information. It is simple to encode a message by

varying lines, colors or other elements in pictures. Computers take such a method to new

dimensions as we will see later.

Even the layout of a document can provide information about that document. Brassil et al

authored a series of publications dealing with document identification and marking by

modulating the position of lines and words [Brassil-Infocom94, Brassil- Infocom94, Brassil-

CISS95]. Similar techniques can also be used to provide some other "covert" information just as

0 and 1 are informational bits for a computer. As in one of their examples, word-shifting can be

used to help identify an original document [Brassil-CISS95]. Though not applied as discussed in

the series by Brassil et al, a similar method can be applied to display an entirely different

message. Take the following sentence (S0):

We explore new steganographic and cryptographic

algorithms and techniques throughout the world to

produce wide variety and security in the electronic web

called the Internet.

and apply some word shifting algorithm (this is sentence S1).





By overlapping S0 and S1, the following sentence is the result:







This is achieved by expanding the space before explore, the, wide, and web by one point and

condensing the space after explore, world, wide and web by one point in sentence S1.

Independently, the sentences containing the shifted words appear harmless, but combining this

with the original sentence produces a different message: explore the world wide web.

2.3 Steganography – Uses:

Like many security tools, steganography can be used for a variety of reasons, some good,

some not so good. Legitimate purposes can include things like watermarking images for reasons

such as copyright protection. Digital watermarks (also known as fingerprinting, significant

especially in copyrighting material) are similar to steganography in that they are overlaid in files,

which appear to be part of the original file and are thus not easily detectable by the average

person. Steganography can also be used as a way to make a substitute for a one-way hash value

(where you take a variable length input and create a static length output string to verify that no

changes have been made to the original variable length input). Further, steganography can be

used to tag notes to online images (like post-it notes attached to paper files). Finally,

steganography can be used to maintain the confidentiality of valuable information, to protect the

data from possible sabotage, theft, or unauthorized viewing.

Unfortunately, steganography can also be used for illegitimate reasons. For instance, if

someone was trying to steal data, they could conceal it in another file or files and send it out in

an innocent looking email or file transfer. Furthermore, a person with a hobby of saving

pornography, or worse, to their hard drive, may choose to hide the evidence through the use of

steganography. And, as was pointed out in the concern for terroristic purposes, it can be used as a

means of covert communication. Of course, this can be both a legitimate and an illegitimate

application.

2.4 Steganography Tools

There are a vast number of tools that are available for steganography. An important

distinction that should be made among the tools available today is the difference between tools


http://www.webopaedia.com/TERM/h/hashing.html


that do steganography, and tools that do steganalysis, which is the method of detecting

steganography and destroying the original message. Steganalysis focuses on this aspect, as

opposed to simply discovering and decrypting the message, because this can be difficult to do

unless the encryption keys are known.

A comprehensive discussion of steganography tools is beyond the scope of this article.

However, there are many good places to find steganography tools on the Net. One good place to

start your search for stego tools is on Neil Johnson's Steganography and Digital Watermarking

Web site. The site includes an extensive list of steganography tools. Another comprehensive

tools site is located at the StegoArchive.com.

For steganalysis tools, a good site to start with is Neil Johnson's Steganalysis site. Niels

Provos's site, is also a great reference site, but is currently being relocated, so keep checking

back on its progress.

The plethora of tools available also tends to span the spectrum of operating systems.

Windows, DOS, Linux, Mac, Unix: you name it, and you can probably find it.

2.4.1 Working of Steganography Tools:

To show how easy steganography is, I started out by downloading one of the more

popular freeware tools out now: F5, then moved to a tool called SecurEngine, which hides text

files within larger text files, and lastly a tool that hides files in MP3s called MP3Stego. I also

tested one commercial steganography product, Steganos Suite.

F5 was developed by Andreas Westfield, and runs as a DOS client. A couple of GUIs

were later developed: one named "Frontend", developed by Christian Wohne and the other,

named "Stegano", by Thomas Biel. I tried F5, beta version 12. I found it very easy to encode a

message into a JPEG file, even if the buttons in the GUI are written in German! Users can simply

do this by following the buttons, inputting the JPEG file path, then the location of the data that is

being hidden (in my case, I used a simple text file created in Notepad), at which point the


http://www.steganos.com/en/

http://www.cl.cam.ac.uk/~fapp2/steganography/mp3stego/

http://securengine.isecurelabs.com/download_en.html

http://wwwrn.inf.tu-dresden.de/~westfeld/f5.html

http://www.citi.umich.edu/u/provos/stego/

http://www.jjtc.com/Steganalysis/

http://members.tripod.com/steganography/stego/software.html

http://www.jjtc.com/Steganography/toolmatrix.htm


program prompts the user for a pass phrase. As you can see by the before and after pictures

below, it is very hard to tell them apart, embedded message or not.

Granted, the file that I embedded here was very small (it included one line of text: "This is a

test. This is only a test."), so not that many pixels had to be replaced to hide my message. But

what if I tried to hide a larger file? F5 only hides text files. I tried to hide a larger word document

and although it did hide the file, when I tried to decrypt it, it came out as garbage. However,

larger text files seemed to hide in the picture just as well as my small, one-line message.

SecurEngine doesn't seem to be as foolproof as the tools that hide text within pictures. When I

hid my small text file in a bigger text file, I found an odd character at the bottom of the encoded

file ("ÿ"). This character was not in the original file. SecurEngine gives users the option of just

hiding the image, hiding the image as well as encrypting it, or both. The test message was

encrypted and decrypted without issue. SecurEngine also has a feature that helps to "wipe" files

(to delete them more securely).

MP3Stego, a tool that hides data in MP3 files worked very well. How the process works

is like this: you encode a file, a text file for example, with a .WAV file, in order for it to be

compressed into MP3 format. One problem that I ran into was that in order to hide data of any

size, I had to find a file that was proportional in size. So, for instance, my small text message

from the previous exercise was too big to hide in a .WAV file (the one that I originally tried was

121KB, and the text file was around 36 bytes). In order to ultimately hide a file that was 5 bytes

(only bearing the message "test."), I found a .WAV file that was 627 KB. The ultimate MP3 file

size was 57KB.

Steganos Suite is a commercial software package of numerous stego tools all rolled into

one. In addition to a nifty Internet trace destructor function and a computer file shredder, it has a

function called the File Manager. This allows users to encrypt and hide files on their hard drive.

The user selects a file or folder to hide, and then selects a "carrier" file, which is defined as a

graphic or sound file. It will also create one for you if you prefer, if you have a scanner or



microphone available. If you don't have a file handy or if you want to create one, the File

Manager will search your hard drive for an appropriate carrier. This tool looks for a wider

variety of file types than the majority of the freeware tools that I perused (such as .DLL and .DIB

files), so if you intend to do quite a bit of file hiding, you might want to invest in a commercial

package.

2.5 Steganography and Security

As mentioned previously, steganography is an effective means of hiding data, thereby

protecting the data from unauthorized or unwanted viewing. But stego is simply one of many

ways to protect the confidentiality of data. It is probably best used in conjunction with another

data-hiding method. When used in combination, these methods can all be a part of a layered

security approach. Some good complementary methods include:

Encryption - Encryption is the process of passing data or plaintext through a series of

mathematical operations that generate an alternate form of the original data known as

ciphertext. The encrypted data can only be read by parties who have been given the

necessary key to decrypt the ciphertext back into its original plaintext form. Encryption

doesn't hide data, but it does make it hard to read!

Hidden directories (Windows) - Windows offers this feature, which allows users to hide

files. Using this feature is as easy as changing the properties of a directory to "hidden",

and hoping that no one displays all types of files in their explorer.

Hiding directories (Unix) - in existing directories that have a lot of files, such as in

the /dev directory on a Unix implementation, or making a directory that starts with three

dots (...) versus the normal single or double dot.

Covert channels - Some tools can be used to transmit valuable data in seemingly normal

network traffic. One such tool is Loki. Loki is a tool that hides data in ICMP traffic (like

ping).

2.6 Protecting Against Malicious Steganography


http://www.securityfocus.com/infocus/1181


Unfortunately, all of the methods mentioned above can also be used to hide illicit, unauthorized

or unwanted activity. What can you do to prevent or detect issues with stego? There is no easy

answer. If someone has decided to hide their data, they will probably be able to do so fairly

easily. The only way to detect steganography is to be actively looking for in specific files, or to

get very lucky. Sometimes an actively enforced security policy can provide the answer: this

would require the implementation of company-wide acceptable use policies that restrict the

installation of unauthorized programs on company computers.

Using the tools that you already have to detect movement and behavior of traffic on your

network may also be helpful. Network intrusion detection systems can help administrators to

gain an understanding of normal traffic in and around your network and can thus assist in

detecting any type of anomaly, especially with any changes in the behavior of increased

movement of large images around your network. If the administrator is aware of this sort of

anomalous activity, it may warrant further investigation. Host-based intrusion detection systems

deployed on computers may also help to identify anomalous storage of image and/or video files.

A research paper by Stefan Hetzel cites two methods of attacking steganography, which

really are also methods of detecting it. They are the visual attack (actually seeing the differences

in the files that are encoded) and the statistical attack: "The idea of the statistical attack is to

compare the frequency distribution of the colors of a potential stego file with the theoretically

expected frequency distribution for a stego file." It might not be the quickest method of

protection, but if you suspect this type of activity, it might be the most effective. For JPEG files

specifically, a tool called Stegdetect, which looks for signs of steganography in JPEG files, can

be employed. Stegbreak, a companion tool to Stegdetect, works to decrypt possible messages

encoded in a suspected steganographic file, should that be the path you wish to take once the

stego has been detected.


http://www.outguess.org/detection.php

http://steghide.sourceforge.net/steganography/survey/steganography.html


Chapter 3

PC Software that Provide Steganographic Services

3.1 Background

Steganographic software is new and very effective. Such software enables information to

be hidden in graphic, sound and apparently "blank" media. Charles Kurak and John McHugh

discuss the implications of downgrading an image (security downgrading) when it may contain

some other information [Kurak92]. Though not explicitly stated the author(s) of StegoDos

mention embedding viruses in images [StegoDos].

In the computer, an image is an array of numbers that represent light intensities at various

points (pixels1) in the image. A common image size is 640 by 480 and 256 colors (or 8 bits per

pixel). Such an image could contain about 300 kilobits of data.

There are usually two type of files used when embedding data into an image. The

innocent looking image which will hold the hidden information is a "container." A "message" is


http://www.jjtc.com/stegdoc/sec301.html#footnote


the information to be hidden. A message may be plain-text, ciphertext, other images or any thing

that can be embedded in the least significant bits (LSB) of an image.

For example:

Suppose we have a 24-bit image 1024 x 768 (this is a common resolution for satellite images,

electronic astral photographs and other high resolution graphics). This may produce a file over 2

megabytes in size (1024x768x24/8 = 2,359,296 bytes). All color variations are derived from

three primary colors, Red, Green and Blue. Each primary color is represented by 1 byte (8 bits).

24-bit images use 3 bytes per pixel. If information is stored in the least significant bit (LSB) of

each byte, 3 bits can be a stored in each pixel. The "container" image will look identical to the

human eye, even if viewing the picture side by side with the original. Unfortunately, 24-bit

images are uncommon (with exception of the formats mentioned earlier) and quite large. They

would draw attention to themselves when being transmitted across a network. Compression

would be beneficial if not necessary to transmit such a file. But file compression may interfere

with the storage of information.

1 A pixel is an instance of color, a point in a picture.

Kurak and McHugh identify two kinds of compression, lossless and lossy [Kurak92]. Both

methods save storage space but may present different results when the information is

uncompressed.

Lossless compression is preferred when there is a requirement that the original

information remain intact (as with steganographic images). The original message can be

reconstructed exactly. This type of compression is typical in GIF2 and BMP3 images.

Lossy compression, while also saving space, may not maintain the integrity of the

original image. This method is typical in JPG4 images and yields very good compression.

To illustrate the advantage of lossy compression, Renoir's Le Moulin de la Galette was

retrieved as a 175,808 byte JPG image 1073 x 790 pixels with 16 million possible colors. The






colors were maintained when converting it to a 24-bit BMP file but the file size became

2,649,019 bytes! Converting again to a GIF file, the colors were reduced to 256 colors (8-bit)

and the new file is 775,252 bytes. The 256 color image is a very good approximation of Renoir's

painting.

2 Graphic Interchange Format developed by Compuserve to be a

device-independent method of storing images.

3 Windows and OS/2 bitmap picture file.

4 Joint Photography experts Group (JPG/JPEG) is a device-

independent method for storing images which supports 24-bit

images.

Most steganographic software available does not support, nor recommends, using JPG

files (an exception is noted later in the paper). The next best alternative to 24-bit images, is to use

256 color (or gray-scale) images. These are the most common images found on the Internet in

the form of GIF files. Each pixel is represented as a byte (8-bits). Many authors of the

steganography software and articles stress the use of gray-scale images (those with 256 shades of

gray or better) [Arachelian, Aura95, Kurak92, Maroney]. The importance is not whether the

image is gray-scale or not, the importance is the degree to which the colors change between bit

values.

Gray-scale images are very good because the shades gradually change from byte to byte.

The following is a palette containing 256 shades of gray.



3.2 Evaluation Method

A similar image with 16 shades of gray (four-bit color) may look very close to one with

256 shades of gray but the palette has less variations with which to work. The subtleties permit

data to be stored without the human eye catching the changes. Many argue that gray-scale

images render the "best" results for steganography. However, using gray- scale or color is not as

important as the subtleties in color variation. Consider the following two 256 color palettes.


http://www.jjtc.com/stegdoc/sec305.html


Figure 3 illustrates subtle changes in color variations. It is difficult to differentiate

between many of the colors in this palette. Is this palette in Figure 2 "good" for steganography?

Well, it depends. Subtle color changes can be seen in Figure 2, but other color variances seem to

be rather drastic. However, one must consider the image in addition to the palette. Obviously, an

image with large areas of solid colors is a poor choice as variances created from the embedded

message will be noticeable in the solid areas (a palette as in Figure 3 would offset this). Figure 2

is the palette from a 256 color version of Renoir's Le Moulin de la Galette. Based on embedding

this image with text and graphic messages, it is a very good container for holding data.

Various steganographic software packages were explored. The evaluation process was to

determine limitations and flexibility of the software readily available to the public.

Message and container files were selected before testing. This proved to be a problem

with some packages due to limitations of the software. The images selected had to be altered to

fit into the constraints of the software and other containers were used. In all, a total of 25 files

were used as containers (much more than I have room to discuss).



The files used for evaluation included two "message" files and two "container" files. The

"message" files are those to be hidden in the innocent looking "container" files.

Message 1 contains the following plain-text and will be referred to as M1:

Steganography is the art and science of communicating in a way which hides the existence

of the communication. In contrast to cryptography, where the "enemy" is allowed to detect,

intercept and modify messages without being able to violate certain security premises guaranteed

bya cryptosystem, the goal of steganography is to hide messages inside other "harmless"

messages in a way that does not allow any "enemy" to even detect that there is a second secret

message present.

5 The satellite photograph is of a major Soviet strategic bomber base near Dolon,

Kazakhstan taken August 20, 1966. An Executive Order, signed by President Clinton on 23

February 1995, has authorized the declassification of satellite photographs collected by the U.S.



intelligence community during the 1960's. This and other photographs are available on the

Internet via U.S Geological Survey - National Mapping Information - EROS Data Center.

The Container Files Figure 5: Renoir's Le Moulin de la Galette - Container C1

Figure 6: Droeshout engraving of William Shakespeare - Container C27




6 Le Moulin de la Galette by Pierre-Auguste Renoir is available via the WebMuseum, Paris

and accessible.

7 A JPG version of Droeshout engraving of William Shakespeare is available.

The image of Shakespeare is too small to contain M2, but M1 could be embedded

without any degradation of the image. For the most part, all the software tested could handle the

518 byte plaintext message, however, only two could handle the image labeled M2. Of the two,

only one software package could reliably handle 24-bit images and other formats consistently: S-

Tools by Andy Brown.

Next, an attempt was made to embed messages M1 and M2 using each software package.

If the software could not handle processing these containers (C1 and C2), other containers were

tried. All the software could embed M1 into some container. These files were reviewed before

and after applying steganographic methods.

3.3 Software Evaluation

The following software packages were reviewed with respect to steganographic

manipulation of images: Hide and Seek v4.1, StegoDos v0.90a, White Noise Storm, and S-Tools

for Windows v3.00. Nearly all the authors encourage encrypting messages before embedding



them in images as an added layer of protection and reviewing the images after embedding data.

Even with the most reliable software tested, there may be some unexpected results.

3.3.1 Hide and Seek v 4.1

Hide and Seek versions 4.1 and 5.0 by Colin Maroney have similar limitations with

minimum image sizes (320 x 480). In version 4.1 if the image is smaller than the minimum, then

the stego-image is padded with black space. If the cover image is larger, the stego-image is

cropped to fit. In version 5.0 the same is true with minimum image sizes. If any image exceeds

1024 x 768, an error message is returned. The Hide and Seek 1.0 for Windows 95 version seems

to have these issues resolved and is a much improved steganography tool. Version 4.1 is

evaluated here to illustrate limitations of some steganography tools.

Hide and Seek 4.1 is free software which contains a series of DOS programs that embed

data in GIF files and comes with the source code.

Hide and Seek uses the Least Significant Bit of each pixel to encode characters, 8 pixels

per character and spreads the data throughout the GIF in a somewhat random fashion. The larger

the message the more likely the resulting image will be degraded. Since the data is dispersed

"randomly" and the message file header is encrypted, there is no telling what is in an embedded

file.

Unfortunately the hidden file can be no longer than 19,000 bytes because the maximum

display used is 320 x 480 pixels. Each character takes 8 pixels two hide ( (320x480)/8 = 19200).

C2 (Shakespeare) was used to embed M1. The original image of Shakespeare is 222 x

282 pixels and 256 shades of gray. The resulting image was forced to 320 x 480 pixels. Instead

of "stretching" the image to fit, large black areas were added to the image making it 320 x 480.

The image on the left is the original C2 and the image on the right is embedded with M1.



3.3.2 StegoDos

StegoDos is also known as Black Wolf's Picture Encoder version 0.90a. This is Public

Domain software written by Black Wolf (anonymous). This is a series of DOS programs that

require far too much effort for the results. It will only work with 320x200 images with 256

colors. To encode a message, one must:

1. Run GETSCR. This starts a TSR which will perform a screen capture when

PRINTSCREEN is pressed.

2. View the image with a third-party image viewing software (not included with StegoDos)

and press PRINTSCREEN to save the image in MESSAGE.SCR.

3. Save your message to be embedded in the image as MESSAGE.DAT.

4. Run ENCODE. This will merge MESSAGE.DAT with MESSAGE.SCR.

5. Use a third party screen capturing program (not included with StegoDos) to capture the

new image from the screen.



6. Run PUTSCR and capture the image displayed on the screen.

Decoding the message is not as involved but still requires a third party program to view the

image. To decode a message, one must:

1. Run GETSCR. This starts a TSR which will perform a screen capture when

PRINTSCREEN is pressed.

2. View the image containing a message with a third-party image viewing software (not

included with StegoDos) and press PRINTSCREEN to save the image in

MESSAGE.SCR.

3. Run DECODE. This will extract the stored message from MESSAGE.SCR.

Due to the size restrictions, M2 and C1 could not be used. C2 (Shakespeare) and a number of

other containers were tested (both color and gray-scale) with M1. Every one of them were

obviously distorted. There was little distortion within the C2 image, but it was cropped and fitted

into a 320 x 200 pixel image. The image on the left is the original C2 file. The image on the right

contains the M1 message:



This application uses the Least Significant Bit method with less success than the others. It

also appends an EOF (end of file) character to the end of the message. Even with the EOF

character, the message retrieved from the altered imaged most likely contained garbage at the

end. The following is the original message (M1) and a portion of the message extracted from the

image created with StegoDos:

Steganography is the art and science of communicating in a way which hides the

existence of the communication. In contrast to cryptography, where the "enemy" is allowed to

detect, intercept and modify messages without being able to violate certain security premises

guaranteed by a cryptosystem, the goal of steganography is to hide messages inside other

"harmless" messages in a way that does not allow any "enemy" to even detect that there is a

second secret message present.

The original file is 518 bytes. The extracted file is around 8 kilobytes:

Steganography is the art and science of communicating in a way which hides the existence of

the communication. In contrast to cryptography, where the "enemy" is allowed to detect,

intercept and modify messages without being able to violate certain security premises guaranteed

by a cryptosystem, the goal of steganography is to hide messages inside other "harmless"



messages in a way that does not allow any "enemy" to even detect that there is a second secret

message present.

3.3.3 White Noise Storm

White Noise Storm by Ray (Arsen) Arachelian is a very versatile steganography

application for DOS. Embedding M1 in the containers C1 and C2 was rather trivial and no

degradation could be detected. White Noise Storm was the first software tested that could embed

M2 into C1 - notice the "noise" interfering with the image integrity.

The image on the left is the original C2. The image on the right contains message M1:





Arachelian encourages encrypting the message before embedding it into an image. White

Noise Storm (WNS) also includes an encryption routine to "randomize" the bits with in an image.

His use of encryption with steganography is well integrated, but is beyond the scope of this

paper.

WNS was designed based on the idea of spread spectrum technology and frequency

hopping. "Instead of having X channels of communication which are changed with a fixed

formula and passkey. Eight channels are spread within a number of 8-bits*W byte channels. W

represents a random sized window of W bytes. Each of these eight channels represents one single

bit, so each window holds one byte of information and a lot of unused bits. These channels rotate

among themselves, for instance bit 1 might be swapped with bit 7, or all the bits may rotate

positions at once. These bits change location within the window on the byte level. The rules for

this swapping are dictated not only by the passphrase by also by the previous window's random

data (similar to DES block encryption)" [Arachelian, RE: Steganography].

WNS also used the Least Significant Bit (LSB) application of steganography and applies

this method to PCX8 files. The software extracts the LSBs from the container image and stores

them in a file. The message is encrypted and applied to these bits to create a "new" set of LSBs.

These are then "injected" into the container image to create a new image. The documentation that

accompanies White Noise Storm is well organized and explains some of the theory behind the

implementation of encryption and steganography.

The main disadvantage of applying the WNS encryption method to steganography is the

loss of many bits that can be used to hold information. Relatively large files must be used to hold

the same amount of information other methods provide.




3.3.4 S-Tools

Steganography Tools (S-Tools) for Windows 3.00 by Andy Brown is the most versatile

steganography tools of any applications tested. It includes several programs that process GIF and

BMP images (ST-BMP.EXE), audio WAV files (ST-WAV.EXE) and will even hide information

in the "unused" areas on floppy diskettes (ST-FDD.EXE). In addition to supporting 24-bit

images, S-Tools also includes a barrage of encryption routines (Idea, MPJ2, DES, 3DES and

NSEA) with many options.

S-Tools applies the LSB methods discussed before to both images and audio files. Due to

the lack of resources, only images were tested. Brown developed a very nice interface with

prompts and well developed on-line documentation. The only apparent limitations were the

resources available. There were times large 24-bit images would bring the Windows to a halt. A

very useful feature is a status line that displays the largest message size that can be store in an

open container file. This saved the time of attempting to store a message that is too large for a

container. After hiding the message, the "new" image will be displayed and let you toggle

between the new and original images. At times the new image looked to be grossly distorted, but

after saving the new image looked nearly identical to the original. This may be due to memory

limitations. On occasion a saved image was actually corrupted and could not be read. A saved

image should always be reviewed before sending it out.

S-Tools provided the most impressive results. Unlike the obvious distortions in "A

Cautionary Note on Image Downgrading" [Kurak92], S-Tools maintained remarkable image

integrity. The following figure illustrates the text message M1 embedded in container C2.





The following is the original C1 (top) and C1 embedded with M2 (airfield):



The following is derived from S-Tools BMP - How it is done by Andy Brown:

"S-Tools works by 'spreading' the bit-pattern of the message file to be hidden across the

least-significant bits of the color levels in the image. S-Tools tries to reduce the number of image

colors in a manner that preserves as much of the image detail as possible. It is difficult to tell the

difference between a 256 color image and one reduced to 32."

"S-Tools adds some extra information on to the front of the message file before hiding.

32 bits of time-dependent random garbage is added first. This step means that two identical

hidden files that are encrypted in CBC or PCBC mode will never encipher to the same

ciphertext. The 32 bit length of the hidden file is then included. This is required for S- Tools to

be able to extract the hidden file. Encryption will conceal this value."

"To further conceal the presence of a file, S-Tools picks its bits from the image based on

the output of a random number generator. This is designed to defeat an attacker who might apply

a statistical randomness test to the lower bits of the image to determine whether encrypted data is

hidden there (well-encrypted data shows up as pure white noise). The random number generator

used by S-Tools is based on the output of the MD5 message digest algorithm, and is not easily (if

at all) defeatable" [S-Tools Documentation by Andy Brown].



3.4. Software not tested but worth noting

The following software packages were reviewed but not tested: Jpeg-Jsteg v4 and Stealth v1.1.

3.4.1 Jpeg-Jsteg v4

Cryptography and steganography rely on retrieving a message in its original form without

losing any information. Such is the idea behind lossless compression. Since JPG images use

lossy encoding to compress its data, it is generally thought that steganography would be

infeasible with such images. "This version of the Independent JPEG Group's JPEG Software has

been modified for 1-bit steganography in JFIF output files" [Independent JPEG Group]. The

Jpeg-Jsteg software comes with source code and instructions for compiling the code on various

platforms.

According to the Independent JPEG Group (IJPG), the JFIF format is composed of lossy

and non-lossy stages. Information can be inserted between these stages without corrupting the

image.

As discussed earlier with Renoir's Le Moulin de la Galette compression is a great

advantage JPG images have over other formats. JPEG images are becoming more abundant on

the Internet because large images with unlimited colors can be stored in relatively small files (a

1073 x 790 pixel image with 16 million colors can be stored in a 170 Kilobyte file. The same

image is over 2 Megabytes if converted to a BMP).

3.4.2 Stealth v1.1

Stealth by Henry Hastur in and of itself is not a steganographic program or method. It is

usually found with steganographic software on the Internet and is used to complement the

steganographic methods. Stealth is a filter that strips off the PGP header that is on a PGP

encrypted file. This leaves only the encrypted data. Why is this important? Applying

steganography to an encrypted message is more secure than a "plain text" message. However,

many encryption applications add header information to the encrypted message. This header



information identifies the method used to encrypt the data. For example, if a cracker has

identified hidden data in an image and has successfully extracted the encrypted message, a

header for the encryption method would point the cracker in the right direction for additional

cryptanalysis. But, if the header is removed, the cracker cannot determine the method for

encryption. Some steganography software (White Noise Storm and S-Tools) provide this step in

security, but others do not.

3.5 Code:

function varargout = steg(varargin)

% STEG M-file for steg.fig

% STEG, by itself, creates steg new STEG or raises the existing

% singleton*.

%

% H = STEG returns the handle to steg new STEG or the handle to

% the existing singleton*.

%

% STEG('CALLBACK',hObject,eventData,handles,...) calls the local

% function named CALLBACK in STEG.M with the given input arguments.

%

% STEG('Property','Value',...) creates steg new STEG or raises the

% existing singleton*. Starting from the left, property value pairs are

% applied to the GUI before steg_OpeningFunction gets called. An

% unrecognized property name or invalid value makes property application

% stop. All inputs are passed to steg_OpeningFcn via varargin.

%



% *See GUI Options on GUIDE's Tools menu. Choose "GUI allows only one

% instance to run (singleton)".

%

% See also: GUIDE, GUIDATA, GUIHANDLES

% Copyright 2002-2003 The MathWorks, Inc.

% Edit the above text to modify the response to help steg

% Last Modified by GUIDE v2.5 27-Jun-2007 02:17:56

% Begin initialization code - DO NOT EDIT

gui_Singleton = 1;

gui_State = struct('gui_Name', mfilename, ...

'gui_Singleton', gui_Singleton, ...

'gui_OpeningFcn', @steg_OpeningFcn, ...

'gui_OutputFcn', @steg_OutputFcn, ...

'gui_LayoutFcn', [] , ...

'gui_Callback', []);

if nargin && ischar(varargin{1})

gui_State.gui_Callback = str2func(varargin{1});

end

if nargout

[varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});

else

gui_mainfcn(gui_State, varargin{:});

end



% End initialization code - DO NOT EDIT

% --- Executes just before steg is made visible.

function steg_OpeningFcn(hObject, eventdata, handles, varargin)

% This function has no output args, see OutputFcn.

% hObject handle to figure

% eventdata reserved - to be defined in steg future version of MATLAB

% handles structure with handles and user data (see GUIDATA)

% varargin command line arguments to steg (see VARARGIN)

% Choose default command line output for steg

handles.output = hObject;

% Update handles structure

guidata(hObject, handles);

% UIWAIT makes steg wait for user response (see UIRESUME)

% uiwait(handles.figure1);

% --- Outputs from this function are returned to the command line.

function varargout = steg_OutputFcn(hObject, eventdata, handles)

% varargout cell array for returning output args (see VARARGOUT);

% hObject handle to figure



% Get default command line output from handles structure

varargout{1} = handles.output;

% --- Executes on button press in mat.

function mat_Callback(hObject, eventdata, handles)



% hObject handle to mat (see GCBO)



mat

% --- Executes on button press in itu.

function itu_Callback(hObject, eventdata, handles)

% hObject handle to itu (see GCBO)



itu

% --- Executes on button press in dec.

function dec_Callback(hObject, eventdata, handles)

% hObject handle to dec (see GCBO)



global var

% Loading the Image

[filename, pathname, filterindex]=uigetfile( ...

{'*.jpg','JPG File (*.bmp)'; ...

'*.*','Any Image file (*.*)'}, ...

'Pick an image file');

var=strcat(pathname,filename);

dec(var)



Chapter 4

Conclusion & Future Scope

Steganography has its place in security. It is not intended to replace cryptography but

supplement it. Hiding a message with steganography methods reduces the chance of a message

being detected. However, if that message is also encrypted, if discovered, it must also be cracked

(yet another layer of protection).

There are an infinite number of steganography applications. This paper explores a tiny

fraction of the art of steganography. It goes well beyond simply embedding text in an image.

Steganography does not only pertain to digital images but also to other media (files such as

voice, other text and binaries; other media such as communication channels, the list can go on

and on). Consider the following example:

A person has a cassette tape of Pink Floyd's "The Wall." The plans of a Top Secret

project (e.g., device, aircraft, covert operation) are embedded, using some steganographic

method, on that tape. Since the alterations of the "expected contents" cannot be detected,

(especially by human ears and probably not easily so by digital means) these plans can cross

borders and trade hands undetected. How do you detect which recording has the message?

This is a trivial (and incomplete) example, but it goes far beyond simple image encoding

in an image with homogeneous regions. Part of secrecy is selecting the proper mechanisms.

Consider encoding using an Mandelbrot image [Hastur].

In and of itself, steganography is not a good solution to secrecy, but neither is simple

substitution and short block permutation for encryption. But if these methods are combined, you

have much stronger encryption routines (methods).



For example (again over simplified): If a message is encrypted using substitution

(substituting one alphabet with another), permute the message (shuffle the text) and apply a

substitution again, then the encrypted ciphertext is more secure than using only substitution or

only permutation. NOW, if the ciphertext is embedded in an [image, video, voice, etc.] it is even

more secure. If an encrypted message is intercepted, the interceptor knows the text is an

encrypted message. With steganography, the interceptor may not know the object contains a

message.



References:

1. [Aura95] Tuomas Aura, "Invisible Communication," EET 1995,

2. [Brassil-Infocom95] J. Brassil, S. Low, N. Maxemchuk, L. O’Goram, "Document Marking and Identification using Both Line and Word Shifting," Infocom95,

3. [Brassil-Infocom94] J. Brassil, S. Low, N. Maxemchuk, L. O’Goram, "Electronic Marking and Identification Techniques to Discourage Document Copying,"

4. [Brassil-CISS95] J. Brassil, S. Low, N. Maxemchuk, L. O’Goram, "Hiding Information in Document Images," CISS95,

5. [JDJ01] Neil F. Johnson, Zoran Duric, Sushil Jajodia, Information Hiding: Steganography and Watermarking - Attacks and Countermeasures Kluwer Academic Press, Norwrll, MA, New York, The Hague, London, 2000.

6. [Kahn67] David Kahn, The Codebreakers, The Macmillan Company. New York, NY 1967.

7. [Kurak92] C. Kurak, J. McHugh, "A Cautionary Note On Image Downgrading," IEEE Eighth Annual Computer Security Applications Conference, 1992. pp. 153-159.

8. [Norman73] Bruce Norman, Secret Warfare, Acropolis Books Ltd. Washington, DC 1973.

9. [Zevon] Warren Zevon, Lawyers, Guns, and Money. Music track released in the albums Excitable Boy, 1978; Stand in the Fire, 1981; A Quiet Normal Life, 1986; Learning to Flinch, 1993.

10. [Zim48] Herbert S. Zim, Codes and Secret Writing, William Marrow and Company. New York, NY, 1948.



Appendix - A

Steganography: Hiding Data Within Data

Cryptography — the science of writing in secret codes — addresses all of the elements necessary

for secure communication over an insecure channel, namely privacy, confidentiality, key

exchange, authentication, and non-repudiation. But cryptography does not always provide

safe communication.

Consider an environment where the very use of encrypted messages causes suspicion. If a

nefarious government or Internet service provider (ISP) is looking for encrypted

messages, they can easily find them. Consider the following text file; what else is it likely

to be if not encrypted?

qANQR1DBwU4D/TlT68XXuiUQCADfj2o4b4aFYBcWumA7hR1Wvz9rbv2BR6WbEUsy

ZBIEFtjyqCd96qF38sp9IQiJIKlNaZfx2GLRWikPZwchUXxB+AA5+lqsG/ELBvRa

c9XefaYpbbAZ6z6LkOQ+eE0XASe7aEEPfdxvZZT37dVyiyxuBBRYNLN8Bphdr2zv

z/9Ak4/OLnLiJRk05/2UNE5Z0a+3lcvITMmfGajvRhkXqocavPOKiin3hv7+Vx88

uLLem2/fQHZhGcQvkqZVqXx8SmNw5gzuvwjV1WHj9muDGBY0MkjiZIRI7azWnoU9

3KCnmpR60VO4rDRAS5uGl9fioSvze+q8XqxubaNsgdKkoD+tB/4u4c4tznLfw1L2

YBS+dzFDw5desMFSo7JkecAS4NB9jAu9K+f7PTAsesCBNETDd49BTOFFTWWavAfE

gLYcPrcn4s3EriUgvL3OzPR4P1chNu6sa3ZJkTBbriDoA3VpnqG3hxqfNyOlqAka

mJJuQ53Ob9ThaFH8YcE/VqUFdw+bQtrAJ6NpjIxi/x0FfOInhC/bBw7pDLXBFNaX

HdlLQRPQdrmnWskKznOSarxq4GjpRTQo4hpCRJJ5aU7tZO9HPTZXFG6iRIT0wa47

AR5nvkEKoIAjW5HaDKiJriuWLdtN4OXecWvxFsjR32ebz76U8aLpAK87GZEyTzBx

dV+lH0hwyT/y1cZQ/E5USePP4oKWF4uqquPee1OPeFMBo4CvuGyhZXD/18Ft/53Y

WIebvdiCqsOoabK3jEfdGExce63zDI0=

=MpRf



The message above is a sentence in English that is encrypted using Pretty Good Privacy (PGP),

probably the most commonly used e-mail encryption software today. Besides being

nonsensical to a casual reader, the other indication that this is encrypted is that the

characters comprising the message appear more-or-less at random and do not adhere to

the relative frequency counts that one would expect in a non-encrypted message.

Encrypted data sticks out like a sore thumb.

Steganography is the science of hiding information. Whereas the goal of cryptography is to make

data unreadable by a third party, the goal of steganography is to hide the data from a third

party. In this article, I will discuss what steganography is, what purposes it serves, and

will provide an example using available software.

STEGANOGRAPHY

There are a large number of steganographic methods that most of us are familiar with (especially

if you watch a lot of spy movies!), ranging from invisible ink and microdots to secreting

a hidden message in the second letter of each word of a large body of text and spread

spectrum radio communication. With computers and networks, there are many other

ways of hiding information, such as:

Covert channels (e.g., Loki and some distributed denial-of-service tools use the Internet

Control Message Protocol, or ICMP, as the communications channel between the "bad

guy" and a compromised system)

Hidden text within Web pages

Hiding files in "plain sight" (e.g., what better place to "hide" a file than with an important

sounding name in the c:\winnt\system32 directory?)

Null ciphers (e.g., using the first letter of each word to form a hidden message in an

otherwise innocuous text)



Steganography today, however, is significantly more sophisticated than the examples above

suggest, allowing a user to hide large amounts of information within image and audio

files. These forms of steganography often are used in conjunction with cryptography so

that the information is doubly protected; first it is encrypted and then hidden so that an

adversary has to first find the information (an often difficult task in and of itself) and then

decrypt it.

There are a number of uses for steganography besides the mere novelty. One of the most widely

used applications is for so-called digital watermarking. A watermark, historically, is the

replication of an image, logo, or text on paper stock so that the source of the document

can be at least partially authenticated. A digital watermark can accomplish the same

function; a graphic artist, for example, might post sample images on her Web site

complete with an embedded signature so that she can later prove her ownership in case

others attempt to portray her work as their own.

Stego can also be used to allow communication within an underground community. There are

several reports, for example, of persecuted religious minorities using steganography to

embed messages for the group within images that are posted to known Web sites.

STEGANOGRAPHIC METHODS

The following formula provides a very generic description of the pieces of the steganographic

process:

cover_medium + hidden_data + stego_key = stego_medium

In this context, the cover_medium is the file in which we will hide the hidden_data, which may

also be encrypted using the stego_key. The resultant file is the stego_medium (which will,

of course. be the same type of file as the cover_medium). The cover_medium (and, thus,

the stego_medium) are typically image or audio files. In this article, I will focus on image

files and will, therefore, refer to the cover_image and stego_image.



Before discussing how information is hidden in an image file, it is worth a fast review of how

images are stored in the first place. An image file is merely a binary file containing a

binary representation of the color or light intensity of each picture element (pixel)

comprising the image.

Images typically use either 8-bit or 24-bit color. When using 8-bit color, there is a definition of

up to 256 colors forming a palette for this image, each color denoted by an 8-bit value. A

24-bit color scheme, as the term suggests, uses 24 bits per pixel and provides a much

better set of colors. In this case, each pix is represented by three bytes, each byte

representing the intensity of the three primary colors red, green, and blue (RGB),

respectively. The Hypertext Markup Language (HTML) format for indicating colors in a

Web page often uses a 24-bit format employing six hexadecimal digits, each pair

representing the amount of red, blue, and green, respectively. The color orange, for

example, would be displayed with red set to 100% (decimal 255, hex FF), green set to

50% (decimal 127, hex 7F), and no blue (0), so we would use "#FF7F00" in the HTML

code.

The size of an image file, then, is directly related to the number of pixels and the granularity of

the color definition. A typical 640x480 pix image using a palette of 256 colors would

require a file about 307 KB in size (640 • 480 bytes), whereas a 1024x768 pix high-

resolution 24-bit color image would result in a 2.36 MB file (1024 • 768 • 3 bytes).

To avoid sending files of this enormous size, a number of compression schemes have been

developed over time, notably Bitmap (BMP), Graphic Interchange Format (GIF), and

Joint Photographic Experts Group (JPEG) file types. Not all are equally suited to

steganography, however.

GIF and 8-bit BMP files employ what is known as lossless compression, a scheme that allows

the software to exactly reconstruct the original image. JPEG, on the other hand, uses

lossy compression, which means that the expanded image is very nearly the same as the



original but not an exact duplicate. While both methods allow computers to save storage

space, lossless compression is much better suited to applications where the integrity of

the original information must be maintained, such as steganography. While JPEG can be

used for stego applications, it is more common to embed data in GIF or BMP files.

The simplest approach to hiding data within an image file is called least significant bit (LSB)

insertion. In this method, we can take the binary representation of the hidden_data and

overwrite the LSB of each byte within the cover_image. If we are using 24-bit color, the

amount of change will be minimal and indiscernible to the human eye. As an example,

suppose that we have three adjacent pixels (nine bytes) with the following RGB

encoding:

10010101 00001101 11001001

10010110 00001111 11001010

10011111 00010000 11001011

Now suppose we want to "hide" the following 9 bits of data (the hidden data is usually

compressed prior to being hidden): 101101101. If we overlay these 9 bits over the LSB of

the 9 bytes above, we get the following (where bits in bold have been changed):

10010101 00001100 11001001

10010111 00001110 11001011

10011111 00010000 11001011

Note that we have successfully hidden 9 bits but at a cost of only changing 4, or roughly 50%, of

the LSBs.

This description is meant only as a high-level overview. Similar methods can be applied to 8-bit

color but the changes, as the reader might imagine, are more dramatic. Gray-scale

images, too, are very useful for steganographic purposes. One potential problem with any



of these methods is that they can be found by an adversary who is looking. In addition,

there are other methods besides LSB insertion with which to insert hidden information.

Without going into any detail, it is worth mentioning steganalysis, the art of detecting and

breaking steganography. One form of this analysis is to examine the color palette of a

graphical image. In most images, there will be a unique binary encoding of each

individual color. If the image contains hidden data, however, many colors in the palette

will have duplicate binary encodings since, for all practical purposes, we can't count the

LSB. If the analysis of the color palette of a given file yields many duplicates, we might

safely conclude that the file has hidden information.

But what files would you analyze? Suppose I decide to post a hidden message by hiding it in an

image file that I post at an auction site on the Internet. The item I am auctioning is real so

a lot of people may access the site and download the file; only a few people know that the

image has special information that only they can read. And we haven't even discussed

hidden data inside audio files! Indeed, the quantity of potential cover files makes

steganalysis a Herculean task.

A STEGANOGRAPHY EXAMPLE

There are a number of software packages that perform steganography on just about any software

platform; readers are referred to Neil Johnson's list of steganography tools at

http://www.jjtc.com/Steganography/toolmatrix.htm. Some of the better known packages

for Windows NT and Windows 2000 systems include:

Hide4PGP (http://www.heinz-repp.onlinehome.de/Hide4PGP.htm)

MP3Stego (http://www.cl.cam.ac.uk/~fapp2/steganography/mp3stego/)

Stash (http://www.smalleranimals.com/stash.htm)

Steganos (http://www.steganos.com/english/steganos/download.htm)


http://www.steganos.com/english/steganos/download.htm

http://www.smalleranimals.com/stash.htm

http://www.cl.cam.ac.uk/~fapp2/steganography/mp3stego/

http://www.heinz-repp.onlinehome.de/Hide4PGP.htm

http://www.jjtc.com/Steganography/toolmatrix.htm


S-Tools (available from http://www.webattack.com/download/dlstools.shtml)

FIGURE 1. The cover_image (5th wave.gif), hidden_data file (virusdetectioninfo.txt),


http://www.webattack.com/download/dlstools.shtml


and stego_key.

The following examples come from Andy Brown's S-Tools for Windows. S-Tools allows users

to hide information into BMP, GIF, or WAV files. The basic scheme of the program is

straight-forward; you drag an image or audio file into the S-Tools active window to act as

the cover_medium, drag the hidden_data file onto the cover_medium, and then provide a

stego_key for encryption. The result is the stego_medium. All of this is shown in Figure

1:

1. I highlighted the GIF image file 5th wave.gif and dragged it to the S-Tools active window.

Note that S-Tools reports that up to 138,547 bytes can be hidden in this image file.

2. I next highlighted a 14 KB text file called virusdetectioninfo.txt and dragged it onto the

image file in S-Tools.

3. A dialog box pops up telling me that I am hiding 6,019 bytes of data and asks for a passphrase with

which to encrypt the hidden text; the default secret key crypto scheme used by S-Tools is the

International Data Encryption Algorithm (IDEA).





FIGURE 3. Extracting hidden information from the image file.



4. Once the image file has been received, the user merely drags the file to S-Tools and right-clicks over the image, specifying the Reveal option. A dialog box will pop up requesting the passphrase. Figure 3 shows the information about the hidden archive file, and allows the user to open the file.



Appendix – B

MATLAB

MATLAB is a numerical computing environment and fourth generation programming

language. Developed by The MathWorks, MATLAB allows matrix manipulation, plotting of

functions and data, implementation of algorithms, creation of user interfaces, and interfacing

with programs in other languages. Although it is numeric only, an optional toolbox uses the

MuPAD symbolic engine, allowing access to computer algebra capabilities. An additional

package, Simulink, adds graphical multidomain simulation and Model-Based Design for

dynamic and embedded systems.

In 2004, MathWorks claimed that MATLAB was used by more than one million people

across industry and the academic world.[

MATLAB (meaning "matrix laboratory") was invented in the late 1970s by Cleve

Moler, then chairman of the computer science department at the University of New Mexico. He

designed it to give his students access to LINPACK and EISPACK without having to learn

Fortran. It soon spread to other universities and found a strong audience within the applied

mathematics community. Jack Little, an engineer, was exposed to it during a visit Moler made to

Stanford University in 1983. Recognizing its commercial potential, he joined with Moler and

Steve Bangert. They rewrote MATLAB in C and founded The MathWorks in 1984 to continue

its development. These rewritten libraries were known as JACKPAC. [citation needed] In 2000,

MATLAB was rewritten to use a newer set of libraries for matrix manipulation, LAPACK.

MATLAB was first adopted by control design engineers, Little's specialty, but quickly spread to

many other domains. It is now also used in education, in particular the teaching of linear algebra

and numerical analysis, and is popular amongst scientists involved with image processing.


http://en.wikipedia.org/wiki/Image_processing

http://en.wikipedia.org/wiki/Numerical_analysis

http://en.wikipedia.org/wiki/Linear_algebra

http://en.wikipedia.org/wiki/Control_engineering

http://en.wikipedia.org/wiki/LAPACK

http://en.wikipedia.org/wiki/Wikipedia:Citation_needed

http://en.wikipedia.org/wiki/The_MathWorks

http://en.wikipedia.org/wiki/C_(programming_language)

http://en.wikipedia.org/wiki/Stanford_University

http://en.wikipedia.org/wiki/John_N._Little

http://en.wikipedia.org/wiki/Applied_mathematics

http://en.wikipedia.org/wiki/Applied_mathematics

http://en.wikipedia.org/wiki/Fortran

http://en.wikipedia.org/wiki/EISPACK

http://en.wikipedia.org/wiki/LINPACK

http://en.wikipedia.org/wiki/University_of_New_Mexico

http://en.wikipedia.org/wiki/Computer_science

http://en.wikipedia.org/wiki/Cleve_Moler

http://en.wikipedia.org/wiki/Cleve_Moler

http://en.wikipedia.org/wiki/Simulink

http://en.wikipedia.org/wiki/Computer_algebra_system

http://en.wikipedia.org/wiki/MuPAD

http://en.wikipedia.org/wiki/User_interface

http://en.wikipedia.org/wiki/Algorithm

http://en.wikipedia.org/wiki/Function_(mathematics)

http://en.wikipedia.org/wiki/Matrix_(mathematics)

http://en.wikipedia.org/wiki/The_MathWorks

http://en.wikipedia.org/wiki/Fourth_generation_programming_language

http://en.wikipedia.org/wiki/Fourth_generation_programming_language

http://en.wikipedia.org/wiki/Numerical_analysis


Variables

Variables are defined with the assignment operator, =. MATLAB is dynamically typed,

meaning that variables can be assigned without declaring their type, except if they are to be

treated as symbolic objects[6], and that their type can change. Values can come from constants,

from computation involving values of other variables, or from the output of a function. For

example:

>> x = 17x = 17>> x = 'hat'x =hat>> x = [3*4, pi/2]x = 12.0000 1.5708>> y = 3*sin(x)y = -1.6097 3.0000

Vectors/Matrices

MATLAB is a "Matrix Laboratory", and as such it provides many convenient ways for

creating vectors, matrices, and multi-dimensional arrays. In the MATLAB vernacular, a vector

refers to a one dimensional (1×N or N×1) matrix, commonly referred to as an array in other

programming languages. A matrix generally refers to a 2-dimensional array, i.e. an m×n array

where m and n are greater than or equal to 1. Arrays with more than two dimensions are referred

to as multidimensional arrays.

MATLAB provides a simple way to define simple arrays using the syntax:

init:increment:terminator. For instance:

>> array = 1:2:9

array =

1 3 5 7 9


http://en.wikipedia.org/wiki/Constant_(computer_science)

http://en.wikipedia.org/wiki/Matlab#cite_note-5

http://en.wikipedia.org/wiki/Type_system#Dynamic_typing


defines a variable named array (or assigns a new value to an existing variable with the name

array) which is an array consisting of the values 1, 3, 5, 7, and 9. That is, the array starts at 1 (the

init value), increments with each step from the previous value by 2 (the increment value), and

stops once it reaches (or to avoid exceeding) 9 (the terminator value).

>> array = 1:3:9

array =

1 4 7

the increment value can actually be left out of this syntax (along with one of the colons), to use a

default value of 1.

>> ari = 1:5

ari =

1 2 3 4 5

assigns to the variable named ari an array with the values 1, 2, 3, 4, and 5, since the default value

of 1 is used as the incrementer.

Indexing is one-basedwhich is the usual convention for matrices in mathematics. This is atypical

for programming languages, whose arrays more often start with zero.

Matrices can be defined by separating the elements of a row with blank space or comma and

using a semicolon to terminate each row. The list of elements should be surrounded by square

brackets: []. Parentheses: () are used to access elements and subarrays (they are also used to

denote a function argument list).

>> A = [16 3 2 13; 5 10 11 8; 9 6 7 12; 4 15 14 1]A = 16 3 2 13 5 10 11 8 9 6 7 12 4 15 14 1 >> A(2,3)



http://en.wikipedia.org/wiki/Array_data_type#Indices_into_arrays


ans =

11

Sets of indices can be specified by expressions such as "2:4", which evaluates to [2, 3, 4]. For

example, a submatrix taken from rows 2 through 4 and columns 3 through 4 can be written as:

>> A(2:4,3:4)

ans =

11 8

7 12

14 1

A square identity matrix of size n can be generated using the function eye, and matrices of any

size with zeros or ones can be generated with the functions zeros and ones, respectively.

>> eye(3)

ans =

1 0 0

0 1 0

0 0 1

>> zeros(2,3)

ans =

0 0 0

0 0 0

>> ones(2,3)

ans =

1 1 1

1 1 1

Most MATLAB functions can accept matrices and will apply themselves to each element. For

example, mod(2*J,n) will multiply every element in "J" by 2, and then reduce each element


http://en.wikipedia.org/wiki/Identity_matrix


modulo "n". MATLAB does include standard "for" and "while" loops, but using MATLAB's

vectorized notation often produces code that is easier to read and faster to execute. This code,

excerpted from the function magic.m, creates a magic square M for odd values of n (MATLAB

function meshgrid is used here to generate square matrices I and J containing 1:n).

[J,I] = meshgrid(1:n);

A = mod(I+J-(n+3)/2,n);

B = mod(I+2*J-2,n);

M = n*A + B + 1;

Semicolon

Unlike many other languages, where the semicolon is used to terminate commands, in MATLAB

the semicolon serves to suppress the output of the line that it concludes.

Graphics

Function plot can be used to produce a graph from two vectors x and y. The code:

x = 0:pi/100:2*pi;y = sin(x);plot(x,y)

produces the following figure of the sine function:


http://en.wikipedia.org/wiki/Sine_wave

http://en.wikipedia.org/wiki/Magic_square

http://en.wikipedia.org/wiki/File:Matlab_plot_sin.svg


Three-dimensional graphics can be produced using the functions surf, plot3 or mesh.

[X,Y] = meshgrid(-10:0.25:10,-10:0.25:10);f = sinc(sqrt((X/pi).^2+(Y/pi).^2));mesh(X,Y,f);axis([-10 10 -10 10 -0.3 1])xlabel('{\bfx}')ylabel('{\bfy}')zlabel('{\bfsinc} ({\bfR})')hidden off

[X,Y] = meshgrid(-10:0.25:10,-10:0.25:10);f = sinc(sqrt((X/pi).^2+(Y/pi).^2));surf(X,Y,f);axis([-10 10 -10 10 -0.3 1])xlabel('{\bfx}')ylabel('{\bfy}')zlabel('{\bfsinc} ({\bfR})')

This code produces a wireframe 3D plot of the two-dimensional unnormalized sinc function:

This code produces a surface 3D plot of the two-dimensional unnormalized sinc function:

Object-Oriented Programming

MATLAB's support for object-oriented programming includes classes, inheritance, virtual dispatch, packages, pass-by-value semantics, and pass-by-reference semantics.[8]

classdef hello methods function doit(this) disp('hello') end endend

When put into a file named hello.m, this can be executed with the following commands:

>> x = hello;>> x.doit;hello


http://en.wikipedia.org/wiki/Matlab#cite_note-7

http://en.wikipedia.org/wiki/Object-oriented_programming

http://en.wikipedia.org/wiki/Sinc_function

http://en.wikipedia.org/wiki/Sinc_function

http://en.wikipedia.org/wiki/Wire_frame_model

http://en.wikipedia.org/wiki/File:MATLAB_mesh_sinc3D.svg

http://en.wikipedia.org/wiki/File:MATLAB_surf_sinc3D.svg


Limitations

For a long time there was criticism that because MATLAB is a proprietary product of

The MathWorks, users are subject to vendor lock-in. Recently an additional tool called the

MATLAB Builder under the Application Deployment tools section has been provided to deploy

MATLAB functions as library files which can be used with .NET or Java application building

environment. But the drawback is that the computer where the application has to be deployed

needs MCR (MATLAB Component Runtime) for the MATLAB files to function normally.

MCR can be distributed freely with library files generated by the MATLAB compiler.

MATLAB, like Fortran, Visual Basic and Ada, uses parentheses, e.g. y = f(x), for both

indexing into an array and calling a function. Although this syntax can facilitate a switch

between a procedure and a lookup table, both of which correspond to mathematical functions, a

careful reading of the code may be required to establish the intent.

Many functions have a different behavior with matrix and vector arguments. Since

vectors are matrices of one row or one column, this can give unexpected results. For instance,

function sum(A) where A is a matrix gives a row vector containing the sum of each column of A,

and sum(v) where v is a column or row vector gives the sum of its elements; hence the

programmer must be careful if the matrix argument of sum can degenerate into a single-row

array. While sum and many similar functions accept an optional argument to specify a direction,

others, like plot, do not, and require additional checks. There are other cases where MATLAB's

interpretation of code may not be consistently what the user intended[citation needed] (e.g. how spaces

are handled inside brackets as separators where it makes sense but not where it doesn't, or

backslash escape sequences which are interpreted by some functions like fprintf but not directly

by the language parser because it wouldn't be convenient for Windows directories). What might

be considered as a convenience for commands typed interactively where the user can check that

MATLAB does what the user wants may be less supportive of the need to construct reusable

code.[citation needed]



http://en.wikipedia.org/wiki/Reusability

http://en.wikipedia.org/wiki/Reusability

http://en.wikipedia.org/wiki/Escape_character


http://en.wikipedia.org/wiki/Function_(mathematics)

http://en.wikipedia.org/wiki/Lookup_table

http://en.wikipedia.org/wiki/Subroutine

http://en.wikipedia.org/wiki/Ada_(programming_language)

http://en.wikipedia.org/wiki/Visual_Basic


http://en.wikipedia.org/wiki/Java_(software_platform)

http://en.wikipedia.org/wiki/.NET_Framework

http://en.wikipedia.org/wiki/Vendor_lock-in

http://en.wikipedia.org/wiki/Proprietary_software


Array indexing is one-based which is the common convention for matrices in

mathematics, but does not accommodate any indexing convention of sequences that have zero or

negative indices. For instance, in MATLAB the DFT (or FFT) is defined with the DC component

at index 1 instead of index 0, which is not consistent with the standard definition of the DFT in

any literature. This one-based indexing convention is hard coded into MATLAB, making it

difficult for a user to define their own zero-based or negative indexed arrays to concisely model

an idea having non-positive indices.

Code written for a specific release of MATLAB often does not run with earlier releases

as it may use some of the newer features. To give just one example: save('filename','x') saves the

variable x in a file. The variable can be loaded with load('filename') in the same MATLAB

release. However, if saved with MATLAB version 7 or later, it cannot be loaded with MATLAB

version 6 or earlier. As workaround, in MATLAB version 7 save('filename','x','-v6')

generates a file that can be read with version 6. However, executing save('filename','x','-

v6') in version 6 causes an error message.

Interactions with other languages

MATLAB can call functions and subroutines written in the C programming language or

Fortran. A wrapper function is created allowing MATLAB data types to be passed and returned.

The dynamically loadable object files created by compiling such functions are termed "MEX-

files" (for MATLAB executable).

Libraries written in Java, ActiveX or .NET can be directly called from MATLAB and

many MATLAB libraries (for example XML or SQL support) are implemented as wrappers

around Java or ActiveX libraries. Calling MATLAB from Java is more complicated, but can be

done with MATLAB extension, which is sold separately by MathWorks.

Through the MATLAB Toolbox for Maple, MATLAB commands can be called from within the

Maple Computer Algebra System, and vice versa.


http://en.wikipedia.org/wiki/Maple_(software)

http://en.wikipedia.org/wiki/MathWorks

http://en.wikipedia.org/wiki/SQL

http://en.wikipedia.org/wiki/XML

http://en.wikipedia.org/wiki/.NET_Framework

http://en.wikipedia.org/wiki/ActiveX

http://en.wikipedia.org/wiki/Java_(programming_language)

http://en.wikipedia.org/wiki/C_(programming_language)

http://en.wikipedia.org/wiki/Hard_coding

http://en.wikipedia.org/wiki/DC_component

http://en.wikipedia.org/wiki/Fast_Fourier_transform

http://en.wikipedia.org/wiki/Discrete_Fourier_transform


http://en.wikipedia.org/wiki/Array_data_type#Index_of_the_first_element


Alternatives

MATLAB has a number of competitors.

There are free open source alternatives to MATLAB, in particular GNU Octave, FreeMat, and

Scilab which are intended to be mostly compatible with the MATLAB language (but not the

MATLAB desktop environment). Among other languages that treat arrays as basic entities (array

programming languages) are APL and its successor J, Fortran 95 and 2003, as well as the

statistical language S (the main implementations of S are S-PLUS and the popular open source

language R).

There are also several libraries to add similar functionality to existing languages, such as Perl

Data Language for Perl and SciPy together with NumPy and Matplotlib for Python.


http://en.wikipedia.org/wiki/Python_(programming_language)

http://en.wikipedia.org/wiki/Matplotlib

http://en.wikipedia.org/wiki/NumPy

http://en.wikipedia.org/wiki/SciPy

http://en.wikipedia.org/wiki/Perl

http://en.wikipedia.org/wiki/Perl_Data_Language

http://en.wikipedia.org/wiki/Perl_Data_Language

http://en.wikipedia.org/wiki/R_(programming_language)

http://en.wikipedia.org/wiki/S-PLUS

http://en.wikipedia.org/wiki/S_(programming_language)


http://en.wikipedia.org/wiki/J_(programming_language)

http://en.wikipedia.org/wiki/APL_(programming_language)

http://en.wikipedia.org/wiki/Array_programming

http://en.wikipedia.org/wiki/Array_programming

http://en.wikipedia.org/wiki/Scilab

http://en.wikipedia.org/wiki/FreeMat

http://en.wikipedia.org/wiki/GNU_Octave

http://en.wikipedia.org/wiki/Open_source