Top Banner
International Journal of Computer Applications (0975 8887) Volume 70No.18, May 2013 29 Hiding Data in Text using ASCII MappingTechnology (AMT) Souvik Bhattacharyya Department of CSE University Institute of Technology, The University of Burdwan West Bengal,India Pabak Indu Department of CSE University Institute of Technology, The University of Burdwan West Bengal,India Gautam Sanyal Department of CSE National Institute of Technology,Durgapur West Bengal,India ABSTRACT In current digitized world humane race is facing a great revolution over Internet technologies, as a result every single information needs to be transmitted through a communication network. This results a different level of attack over these information. This attacks, misuse or unauthorized access of this information are greatest concern of today‟s world. Steganography means covered or hidden writing”, originated from Greek language, an ancient art of protecting information. Considerable amount of work has been carried out by different researchers on Steganography. In this paper the authors propose a novel text steganography method based on ASCII Mapping Technology (AMT) on English language producing stego text with which is visibly indistinguishable from the original cover text. There is an extra level of security which is achieved through a derived quantum gate. This solution is independent of the nature of the data to be hidden and produces a stego text with minimum degradation and applicable for other India languages also. Quality of the stego text is analyzed by tradeoff between no of bits used for mapping. Efficiency of the proposed method is illustrated by exhaustive experimental results and comparisons. General Terms Steganography, Cover Text, Stego Text, Quantum gates, C- NOT gate, SWAP gate, NOT gate. Keywords AMT (ASCII Mapping Technology), CS gate, CSN gate, POS file, Jaro-Winkler Distance. 1. INTRODUCTION In current world the main concern of the information transmission over a communication network is the protection of data from unauthorized user. To deal with this problem, there exists several information hiding mechanisms like, Steganography, Cryptography, Digital watermarking etc. The first two techniques provides protection on data where as the third one gives the authentication over the data using some tag or labeling on some objects like text, audio, video, image [1]. As the goal of steganography is to hide the presence of a message and to create a covert channel, it can be seen as the complement of cryptography, whose goal is to hide the content of a message. The message is hidden in another media such that the transmitted data will be meaningful and innocuous looking to everyone. In summary it can be said cryptography attempting to conceal the content of the secret message, steganography conceals the very existence of that [2, 3]. Steganalysis is the art of detecting any hidden message on the communication channel. If the existence of the hidden message is revealed, the goal of steganography is defeated. A famous illustration of modern day steganography is Simmons‟ Prisoners‟ Problem [4]. An assumption can be made based on this model is that if both the sender and receiver share some common secret information then the corresponding steganography protocol is known as then the secret key steganography where as pure steganography means that there is none prior information shared by sender and receiver. If the public key of the receiver is known to the sender, the steganographic protocol is called public key steganography [3], [5] and [6]. For a more thorough knowledge of steganography methodology the reader is advised to see [7-8]. Figure 1 show below the framework of modern day steganography. Figure 1: Frame work of modern day Steganography A message is embedded in a Cover-Object with the help of a embedding algorithm and a key, which is shared by both sender and receiver, thus resulting an identical Stego-Object which is transmitted through a communication channel and the extraction algorithm extracts the secret message from the Stego-Media with the help of the key Although all digital file formats can be used for steganography, but the image and audio files are more suitable because of their high degree of redundancy [21]. Figure 2 below shows the different categories of file formats that can be used for steganography techniques.
9

Hiding Data in Text using ASCII MappingTechnology (AMT)€¦ · Hiding Data in Text using ASCII MappingTechnology (AMT) Souvik Bhattacharyya Department of CSE University Institute

May 11, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Hiding Data in Text using ASCII MappingTechnology (AMT)€¦ · Hiding Data in Text using ASCII MappingTechnology (AMT) Souvik Bhattacharyya Department of CSE University Institute

International Journal of Computer Applications (0975 – 8887)

Volume 70– No.18, May 2013

29

Hiding Data in Text using ASCII MappingTechnology (AMT)

Souvik Bhattacharyya Department of CSE University Institute of

Technology, The University of Burdwan

West Bengal,India

Pabak Indu Department of CSE University Institute of

Technology, The University of Burdwan

West Bengal,India

Gautam Sanyal Department of CSE National Institute of

Technology,Durgapur West Bengal,India

ABSTRACT

In current digitized world humane race is facing a great

revolution over Internet technologies, as a result every single

information needs to be transmitted through a communication

network. This results a different level of attack over these

information. This attacks, misuse or unauthorized access of

this information are greatest concern of today‟s world.

Steganography means “covered or hidden writing”,

originated from Greek language, an ancient art of protecting

information. Considerable amount of work has been carried

out by different researchers on Steganography. In this paper

the authors propose a novel text steganography method based

on ASCII Mapping Technology (AMT) on English language

producing stego text with which is visibly indistinguishable

from the original cover text. There is an extra level of security

which is achieved through a derived quantum gate. This

solution is independent of the nature of the data to be hidden

and produces a stego text with minimum degradation and

applicable for other India languages also. Quality of the stego

text is analyzed by tradeoff between no of bits used for

mapping. Efficiency of the proposed method is illustrated by

exhaustive experimental results and comparisons.

General Terms

Steganography, Cover Text, Stego Text, Quantum gates, C-

NOT gate, SWAP gate, NOT gate.

Keywords

AMT (ASCII Mapping Technology), CS gate, CSN gate, POS

file, Jaro-Winkler Distance.

1. INTRODUCTION

In current world the main concern of the information

transmission over a communication network is the protection

of data from unauthorized user. To deal with this problem,

there exists several information hiding mechanisms like,

Steganography, Cryptography, Digital watermarking etc. The

first two techniques provides protection on data where as the

third one gives the authentication over the data using some tag

or labeling on some objects like text, audio, video, image [1].

As the goal of steganography is to hide the presence of a

message and to create a covert channel, it can be seen as the

complement of cryptography, whose goal is to hide the

content of a message. The message is hidden in another media

such that the transmitted data will be meaningful and

innocuous looking to everyone. In summary it can be said

cryptography attempting to conceal the content of the secret

message, steganography conceals the very existence of that [2,

3]. Steganalysis is the art of detecting any hidden message on

the communication channel. If the existence of the hidden

message is revealed, the goal of steganography is defeated. A

famous illustration of modern day steganography is Simmons‟

Prisoners‟ Problem [4]. An assumption can be made based

on this model is that if both the sender and receiver

share some common secret information then the

corresponding steganography protocol is known as then

the secret key steganography where as pure

steganography means that there is none prior information

shared by sender and receiver. If the public key of the

receiver is known to the sender, the steganographic

protocol is called public key steganography [3], [5] and

[6]. For a more thorough knowledge of steganography

methodology the reader is advised to see [7-8]. Figure 1

show below the framework of modern day steganography.

Figure 1: Frame work of modern day Steganography

A message is embedded in a Cover-Object with the help of a

embedding algorithm and a key, which is shared by both

sender and receiver, thus resulting an identical Stego-Object

which is transmitted through a communication channel and

the extraction algorithm extracts the secret message from the

Stego-Media with the help of the key

Although all digital file formats can be used for

steganography, but the image and audio files are more suitable

because of their high degree of redundancy [21]. Figure 2

below shows the different categories of file formats that can

be used for steganography techniques.

Page 2: Hiding Data in Text using ASCII MappingTechnology (AMT)€¦ · Hiding Data in Text using ASCII MappingTechnology (AMT) Souvik Bhattacharyya Department of CSE University Institute

International Journal of Computer Applications (0975 – 8887)

Volume 70– No.18, May 2013

30

Figure 2: Types of Steganography

Among them Image Steganography is most popular of the lot.

In this technique secret message is embedded into an image as

a noise, whose detection is nearly impossible to an human eye

[9, 10, 11]. Audio Steganography embeds the message into a

cover audio file as noise at a frequency beyond human hearing

range [12-13]. A video is nothing but a collection of moving

images, with some audio associated. Video Steganography

can be achieved by the same methods used previously.

Linguistic Steganography or Text Steganography is a major,

and perhaps most difficult kind of this category. The Text

Steganography is a method of using written natural language

to conceal secret message as defined by Chapman et al [14].

Some Steganographic model with high security features has

been presented in [15-18].

In steganography two aspects are usually addressed. First, the

cover-media and stego media should appear identical under all

possible statistical attacks. Second, the embedding process

should not degrade the media fidelity, that is, the difference

between the stego media and the cover-media should be

imperceptible to human perceptual system.

This paper has been organized as following sections:- Section

II discusses about some of the related works done based on

text steganography. Section III describes proposed text

steganography method. Section IV describes the

Mathematical formulation of the processes. Solution

methodology has been provided in section V. Section VI

describes different algorithms .Section VII contains the

analysis of the results. Section VIII compares the proposed

method with other existing and Section IX draws the

conclusion.

2. REVIEW OF RELATED WORKS IN

TEXT STEGANOGRAPHY

Depending upon the method used, the text steganography can

be classified into three categories as shown in Figure 3.

Figure 3 : Classification of Text Steganography

2.1 Format-Based

Format-Based methods use and change the formatting of the

cover-text to hide data. This method does not bring any

change to any sentence or to any word. Thus the „value‟ of the

cover-text is unharmed. A format-based text steganography

method is open space method. In this method extra white

spaces are added into the text to hide information. A single

space represents “0” while two consecutive spaces are

represented as “1”. By line and word shifting [19],

manipulating the white spaces between words and paragraph

[20] this method can also be achieved. In line shifting method,

vertical alignments of some lines of the text are shifted to

create a unique hidden shape to embed message in it [21].

2.2 Random and Statistical Generation

Random and statistical generation methods are used to

generate cover-text automatically according to the statistical

properties of language. These methods use example grammars

to produce cover-text in a certain natural language. A

probabilistic context-free grammar (PCFG) is a commonly

used language model where each transformation rule of a

context-free grammar has a probability associated with it

[22].The quality of the generated stego-message depends

directly on the quality of the grammars used. Another

approach to this type of method is to generate words having

same statistical properties like word length and letter

frequency of a word in the original message. The words

generated are often without of any lexical value.

2.3 Linguistic Method

The Linguistic method [23] considers the linguistic properties

of the text to modify it. To hide information the method uses

the linguistic structure of the message. Syntactic method is a

linguistic steganography method where some punctuation sign

like comma (,) and full-stop (.) are placed in proper places in

the document to embed data.

2.4 Changing Alphabet Letter Pattern

(CALP)

CALP is a new method for text steganography presented in

[24, 25]. In this method, considering the structure of English

alphabets each one or two bit of the binary sequence of the

secret message has been mapped through some little structural

modification of some of the alphabets of the cover text .This

approach uses the idea of structural and feature changing of

the cover carrier which is not visibly distinguishable from the

original to the human beings and may be modified for other

India language also. This solution is independent of the nature

of the data to be hidden and produces a stego text with

minimum degradation.

2.5 Other Methods

Except the above mentioned methods, there are some other

methods proposed for text steganography, such as feature

coding, text steganography by specific characters in words,

abbreviations etc. [26] or by changing words spelling [27].

Some other methods like Text Steganography by Inter-word

Spacing and Inter paragraph Spacing Approach [20] or Text

Steganography by Using Letter Points and Extensions [28].

Text Steganography by Word Mapping Method (WMM) [29]

are also exist.

3. PROPOSED METHOD OF TEXT

STEGANOGRAPHY USING ASCII

MAPPING TECHNOLOGY (AMT)

In this paper a new method of text steganography for English

language has been introduced. In the conventional text

steganography methods changes occurs between the cover and

stego text but in this AMT approach there is no reflection of

visual changes between cover and stego text occurs. In this

method cover text and secret message both are generated by

user. Stego text is formed by mapping the binary sequence of

Page 3: Hiding Data in Text using ASCII MappingTechnology (AMT)€¦ · Hiding Data in Text using ASCII MappingTechnology (AMT) Souvik Bhattacharyya Department of CSE University Institute

International Journal of Computer Applications (0975 – 8887)

Volume 70– No.18, May 2013

31

the secret message through the ASCII mapping technique of

some alphabets of the cover text. Figure 5 and 6 below

respectively shows the mapping sequence for embedding 0s

and 1s through the ASCII mapping technique of some of the

alphabets of the cover text. These ASCII mapping technique

have been incorporated using some unused symbols in the

ASCII chart. This method duplicates some of the alphabets

as shown in figure 4 and 5 to some unused ASCIIs of the

English language for embedding/mapping 0s and 1s

respectively. Before mapping operation derived quantum

logic has been used for selecting the valid embedding or

mapping positions which added an extra level of security.

Figure 4: Mapping sequence for embedding ‘0’

Figure 5: Mapping sequence for embedding ‘1’

3.1 Derivation of the Quantum Logic

Authors have derived quantum logic through the combination

of C-NOT gate, SWAP gate and quantum NOT gate. The

figure 6 below illustrates the steps of the derivation.

Figure 6: Derivation steps of the Quantum Logic

As shown in the figure 6 there is a OR operation has been

carried out between C-NOT gate and SWAP gate to generate

1st derived gate which in turn logically ORed with NOT gate

results the desired quantum logic, used to increase an extra

level of security.

3.2 Extra Level of Security through

Quantum Logic

As discussed in earlier section the quantum logic has been

incorporated to provide an extra level of security. The derived

quantum principle is placed over the cover text in a grid

format and the 1‟s generated signifies valid embedding

positions where 0‟s makes the embedding positions invalid.

Figure 7 shows a sample cover text where the underlined

letters are the embedding positions. After applying derived

quantum logic (as shown in figure 8) over that cover text

some embedding positions are discarded as marked in figure

9. The encrypted POS file contains the positional values of 0‟s

from the secret messages and the user can decrypt it and

compare with the secret message which has been extracted

from the stego text for authentication. This encryption of POS

file is done using integer wavelet transform (discussed in

section 4.2).

Figure 7: A Sample Cover Text

Figure 8: Derived quantum logic

Figure 9: After applying derived quantum logic, circular

letters are discarded as embedding positions.

4. MATHEMATICAL FORMULATION

OF THE PROCESSES

In this section the necessary essence of mathematical

formulation of the algorithms are discussed.

4.1 Quantum Gates

Quantum circuit model of computation in quantum computing

[30], [31] and [32], a quantum gate or quantum logic gate is a

basic quantum circuit which operates on a small number of

qubits or quantum bits. They are the building blocks of

quantum circuits, like classical logic gates are basically for

Page 4: Hiding Data in Text using ASCII MappingTechnology (AMT)€¦ · Hiding Data in Text using ASCII MappingTechnology (AMT) Souvik Bhattacharyya Department of CSE University Institute

International Journal of Computer Applications (0975 – 8887)

Volume 70– No.18, May 2013

32

conventional digital circuits. Quantum logic gates are

reversible like other classical logic gates. However, classical

computing can be performed by the help of only reversible

gates. Quantum gates are represented as matrices. A gate

which acts on k qubits is represented by a 2k x 2k unitary

matrix. The number of qubits in the input and output of the

gate is equal. There are various types of quantum gates are

represent the qubits. They are Hadamard gate, Pauli-X gate,

Pauli-Y gate, Pauli-Z gate, Phase shift gates, Swap gate,

Controlled gates, Toffoli gate, Fredkin gate, etc. Here we use

Controlled gates to represent the qubits and control the

operations. In this paper authors have used three quantum

gates i.e. NOT gate, CNOT gate, SWAP gate. The figure 10,

11, 12 shows the truth table of these gates.

Figure 10: NOT gate

Figure 11: C NOT gate

Figure 12: SWAP gate

It can also write this in the form of a matrix, or as a graphic.

The matrix form lists the lines in the truth table in the form

<0>, <1>. The matrix field with 1's and 0's such that each

horizontal or vertical line has exactly one 1, which is to be

interpreted as a one-to-one mapping of the input to the output.

The controlled gates act on 2 or more qubits, where one or

more qubits act as a control for some operation. For example,

the controlled not gate (or C-NOT) acts on 2 qubits, and

performs the NOT operation on the second qubit only when

the first qubit is , and otherwise leaves it unchanged. The

swap gate swaps two qubits.

4.2 Integer Wavelet Transform

Authors have used integer wavelet transform [33-35] as the

encryption function to encrypt the pos file which ensures that

the message is sent from an authorized end. The integer

wavelet transform through lifting scheme is an algorithm to

calculate wavelet transforms in an efficient way. It is also a

generic method to create so-called second-generation

wavelets. They are much more flexible and can be used to

define wavelet basis on an interval or on an irregular grid, or

even on a sphere. The wavelet lifting scheme is a method for

decomposing wavelet transform into a set of stages. An

advantage of lifting scheme is that they do not require

temporary storage in the calculation steps and have required

less no of computation steps. The lifting procedure consists of

three phases, namely, (i) split phase, (ii) predict phase and (iii)

update phase. Figure 13 shows the lifting scheme forward

wavelet transformation

Figure 13: Lifting Scheme Forward Wavelet

Transformation

Splitting: Split the signal x into even samples and odd

samples:

𝒙𝒆𝒗𝒆𝒏 : 𝒔𝒊 ← 𝒙𝟐𝒊 .

𝒙𝒐𝒅𝒅 : 𝒅𝒊 ← 𝒙𝟐𝒊+𝟏 .

Prediction Predict the odd samples using linear interpolation:

𝒅𝒊 ← 𝒅𝒊 − {(𝒔𝒊 + 𝒔𝒊+𝟏)/𝟐} .

Update: Update the even samples to preserve the mean value

of the samples:

𝒔𝒊 ← 𝒔𝒊 − {(𝒅𝒊−𝟏 + 𝒅𝒊)/𝟒} .

The output from the s channel provides a low pass filtered

version of the input where as the output from the d channel

provides the high pass filtered version of the input. The

inverse transformed is obtained by reversing the order and the

sign of the operations performed in the forward transform.

Figure 14 demonstrates the lifting scheme inverse wavelet

transformation.

Figure 14: Lifting Scheme Inverse Wavelet

Transformation

4.3 Lifting Scheme Haar Transform

In the lifting scheme version of the Haar transform, the

prediction step predicts that the odd element will be equal to

the even element. The difference between the predicted value

(the even element) and the actual value of the odd element

replaces the odd element. For the forward transform iteration j

and element i, the new odd element, j+1,i would be

𝒐𝒅𝒅𝒋+𝟏,𝒊 = 𝒐𝒅𝒅𝒋,𝒊 − 𝒆𝒗𝒆𝒏𝒋,𝒊

In the lifting scheme version of the Haar transform the update

step replaces an even element with the average of the

even/odd pair (e.g., the even element si and its odd successor,

si+1):

𝒆𝒗𝒆𝒏𝒋+𝟏,𝒊 =𝒆𝒗𝒆𝒏𝒋,𝒊 + 𝒐𝒅𝒅𝒋,𝒊

𝟐

Page 5: Hiding Data in Text using ASCII MappingTechnology (AMT)€¦ · Hiding Data in Text using ASCII MappingTechnology (AMT) Souvik Bhattacharyya Department of CSE University Institute

International Journal of Computer Applications (0975 – 8887)

Volume 70– No.18, May 2013

33

The original value of the oddj,i element has been

replaced by the difference between this element and its

even predecessor. Simple algebra lets us recover the

original value:

𝒐𝒅𝒅𝒋,𝒊 = 𝒆𝒗𝒆𝒏𝒋,𝒊 + 𝒐𝒅𝒅𝒋+𝟏,𝒊

Substituting this into the average,

𝒆𝒗𝒆𝒏𝒋+𝟏,𝒊 =𝒆𝒗𝒆𝒏𝒋,𝒊 + 𝒆𝒗𝒆𝒏𝒋,𝒊 + 𝒐𝒅𝒅𝒋+𝟏,𝒊

𝟐

𝒆𝒗𝒆𝒏𝒋+𝟏,𝒊 = 𝒆𝒗𝒆𝒏𝒋,𝒊 +𝒐𝒅𝒅𝒋+𝟏,𝒊

𝟐

5. SOLUTION METHODOLOGY

The proposed method contains of the following two windows.

One is for sender side and one is for receiver side. The sender

side takes the secret message and covers text, as an input and

generates secret key, stego text and an encrypted pos file as

output. The user should be able to generate secret message

and cover text. The receiver side takes stego text and the

encrypted pos file as input and retrieves the secret message.

The user should be someone who is familiar with the process

of information hiding and will have the knowledge of

steganography system. Figure 15 and 16 shows the proposed

methods for the corresponding GUI for the sender and

receiver side of the proposed text steganography system

respectively.

6. ALGORITHMS

In this section different algorithmic approach for embedding

and extraction process has been discussed. The secret message

is converted into bits by their ASCII values and the secret bits

are embedded through the embedding algorithm into the cover

text to generate the stego text.

6.1 Algorithm for Embedding

Let COVER is cover text and STEGO is the string which

consists of the stego text and MSG is the binary string of the

secret message and N is the no of elements in the MSG.

Initially COVER and STEGO are the same. Set two counters i

, p , j and r initialize to 1. POS is an array which contains the

positions of “0” bits of MSG and ENCRYPT is the function

which encrypts a value using INTEGER WAELET

TRANSFORM. QG contains the value of the derived

quantum logic.

Step 1: Generate an appropriate COVER consisting of “a” or

“m” or “n” and “i”, “j”, “k” Let l be the size of the

COVER. Copy the contents of the COVER into STEGO.

Step 2: For j=1to l

Step 3: If QG(i)==1 then do

goto step 4 and i:=i+1

Else

j:=j+1, i:=i+1

Step 4: If (COVER(j)== “a” or “m” or “n” and MSG(r)==

“1”)

then STEGO(j) := “a” or “m” or “n” and r :=

r+1

Else if (COVER(j)== “i” or “j” or “k” and MSG(r)==

“0”)

then STEGO(j) := “i” or “j” or “k” and r := r+1

Step 5: End of IF statement

Step 6: End of IF statement

Step 7: End of For loop

Step 8: For i=1 to N

Step 9: If (MSG(i)== “0”)

then pos(p)= ENCRYPT(i) and p := p+1

Step 10: End of If

Step 11: End of For loop

Step 12: End

6.2 Algorithm for Extraction

Let STEGO is the stego text and MSG is the binary string of

the secret message and N is the no. of elements in the STEGO

and i and r be two arbitrary variables and j is initialize to 1.

POS is an array which contains the positions of “0” bits of

MSG and DECRYPT is the function which reverses the

encrypt action on a value LENGHT (MSG) gives the length of

the secret message. SP is the number of element in the POS.

Step 1: For i=1 to N

Step 2: If ( STEGO(i)== “a” or “m” or “n”)

then MSG(r) := 1

Else if ( STEGO(i)== “i” or “j” or “k”)

then MSG(r) := 0

Step 3: End of If statement

Step 4 : End of For Loop

Step 5: For i=1 to SP

Step 6: IF(MSG(DECRYPT(POS(i))) == 0)

then message is authentic.

Step 7: End of If statement

Step 8: End of For loop

Step 9: End

7. ANALYSIS OF THE RESULT

There are mainly three aspects should be taken into account

when discussing the results of the proposed method of text

steganography. They are security, capacity and robustness.

The authors simulated the proposed system the results are

shown in the figure 17, 18, 19 and 20 respectively. This

method serves with better embedding capacity, security as

well as authentication aspect also. It generates the stego text

with zero degradation which is not very revealing to people

about the existence of any hidden data, maintaining its

security to the eavesdroppers. The embedding capacity

increases as the size of the cover text increases and the

encrypted file pos takes care of the authentication.

7.1 Similarity Measure of the Cover Text

and Stego Text through Shannon Entropy

Measure

Shannon entropy is one of the most important metrics in

information theory. Entropy measures the uncertainty

Page 6: Hiding Data in Text using ASCII MappingTechnology (AMT)€¦ · Hiding Data in Text using ASCII MappingTechnology (AMT) Souvik Bhattacharyya Department of CSE University Institute

International Journal of Computer Applications (0975 – 8887)

Volume 70– No.18, May 2013

34

associated with a random variable, i.e. the expected value of

the information in the message [36]. Shannon entropy allows

estimating the average minimum number of bits needed to

encode a string of symbols based on the alphabet size and

frequency of the symbols. The Shannon entropy is calculated

using formula:

𝑯 𝑿 = 𝒑 𝒙𝒊 𝑰 𝒙𝒊 =

𝒏

𝒊=𝟏

𝒑 𝒙𝒊 𝒍𝒐𝒈𝒃

𝟏

𝒑 𝒙𝒊

𝒏

𝒊=𝟏

= − 𝒑 𝒙𝒊 𝒍𝒐𝒈𝒃 𝒑 𝒙𝒊

𝒏

𝒊=𝟏

Shannon entropy tells about the minimal number of bits per

symbol needed to encode the information in binary form.

Additionally, other formulas can be calculated, one of the

simplest is metric entropy which is Shannon entropy divided

by string length. Metric entropy will helps to assess the

randomness of the message. It can take values from 0 to 1,

where 1 means equally distributed random string.

7.2 Similarity Measure of the Cover Text

and Stego Text through Correlation

The most familiar measure of dependence between two

quantities is the Pearson product-moment correlation

coefficient [37], or ”Pearson‟s correlation.” It is obtained by

dividing the covariance of the two variables by the product of

their standard deviations. Karl Pearson developed the

coefficient from a similar but slightly different idea by Francis

Galton. The Pearson correlation is +1 in the case of a perfect

positive (increasing) linear relationship (correlation), -1 in the

case of a perfect decreasing (negative) linear relationship (anti

correlation) , and some value between -1 and 1 in all other

cases, indicating the degree of linear dependence between the

variables.

As it approaches zero there is less of a relationship (closer to

uncorrelated). The closer the coefficient is to either -1 or 1,

the stronger the correlation between the variables. If the

variables are independent, Pearson‟s correlation coefficient is

0, but the converse is not true because the correlation

coefficient detects only linear dependencies between two

variables. If there is a series of n measurements of X and Y

written as xi and yi where i = 1,2,…,n then the sample

correlation coefficient can be used in Pearson correlation r

between X and Y. The sample correlation coefficient is

written as

𝑟𝑥𝑦 = 𝑥𝑖 − 𝑥 𝑛

𝑖=1 𝑦𝑖 − 𝑦

(𝑛 − 1) 𝑠𝑥 𝑠𝑦

where and are the sample means of X and Y, sx and sy

are the sample standard deviations of X and Y.

Figure 15: GUI for Sender Side

Page 7: Hiding Data in Text using ASCII MappingTechnology (AMT)€¦ · Hiding Data in Text using ASCII MappingTechnology (AMT) Souvik Bhattacharyya Department of CSE University Institute

International Journal of Computer Applications (0975 – 8887)

Volume 70– No.18, May 2013

35

Figure 16: GUI for Receiver Side

Figure 17: Cover text

Figure 18: Secret Message

Figure 19: Unicode Form of The Secret Message

Figure 20: Stego Text

7.3 Similarity Measure of the Cover Text

and Stego Text through Jaro Winkler

Distance

For comparing the similarity between cover text and the stego

text, the Jaro-Winkler distance for measuring similarity

between two strings has been computed. The Jaro-Winkler

distance [38] is a measure of similarity between two strings. It

is a variant of the Jaro distance metric [39], [40] and mainly

used in the area of record linkage (duplicate detection). The

higher the Jaro-Winkler distance for two strings is, the more

similar the strings are. The score is normalized such that 0

equates to no similarity and 1 is an exact match. The Jaro

distance metric states that given two strings s1 and s2 their

distance dj is

m

tm

s

m

s

md j

213

1 , where m is

the number of matching characters and t is the number of

transpositions. Two characters from s1 and s2 respectively are

Page 8: Hiding Data in Text using ASCII MappingTechnology (AMT)€¦ · Hiding Data in Text using ASCII MappingTechnology (AMT) Souvik Bhattacharyya Department of CSE University Institute

International Journal of Computer Applications (0975 – 8887)

Volume 70– No.18, May 2013

36

considered matching only if they are not farther than

12

,max 21

SS

. Each character of s1 is compared with

all its matching characters in s2. The number of matching (but

different sequence order) characters divided by two defines

the number of transpositions. Table 1 shows below the various

similarity coefficients of the different sizes of cover text

embedded with different secret message size.

Table 1: Similarity coefficient between different cover

text and stego text

8. COMPARISON WITH OTHER

METHODS

In this section a comparison has been shown with some other

existing methods like Text steganography using Changing

word spelling [27], Inter word spacing and inter paragraph

spacing [20] or Text Steganography by using Letter Points

and Extensions [28] or Text steganography using

CALP[24,25]. From the comparative study shown in table 2 it

has been observed that the proposed AMT Text

Steganography method is better than the existing other

methods in terms of embedding capacity and robustness. This

method supports the authentication of the secret message.

This method is a universal one and applicable to any other

languages. A technique for measuring the similarity between

the cover text and stego text also exist for this method.

9. CONCLUSION

In this paper a novel approach of English text steganography

method known as AMT has been presented .Stego text is

generated by mapping the binary sequence of the secret

message using ASCII mapping technique. Quantum logic

technique for finding valid embedding position increases an

additional level of security. This method takes care of the

authentication problem also through POS file. From Table 1 it

has been observed that AMT method generates the stego text

with minimum or zero degradation as the Shannon Entropy,

Correlation-coefficient and Jaro Score value is very high. This

property enables the method to avoid the steganalysis. The

proposed steganography technique through ASCII mapping is

a new approach for the English steganography and this

methodology can be extended to any other Indian language

also.

Table 2: Comparison of the proposed method with others

10. REFERENCES

[1] Digital Watermarking :A Tutorial Review S.P.Mohanty ,1999.

[2] Ross J. Anderson and Fabien A.P. Petitcolas, "On the limits of

steganography," IEEE Journal on Selected Areas in

Communications (J-SAC), Special Issue on Copyright &

Privacy Protection, vol. 16 no. 4, pp 474-481, May 1998.

[3] T Mrkel,JHP Eloff and MS Olivier .”An Overview of Image

Steganography,”in proceedings of the fifth annual Information

Security South Africa Conference ,2005 .

[4] Gustavus J. Simmons, "The Prisoners' Problem and the

Subliminal Channel", in Proceedings of CRYPTO '83, pp 51-

67. Plenum Press (1984).

[5] “Stretching the Limits of Steganography", RJ Anderson, in

Information Hiding, Springer Lecture Notes in Computer

Science v 1174 (1996) pp 39-48.

[6] Scott Craver, "On Public-key Steganography in the Presence

of an Active Warden," in Proceedings of 2nd International

Workshop on Information Hiding, April 1998, Portland,

Oregon, USA. pp. 355 - 368.

[7] W. Bender, D. Gruhl, N. Morimoto, and A. Lu, "Techniques

for data hiding", IBM Systems Journal, vol. 35, Issues 3&4,

1996, pp. 313-336.

[8] N. F. Johnson and S. Jajodia, "Steganography: seeing the

unseen," IEEE Computer.,Feb., 26-34 (1998).

[9] L. M. Marvel, C. G. Boncelet, Jr. and C. T. Retter, "Spread

spectrum image steganography," IEEE Trans. on Image

Processing, 8(8), 1075-1083 (1999).

Page 9: Hiding Data in Text using ASCII MappingTechnology (AMT)€¦ · Hiding Data in Text using ASCII MappingTechnology (AMT) Souvik Bhattacharyya Department of CSE University Institute

International Journal of Computer Applications (0975 – 8887)

Volume 70– No.18, May 2013

37

[10] Analysis of LSB Based Image Steganography Techniques ,R.

Chandramouli, Nasir Memon, Proc. IEEE ICIP, 2001.

[11] An Evaluation of Image Based Steganography Methods,Kevin

Curran, Kran Bailey, International Journal of Digital

Evidence,Fall 2003.

[12] K. Gopalan, "Audio steganography using bit modification",

Proceedings of the IEEE International Conference on

Acoustics, Speech, and Signal Processing, (ICASSP '03), vol.

2, 6-10 April 2003, pp. 421-424.

[13] K. Gopalan, "Audio steganography using bit modification",

Proceedings of the IEEE International Conference on

Acoustics, Speech, and Signal Processing, (ICASSP '03), vol.

2, 6-10 April 2003, pp. 421-424.

[14] M.Chapman, G. Davida, and M. Rennhard, “A Practical and

Effective Approach to Large-Scale Automated Linguistic

Steganography”, Proceedings of the Information Security

Conference, October 2001, pp. 156-165.

[15] “Study of Secure Steganography model” by Souvik

Bhattacharyya and Gautam Sanyal at the proceedings of

“International Conference on Advanced Computing &

Communication Technologies (ICACCT-2008),Nov, 2008,

Panipat, India”

[16] “An Image based Steganography model for promoting Global

Cyber Security” by Souvik Bhattacharyya and Gautam Sanyal

at the proceedings of “International Conference on

Systemics,Cybernetics and Informatics (ICSCI- 2009),Jan,

09,Hyderabad,India.”

[17] “Implementation and Design of an Image based

Steganographic model” by Souvik Bhattacharyya and Gautam

Sanyal at the proceedings of “ IEEE International Advance

Computing Conference “(IACC-2009)”

[18] Novel Approach to Develop a Secure Image based

Steganographic Model using Integer Wavelet Transform” at

the proceedings of International Conference on Recent Trends

in Information, Telecommunication and Computing (ITC

2010)” by Souvik Bhattacharyya, Avinash Prasad Kshitij and

Gautam Sanyal. (Indexed by IEEE Computer Society).

[19] Y. Kim, K. Moon, and I. Oh, "A Text Watermarking

Algorithm based on Word Classification and Inter-word

Space Statistics", Proceedings of the Seventh International

Conference on Document Analysis and Recognition

(ICDAR’03), 2003, pp. 775–779.

[20] L.Y. Por and B. Delina, “Information Hiding: A New

Approach in Text Steganography”, 7th WSEAS International

Conference on Applied Computer & Applied Computational

Science, April 2008, pp- 689-695.

[21] A.M. Alattar and O.M. Alattar, "Watermarking electronic text

documents containing justified paragraphs and irregular line

spacing ", Proceedings of SPIE - Volume5306, Security,

Steganography, and Watermarking of Multimedia Contents

VI, June 2004, pp- 685-695.

[22] P. Wayner, “Strong Theoretical Steganography”, Cryptologia,

XIX(3), July 1995, pp. 285-299.

[23] M. Niimi, S. Minewaki, H. Noda, and E.Kawaguchi, "A

Framework of Text-based Steganography Using SD-Form

Semantics Model", Pacific Rim Workshop on Digital

Steganography 2003, Kyushu Institute of Technology,

Kitakyushu, Japan, July 3-4, 2003.

[24] Hiding Data in Text Through Changing in Alphabet Letter

Patterns (CALP) by Souvik Bhattacharyya, Pabak Indu ,

Sanjana Dutta , Ayan Biswas and Gautam Sanyal at Journal of

Global Research in Computer Science (JGRCS) VOL 2, NO 3

(2011) MARCH-2011.

[25] “Text Steganography using CALP with High Embedding

Capacity” by Souvik Bhattacharyya, Pabak Indu , Sanjana

Dutta , Ayan Biswas and Gautam Sanyal at Journal of Global

Research in Computer Science (JGRCS) VOL 2, NO 5 (2011)

MAY-2011.

[26] M.H. Shirali-Shahreza and M. Shirali-Shahreza, "Text

Steganography in Chat", Proceedings of the Third IEEE/IFIP

International Conference in Central Asia on Internet the Next

Generation of Mobile, Wireless and Optical Communications

Networks (ICI 2007), Tashkent, Uzbekistan, September 26-

28, 2007.

[27] Mohammad Shirali-Shahreza. Text steganography by

changing words spelling. In ICACT, 2008.

[28] Gutub, A. A. and Fattani, M. M. Text steganography by using

letter points and extensions. World Academy of Science,

Engineering and Technology 27 ,2007, pages 13–27, 2007.

[29] Souvik Bhattacharyya, I. B. and Sanyal, G. A novel approach

of secure text based steganography model using Word

Mapping Method (WMM) International Journal of Computer

and Information Engineering, 4:96–103, 2010.

[30] D. Deutsch. Quantum theory, the church-turing principle and

the universal quantum computer. Proc. Roy. Soc. Lond. A,

400 (1985), 97-117.

[31] D. Deutsch. Quantum computational networks. Proc. Roy.

Soc. Lond. A, 425 (1989), 73-90.

[32] R. P. Feynman. Quantum mechanical computers. Found.

Phys. 16(1986), 507.

[33] Geert Uytterhoeven Dirk Roose Adhemar Bultheel. Integer

wavelet transforms using the lifting scheme. In CSCC

Proceedings, 1999.

[34] W. Sweldens. The lifting scheme. A construction of second

generation wavelets. SIAM J. Math. Anal., 29:511–546, 1997.

[35] W. Sweldens R. Calderbank, I. Daubechies and B.L. Yeo.

Wavelet transforms that map integers to integers. Appl.

Comput. Harmon. Anal.,5:332–369, 1998.

[36] “A Mathematical Theory of Communication” By C. E.

SHANNON, at The Bell System Technical Journal, Vol. 27,

pp. 379–423, 623–656, July, October, 1948.

[37] S. Dowdy and S. Wearden. Statistics for research. Wiley.

ISBN 0471086029, page 230, 1983.

[38] W. E. Winkler. The state of record linkage and current

research problems. Statistics of Income Division, Internal

Revenue Service Publication R99/04., 1999.

[39] M. A. Jaro. Advances in record linking methodology as

applied to the 1985 census of tampa florida. Journal of

the American Statistical Society. 84:414–420, 1989.

[40] M. A. Jaro. Probabilistic linkage of large public health

data file. Statistics in Medicine 14 (5-7)., pages 491–498,

1995.