CHAPTER 5 REVERSIBLE DATA HIDING FOR EMBEDDING CAPACITY ENHANCEMENTshodhganga.inflibnet.ac.in/bitstream/10603/10113/11/11_chapter 5.pdf · REVERSIBLE DATA HIDING FOR EMBEDDING CAPACITY

82

CHAPTER 5

REVERSIBLE DATA HIDING FOR EMBEDDING CAPACITY

ENHANCEMENT

5.1 INTRODUCTION

Having dealt with the first two important criteria for data embedding in

medical images namely, robustness towards attacks and imperceptibility to human visual

system, the third and final most important criterion is optimizing the data hiding capacity

parameter. This is especially a very important criterion since it deals with medical data

management. Embedding of data over digital images can combine the advantages of data

security with efficient memory utilization. However, the embedding procedure will

distort the images. This distortion may cause the modified medical images to be unable to

use for further diagnosis. In addition to casting of watermarks, trying to interleave the

patient information into the cover medical image may introduce more distortion which is

quite not acceptable for medical images. Hence, establishing an optimal balance between

embedding capacity and the image quality is of utmost importance. The reversible data

hiding techniques are the solution for this problem.

The chapter is organized as follows

Section 5.2 insists on the need for reversible data hiding and its influence

on increasing the embedding capacity.

The key parameters to be considered while designing a reversible

watermarking scheme for capacity enhancement is pointed out in section

5.3.

83

Section 5.4 outlines the issues existing in the capacity enhancement through

reversible embedding and extraction process.

The types of reversible watermarking methods and the motivation behind

adopting this method in this work is illustrated in section 5.5.

Section 5.6 presents the concept of difference expansion of pixels and how

it is exploited to create smooth and non smooth regions of the image to

accommodate more data into it.

Section 5.7 briefs on histogram shifting technique and the histogram based

selection of locations for embedding

The summary of the chapter is given in section 5.8.

5.2 NEED FOR REVERSIBLE DATA EMBEDDING

Reversibility gives the ability to retrieve the exact original input data after the

extraction process. This is a technique to embed additional message into some distortion-

unacceptable cover media, such as military or medical images, with a reversible manner

so that the original cover content can be perfectly restored after extraction of the hidden

message.

Reversibility can be used to attach crucial information to the media without

changing their original contents. The lossless embedding increases the size of the original

image and lossy embedding process cannot be applied to medical field. Recently,

reversible data embedding technique has attracted many attentions. It is also called

lossless data embedding.

84

5.3 KEY PARAMETERS OF REVERSIBLE DATA EMBEDDING

Reversible watermarking is a feasible concept due to the fact that the original

media usually has a strong spatial or temporal redundancy. Reversibility is guaranteed if

enough free space can be found or created to embed the watermark within the host signal

while retaining the characteristic of the host untainted. This task is possible in the

appropriate transform domain by employing specific suitable techniques and methods.

There are some key parameters that should be taken care of while trying to increase the

embedding capacity at the cost of image quality. A few important ones are briefed below.

5.3.1 Fidelity

Fidelity is a measure of how far the extracted image resembles the original

image in all means. It is an important criterion in medical image processing, since even

the slightest visual difference cannot be tolerated. A reversible data embedding scheme

should ensure that there is a perfect balance between quantity of data embedded and the

image quality. Fidelity is a phenomenon that is concerned with the human visual

perception of an image. Fidelity between two images are said to be high if the human

visual system is not able to detect any visible changes in the modified image.

5.3.2 Computational Cost

Computational cost is a parameter which is quite important in real time

applications as it dictates the time and speed in which the data embedding system should

work in real time applications. Since our application deals with medical data

management, computational cost could be used to govern the speed and time of storage

of patient information, its retrieval when required.

85

5.3.3 Efficiency

Efficiency in reversible data embedding depends on the amount of patient

information that can be cast into the cover image measured in bits per pixels (bpp)

without affecting the visual quality of the cover image measured in terms of the PSNR.

5.3.4 Security

Security is yet another key factor which deals with the type of encryption

and decryption schemes employed in the embedding and extraction procedures, their

simplicity in implementation and their strength against potential hackers. Use of a strong

key, number of rounds of encryption is some of the main factors which influence the

security of the system. It is a key factor since the patient record in the Hospital storage

system should not be tampered with or destroyed on any account.

5.3.5 Payload

Data payload refers to the number of bits a watermark system embeds within a

unit of time or within a unit of cover signal. A data embedding scheme that embeds N

bits into the cover signal is referred to as an N-bit embedder. The required data payload

may differ greatly for each application. Copy protection or copy control applications may

require only a few bits of information while broadcast monitoring may require rates three

times larger than the previous case, or in case of forensic applications the necessary to-

be-embedded information should be complete enough to prevent any modification of the

content. In medical data management, there cannot be any compromise on the payload

content as it may contain the entire patient information, the doctor‘s diagnosis and

subsequent treatment information which cannot be selectively ignore.

86

5.4 ISSUES IN EMBEDDING CAPACITY ENHANCEMENT

Medical images are a very important part of patient‘s records and information,

which are stored in databases of hospitals and may be exchanged between hospitals and

health centers. Among these data, both patient information and medical images need to

be properly organized, so as to avoid mishandling and loss of data. In recent years, many

hospitals have established the electronic medical information, so that much related

medical information has been digitized. Usually, most of the doctors record the

treatments by handwriting first, and then input the data of treatment to the computer for

later references. All the information will be stored in the Hospital Information System

(HIS). The disadvantage of the HIS is the medical staffs can only look up the related

check reports in the specific place. This is inconvenient for the data searching.

One of the characteristics of medical images, among the other types of images, is

the large smooth regions. Taking advantage of this characteristic, the scheme divides the

image into two regions; smooth region and non-smooth region. This method is used in

the proposed method for the efficient data hiding in medical images. Many data hiding

techniques are used for interleaving patient information with medical images. Moreover,

these data hiding techniques can also be used for authentication and tamper detection to

judge images integrity and fidelity. In order to achieve that, much data must be concealed

into the image besides the patient‘s data. Thus, the capacity of the hiding must be high

enough to accommodate the payload. On the other hand, reversibility is one of the most

important requirements for medical images, which must be kept intact to avoid any

misdiagnoses.

Medical images and the medical datagram needed extra care. Since medical

information needed much bandwidth while transmitting, the bandwidth for data

87

transmission can be reduced by embedding data (patient information) in the medical

images. The data hiding in medical images should be reversible to achieve high visual

quality. Also the medical data needed high security and authentication. To achieve that if

data hide inside the medical images, secure reception can be possible. One of the

outstanding reversible data hiding schemes is the difference expansion.

5.5 TYPES OF REVERSIBLE DATA HIDING SCHEMES

There are a number of reversible data hiding schemes as vigorous research is still

being carried out in this field. Some of the prominent ones are the regular singular

scheme (R – S), the difference expansion scheme (DE), the Integer wavelet transform

based scheme (IWT) and the patchwork based schemes. One of the outstanding reversible

data hiding schemes is the difference expansion (DE). The main advantage of DE, among

the other reversible schemes, is the high embedding capacity which is achieved by this

method. DE is based on modifying the difference between a pair of pixel values while

keeping the average of them unchanged. This technique divides the image into pairs of

pixels, then embeds one bit of information into each pair. The technique received more

attention because of its high efficiency and simplicity. Since difference expansion is a

special domain data hiding approach, computational complexity is much reduced

compared to the existing transform domain approaches.

To avoid the drawbacks of the conventional difference expansion method, two

types of DE techniques can be used. For the smooth region, a high embedding capacity

scheme is applied, while the original DE method is applied to the non-smooth region. The

high embedding capacity scheme for smooth blocks is a histogram shifting approach so

that can avoid the use of location map. Location map is the data stream which is used to

indicate the position of the hide data. For the non smooth region, DWT method is used.

88

The aim of improving the original DE proposed by researchers is twofold: first is

to make the embedding capacity as high as possible, second is to make the visible

distortion as low as possible. To achieve high embedding capacity, the reviewed schemes

adopted three different approaches:

(i) Simplifying the location map in order to increase its compressibility,

(ii) Embedding payload without location map, and

(iii) Expanding differences more than once this allows more data to be

embedded.

Meanwhile, the visual quality may be enhanced by:

a. using a predefined threshold T, and

b. Selecting smooth areas to embed data.

5.6 CONCEPT OF DIFFERENCE EXPANSION

The main objective of the proposed work is to improve the data hiding in medical

images using difference expansion method. To increase embedding capacity and visual

quality high embedding capacity scheme based on histogram shifting is applied for the

smooth region and distortion less frequency domain transform based data hiding method

is used for the non smooth region.

The DE embedding technique involves pairing the pixels of the host image I and

transforming them into a low-pass image L containing the integer averages and a high-

pass image H containing the pixel differences h. If a and b be the intensity values of a

pixel-pair, then l and h are defined as

𝑙 = └𝑎+𝑏

2┘ (5.1)

𝑕 = 𝑎 − 𝑏 (5.2)

89

This transformation is invertible, so that the gray levels a and b can be computed

from l and h

𝑎 = 𝑙 + └𝑕+1

2┘ (5.3)

𝑏 = 𝑙 − └𝑕

2┘ (5.4)

An information bit i is embedded by appending it to the LSB of the difference h,

thus creating a new LSB. The watermarked difference is

𝑕𝑤 = 2𝑕 + 𝑖 (5.5)

The resulting pixel gray-levels are calculated from the difference (hw) and integer

average (l) using (5.3) and (5.4). For an image with n-bit pixel representation, the gray

levels satisfy𝑎, 𝑏 ∈ [0,2𝑛 − 1], if and only if h and l satisfy the following condition:

𝑕 ∈ 𝑅𝑑 𝑙 = [0, min 2 2𝑛 − 1 − 𝑙 , 2𝑙 + 1 ] (5.6)

Where, 𝑅𝑑 𝑙 is called the invertible region. Combining (5.5) and (5.6), obtain the

condition for a difference h to undergo DE.

2𝑕 + 𝑖 ∈ 𝑅𝑑 𝑙 𝑓𝑜𝑟𝑖 = 0,1 (5.7)

This condition is called the expandability condition for DE. A difference that

satisfies the expandability condition, given a corresponding integer average, is called an

expandable difference. Apart from the DE embedding technique, Tian‘s algorithm also

uses an embedding technique called LSB replacement. In the LSB-replacement

embedding technique, the LSB of the difference is replaced with an information bit. This

is a lossy embedding technique since the true LSB is overwritten in the embedding

process. However, in Tian‘s scheme, the true LSBs of the differences that are embedded

by LSB-replacement are saved and embedded with the payload, to ensure lossless

90

reconstruction. The LSB of a difference can be flipped without affecting its ability to

invert back to the pixel domain if and only if

└𝑕

2┘ + 𝑖 ∈ 𝑅𝑑 𝑙 𝑓𝑜𝑟𝑖 = 0,1 (5.8)

This is called the changeability condition. A difference satisfying the

changeability condition, given a corresponding integer average, is called a changeable

difference. An expandable difference is also a changeable difference. A changeable

location remains changeable even after its LSB is replaced, whereas an expandable

location may not be expandable after DE, but it remains changeable.

Let D be the common domain of the high-pass and low-pass images, H and L,

respectively. Each element of D is associated with a difference and an integer-average.

Expandable locations and changeable locations are subsets of D. The subset of D with

corresponding changeable differences is denoted by C and is called the set of changeable

locations. An important subset of C containing the locations with expandable differences

is denoted by E and is called the set of expandable locations.

Using a selection criterion depending on the size of the payload, E is partitioned

into E‘ and the set difference, E\ E‘. The differences at E‘ are expansion embedded. The

differences at C\ E‘ are modified by LSB replacement. In order to ensure reconstruction,

the original LSBs are saved and embedded along with the payload. To enable

reconstruction, a binary location map indicating the selected locations, E‘, is created and

compressed in a lossless manner. A bit stream is formed by concatenating the compressed

location map, the saved LSBs and the payload, and this bit stream is then embedded into

the high-pass image H. The locations of H are traversed in a predefined order, and the

bits are embedded into the changeable locations, C. The watermarked image is obtained

from the modified high-pass image and the low-pass image.

91

The original difference expansion (DE) technique involves pairing the pixels of

the host image and transforming them into a low-pass image containing the integer

averages and a high-pass image containing the pixel differences. During embedding,

differences are classified into three groups: expandable, changeable, and non-changeable.

Data bits are embedded only into expandable and changeable. A location map is formed

to distinguish between the three different groups. The map is then compressed,

concatenated with the payload, and then embedded into the image. To avoid the draw

backs in the pair wise DE method, histogram shifting based difference expansion method

is used for a part of image.

5.7 HISTOGRAM SHIFTING

DE of the differences in the selected locations expands the histogram of the inner

region, and the modified differences occupy the range −2∆ − 2, 2∆ + 1 . Comparing

this range with the range of the differences that constitute the outer regions, they overlap

in the range −2∆ − 2, −∆ − 2 U ∆ + 1,2∆ + 1 . An appropriate histogram shift of the

outer regions would cancel all overlap between the two regions. In order to achieve this,

the negative differences and the nonnegative differences of the outer regions should be

shifted left and right, respectively, by at least∆ + 1.

𝑕𝑠 = 𝑕 + ∆ + 1, 𝑖𝑓𝑕 > ∆ (5.9)

𝑕 − ∆ − 1, 𝑖𝑓𝑕 < −∆ − 1

A histogram shift can be easily reversed if ∆isknown.

𝑕 = 𝑕𝑠 − ∆ − 1, 𝑖𝑓𝑕𝑠 > 2∆ + 1 (5.10)

𝑕𝑠 − ∆ + 1 𝑖𝑓𝑕𝑠 < −2∆ − 2 (5.11)

92

The discussion on histogram shifting has been restricted to expandable differences

lying in the outer regions (i.e., differences outside the range[−∆ − 1, ∆] ). Histogram

shifting causes a smaller change in these differences than DE. Therefore, it is not

necessary to check whether a histogram shift might cause overflow/underflow.

Incorporating histogram shifting along with DE also eliminates the need to have a

location map of the selected expandable locations (they can be identified at the decoder

from the histogram of the differences). Consequently, the amount of auxiliary

information embedded is also significantly reduced. In addition, the computational

intensity required for histogram shifting is much less than that required for the

compression/ decompression engine.

The smooth regions of the medical image will be undergone histogram shifting

based difference expansion method. Average and difference values of smooth block

pixels will be calculated after divided into pairs of pixels. The data to be embedded will

be taken. To embed the data, difference values should be divided as changeable and

expandable. Embedding will be done in the LSB location of the expandable difference

values. Location map will be used to identify the embedded locations. The location map

is the LSB location of expandable and changeable difference values which are not used

for data embedding. To avoid the overflow and underflow because of expansion

embedding, histogram shifting will be done. The above process can be done in

accordance with a threshold value, ∆ (here, ∆ =10). After embedding of data in the LSB

location of histogram shifted expandable locations, the embedded image will be

reconstructed.

In histogram shifting based DE method, after finding out the difference and

average values of the pixel pairs, the data bits will be embedded in the position of the

LSB bits. The embedded data stream contains original LSB bits and the authenticated

payload.

93

5.7.1 Histogram based selection of locations

The smooth region pixels are equal in 4*4 pixel blocks. Data will be embedded in

the difference values obtained from pairs of pixels. To embed data, the difference values

can be separated into expandable and non expandable pixels and data embedding will be

preferably for expandable difference values. If the expandable difference values are

insufficient, data embedding can be done in changeable and non expandable difference

values also.

To identify and separate the values of expandable difference values, a histogram

will be plotted for difference values. The bins whose differences have smaller magnitude

are given preference in the selection process for data embedding because the smaller the

magnitude of the expandable difference, the smaller the resulting distortion. The selection

of locations for embedding is done by defining non overlapping regions in the histogram

of expandable locations. Upon expansion, the bins corresponding to these selected

locations overlap with the bins of the other locations. To compensate it, introduce a

histogram shifting technique that eliminates all overlap between the bins of the expanded

locations and the other bins.

The difference histogram will be usually observed as difference values with small

magnitude occur more frequently. Therefore, the selection of locations for expansion

embedding involves setting an appropriate threshold∆≥ 0 , such that ∆ + 1 negative

and∆ + 1nonnegative bins from the histogram are selected, resulting in2∆ + 2 bins.

These selected bins have differences in the range[∆ − 1, ∆]. This selection method

divides the histogram into two no overlapping inner and outer regions as shown in figure

5.1(∆= 10).

94

Fig 5.1 Histogram of expandable difference values for a Lena image

5.8 SUMMARY

This chapter introduces the concept of reversible watermarking or data hiding for

the purpose of increasing the embedding capacity in medical images. It describes the

need for increasing the embedding capacity, the care that needs to be taken followed by

the utilization of a difference expansion based concept. This concept is used along with

histogram shifting in a hybrid combination with the frequency domain transform to

address all the three optimization criteria namely, the robustness, the imperceptibility and

the embedding capacity which is discussed elaborately in the succeeding chapter.

CHAPTER 5 REVERSIBLE DATA HIDING FOR EMBEDDING CAPACITY ENHANCEMENTshodhganga.inflibnet.ac.in/bitstream/10603/10113/11/11_chapter 5.pdf · REVERSIBLE DATA HIDING FOR EMBEDDING CAPACITY

Documents