Content Based Image Retrieval (CBIR) - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/35528/16/16_chapter_7.pdf · Content Based Image Retrieval (CBIR) In chapter 1, ... Majority

213

Chapter 7

Content Based Image Retrieval (CBIR)

In chapter 1, an introduction to “Content Based Image Retrieval (CBIR)”

was given. Some applications of CBIR and related problems and issues were also

discussed. Several tools and techniques are being used in the development of CBIR

systems to enhance the capabilities for better image retrieval at a higher level of

semantics.

Most of the image retrieval techniques presented in the literature are not

robust enough in exhibiting good image retrieval performance even with altered

input images.

Wavelets have found considerable application in image retrieval. Wavelets

and other wavelet family members such as curvelets are widely being used in the

development and implementation of the efficient image retrieval systems in recent

years [Zho2009].

In this chapter, the role and effectiveness of wavelets in the development of

robust image retrieval schemes, which can perform better against several input

image alterations, is studied. Efforts have been made to utilize wavelets in

extracting “color and texture based features of the images” and their performance

is presented in comparison to some well-known standard color and texture

feature extraction schemes.

7.1 Wavelets in CBIR

Wavelets offer advantages of multi-resolution representation and space-

frequency localization. “They have capability to capture image details in different

directions viz., horizontal, vertical and diagonal”. Curvelets are excellent in

capturing image details along the curvatures at various resolution levels. Wavelets

have also proved to be good for texture based image retrieval.

“A wavelet-based salient point extraction algorithm for CBIR is proposed in

[Tia2001]. In this scheme, the color and texture information in the locations given

214

by these points is extracted which provides significant improvement in the

retrieval results”.

“A CBIR system named SIMPLIcity (Semantics sensitive Integrated

Matching for Picture LIbraries) is proposed in [Wan2001]. It uses semantics

classification methods, a wavelet-based approach for feature extraction, and

integrated region matching based upon image segmentation. The system classifies

images into semantic categories, such as textured-non textured, graph-

photograph. Potentially, the categorization enhances retrieval by permitting

semantically-adaptive searching methods and narrowing down the searching

range in a database. The proposed system also shows robustness to image

alterations”.

“A wavelet based texture feature extraction method for CBIR applications is

presented in [Vad2004]. This method uses standard deviations of high frequency

components of Daubechies wavelet transform as texture features. A total of nine

features are obtained from three levels of decomposition”.

“A simple wavelet based CBIR system is presented in [Chi2006]. In this

scheme, wavelet coefficients of each image are stored. In the image retrieval

phase, the system compares the most significant wavelet coefficients of the Y, U

and V components of the query image with those of the images in the database,

coupled with the weight factors assigned by users, and finds out the matches

based on the features of interest to users”.

“An algorithm for texture feature extraction using wavelet decomposed

coefficients of an image and its complement is presented in [Hir2006]. Four

different approaches (Multispectral approach in RGB, HSV, YCbCr space and Gray

scale texture feature) to color texture analysis are tested on the classification of

images from the VisTex database”.

“A novel approach for rotation invariant texture image retrieval is

presented in [Kok2006]. This method uses set of dual-tree rotated complex

wavelet filter (DT-RCWF) and DT complex wavelet transform (DT-CWT) jointly in

12 different directions. The DT-RCWFs are non-separable and oriented, which

improves characterization of oriented textures”.

215

“Representation of local properties in an image is an important issue in

CBIR. The work in [Muw2007], proposes a salient region detector based on

wavelet transform. The detector can extract visually meaningful regions of an

image and reflect local characteristics”.

“A CBIR method for a diagnosis aid in medical fields is proposed in

[Lam2007]. To extract texture features, the Gaussian curves are tried to fit the

distribution of wavelet coefficients at different levels of decomposition. Only few

parameters defining the fitted curves are stored as image signatures in feature

database. The wavelet function is adopted by the lifting scheme and retrieval

efficiency is given for different databases including a diabetic retinopathy, a

mammography and a face database”.

“A wavelet based retrieval scheme for trade mark images without

complicated image segmentation is presented in [Muw2008]. The proposed

trademark image retrieval scheme comprises two stages: first, edge detection

based on wavelet transform is performed on the trademark image, second, novel

wavelet -based shape features are introduced to reflect the edges’ characteristics”.

“A CBIR method based on an efficient combination of multi-resolution color

and texture features is proposed in [Chu2008]. In this scheme, color auto-

correlograms of hue and saturation component images are used as color features

and BDIP (block difference of inverse probabilities) and BVLC (block variation of

local correlation coefficients) moments of the value component images are used

for texture features”.

“A CBIR method which uses multi-wavelet transform is presented in

[Xiw2008]. Multi-wavelets are used to improve the shape and texture features

extraction and decrease the operands. Theoretical analysis and the numerical and

experimental results show that the multi-wavelet transform based image retrieval

can get good results for image retrieval”.

“The phase often holds crucial information about image structures and

features. However, only the real part or the magnitude of the transform

coefficients is typically used for image processing applications. A method for the

feature extraction of images called Phase-based LBP is presented in [Ngu2010].

216

Proposed method is based on the combination of phase of complex wavelet

coefficients and the Local Binary Pattern operator (LBP)”.

“A scheme of color image retrieval, based on wavelet transform and G-

Regions of Interest (GROI) is presented in [Xug2010]. The images are first

represented in HSV color space. Areas of interest are then extracted by using K-

means clustering in the wavelet domain. The energies of the wavelet coefficients in

these areas of interest is used as a texture feature. For color feature, mean and

variance are used. The barycentric coordinates are also utilized as position

features”.

“An image retrieval method using the Daubechies wavelet and stage

treatment is proposed in [Wan2010]. This algorithm decomposes the color

information of the image using Daubechies wavelet and constructs the eigenvector

using both the low frequency component and the high frequency component.

Three features: variance, invariant moment and angle of the eigenvector are used

to compare the similarity between the retrieval image and the images in the

database”.

“A new image indexing and retrieval algorithm by integrating color and

texture features is proposed in [Red2012]. Histograms are constructed from HSV

space and used as color feature. For texture feature, the color image is converted

into grayscale and divided into eight binary bit-planes. The Binary Wavelet

Transform (BWT) is applied on each bit-plane and the local binary pattern (LBP)

features are extracted from the resultant BWT sub-bands”.

“The use of pyramidal and tree structured wavelet features using 8-tap

Daubechies coefficients is proposed in [Kok2013] for texture analysis along with

extensive experimental evaluation. Comparison with various features indicates

that the combination of energy and standard deviation of wavelet features provide

good pattern retrieval accuracy for tree structured wavelet decomposition while

standard deviation alone gives better result in pyramidal wavelet decomposition”.

“Various other techniques have been used to improve retrieval rates along

with wavelet decomposition. These techniques mainly focus on use of relevance

feedback [Bul2011], use of energy distribution pattern, statistical properties,

various color models, texture analysis [Tam1978] etc. to extract details at various

217

levels of resolution. Along with simple wavelets decomposition, use of some

advanced versions of wavelets such as contourlets [Rao2007], curvelets

[Sum2008], ridgelts [God2010] etc. is also reported in various methods. A

comparative literature survey of use of wavelet based techniques in CBIR is given

in table 7.1”.

From table 7.1 it can be observed that researchers have utilized the

strengths of wavelets in various ways. Majority of the work is focused on texture

and color based features extraction. In almost all the schemes, either energy,

standard deviation or other higher order moments of wavelet coefficients are used

for texture feature extraction. Detailed analysis of properties of wavelets and

effective utilization of their strengths has not been addressed adequately.

Majority of the works do not report the robustness of proposed CBIR

scheme against image alterations and noise except a few such as [Wan2001,

Kok2006, Xiw2008]. The work proposed by [Wan2001] shows the robustness of

the CBIR scheme against various image processing alterations done to the query

images such as cropping, intensity variations etc. The works of [Kok2006,

Xiw2008], only consider rotation invariance. Wavelets have the capability of

capturing edge details of the images, even in the presence of noise. This can help in

providing the robustness against noise. Multi-resolution feature of wavelets can

provide robustness against cropping and scaling. Therefore, there is motivation to

develop a robust CBIR system which perform well even when there are alterations

in the input image.

In this chapter, an attempt has been made to develop a robust CBIR

scheme, “which utilizes the strength of multi resolution capability of wavelets and

strength of edge histogram descriptor”. This scheme extracts both color and

texture features with help of wavelet coefficients.

218

219

220

7.2 Proposed CBIR scheme based on Multi–Resolution Wavelet Transform and Edge Histogram11

Wavelet transform analysis is often used for extracting features of an image

in a few directions. The high frequency detailed wavelet coefficients are able to

represent image details in horizontal, vertical and diagonal directions while

approximation coefficients represent overall image content. The strength of

wavelet based image representation is multi-resolution representation of an

image which emulates the human visual system of observing coarser and finer

details of an image. “The orientations of various edges in an image can be

captured by finding edge histogram with help of Edge Histogram Descriptor (EHD)

of Mpeg-7 standard [Won2002]. Various CBIR techniques use EHD for image

retrieval based on shape and texture features”. In the proposed CBIR method, the

strength of multi resolution capability of wavelets and edge capturing capability of

EHD are combined together to achieve higher retrieval rates.

Color is another very important and primary feature used by CBIR systems.

Various color features defined by Mpeg-7 standard are used in practice such as

“Color Layout Descriptor (CLD), Scalable Color Descriptor (SCD), Color Structure

Descriptor (CSD), Dominant Color Descriptor (DCD)” etc. Other color descriptors

such as Color Correlogram [Jin1997] are also popular. These color descriptors

extract color features in RGB, HSV, YCbCr and HMMD color spaces.

“The HSV color space is a popular choice for manipulating colors. The HSV

color space is developed to provide an intuitive representation of color and to

approximate the way in which humans perceive and manipulate color. This HSV

color space is used by Scalable Color Descriptor (SCD). The SCD is one of the

widely used Mpeg-7 standards for color feature extraction in CBIR systems. In the

proposed CBIR scheme, a modified SCD is implemented, which is simple but

effective as it utilizes full 256 bins color histograms without any quantization”.

This part of the thesis is presented and published in, 11. “C. Patvardhan, A. K. Verma and C. V. Lakshmi, ‘A Robust Content Based Image Retrieval Based On Multi-

Resolution Wavelet Features and Edge Histogram’, 2nd IEEE International Conference on Image Information

Processing (ICIIP), pp. 447-452, Jaypee University of Information & Technology, Shimla, Dec. 9 - 11, 2013”.

221

7.2.1 Proposed Methodology

In the proposed scheme, the effectiveness of wavelet transform is utilized

in both color and texture based feature extraction. Edge histogram is also utilized

along with wavelet transform for texture based feature extraction. These feature

extraction schemes are described next.

7.2.1.1 Color Feature Extraction Scheme

To extract color features of images, a modified version of Scalable Color

Descriptor (SCD) of Mpeg-7 standard is utilized. This version is easier to

implement as quantization steps of SCD are omitted here. “The Haar wavelet

transform is used to reduce the dimension of feature vector”. The overall scheme

is shown in figure 7.1.

Like SCD, the proposed scheme also utilizes the HSV color space. In SCD, the

histograms are quantized and represented in only 16, 4 and 4 bins for H, S and V

respectively. But in the proposed scheme, the full 256 bins histograms are found

for each of the H, S and V color channels collecting more color details at this level.

At this stage, the total length of color feature vector is 3 × 256 = 768. The “Haar

transform” is used to reduce this number.

In the HSV color model, it is known that, ‘H’ is Hue, which represents the

distinct color and is therefore more important than ‘S’ and ‘V’. ‘S’ is Saturation

which represents the amount of color and ‘V’ is Value which represents the

brightness of color. Therefore, “the images can be identified on the basis of Hue

values”. Thus histogram of Hue is decomposed at level 2, while histograms of S and

V are decomposed at level 3.

In Haar wavelet decomposition, the approximation coefficients are

considered because they preserve the fundamental shape of histogram curve at

several levels of decomposition. High frequency detailed coefficients are ignored

as they only contribute fast color changes of noisy nature. The steps are depicted

in figure 7.1 and are explained as follows.

i) Convert RGB input image (𝐼) into HSV color space.

ii) Compute full 256 bins histograms of each color channel H, S and V.

222

iii) Perform 2-level Haar wavelet decomposition for H-histogram (𝐻ℎ)

and 3-level Haar wavelet decomposition for S-Histogram (𝐻𝑠) and V-

Histogram (𝐻𝑣).

iv) Select only approximation coefficients of these wavelet

decompositions and concatenate them to form final color feature

vector (𝑓𝑣𝑐).

The size of each histogram is 256. The 2-level wavelet decomposition of 𝐻ℎ

and 3-level decomposition of 𝐻𝑠 and 𝐻𝑣 result in approximation coefficient vectors

of lengths 64, 32 and 32 respectively. Therefore, the total length of color feature

vector becomes 128.

In the method proposed, Haar wavelet is used because the objective here is

to reduce only the size of feature vector. Also, Haar filter is smallest in size leading

to a very fast method of color feature extraction.

Fig. 7.1: Wavelet based color feature extraction scheme

223

7.2.1.2 Texture Feature Extraction Scheme

Wavelets are good at capturing details of an image in “horizontal, vertical

and diagonal directions” even in the presence of noise where EHD features alone

do not show good performance. Also wavelets can capture these details at various

resolution levels similar to human visual system, while only EHD cannot do this.

Therefore, an attempt is made to combine their capabilities to achieve better

retrieval results. In the proposed scheme, major image details are captured by

wavelet transform of the image at three resolution levels. Dominant edges are

then captured by EHD to form the feature vector. The overall scheme is shown in

figure 7.2.

Fig. 7.2: Wavelet and EHD based texture feature extraction scheme

The edge details of wavelet coefficients are captured by the calculating edge

histogram as shown in figure 7.2. The edge histogram is calculated by the method

already discussed in section 3.3.1.1 of chapter 3. However, 5 bins are added as

global features of the image. These 5 more bins are obtained by averaging all the

224

16 sets of 5 bins representing local image properties. Therefore, length of the edge

histogram vector becomes 85 instead of 80. As can be seen from figure 7.2, the

edge histograms are calculated for wavelet coefficients at all three levels.

Therefore, the total length of texture feature vector becomes 3 × 85 = 255. The

steps are as follows.

i) Decompose image (𝐼) up to level 3 and collect all the wavelet

coefficients at each resolution level.

ii) Arrange wavelet coefficients in a matrix 𝐶𝑚 for each level of

decomposition as shown in figure 7.2.

iii) Find Edge Histogram of matrix 𝐶𝑚 for each level of decomposition. The

edge histogram generates 85 values of texture features. Therefore, for

all the three levels, 255 values are generated.

iv) Concatenate all the edge histogram values to form a single texture

feature vector(𝑓𝑣𝑡) and store it in feature database.

In the method proposed, Db10 wavelet is used for texture feature extraction. This

is a higher order smooth wavelet, which is capable of capturing fast changing (high

frequency) details of images. Also in chapter 6, it is already shown that Db10

performs better than other wavelets for capturing edge details of characters.

Therefore, Db10 is utilized for proposed CBIR case as mainly image details are

captured in form of edges.

7.2.2 Experimental Results and Analysis

The proposed scheme is tried on images of various categories. Two popular

databases are Wang’s image database [Wan2013] and Microsoft Research

Cambridge Object Recognition Image Database [Mic2013]. “These are used for

training and testing the performance of the proposed scheme. The Wang’s image

database is used in implementation of SIMPLIcity software package [Wan2001]

and also used by majority of researchers to show the performance of their

proposed CBIR schemes”.

“The Wang’s image database consists of 1000 images having 10 categories

of 100 images each. These 10 categories and their sample images are shown in

table 7.2”.

225

Table 7.2: Wang’s Image Database

Category Sample images

Africans

Beaches

Buildings

Buses

Dinosaurs

Elephants

Roses

Horses

Mountains

Food

Table 7.3: Microsoft Research Image Database Category Sample images

Aero planes

Cow

Sheep

Bi-Cycles

Cars

Chimneys

Clouds

Doors

Cutlery

Landscapes

Trees

Windows

226

Each image of the database is an RGB color image of size 288 × 384

or 384 × 288. The Microsoft’s image database consists of 4129 images of total 18

categories. The division of images in each category is not equal in this case unlike

Wang’s database. Here some categories have more images while others have very

few images. Therefore, for testing purpose, total 12 categories are selected. These

categories along with sample images are shown in table 7.3. Each Image in

Microsoft’s database is an RGB color image of size 640 × 480 or 480 × 640.

Before testing the proposed scheme, it is trained with images of Wang’s and

Microsoft’s image databases. The training procedure is shown in figure 7.3.

Fig.7.3: Training procedure for proposed CBIR scheme

In the training procedure, “both color and texture features are calculated

for each image in the database and these feature vectors are stored in database”.

Similarly for testing, “both the color and texture feature vectors are calculated for

the query image and compared with feature vectors already stored in database”.

Then the first ′𝑛′ nearest matches are displayed as retrieval results. The testing

procedure is shown in figure 7.4. “The similarity check finds distance between

feature vector of query image and feature vectors stored in the database”. In the

proposed scheme, nearest match is found by the sum of absolute differences of

feature vectors. This is calculated using equation 6.1 of chapter 6.

227

Fig. 7.4: Testing procedure for proposed CBIR scheme

To test the performance of the proposed CBIR scheme, the following 4 test

procedures are followed.

Procedure 1 (P1): In this procedure, only color features are used.

Procedure 2 (P2): In this procedure, only texture feature are used.

Procedure 3 (P3): It is a two-stage classification process which utilizes both the

color and texture features. Here color feature of query image is used as primary

classification then texture feature is used to find final results out of results of

primary classification.

Procedure 4 (P4): This is also a two-stage classification process where both the

features are used. Here, texture feature of query image is first used as primary

classification. Then color features are used to find the set of final results, out of the

results of primary classification.

“The results of relevant image retrieval are expressed in terms of “Precision and

Recall, which are adopted by the majority of researches. “Precision and Recall are

defined as follows.

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =𝑁𝑂

𝑁𝑟 (7.1)

𝑅𝑒𝑐𝑎𝑙𝑙 =𝑁𝑂

𝑁𝑑 (7.2)

228

𝑁𝑜: Number of relevant images retrieved.

𝑁𝑟: Number of total images requested.

𝑁𝑑: Total number of relevant images in database.

Both Precision and Recall are expressed in percentage terms. Larger values

of both precision and recall represent good performance of the CBIR system”.

To test the retrieval performance, 10 test images from Wang’s database and

12 test images from Microsoft’s database are randomly selected. These test images

are used as query images and are shown in figure 7.5(a) and (b). The selected test

images are taken from each category of both the databases.

Africans (3.jpg) Beaches (158.jpg) Buildings (244.jpg) Buses (314.jpg) Dinosaurs (445.jpg)

Elephants (565.jpg) Roses (619.jpg) Horses (744.jpg) Mountains (801.jpg) Food (977.jpg)

Fig. 7.5 (a): Test images from each category from Wang’s database

Airplanes (28.jpg)

Cows (206.jpg)

Sheep (267.jpg)

Bi-Cycles (723.jpg)

Cars (1000.jpg)

Chimneys (1620.jpg)

Clouds

(1846.jpg) Doors

(2265.jpg) Cutlery

(2571.jpg) Landscapes (2852.jpg)

Trees (3377.jpg)

Windows (3635.jpg)

Fig. 7.5 (b): Test images from each category from Microsoft’s database

Testing is performed using all four procedures. For procedure 1 (P1), the results

for test images of figure 7.6(a) are shown in table 7.4 for first 50 nearest matches.

229

Table 7.4: Comparison of Precision for Color Features only (Procedure: P1/ Wang’s Images)

Image Category

Images No.

Precision (%)

𝑵𝒓 = 𝟏𝟎 𝑵𝒓 = 𝟐𝟎 𝑵𝒓 = 𝟑𝟎 𝑵𝒓 = 𝟒𝟎 𝑵𝒓 = 𝟓𝟎

Africans 3.jpg 100.00 100.00 100.00 95.00 92.00

Beaches 158.jpg 100.00 100.00 93.33 95.00 88.00

Buildings 244.jpg 80.00 85.00 73.33 67.50 70.00

Buses 314.lpg 100.00 95.00 93.33 95.00 94.00

Dinosaurs 445.jpg 100.00 100.00 100.00 100.00 100.00

Elephants 565.jpg 100.00 75.00 73.33 72.50 66.00

Roses 619.jpg 100.00 100.00 93.33 90.00 86.00

Horses 744.jpg 100.00 100.00 100.00 97.50 96.00

Mountains 801.jpg 40.00 30.00 33.33 42.50 36.00

Food 977.jpg 70.00 60.00 63.33 67.50 70.00

Average 89.00 84.50 82.33 82.25 79.80

Similarly the retrieval results for only texture features are shown in table 7.5.

Table 7.5: Comparison of Precision for Texture Features only (Procedure: P2/ Wang’s Images)

Image Category

Images No.

Precision (%)

𝑵𝒓 = 𝟏𝟎 𝑵𝒓 = 𝟐𝟎 𝑵𝒓 = 𝟑𝟎 𝑵𝒓 = 𝟒𝟎 𝑵𝒓 = 𝟓𝟎

Africans 3.jpg 60.00 45.00 40.00 30.00 24.00

Beaches 158.jpg 80.00 65.00 56.67 55.00 54.00

Buildings 244.jpg 60.00 50.00 40.00 42.50 40.00

Buses 314.lpg 100.00 100.00 96.67 95.00 92.00

Dinosaurs 445.jpg 100.00 100.00 96.67 97.50 98.00

Elephants 565.jpg 60.00 70.00 50.00 37.50 32.00

Roses 619.jpg 100.00 100.00 100.00 100.00 100.00

Horses 744.jpg 100.00 90.00 86.67 87.50 82.00

Mountains 801.jpg 50.00 40.00 26.67 27.50 24.00

Food 977.jpg 70.00 50.00 43.33 40.00 38.00

Average 78.00 71.00 63.67 61.25 58.40

It can be observed from the results of tables 7.4 and 7.5 that performance of color

features alone is better than texture features. In the case of a few categories such

as Buses, Dinosaurs and Roses, where the object is very clear, the results of texture

based features are also better as compared to color based features. But the

problem becomes more complicated when level of semantics increases such as in

case of images in Africans, Elephants, Mountains and Food categories. Therefore,

both color and texture features need to be combined. This combination is done in

230

testing procedure 3 (P3) and procedure 4 (P4). The results of testing same Wang’s

query test images under procedures P3 and P4 are shown in tables 7.6 and 7.7.

Table 7.6: Comparison of Precision for Color & Texture Features (Procedure: P3/ Wang’s Images)

Image Category

Images No.

Precision (%)

𝑵𝒓 = 𝟏𝟎 𝑵𝒓 = 𝟐𝟎 𝑵𝒓 = 𝟑𝟎 𝑵𝒓 = 𝟒𝟎 𝑵𝒓 = 𝟓𝟎

Africans 3.jpg 100.00 100.00 96.67 95.00 92.00

Beaches 158.jpg 100.00 100.00 100.00 92.50 90.00

Buildings 244.jpg 100.00 95.00 76.67 75.00 70.00

Buses 314.lpg 100.00 100.00 100.00 100.00 94.00

Dinosaurs 445.jpg 100.00 100.00 100.00 100.00 100.00

Elephants 565.jpg 100.00 80.00 73.33 70.00 66.00

Roses 619.jpg 100.00 100.00 100.00 100.00 86.00

Horses 744.jpg 100.00 100.00 100.00 100.00 96.00

Mountains 801.jpg 90.00 75.00 56.67 42.50 36.00

Food 977.jpg 100.00 95.00 83.333 77.50 70.00

Average 99.00 94.50 88.67 85.25 80.00

Table 7.7: Comparison of Precision for Texture & Color Features (Procedure: P4/ Wang’s Images)

Image Category

Images No.

Precision (%)

𝑵𝒓 = 𝟏𝟎 𝑵𝒓 = 𝟐𝟎 𝑵𝒓 = 𝟑𝟎 𝑵𝒓 = 𝟒𝟎 𝑵𝒓 = 𝟓𝟎

Africans 3.jpg 80.00 55.00 43.33 32.50 26.00

Beaches 158.jpg 100.00 85.00 73.33 70.00 58.00

Buildings 244.jpg 90.00 70.00 60.00 55.00 48.00

Buses 314.lpg 100.00 100.00 100.00 100.00 90.00

Dinosaurs 445.jpg 100.00 100.00 100.00 100.00 100.00

Elephants 565.jpg 90.00 55.00 40.00 37.50 34.00

Roses 619.jpg 100.00 100.00 100.00 100.00 100.00

Horses 744.jpg 100.00 100.00 100.00 97.50 80.00

Mountains 801.jpg 70.00 60.00 40.00 30.00 24.00

Food 977.jpg 90.00 75.00 50.00 37.50 30.00

Average 92.00 80.00 70.67 66.00 59.00

From the results of table 7.6 and 7.7, it is clear that results of procedure 3

(P3) are better as compared to results of procedure 4 (P4). The choice of either

procedure P3 or P4 is actually governed by the user’s demand. If color is main

search criterion for the user, then P3 will be selected else P4 will be the choice if

texture is the main search criterion.

The total number of images (𝑁𝑑) in each category of Wang’s database is

100. Therefore, by the equation 7.2, the recall rate can also be found by

231

considering first 100 images in retrieval results. The recall rates for each category

are shown in table 7.8 for P3 and P4.

Table 7.8: Recall rates for Wang’s test images for procedures P3 and P4

Image Category

No. of Images

Images No. % Recall (𝑵𝒓/𝑵𝒅)

Color First (P3) Texture First (P4)

Africans 100 3.jpg 73.00 21.00

Beaches 100 158.jpg 65.00 43.00

Buildings 100 244.jpg 47.00 34.00

Buses 100 314.lpg 78.00 70.00

Dinosaurs 100 445.jpg 98.00 74.00

Elephants 100 565.jpg 42.00 26.00

Roses 100 619.jpg 60.00 80.00

Horses 100 744.jpg 80.00 60.00

Mountains 100 801.jpg 34.00 18.00

Food 100 977.jpg 58.00 30.00

Average 63.50 45.60

From the results of table 7.8, it can be seen that, retrieval results of procedure P3

are better for the images of Wang’s images.

Some visual results of image retrieval for each procedure (P1, P2, P3 and

P4) are shown in figures 7.6 to 7.9. The number of requested images is 15 to be

displayed and the first image at left top corner is query image.

It can be easily seen from the retrieval results of figure 7.6 to 7.9 that

proposed wavelet based color and texture features are strong representations of

image details and the combination of both the features further enhances the

retrieval capabilities.

The performance of the proposed wavelet based CBIR scheme is also tested

for another popular image database which is obtained from Microsoft Research.

Again all the four test procedures are performed and the results of retrieval are

shown in tables 7.9 to 7.12 for the test images of figure 7.5 (b).

232

Fig. 7.6: “Retrieval result for query image (Wang’s 3.jpg) for procedure P1”


233



234

It can be easily seen from the retrieval results of figure 7.6 to 7.9 that

proposed wavelet based color and texture features are strong representations of

image details and the combination of both the features further enhances the

retrieval capabilities.

“The performance of the proposed wavelet based CBIR scheme is also

tested for another popular image database which is obtained from Microsoft

Research”. Again all the four test procedures are performed and the results of

retrieval are shown in tables 7.9 to 7.12 for the test images of figure 7.5 (b).

Table 7.9: Comparison of Precision for only Color Feature (Procedure: P1/ MS Images)

Image Category

Category Range

Image No.

Precision (%)

𝑵𝒓 = 𝟏𝟎 𝑵𝒓 = 𝟐𝟎 𝑵𝒓 = 𝟑𝟎 𝑵𝒓 = 𝟒𝟎 𝑵𝒓 = 𝟓𝟎

Aero planes 1-58 28.jpg 100.00 65.00 50.00 40.00 38.00

Cow 59-240 206.jpg 60.00 65.00 66.67 67.50 60.00

Sheep 241-430 267.jpg 100.00 95.00 86.67 82.50 74.00

Bi-Cycles 499-770 723.jpg 80.00 70.00 63.33 60.00 52.00

Cars 989-1483 1000.jpg 90.00 80.00 73.33 77.50 74.00

Chimneys 1484-1749 1620.jpg 100.00 100.00 93.33 92.50 88.00

Clouds 1750-2178 1846.jpg 100.00 95.00 96.67 97.50 98.00

Doors 2179-2344 2265.jpg 60.00 40.00 46.67 47.50 40.00

Cutlery 2511-2685 2571.jpg 100.00 100.00 100.00 95.00 90.00

Landscapes 2804-3019 2852.jpg 40.00 45.00 40.00 40.00 44.00

Trees 3257-3473 3377.jpg 90.00 90.00 86.67 85.00 82.00

Windows 3474-4125 3635.jpg 70.00 65.00 53.33 50.00 48.00

Average 75.00 75.833 71.39 69.58 65.67

Table 7.10: Comparison of Precision for only Texture Feature (Procedure: P2/ MS Images)

Image Category

Category Range

Image No.

Precision (%)

𝑵𝒓 = 𝟏𝟎 𝑵𝒓 = 𝟐𝟎 𝑵𝒓 = 𝟑𝟎 𝑵𝒓 = 𝟒𝟎 𝑵𝒓 = 𝟓𝟎

Aero planes 1-58 28.jpg 100.00 85.00 86.67 75.00 68.00

Cow 59-240 206.jpg 70.00 65.00 53.33 55.00 52.00

Sheep 241-430 267.jpg 80.00 75.00 76.67 75.00 66.00

Bi-Cycles 499-770 723.jpg 50.00 45.00 33.33 32.50 32.00

Cars 989-1483 1000.jpg 100.00 100.00 100.00 100.00 100.00

Chimneys 1484-1749 1620.jpg 100.00 95.00 90.00 77.50 72.00

Clouds 1750-2178 1846.jpg 100.00 100.00 96.67 97.50 98.00

Doors 2179-2344 2265.jpg 90.00 65.00 56.67 47.50 44.00

Cutlery 2511-2685 2571.jpg 100.00 100.00 100.00 100.00 92.00

Landscapes 2804-3019 2852.jpg 100.00 60.00 56.67 45.00 40.00

235

Trees 3257-3473 3377.jpg 100.00 100.00 100.00 97.50 96.00

Windows 3474-4125 3635.jpg 100.00 95.00 86.67 87.50 86.00

Average 90.83 82.08 78.06 74.17 70.50

Table 7.11: Comparison of Precision for Color & Texture Features (Procedure: P3/ MS Images)

Image Category

Category Range

Image No.

Precision (%)

𝑵𝒓 = 𝟏𝟎 𝑵𝒓 = 𝟐𝟎 𝑵𝒓 = 𝟑𝟎 𝑵𝒓 = 𝟒𝟎 𝑵𝒓 = 𝟓𝟎

Aero planes 1-58 28.jpg 100.00 90.00 73.33 55.00 44.00

Cow 59-240 206.jpg 100.00 100.00 93.33 90.00 82.00

Sheep 241-430 267.jpg 100.00 100.00 83.33 77.50 74.00

Bi-Cycles 499-770 723.jpg 100.00 100.00 93.33 92.50 84.00

Cars 989-1483 1000.jpg 100.00 100.00 100.00 100.00 100.00

Chimneys 1484-1749 1620.jpg 100.00 100.00 100.00 87.50 86.00

Clouds 1750-2178 1846.jpg 100.00 100.00 100.00 100.00 100.00

Doors 2179-2344 2265.jpg 100.00 95.00 90.00 80.00 72.00

Cutlery 2511-2685 2571.jpg 100.00 100.00 100.00 92.50 86.00

Landscapes 2804-3019 2852.jpg 100.00 90.00 93.33 95.00 92.00

Trees 3257-3473 3377.jpg 100.00 100.00 100.00 100.00 100.00

Windows 3474-4125 3635.jpg 100.00 100.00 93.33 90.00 84.00

Average 100.00 97.92 93.33 88.33 83.67

Table 7.12: Comparison of Precision for Texture & Color Features (Procedure: P4/ MS Images)

Image Category

Category Range

Image No.

Precision (%)

𝑵𝒓 = 𝟏𝟎 𝑵𝒓 = 𝟐𝟎 𝑵𝒓 = 𝟑𝟎 𝑵𝒓 = 𝟒𝟎 𝑵𝒓 = 𝟓𝟎

Aero planes 1-58 28.jpg 100.00 90.00 86.67 82.50 72.00

Cow 59-240 206.jpg 100.00 100.00 86.67 70.00 66.00

Sheep 241-430 267.jpg 100.00 95.00 83.33 80.00 76.00

Bi-Cycles 499-770 723.jpg 100.00 90.00 73.33 65.00 54.00

Cars 989-1483 1000.jpg 100.00 100.00 100.00 100.00 100.00

Chimneys 1484-1749 1620.jpg 100.00 100.00 100.00 90.00 80.00

Clouds 1750-2178 1846.jpg 100.00 100.00 100.00 100.00 100.00

Doors 2179-2344 2265.jpg 100.00 75.00 53.33 47.50 44.00

Cutlery 2511-2685 2571.jpg 100.00 100.00 100.00 95.00 94.00

Landscapes 2804-3019 2852.jpg 100.00 70.00 53.33 42.50 36.00

Trees 3257-3473 3377.jpg 100.00 100.00 100.00 100.00 100.00

Windows 3474-4125 3635.jpg 100.00 100.00 90.00 87.50 86.00

Average 100.00 92.50 85.28 80.88 75.33

236

From the results of table 7.9 to 7.12, it is clear that the performance of

procedure P3 is better than others. For images with almost fixed details, like Car,

Chimneys, Clouds, Cutlery etc. the retrieval results are excellent while for the test

images, where level of semantics being higher, the results are slightly lower but

satisfactory. The recall rates for MS test images are shown in table 7.13.

Table 7.13: Recall rates for MS test images for procedures P3 and P4

Image Category

No. of Images

Image No. % Recall (𝑵𝒓/𝑵𝒅)

Color First (P3) Texture First (P4)

Aero planes 58 28.jpg 34.48 60.35

Cow 182 206.jpg 42.86 42.31

Sheep 190 267.jpg 53.16 31.58

Bi-Cycles 272 723.jpg 33.10 25.00

Cars 495 1000.jpg 43.43 82.02

Chimneys 266 1620.jpg 36.10 38.35

Clouds 429 1846.jpg 56.64 61.54

Doors 166 2265.jpg 19.88 25.30

Cutlery 175 2571.jpg 49.15 49.72

Landscapes 216 2852.jpg 24.54 26.00

Trees 217 3377.jpg 54.38 61.30

Windows 652 3635.jpg 23.93 46.78

Average 39.31 45.85

The results of table 7.13 show that recall rates of procedure P4 are better

than procedure P3. But results of Recall show that P3 is better than P4. This

happens because for more than 50 images, retrieval rates decrease rapidly. Some

visual results of image retrieval for each procedure are shown in figures 7.10 to

7.13. The number of requested images is 15 to be displayed and the first image at

left top corner is the query image.

It can be observed from the figures 7.10 to 7.13, that the proposed

algorithm also performs well for Microsoft Research image database. The results

are encouraging and similar to the results of Wang’s image database. Therefore, it

shows the capability and strength of proposed wavelet based CBIR scheme as

retrieval results are not biased and database dependent. Again the selection of

procedure P3 or P4 depends on the users’ requirement, whether color is main

criterion or texture is main search criterion.

237

Fig. 7.10: “Retrieval result for query image (MS 2571.jpg) for procedure P1”


238



239

The robustness of the proposed CBIR scheme is also investigated. A good

image retrieval scheme must be able to perform well even in the conditions of

noisy environment. The input query image may have several versions modified by

various image processing operations such as rotation, cropping, intensity

adjustment, noise addition etc. A good CBIR scheme should be able to retrieve the

relevant images from the database for these modified versions (within a specific

limit) of the query image. Such robustness tests are also given in [Wan2001]. A set

of test results, is shown in table 7.14 for Wang’s test images and in table 7.15 for

Microsoft’s images. Due to space limitations, first 10 retrieved images are

arranged in two rows.

Table 7.14: Robustness test for the proposed scheme for Wang’s test images for procedure P3

Type of Attack Query Image Retrieval Results (𝑭𝒊𝒓𝒔𝒕 𝟏𝟎 𝒎𝒂𝒕𝒄𝒉𝒆𝒔: 𝟏 𝟐 𝟑𝟔 𝟕 𝟖

𝟒 𝟓𝟗 𝟏𝟎

)

Scale Down by 50%

Scale Up by 130%

30% Bright

30% Dark

30% more Saturate

30% less Saturate

240

Horizontal Flip

Vertical Flip

Pixalize at 5 pixels

Blur by 3x3 Averaging

50% more Sharpness

50% Crop

Gaussian Noise (𝝁 = 𝟎,

𝝈𝟐 = 𝟎. 𝟎𝟖)

Random Crop

Table 7.15: Robustness test for the proposed scheme for Microsoft’s test images for procedure P3

Type of Attack Query Image Retrieval Results (𝑭𝒊𝒓𝒔𝒕 𝟏𝟎 𝒎𝒂𝒕𝒄𝒉𝒆𝒔: 𝟏 𝟐 𝟑𝟔 𝟕 𝟖

𝟒 𝟓𝟗 𝟏𝟎

)

Scale Down by 50%

241

Scale Up by 130%

30% Bright

30% Dark

30% more Saturate

30% less Saturate

Horizontal Flip

Vertical Flip

Pixalize at 5 pixels

Blur by 3x3 Averaging

50% more Sharpness

242

50% Crop

Gaussian Noise

(𝝁 = 𝟎, 𝝈𝟐 = 𝟎. 𝟎𝟖)

Random Crop

Results of the table 7.14 and 7.15 clearly show the robustness of the proposed

image retrieval scheme. In various types of image processing attacks, the retrieval

results are 100% for first 10 recalls except few cases such as scale up, sharpness

and Gaussian noise for Wang’s test images. Similarly for Microsoft’s test images

only vertical flip, Gaussian noise and random crop gave 1 or 2 mismatches.

The strength of proposed multi resolution approach can be seen in case of

scaling attack clearly as proposed scheme tries to match the query image at three

resolution levels. Therefore, coarser and finer details can be captured easily.

To show the advantage of using wavelets along with edge histogram, the

proposed multi resolution texture feature approach is also compared with only

“Mpeg-7 EHD (standard algorithm with 80 bins)” and wavelet based color feature

with “Scalable Color Descriptor (SCD) and Color Layout Descriptor (CLD)” of

Mpeg-7. The recall results are shown in tables 7.16 and 7.17 respectively for

Wang’s database.

Table 7.16: Recall rates comparison for procedure P2 and only Mpeg-7 EHD

Image Category

No. of Images

Images No.

% Recall (𝑵𝒓/𝑵𝒅)

Only EHD Texture

Only (P2)

Africans 100 3.jpg 33.00 21.00

Beaches 100 158.jpg 14.00 43.00

Buildings 100 244.jpg 29.00 34.00

Buses 100 314.lpg 45.00 70.00

Dinosaurs 100 445.jpg 85.00 74.00

Elephants 100 565.jpg 24.00 26.00

243

Roses 100 619.jpg 56.00 80.00

Horses 100 744.jpg 52.00 60.00

Mountains 100 801.jpg 20.00 18.00

Food 100 977.jpg 09.00 30.00

Average 36.70 45.60

Table 7.17: Recall rates comparison for procedure P1, only Mpeg-7 SCD and CLD

Image Category

No. of Images

Images No.


Only SCD Only CLD Color Only

(P1)

Africans 100 3.jpg 40.00 44.00 73.00

Beaches 100 158.jpg 25.00 36.00 65.00

Buildings 100 244.jpg 32.00 28.00 47.00

Buses 100 314.lpg 44.00 47.00 78.00

Dinosaurs 100 445.jpg 56.00 99.00 98.00

Elephants 100 565.jpg 13.00 43.00 42.00

Roses 100 619.jpg 45.00 60.00 60.00

Horses 100 744.jpg 13.00 82.00 80.00

Mountains 100 801.jpg 05.00 44.00 34.00

Food 100 977.jpg 17.00 38.00 58.00

Average 29.00 52.10 63.50

From the results of the tables 7.16 and 7.17, it is clear that the proposed scheme

performs better as compared to the standard Mpeg-7 descriptors for texture as

well as color individually.

The proposed P3 procedure is also compared with “Fuzzy Color and

Texture Histogram (FCTH) scheme proposed in [Cha2008] and Color and Edge

Directivity Descriptor (CEDD) proposed in [Cha2008a]”. The comparison is given

in table 7.18 in terms of recall.

Table 7.18: Recall rates comparison for proposed P3, FCTH [Cha2008] and CEDD [Cha2008a]

Image Category

No. of Images

Images No.


FCTH [Cha2008] CEDD [Cha2008a] Proposed P3

Africans 100 3.jpg 52.00 48.00 73.00

Beaches 100 158.jpg 50.00 43.00 65.00

Buildings 100 244.jpg 50.00 68.00 47.00

Buses 100 314.lpg 71.00 75.00 78.00

Dinosaurs 100 445.jpg 89.00 93.00 98.00

Elephants 100 565.jpg 44.00 45.00 42.00

Roses 100 619.jpg 72.00 65.00 60.00

244

Horses 100 744.jpg 88.00 85.00 80.00

Mountains 100 801.jpg 36.00 40.00 34.00

Food 100 977.jpg 50.00 35.00 58.00

Average 60.20 59.70 63.50

The results of table 7.18 show that the average recall score of proposed P3 scheme

for image retrieval is better than the “Fuzzy Color and Texture Histogram based

technique (FCTH) [Cha2008] and Color and Edge Directivity Descriptor (CEDD)

[Cha2008a]”.

7.3 Conclusions

In this section, “an efficient image retrieval scheme based on multi

resolution wavelet transform is presented. The proposed scheme extracts both

color as well as texture features of the query image for relevant image retrieval.

The color feature extraction algorithm is wavelet based and inspired by standard

Mpeg-7 Scalable Color Descriptor (SCD). It uses HSV color space and Haar wavelet

transform for size reduction. The results of comparison show that the proposed

color feature extraction scheme performs much better than SCD. The proposed

color feature also outperforms the Color Layout Descriptor (CLD) of Mpeg-7”.

To extract texture features, multi resolution advantage of wavelets is

utilized, which emulates the humans’ way of visualizing the objects in coarser and

finer details. Wavelets with edge histograms perform much better than only EHD

directly applied on the images. Particularly in noise or distortion cases, wavelets

are capable to capture the image details. To observe the robustness of the

proposed scheme, the input query images deformed by variety of noises and

geometrical attacks are given to the proposed scheme and in almost every case, it

performed better as can be seen in result section.

The combination of proposed color extraction feature algorithm and

texture feature extraction algorithm performs better than individual features and

outperforms some well-known existing CBIR algorithms.

Content Based Image Retrieval (CBIR) - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/35528/16/16_chapter_7.pdf · Content Based Image Retrieval (CBIR) In chapter 1, ... Majority

Documents