213 Chapter 7 Content Based Image Retrieval (CBIR) In chapter 1, an introduction to “Content Based Image Retrieval (CBIR)” was given. Some applications of CBIR and related problems and issues were also discussed. Several tools and techniques are being used in the development of CBIR systems to enhance the capabilities for better image retrieval at a higher level of semantics. Most of the image retrieval techniques presented in the literature are not robust enough in exhibiting good image retrieval performance even with altered input images. Wavelets have found considerable application in image retrieval. Wavelets and other wavelet family members such as curvelets are widely being used in the development and implementation of the efficient image retrieval systems in recent years [Zho2009]. In this chapter, the role and effectiveness of wavelets in the development of robust image retrieval schemes, which can perform better against several input image alterations, is studied. Efforts have been made to utilize wavelets in extracting “color and texture based features of the images” and their performance is presented in comparison to some well-known standard color and texture feature extraction schemes. 7.1 Wavelets in CBIR Wavelets offer advantages of multi-resolution representation and space- frequency localization. “They have capability to capture image details in different directions viz., horizontal, vertical and diagonal”. Curvelets are excellent in capturing image details along the curvatures at various resolution levels. Wavelets have also proved to be good for texture based image retrieval. “A wavelet-based salient point extraction algorithm for CBIR is proposed in [Tia2001]. In this scheme, the color and texture information in the locations given
32
Embed
Content Based Image Retrieval (CBIR) - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/35528/16/16_chapter_7.pdf · Content Based Image Retrieval (CBIR) In chapter 1, ... Majority
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
213
Chapter 7
Content Based Image Retrieval (CBIR)
In chapter 1, an introduction to “Content Based Image Retrieval (CBIR)”
was given. Some applications of CBIR and related problems and issues were also
discussed. Several tools and techniques are being used in the development of CBIR
systems to enhance the capabilities for better image retrieval at a higher level of
semantics.
Most of the image retrieval techniques presented in the literature are not
robust enough in exhibiting good image retrieval performance even with altered
input images.
Wavelets have found considerable application in image retrieval. Wavelets
and other wavelet family members such as curvelets are widely being used in the
development and implementation of the efficient image retrieval systems in recent
years [Zho2009].
In this chapter, the role and effectiveness of wavelets in the development of
robust image retrieval schemes, which can perform better against several input
image alterations, is studied. Efforts have been made to utilize wavelets in
extracting “color and texture based features of the images” and their performance
is presented in comparison to some well-known standard color and texture
feature extraction schemes.
7.1 Wavelets in CBIR
Wavelets offer advantages of multi-resolution representation and space-
frequency localization. “They have capability to capture image details in different
directions viz., horizontal, vertical and diagonal”. Curvelets are excellent in
capturing image details along the curvatures at various resolution levels. Wavelets
have also proved to be good for texture based image retrieval.
“A wavelet-based salient point extraction algorithm for CBIR is proposed in
[Tia2001]. In this scheme, the color and texture information in the locations given
214
by these points is extracted which provides significant improvement in the
retrieval results”.
“A CBIR system named SIMPLIcity (Semantics sensitive Integrated
Matching for Picture LIbraries) is proposed in [Wan2001]. It uses semantics
classification methods, a wavelet-based approach for feature extraction, and
integrated region matching based upon image segmentation. The system classifies
images into semantic categories, such as textured-non textured, graph-
photograph. Potentially, the categorization enhances retrieval by permitting
semantically-adaptive searching methods and narrowing down the searching
range in a database. The proposed system also shows robustness to image
alterations”.
“A wavelet based texture feature extraction method for CBIR applications is
presented in [Vad2004]. This method uses standard deviations of high frequency
components of Daubechies wavelet transform as texture features. A total of nine
features are obtained from three levels of decomposition”.
“A simple wavelet based CBIR system is presented in [Chi2006]. In this
scheme, wavelet coefficients of each image are stored. In the image retrieval
phase, the system compares the most significant wavelet coefficients of the Y, U
and V components of the query image with those of the images in the database,
coupled with the weight factors assigned by users, and finds out the matches
based on the features of interest to users”.
“An algorithm for texture feature extraction using wavelet decomposed
coefficients of an image and its complement is presented in [Hir2006]. Four
different approaches (Multispectral approach in RGB, HSV, YCbCr space and Gray
scale texture feature) to color texture analysis are tested on the classification of
images from the VisTex database”.
“A novel approach for rotation invariant texture image retrieval is
presented in [Kok2006]. This method uses set of dual-tree rotated complex
wavelet filter (DT-RCWF) and DT complex wavelet transform (DT-CWT) jointly in
12 different directions. The DT-RCWFs are non-separable and oriented, which
improves characterization of oriented textures”.
215
“Representation of local properties in an image is an important issue in
CBIR. The work in [Muw2007], proposes a salient region detector based on
wavelet transform. The detector can extract visually meaningful regions of an
image and reflect local characteristics”.
“A CBIR method for a diagnosis aid in medical fields is proposed in
[Lam2007]. To extract texture features, the Gaussian curves are tried to fit the
distribution of wavelet coefficients at different levels of decomposition. Only few
parameters defining the fitted curves are stored as image signatures in feature
database. The wavelet function is adopted by the lifting scheme and retrieval
efficiency is given for different databases including a diabetic retinopathy, a
mammography and a face database”.
“A wavelet based retrieval scheme for trade mark images without
complicated image segmentation is presented in [Muw2008]. The proposed
trademark image retrieval scheme comprises two stages: first, edge detection
based on wavelet transform is performed on the trademark image, second, novel
wavelet -based shape features are introduced to reflect the edges’ characteristics”.
“A CBIR method based on an efficient combination of multi-resolution color
and texture features is proposed in [Chu2008]. In this scheme, color auto-
correlograms of hue and saturation component images are used as color features
and BDIP (block difference of inverse probabilities) and BVLC (block variation of
local correlation coefficients) moments of the value component images are used
for texture features”.
“A CBIR method which uses multi-wavelet transform is presented in
[Xiw2008]. Multi-wavelets are used to improve the shape and texture features
extraction and decrease the operands. Theoretical analysis and the numerical and
experimental results show that the multi-wavelet transform based image retrieval
can get good results for image retrieval”.
“The phase often holds crucial information about image structures and
features. However, only the real part or the magnitude of the transform
coefficients is typically used for image processing applications. A method for the
feature extraction of images called Phase-based LBP is presented in [Ngu2010].
216
Proposed method is based on the combination of phase of complex wavelet
coefficients and the Local Binary Pattern operator (LBP)”.
“A scheme of color image retrieval, based on wavelet transform and G-
Regions of Interest (GROI) is presented in [Xug2010]. The images are first
represented in HSV color space. Areas of interest are then extracted by using K-
means clustering in the wavelet domain. The energies of the wavelet coefficients in
these areas of interest is used as a texture feature. For color feature, mean and
variance are used. The barycentric coordinates are also utilized as position
features”.
“An image retrieval method using the Daubechies wavelet and stage
treatment is proposed in [Wan2010]. This algorithm decomposes the color
information of the image using Daubechies wavelet and constructs the eigenvector
using both the low frequency component and the high frequency component.
Three features: variance, invariant moment and angle of the eigenvector are used
to compare the similarity between the retrieval image and the images in the
database”.
“A new image indexing and retrieval algorithm by integrating color and
texture features is proposed in [Red2012]. Histograms are constructed from HSV
space and used as color feature. For texture feature, the color image is converted
into grayscale and divided into eight binary bit-planes. The Binary Wavelet
Transform (BWT) is applied on each bit-plane and the local binary pattern (LBP)
features are extracted from the resultant BWT sub-bands”.
“The use of pyramidal and tree structured wavelet features using 8-tap
Daubechies coefficients is proposed in [Kok2013] for texture analysis along with
extensive experimental evaluation. Comparison with various features indicates
that the combination of energy and standard deviation of wavelet features provide
good pattern retrieval accuracy for tree structured wavelet decomposition while
standard deviation alone gives better result in pyramidal wavelet decomposition”.
“Various other techniques have been used to improve retrieval rates along
with wavelet decomposition. These techniques mainly focus on use of relevance
feedback [Bul2011], use of energy distribution pattern, statistical properties,
various color models, texture analysis [Tam1978] etc. to extract details at various
217
levels of resolution. Along with simple wavelets decomposition, use of some
advanced versions of wavelets such as contourlets [Rao2007], curvelets
[Sum2008], ridgelts [God2010] etc. is also reported in various methods. A
comparative literature survey of use of wavelet based techniques in CBIR is given
in table 7.1”.
From table 7.1 it can be observed that researchers have utilized the
strengths of wavelets in various ways. Majority of the work is focused on texture
and color based features extraction. In almost all the schemes, either energy,
standard deviation or other higher order moments of wavelet coefficients are used
for texture feature extraction. Detailed analysis of properties of wavelets and
effective utilization of their strengths has not been addressed adequately.
Majority of the works do not report the robustness of proposed CBIR
scheme against image alterations and noise except a few such as [Wan2001,
Kok2006, Xiw2008]. The work proposed by [Wan2001] shows the robustness of
the CBIR scheme against various image processing alterations done to the query
images such as cropping, intensity variations etc. The works of [Kok2006,
Xiw2008], only consider rotation invariance. Wavelets have the capability of
capturing edge details of the images, even in the presence of noise. This can help in
providing the robustness against noise. Multi-resolution feature of wavelets can
provide robustness against cropping and scaling. Therefore, there is motivation to
develop a robust CBIR system which perform well even when there are alterations
in the input image.
In this chapter, an attempt has been made to develop a robust CBIR
scheme, “which utilizes the strength of multi resolution capability of wavelets and
strength of edge histogram descriptor”. This scheme extracts both color and
texture features with help of wavelet coefficients.
218
219
220
7.2 Proposed CBIR scheme based on Multi–Resolution Wavelet Transform and Edge Histogram11
Wavelet transform analysis is often used for extracting features of an image
in a few directions. The high frequency detailed wavelet coefficients are able to
represent image details in horizontal, vertical and diagonal directions while
approximation coefficients represent overall image content. The strength of
wavelet based image representation is multi-resolution representation of an
image which emulates the human visual system of observing coarser and finer
details of an image. “The orientations of various edges in an image can be
captured by finding edge histogram with help of Edge Histogram Descriptor (EHD)
of Mpeg-7 standard [Won2002]. Various CBIR techniques use EHD for image
retrieval based on shape and texture features”. In the proposed CBIR method, the
strength of multi resolution capability of wavelets and edge capturing capability of
EHD are combined together to achieve higher retrieval rates.
Color is another very important and primary feature used by CBIR systems.
Various color features defined by Mpeg-7 standard are used in practice such as
“Color Layout Descriptor (CLD), Scalable Color Descriptor (SCD), Color Structure
Descriptor (CSD), Dominant Color Descriptor (DCD)” etc. Other color descriptors
such as Color Correlogram [Jin1997] are also popular. These color descriptors
extract color features in RGB, HSV, YCbCr and HMMD color spaces.
“The HSV color space is a popular choice for manipulating colors. The HSV
color space is developed to provide an intuitive representation of color and to
approximate the way in which humans perceive and manipulate color. This HSV
color space is used by Scalable Color Descriptor (SCD). The SCD is one of the
widely used Mpeg-7 standards for color feature extraction in CBIR systems. In the
proposed CBIR scheme, a modified SCD is implemented, which is simple but
effective as it utilizes full 256 bins color histograms without any quantization”.
This part of the thesis is presented and published in, 11. “C. Patvardhan, A. K. Verma and C. V. Lakshmi, ‘A Robust Content Based Image Retrieval Based On Multi-
Resolution Wavelet Features and Edge Histogram’, 2nd IEEE International Conference on Image Information
Processing (ICIIP), pp. 447-452, Jaypee University of Information & Technology, Shimla, Dec. 9 - 11, 2013”.
221
7.2.1 Proposed Methodology
In the proposed scheme, the effectiveness of wavelet transform is utilized
in both color and texture based feature extraction. Edge histogram is also utilized
along with wavelet transform for texture based feature extraction. These feature
extraction schemes are described next.
7.2.1.1 Color Feature Extraction Scheme
To extract color features of images, a modified version of Scalable Color
Descriptor (SCD) of Mpeg-7 standard is utilized. This version is easier to
implement as quantization steps of SCD are omitted here. “The Haar wavelet
transform is used to reduce the dimension of feature vector”. The overall scheme
is shown in figure 7.1.
Like SCD, the proposed scheme also utilizes the HSV color space. In SCD, the
histograms are quantized and represented in only 16, 4 and 4 bins for H, S and V
respectively. But in the proposed scheme, the full 256 bins histograms are found
for each of the H, S and V color channels collecting more color details at this level.
At this stage, the total length of color feature vector is 3 × 256 = 768. The “Haar
transform” is used to reduce this number.
In the HSV color model, it is known that, ‘H’ is Hue, which represents the
distinct color and is therefore more important than ‘S’ and ‘V’. ‘S’ is Saturation
which represents the amount of color and ‘V’ is Value which represents the
brightness of color. Therefore, “the images can be identified on the basis of Hue
values”. Thus histogram of Hue is decomposed at level 2, while histograms of S and
V are decomposed at level 3.
In Haar wavelet decomposition, the approximation coefficients are
considered because they preserve the fundamental shape of histogram curve at
several levels of decomposition. High frequency detailed coefficients are ignored
as they only contribute fast color changes of noisy nature. The steps are depicted
in figure 7.1 and are explained as follows.
i) Convert RGB input image (𝐼) into HSV color space.
ii) Compute full 256 bins histograms of each color channel H, S and V.
222
iii) Perform 2-level Haar wavelet decomposition for H-histogram (𝐻ℎ)
and 3-level Haar wavelet decomposition for S-Histogram (𝐻𝑠) and V-
Histogram (𝐻𝑣).
iv) Select only approximation coefficients of these wavelet
decompositions and concatenate them to form final color feature
vector (𝑓𝑣𝑐).
The size of each histogram is 256. The 2-level wavelet decomposition of 𝐻ℎ
and 3-level decomposition of 𝐻𝑠 and 𝐻𝑣 result in approximation coefficient vectors
of lengths 64, 32 and 32 respectively. Therefore, the total length of color feature
vector becomes 128.
In the method proposed, Haar wavelet is used because the objective here is
to reduce only the size of feature vector. Also, Haar filter is smallest in size leading
to a very fast method of color feature extraction.
Fig. 7.1: Wavelet based color feature extraction scheme
223
7.2.1.2 Texture Feature Extraction Scheme
Wavelets are good at capturing details of an image in “horizontal, vertical
and diagonal directions” even in the presence of noise where EHD features alone
do not show good performance. Also wavelets can capture these details at various
resolution levels similar to human visual system, while only EHD cannot do this.
Therefore, an attempt is made to combine their capabilities to achieve better
retrieval results. In the proposed scheme, major image details are captured by
wavelet transform of the image at three resolution levels. Dominant edges are
then captured by EHD to form the feature vector. The overall scheme is shown in
figure 7.2.
Fig. 7.2: Wavelet and EHD based texture feature extraction scheme
The edge details of wavelet coefficients are captured by the calculating edge
histogram as shown in figure 7.2. The edge histogram is calculated by the method
already discussed in section 3.3.1.1 of chapter 3. However, 5 bins are added as
global features of the image. These 5 more bins are obtained by averaging all the
224
16 sets of 5 bins representing local image properties. Therefore, length of the edge
histogram vector becomes 85 instead of 80. As can be seen from figure 7.2, the
edge histograms are calculated for wavelet coefficients at all three levels.
Therefore, the total length of texture feature vector becomes 3 × 85 = 255. The
steps are as follows.
i) Decompose image (𝐼) up to level 3 and collect all the wavelet
coefficients at each resolution level.
ii) Arrange wavelet coefficients in a matrix 𝐶𝑚 for each level of
decomposition as shown in figure 7.2.
iii) Find Edge Histogram of matrix 𝐶𝑚 for each level of decomposition. The
edge histogram generates 85 values of texture features. Therefore, for
all the three levels, 255 values are generated.
iv) Concatenate all the edge histogram values to form a single texture
feature vector(𝑓𝑣𝑡) and store it in feature database.
In the method proposed, Db10 wavelet is used for texture feature extraction. This
is a higher order smooth wavelet, which is capable of capturing fast changing (high
frequency) details of images. Also in chapter 6, it is already shown that Db10
performs better than other wavelets for capturing edge details of characters.
Therefore, Db10 is utilized for proposed CBIR case as mainly image details are
captured in form of edges.
7.2.2 Experimental Results and Analysis
The proposed scheme is tried on images of various categories. Two popular
databases are Wang’s image database [Wan2013] and Microsoft Research
Cambridge Object Recognition Image Database [Mic2013]. “These are used for
training and testing the performance of the proposed scheme. The Wang’s image
database is used in implementation of SIMPLIcity software package [Wan2001]
and also used by majority of researchers to show the performance of their
proposed CBIR schemes”.
“The Wang’s image database consists of 1000 images having 10 categories
of 100 images each. These 10 categories and their sample images are shown in
table 7.2”.
225
Table 7.2: Wang’s Image Database
Category Sample images
Africans
Beaches
Buildings
Buses
Dinosaurs
Elephants
Roses
Horses
Mountains
Food
Table 7.3: Microsoft Research Image Database Category Sample images
Aero planes
Cow
Sheep
Bi-Cycles
Cars
Chimneys
Clouds
Doors
Cutlery
Landscapes
Trees
Windows
226
Each image of the database is an RGB color image of size 288 × 384
or 384 × 288. The Microsoft’s image database consists of 4129 images of total 18
categories. The division of images in each category is not equal in this case unlike
Wang’s database. Here some categories have more images while others have very
few images. Therefore, for testing purpose, total 12 categories are selected. These
categories along with sample images are shown in table 7.3. Each Image in
Microsoft’s database is an RGB color image of size 640 × 480 or 480 × 640.
Before testing the proposed scheme, it is trained with images of Wang’s and
Microsoft’s image databases. The training procedure is shown in figure 7.3.
Fig.7.3: Training procedure for proposed CBIR scheme
In the training procedure, “both color and texture features are calculated
for each image in the database and these feature vectors are stored in database”.
Similarly for testing, “both the color and texture feature vectors are calculated for
the query image and compared with feature vectors already stored in database”.
Then the first ′𝑛′ nearest matches are displayed as retrieval results. The testing
procedure is shown in figure 7.4. “The similarity check finds distance between
feature vector of query image and feature vectors stored in the database”. In the
proposed scheme, nearest match is found by the sum of absolute differences of
feature vectors. This is calculated using equation 6.1 of chapter 6.
227
Fig. 7.4: Testing procedure for proposed CBIR scheme
To test the performance of the proposed CBIR scheme, the following 4 test
procedures are followed.
Procedure 1 (P1): In this procedure, only color features are used.
Procedure 2 (P2): In this procedure, only texture feature are used.
Procedure 3 (P3): It is a two-stage classification process which utilizes both the
color and texture features. Here color feature of query image is used as primary
classification then texture feature is used to find final results out of results of
primary classification.
Procedure 4 (P4): This is also a two-stage classification process where both the
features are used. Here, texture feature of query image is first used as primary
classification. Then color features are used to find the set of final results, out of the
results of primary classification.
“The results of relevant image retrieval are expressed in terms of “Precision and
Recall, which are adopted by the majority of researches. “Precision and Recall are
defined as follows.
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =𝑁𝑂
𝑁𝑟 (7.1)
𝑅𝑒𝑐𝑎𝑙𝑙 =𝑁𝑂
𝑁𝑑 (7.2)
228
𝑁𝑜: Number of relevant images retrieved.
𝑁𝑟: Number of total images requested.
𝑁𝑑: Total number of relevant images in database.
Both Precision and Recall are expressed in percentage terms. Larger values
of both precision and recall represent good performance of the CBIR system”.
To test the retrieval performance, 10 test images from Wang’s database and
12 test images from Microsoft’s database are randomly selected. These test images
are used as query images and are shown in figure 7.5(a) and (b). The selected test
images are taken from each category of both the databases.