Top Banner
Lecture 6: Multimedia Information Retrieval Dr. Jian Zhang NICTA & CSE UNSW COMP9314 Advanced Database S1 2007 [email protected]
40

Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

Apr 17, 2018

Download

Documents

dinhthien
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

Lecture 6: Multimedia Information Retrieval

Dr. Jian Zhang

NICTA & CSE UNSWCOMP9314 Advanced Database

S1 [email protected]

Page 2: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 2 – J Zhang

Reference Papers and ResourcesPapers:

Colour spaces-perceptual, historical and applicational background: An overview of colour spaces used in image processing.

Colour indexing: using Histogram Intersection for object identification and Histogram Back-projection for object location.

Comparing Images Using Color Coherence Vectors: The original paper for CCV.

Using Perceptually Weighted Histograms for Colour-based Image Retrieval: The original paper for PWH.

The QBIC Project-Querying Images By Content Using Color, Texture, and Shape: The original paper for IBM QBIC project.

Useful resources

MPEG-7 homepage: http://www.chiariglione.org/mpeg/standards/mpeg-7/mpeg-7.htm

IBM QBIC system homepage: http://wwwqbic.almaden.ibm.com/

UIUC CBIR system homepage: http://www.ifp.uiuc.edu/~qitian/MARS.html

Page 3: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 3 – J Zhang

6.1 Image Retrieval based on TextureTexture

Introduction to texture feature

The concept of texture is intuitively obvious but has no precise definition

Texture can be described by its tone and structure

Tone – based on pixel intensity properties

Structure – describes spatial relationships of primitives

Page 4: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 4 – J Zhang

6.1 Image Retrieval based on TextureTexture

MPEG-7 standardThe homogeneous texture descriptor (HTD). Two components of the HTD will be performed in the whole extraction procedure

Mean energyEnergy deviation

The 2-D frequency plane is partitioned into 30 frequency channels

The syntax of HTD = [fDC, fSD, e1,e2,…e30,d1,d2,…,d30] . where fDC and fSD are the mean and standard deviation of the image respectively, Where ei and di are the mean energy and energy deviation that nonlinearly scaled and quantized of the ith channel

Page 5: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 5 – J Zhang

6.1 Image Retrieval based on TextureTexture

The frequency plane partitioning is uniform along the angular direction but not uniform along the radial direction.

Page 6: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 6 – J Zhang

6.1 Image Retrieval based on TextureTexture

Each channel is modeled using Gabor function:

If a channel indexed by (s,r) where s is the radial index and r is the angular index. Then the (s,r)-channel in the freq. domain

Where and are the standard deviation of the Gaussian in the radial direction and the angular direction, respectively

⎥⎦

⎤⎢⎣

⎡ −−⋅⎥

⎤⎢⎣

⎡ −−=

2

2

2

2

22 r

r

s

sr,s

)(exp)(exp),(Gτθθ

σωωθω

sσ rτ

Page 7: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 7 – J Zhang

6.2 Image Retrieval based on TextureTexture

The energy of each channel is defined as the log-scaled sum of the square of the Gabor-filtered Fourier transform coefficients of an image

]1[log10 ii pe +=

[ ]∑ ∑+ +=

°

°=

=1

0

360

)0(

2, ),(),(

ω θ

θωωθω PGp rsi

where

Page 8: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 8 – J Zhang

6.2 Image Retrieval based on TextureTexture

the Fourier transform of an image represent in the polar freq. domain where is the Fourier transform in the Cartesian coordinate system

The energy deviation of each feature channel is defined as the log-scaled standard deviation of the square of the Gabor-filtered Fourier transform coefficients of an image

The HTD consists of the mean and standard deviation of the image intensity, the energy and energy deviation for each feature channel

),(P θω

]q[logd ii += 110[ ]{ }∑ ∑

+ +=

°

°=

−=1

0

360

)0(

22, ),(),(

ω θ

θωωθω irsi pPGqwhere

)sin,cos(F),(P θωθωθω = )v,u(F

idie

∑−

=∑−

=

+−=1

0

1

0

)//(2),(1),(M

x

N

y

NvyMuxjeyxfMN

vuF π

Page 9: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 9 – J Zhang

6.2 Image Retrieval based on TextureTexture

Texture [4] can also be defined as a function of the spatial variation in pixel intensities.

One example is to use statistical properties of the spatial distribution of gray-levels of an image. Two types of statistical properties can be used, i.e. (1) first-order statistics and (2) second-order statistics.

The first-order statistics measures only depend on the individual pixel gray-levels.

Define -- the number of distinct grey levelsDefine – the random variable denoting the grey-levelDefine -- the probability of a grey level occurring in the image

)z(p i

zL

Page 10: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 10 – J Zhang

6.2 Image Retrieval based on TextureTexture

The first-order statistics measures only depend on the individual pixel gray-levels.

Define -- the number of distinct grey levelsDefine – the random variable denoting the grey-levelDefine -- the probability of a grey level occurring in the image

Overall mean Overall standard deviation

Skewness R-Inverse variance

Overall Uniformity Overall Entropy

∑−

==

1

0

L

iii )z(pzm ∑

=

−=1

0

2 )()(L

iii zpmzσ

.)()()(1

0

33 ∑

=

−=L

iii zpmzzμ

)(111 2 z

Rσ+

−=

∑−

=

=1

0

2 )(L

iizpU ∑

=

−=1

010 )(log)(

L

iii zpzpe

)z(p i

zL

Page 11: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 11 – J Zhang

6.2 Image Retrieval based on TextureThe second-order statistics take into account the relationship between the pixel and its neighbors

The Grey-level Co-occurrence Matrix (GLCM) is used to calculate the second-order statistics. Suppose the following 4x4 pixel image with 3 distinct grey-levels:

And d = (dx, dy) = (1,0) means that compute the co-occurrences of the pixels to the left of the current one.

⎥⎥⎥⎥

⎢⎢⎢⎢

2200220000110011

Page 12: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 12 – J Zhang

6.2 Image Retrieval based on TextureThe 3x3 co-occurrence matrix is defined as follows. From the table, the element [0,0] in the GLCM matrix is 4. That is the number of counts of pixels with grey-level 0 that have a unit with a gray-level of 0 in the left

⎥⎥⎥⎥

⎢⎢⎢⎢

2200220000110011

Page 13: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 13 – J Zhang

6.2 Image Retrieval based on TextureThe Symmetrical GLCM can be computed by adding it to its transpose such as with the position operator (-1,0).

A GLCM will be then normalized by dividing each individual element by the total count in the matrix giving the co-occurrence probabilities.

Computing the GLCM over the full 256 gray-level is very expensive and it will also not achieve a good statistical approximation due to a lot of cells with zero values

A 16 linearly scaled grey-levels is commonly used in CBIR application. The position operation in a CBIR system can be: (1,0), (0,1), (1,1) and (-1,0).

Page 14: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 14 – J Zhang

6.2 Image Retrieval based on TextureBased on GLCM, the second-order statistics are then computed as follows:

Angular Second Moment (Energy) measures the homogeneity of the image

Entropy has the same meaning with one of the first-order statistics but using GLCM instead:

Inverse Difference Moment (Homogeneity) I is another measure of homogeneity which is sometimes called local homogeneity

∑∑=i j

ijcA 2

A

∑∑−=i j

ijij cc 2logδ

∑∑ −+=

i j

ij

jic

I 2)(1

Page 15: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 15 – J Zhang

6.2 Image Retrieval based on TextureContrast (Inertia) measures how inhomogeneous the image is

Correlation cor measures the linear dependency on the pairs of pixels:

∑∑ −=i j

ijcjiC 2)(

yx

i jijyx cji

corσσ

μμ∑∑ −−=

))((

∑ ∑=i j

ijx ci ][μ ∑ ∑=j i

ijy cj ][μ

∑ ∑−=i j

ijxx ci ])[( 2μσ ∑ ∑−=j i

ijyy cj ])[( 2μσ

Where

Page 16: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 16 – J Zhang

6.2 Image Retrieval based on TextureLocal Edge Histograms

The edge histogram descriptor (EHD) defined in MPEG-7 represents local edge distribution in the image

Specifically, the image is first divided into sub-images.

The local-edge distribution for each sub-image can be represented by a histogram.

To generate the histogram, edges in the sub-images are categorized into five types:

vertical, horizontal, 45 degree diagonal, 135 degree diagonal, non-directional edges and then computed for each sub-imagesSince there are 16 sub-images, totally 5x16=80 histogram bins are required

Page 17: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 17 – J Zhang

6.2 Image Retrieval based on TextureLocal Edge Histograms

384

256

96

64

8

8

ImageSub-Image

Image Block

a0 a1

a2 a3

An example for dividing an image into sub-images and 8x8 image blocks

4

4

4

4

(0,0) (0,1)

(1,0)

(2,0)(3,0)

(1,1)

(2,1)(3,1)

(0,2)

(1,2)

(2,2)(3,2)

(0,3)

(1,3)

(2,3)(3,3)

Page 18: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 18 – J Zhang

6.2 Image Retrieval based on TextureLocal Edge Histograms

EHD extraction:Each sub-image is first converted to grey-scale levels. The EHD calculation is based on image blocks such as 8x8 pixels.

For a 384x256 size of image, 16 sub-images is divided and each sub-image is further divided into 8x8 blocks, the average intensities in the image block are defined as a0, a1, a2 and a3 respectively.

The edge direction of a block is determined by calculating the edge magnitudes.

Page 19: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 19 – J Zhang

6.2 Image Retrieval based on TextureEHD extraction

The largest edge magnitude is chosen as the edge direction if the magnitude is larger than the threshold

If the magnitude is smaller than the threshold, the block will be decided as containing no-edge and its counts are discarded and not used in computing histograms.

The direction of the edge is shown below

m0 (Horizontal)

m45

45o

135o

m90 (Vertical)

The direction of the edge

Page 20: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 20 – J Zhang

6.2 Image Retrieval based on TextureEHD extraction

The edge magnitude can be calculated (digital filtering) as follows

After calculating the edge magnitude for each image block, 5 histogram columns for this sub-image will be calculated

321090 aaaam −+−= 32100 aaaam −−+=

3045 22 aam −= 21135 22 aam −=

3210 2222 aaaam ldirectionanon +−−=−

Page 21: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 21 – J Zhang

6.3 Image Indexing and Retrieval based on Shape

ShapeBasic concept on shape

The shape of an object or region reflects to its profile and physical structure.

A low-level feature – shape of objects within the images

For retrieval based on shapes, image must be segmented into individual objects

Due to the difficulty of robust and accurate image segmentation,the use of shape features for image retrieval has been limited to special applications where objects or regions are readily available

Page 22: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 22 – J Zhang

6.3 Image Indexing and Retrieval based on Shape

ShapeBasic concept on shape

A good shape representation and similarity measurement for recognition and retrieval purposes should have the following two important properties:

Each shape should have a unique representation, invariant to translation, rotation and scale;

Similar shapes should have similar representations so that retrieval can be based on distance among shape representation

Page 23: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 23 – J Zhang

6.3 Image Indexing and Retrieval based on Shape

Shape RepresentationBoundary-based methods

Chain Codes, fitting line segmentation, Fourier description…

Region-based methodsMoments, orientation …

Geometry-based methodsPerimeter measurement, area attribute …

Structure-based methodsMedial axis transform (MAT) – Skeleton and thinning algorithm

Page 24: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 24 – J Zhang

6.3 Image Indexing and Retrieval based on Shape

Boundary-based methods -- Chain CodeChain codes are used to represent a boundary by a connected sequence of straight-line segments of special length and direction

Typically, this representation is based on 4- or 8-connectivity of the segments. The direction of each segment is coded by using a numbering scheme

Direction numbers for 4-directional chain code Direction numbers for 8-directional chain code

Page 25: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 25 – J Zhang

6.3 Image Indexing and Retrieval based on Shape

Boundary-based methods -- Chain Code

Page 26: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 26 – J Zhang

6.3 Image Indexing and Retrieval based on Shape

Boundary-based methods -- Fourier Descriptors (FDs) A shape is first represented by a feature function called a shape signature. A discrete Fourier Transform (in frequency domain) is applied to the signature to obtain FD of the shape.

For u=0 to N-1, Where N is the number of samples of f(i).

Three commonly used signature: curvature based radius based boundary coordinator based

( )∑−

= ⎥⎦⎤

⎢⎣⎡ −⋅=

1

021 N

i Nuijexpif

NFn

π

Page 27: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 27 – J Zhang

6.3 Image Indexing and Retrieval based on Shape

Boundary-based methods -- Fourier Descriptors (FDs) The Radius-based signature – consists of a number of ordered distance from the shape centroid to boundary points (called radii). The radii are defined as

Where are the coordinates of the centroid and for i=0 to 63 are the coordinates of the 64 sample points along the shape boundary and the number of pixels between each two neighboring points is the same

A feature vector which is invariant to start point (p), rotation (r) and scale (s) should be calculated.

( ) ( )22icici yyxxr −+−=

⎥⎦

⎤⎢⎣

⎡=

0

63

0

1

FF

,...FF

x

( )ii y,x( )cc y,x

Page 28: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 28 – J Zhang

6.3 Image Indexing and Retrieval based on Shape

Boundary-based methods -- Fourier Descriptors (FDs)The distance between shapes is calculated as the Euclidean distance between their feature vectors.

Using FDs is to convert the sensitive radius lengths into the frequency domain where the data is more robust to small changes and noise.

The FDs capture the general features and form of the shape instead of each individual detail

Page 29: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 29 – J Zhang

6.3 Image Indexing and Retrieval based on Shape

Region-based shape representation and similarity measure

The shape similarity measurements based on shape representations, in general, do not conform to human perception.

The following similarity measurements do not match well with human similarity judgment. They are:

Algebraic Spline curve distance Cumulative turning angleSign of curvature and,Hausdorff-distance

Page 30: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 30 – J Zhang

6.3 Image Indexing and Retrieval based on Shape

Region-based shape representation and similarity measure

Basic idea of region-based shape representationAs shown in the figure below, if 1 is assigned to the cell with at least 15% of pixels covered by the shape, and a 0 to each of theother cells. The more grids, the more accurate the shape Rep.

A binary sequence is created by scanning from left to right and top to bottom – 11100000,11111000,01111110,01111111.

Generation of binary sequence for a shape

Page 31: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 31 – J Zhang

6.3 Image Indexing and Retrieval based on Shape

Rotation normalizationRotate the shape so that its major axis is parallel with the x-axis including two possibilities:

Only one of the binary sequences is saved while two orientations are accounted for during retrieval time by representing the query shape using two binary sequences

Two possible orientations with the major axis along the x direction

Page 32: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 32 – J Zhang

6.3 Image Indexing and Retrieval based on Shape

Scale normalizationAll shapes are scaled so that their major axes have the same fixed length.

Unique shape representation – shape indexAfter rotation and scale normalization and selection of a grid cell size, a unique binary sequence for each shape based on a unique major axis.

This binary sequence is used as a index of the shape

When the cell size is decided, the number of grid cells in the xdirection is fixed (i.e 8), The number of cells in the y direction depends on the eccentricity of the shape. The cell number for Y can range from 1 to 8.

Page 33: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 33 – J Zhang

6.3 Image Indexing and Retrieval based on Shape

Similarity measure between two shapes based on their indexes

Based on the shape eccentricities, there are three cases for similarity measurement

Same basic rectangle of two normalized shapes: bitwise compare and distance calculation between the shape point position values, For example:

A and B have the same eccentricity of 4

A = 11111111 11100000 and B= 11111111 1111100, then the distance value between A and B is 3

If two normalized shape have very different basic rectangles, wecan assume these two shapes are quite different (i.e. different on Minor Axis)

Page 34: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 34 – J Zhang

6.3 Image Indexing and Retrieval based on Shape

If two normalized shapes have slightly different basic rectangles, the perceptual similarity is still possible.

Add the 0s at the end of the index of the shape with shorter minor axis to extend the index to the same length as the other shapeExample:A = (2, 11111111 11110000) ,and B = (3, 11111111 11111000 11100000), then the shape A binary number is extended to the same length of B. Hence A = (3, 11111111 11110000 00000000). The distance of A and B is 4

Page 35: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 35 – J Zhang

6.4 Data Structure for Efficient Multimedia Similarity Search

IntroductionThe retrieval is based on the similarity between the query vector and the feature vector

If the feature dimensions high and the number of stored objects are huge, it will be too slow to do the linearly search for all features vectors

Techniques and data structures are required to re-organize feature vectors and develop fast search method to locate the relevant features quickly

The main idea is to divide the high dimension feature vector space into many sub-space and focus on one or a few sub-spaces for effective search

Page 36: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 36 – J Zhang

6.4 Data Structure for Efficient Multimedia Similarity Search

Three common queries:Point query – users’ query is represented as a vector

Feature vectors exactly match

Range query – users’ query is represented as a feature vector and distance range

The distance metrics – i.e. L1 and L2 (Euclidean distance)

The k nearest neighbours query – users’ query is specified by a vector and a integer k.

The k objects whose distances from the query are the smallest are retrieved.

Page 37: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 37 – J Zhang

6.4 Data Structure for Efficient Multimedia Similarity Search -- Filtering Process

Query methods based on color-histogramUse histograms with very few bins to select potential retrieval candidatesThen use the full histograms to calculate the distanceFor a special case, calculate the average of RGB value such as

where A = {R,G, B}

Given the average color vectors and of two images. The Euclidean distance:

Tavgavgavg BGRx ),( ,=

p

pA

avg

p

pA∑

= =1

)(

x y∑=

−=3

1

2)(),(i

iiavg yxyxd

Page 38: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 38 – J Zhang

6.5 Data Structure for Efficient Multimedia Similarity Search – B+ Tree

To achieve an efficient way for query processThe weakness of traditional similarity calculation on feature vectors within search space is sequentialA B+ tree is a hierarchical structure with a number of nodes to store the feature vectors

to record 10

to record 20

to record 60

Page 39: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 39 – J Zhang

6.5 Data Structure for Efficient Multimedia Similarity Search – B+ Tree

Multidimensional B+ TreeEach feature vector has two dimensions. The entire feature space is formed as a large rectangle identified by its lower left and top right corners.

Replace each key value with a rectangular regionThe pointers of leaf nodes point to lists of feature vectors within corresponding rectangular regions.

D1,2 0

D1,0 0 D2,1 0

D0,0 D0,1 D1,0 D1,1 D1,2 D2,0 D2,1 D3,0 0

to L0,0 to L0,1 to L1,0 to L1,1 to L1,2 to L2,0 to L2,1 to L3,0

Page 40: Lecture 6: Multimedia Information Retrievalcs9314/07s1/lectures/Jian_Intro_L6.pdf · Lecture 6: Multimedia Information Retrieval Dr. Jian ... COMP9314 Advanced Database Systems –

COMP9314 Advanced Database Systems – Lecture 6 – Slide 40 – J Zhang

6.6 Similarity Comparison

Given two feature vectors, I, J, the distance is defined as D(I,J) = f(I,J)Typical similarity metrics

Lp (Minkowski distance)Χ2 metric KL (Kullback-Leibler Divergence)JD (Jeffrey Divergence)QF (Quadratic Form)EMD (Earth Mover’s Distance)