1 Texture and Shape for Image Retrieval – Multimedia Analysis and Indexing Winston H. Hsu National Taiwan University, Taipei October 23, 2007 Office: R512, CSIE Building Communication and Multimedia Lab (通訊與多媒體實驗室) http://www.csie.ntu.edu.tw/~winston -2- MMAI, Fall 07 - Winston Hsu, NTU Outline Texture Statistical features Spectral features Edge Shape
33
Embed
Texture and Shape for Image Retrieval – Multimedia ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
Texture and Shape for Image Retrieval– Multimedia Analysis and Indexing
Winston H. HsuNational Taiwan University, Taipei
October 23, 2007
Office: R512, CSIE BuildingCommunication and Multimedia Lab (通訊與多媒體實驗室)http://www.csie.ntu.edu.tw/~winston
-2-MMAI, Fall 07 - Winston Hsu, NTU
Outline Texture
Statistical features Spectral features
Edge Shape
2
-3-MMAI, Fall 07 - Winston Hsu, NTU
Reminder Homework #2
Due: TA@501 (noon, Tuesday, November 13) Rule – “deliver quality work on time with integrity!!”
Midterm A small recap of what we mentioned (major literatures) High-level concepts mentioned in the course Open book (no computer) but requiring no print-out
Mailing list http://cmlmail.csie.ntu.edu.tw/mailman/listinfo/mmai
Fusion of Multimodal Features How to weigh the feature significance ?
Cross-validation approach User-selected Automatically weighting by relevance feedback
Retrieval Resultsby
Different Features
Ranking ->
Sco
re -
>
Fusion approaches such as:Sum (Borda fuse)WtSum (weigthed Borda Fuse)Max (Round-Robin)
* From Kieran Mc Donald
4
-7-MMAI, Fall 07 - Winston Hsu, NTU
-8-MMAI, Fall 07 - Winston Hsu, NTU
Texture What is texture
Has structures or repetitious pattern, i.e., checkboard Has statistical patterns, i.e., grass, sand, rock
Why texture? Applications to satellite images, medical images Describe contents of real world images, i.e., clouds,
fabrics, surfaces, wood, stone Data set
e.g., Brodatz: famous texture photographs for image-texture analysis
Man-made textures & natural objects
5
-9-MMAI, Fall 07 - Winston Hsu, NTU
Mosaic of Brodatz Texture
-10-MMAI, Fall 07 - Winston Hsu, NTU
Types of Computational Texture Features Structural – describing arrangement of texture elements Statistical – characterizing texture in terms of statistical
features Co-occurrence matrix Tamura (coarseness, directionality, contrast) Multiresolution simultaneous autoregressive model (MRSAR) Edge histogram
Spectral – based on analysis in spatial-frequencydomain Fourier domain energy distribution Gabor Pyramid-structure wavelet transform (PWT) Tree-structure wavelet transform (TWT) Laws Filter
6
-11-MMAI, Fall 07 - Winston Hsu, NTU
Co-occurrence Matrix Co-occurrence matrix Cd
Specified with a displacement vector d = {(row, column)} Entry Cd(i, j) indicates how many times a pixel with gray
level i is separated from a pixel of gray level j by thedisplacement vector d
Usually use normalized version of Cd
Sometimes use symmetric version of Cd
d = (1, 1) physical meaning?
-12-MMAI, Fall 07 - Winston Hsu, NTU
Co-occurrence Matrix (cont.) Examples
* From Prof. Leow Wee Kheng, NUS
7
-13-MMAI, Fall 07 - Winston Hsu, NTU
Co-occurrence Matrix (cont.) Consider the following example (black = 1, white = 0)
For d=(1,1), the only non-zero entries are at (0,0) and(1,1) captures diagonal structure
For d=(0,1), the only non-zero entries are at (0,1) and(1,0) captures horizontal structure
-14-MMAI, Fall 07 - Winston Hsu, NTU
Measures on the following features What does it mean when entropy has the largest value as the Nd(i,j) are
equal?
A almost-obsolete feature Not effective for classification and retrieval Expensive to compute
Co-occurrence Matrix (cont.)
8
-15-MMAI, Fall 07 - Winston Hsu, NTU
Tamura – Selected Textual Properties
fine / coarse
high contrast / low contrast
roughness / smooth
directional / non-directional
line-like / blob-like
regular / irregular
-16-MMAI, Fall 07 - Winston Hsu, NTU
Psychophysical experiments – high correlation betweensome groups of properties Coarseness Contrast Roughness
Vector Space Concept Orthonormal Bases (d-dim. vectors)
Any vector in a vector space can be expanded by the setof orthonormal signals
Response for basis k,
Transform to the new bases
(1D/2D) Fourier bases are sets of orthornomal signals
14
-27-MMAI, Fall 07 - Winston Hsu, NTU
�
F g x, y( )( ) u,v( ) = g x, y( )e! i2" ux+vy( )dxdy
R2
##
The Fourier Transform Represent function on a
new basis Think of functions as
vectors, with manycomponents
We now apply a lineartransformation to transformthe basis
dot product with eachbasis element
In the expression, u and vselect the basis element,so a function of x and ybecomes a function of uand v
basis elements have theform
�
e!i2" ux+vy( )
-28-MMAI, Fall 07 - Winston Hsu, NTU
Visual Sinus Pattern*
*The following 5 slides are from Jaap van de Loosdrecht, NoordelijkeHogeschool Leeuwarden
15
-29-MMAI, Fall 07 - Winston Hsu, NTU
Visual Sinus Pattern w/ Low Frequency
-30-MMAI, Fall 07 - Winston Hsu, NTU
Sinus Pattern Rotated 45 Deg.
16
-31-MMAI, Fall 07 - Winston Hsu, NTU
2D Sinus Pattern
-32-MMAI, Fall 07 - Winston Hsu, NTU
Difference in spatial vs. frequency domain 1D sync function of different scales
2D Rectangle
17
-33-MMAI, Fall 07 - Winston Hsu, NTU
Interpreting the Power Spectrum Explain structures in power spectrum
DC
high frequency
low frequency
1
23 3 brightdark
-34-MMAI, Fall 07 - Winston Hsu, NTU
Phase and Magnitude Fourier transform of a
real function is complex difficult to plot, visualize instead, we can think of the
phase and magnitude ofthe transform
Phase is the phase of thecomplex transform
Magnitude is themagnitude of the complextransform
Curious fact all natural images have
about similar magnitudetransform
hence, phase seems tomatter, but magnitudelargely doesn’t
Same for audio?
Demonstration Take two pictures, swap
the phase transforms,compute the inverse - whatdoes the result look like?
18
-35-MMAI, Fall 07 - Winston Hsu, NTU
-36-MMAI, Fall 07 - Winston Hsu, NTU
This is themagnitudetransformof the zebrapic
19
-37-MMAI, Fall 07 - Winston Hsu, NTU
This is thephasetransformof the zebrapic
-38-MMAI, Fall 07 - Winston Hsu, NTU
20
-39-MMAI, Fall 07 - Winston Hsu, NTU
This is themagnitudetransformof thecheetah pic
-40-MMAI, Fall 07 - Winston Hsu, NTU
This is thephasetransformof thecheetah pic
21
-41-MMAI, Fall 07 - Winston Hsu, NTU
Reconstructionwith zebraphase, cheetahmagnitude
-42-MMAI, Fall 07 - Winston Hsu, NTU
Reconstructionwith cheetahphase, zebramagnitude
22
-43-MMAI, Fall 07 - Winston Hsu, NTU
Natural Images and Their FT
What happened to the FT patterns when the texture scale andorientation are changed?
-44-MMAI, Fall 07 - Winston Hsu, NTU
Frequency Domain FeaturesFourier domain energy distribution Angular features (directionality)
where,
Radial features (coarseness)
where,
Uniform division may not be the best!!
F T
23
-45-MMAI, Fall 07 - Winston Hsu, NTU
Gabor Texture Fourier coefficients depend on the entire image (Global) we lose
spatial information Objective: local spatial frequency analysis Gabor kernels: looks like Fourier basis multiplied by a Gaussian
The product of a symmetric (even) Gaussian with an oriented sinusoid Gabor filters come in pairs: symmetric and anti-symmetric (odd) Each pair recover symmetric and anti-symmetric components in a
particular direction (kx, ky): the spatial frequency to which the filter responds strongly σ : the scale of the filter. When σ = infinity, similar to FT
We need to apply a number of Gabor filters are different scales,orientations, and spatial frequencies
-46-MMAI, Fall 07 - Winston Hsu, NTU
Example – Gabor Kernel
Gabor kernel
zebra image
magnitude of the filtered image
Zebra stripes at different scales and orientations and convolved withthe Gabor kernel
The response falls off when the stripes are larger or smaller The response is large when the spatial frequency of the bars
roughly matches the windowed by the Gaussian in the Gabor kernel Local spatial frequency analysis
24
-47-MMAI, Fall 07 - Winston Hsu, NTU
Gabor Texture (cont.) Image I(x,y) convoluted with Gabor filters hmn
(totally M x N)
Using first and 2nd moments for each scale andorientations
Arranging the mean energy in a 2D form structured: localized pattern oriented (or directional): column pattern granular: row pattern random: random pattern
orientation
scale
frequency domain
25
-49-MMAI, Fall 07 - Winston Hsu, NTU
Laws Texture Energy Features Non-Fourier type bases Match better to intuitive texture features The filter algorithm
Filter the input image using texture filters Computer texture energy by summing the absolute
value of filtered results in local neighborhoods aroundeach pixel
Combine features to achieve rotational invariance
-50-MMAI, Fall 07 - Winston Hsu, NTU
Law’s Texture Masks (1)
Basic 1D masks can be extended to create2D masks L5 (Level) = [ 1 4 6 4 1 ]
(Gaussian) gives a center-weighted local average
E5 (Edge) = [ -1 -2 0 2 1 ](gradient) responds to row or column step edges
Texture Comparisons (cont.) Retrieval performance of texture features in terms of the number of
top matches considered using Brodatz album
# of top matches considered
recall
[Ma’98]
Running
RunningMRSAR (M)Gabor
TWTPWT MRSAR
Tamura (improved)
Coarseness histogram
directionality
edge histogram
Tamura
29
-57-MMAI, Fall 07 - Winston Hsu, NTU
Texture Comparisons (cont.) Images of rock samples in applications related to oil exploitation
[Li’00]
-58-MMAI, Fall 07 - Winston Hsu, NTU
Texture Comparisons (cont.) Images of rock samples in applications related to oil exploitation
Gabor descriptors outperform the others
[Li’00]
30
-59-MMAI, Fall 07 - Winston Hsu, NTU
Learned Similarity Distance metrics DO matter
All based onGabor features
Euclidean vs.learned (supervised)distance metric
The later wasmaintained withtexture thesaurus
[Ma’96]
Euclideandistance
learned (supervised)distance
-60-MMAI, Fall 07 - Winston Hsu, NTU
Shape Region-base descriptor Contour-based Shape Descriptor 2D/3D Shape Descriptor Some relevant ones are included in MPEG-7 Not easy to derive automatically
[Bober’01]
31
-61-MMAI, Fall 07 - Winston Hsu, NTU
Region-based vs. Contour-based Descriptor
Columns indicate contour similarity Outline of contours
Rows indicate region similarity Distribution of pixels
-62-MMAI, Fall 07 - Winston Hsu, NTU
Region-based Descriptor Express pixel distribution within a 2D object region Employs a complex 2D Angular Radial Transformation
(ART) 35 fields each of 4 bits
Rotational and scale invariance Robust to some non-rigid transformation L1 metric on transformed coefficients Advantages
Describing complex shapes with disconnected regions Robust to segmentation noise Small size Fast extraction and matching
32
-63-MMAI, Fall 07 - Winston Hsu, NTU
(a)
(b)
(c)
(d)
(e)
Contour-based Descriptor It’s based on Curvature (曲率) Scale-Space (CSS)
representation Found to be superior to
Zernike moments ART Fourier-based Turning angles Wavelets
Rotational and scale invariance Robust to some non-rigid transformations For example
Applicable to (a) Discriminating differences in (b) Finding similarities in (c)-(e)
-64-MMAI, Fall 07 - Winston Hsu, NTU
Problems in Shape-based IndexingMany existing approaches assume Segmentation is given Human operator circle object of interest Lack of clutter and shadows Objects are rigid Planar (2-D) shape models Models are known in advance
33
-65-MMAI, Fall 07 - Winston Hsu, NTU
Summary Texture features
Statistical Spectral
Texture computation are time-consuming compressed domain features?
Shape features Multimodal fusion are quite helpful Next week
Efficient indexing on high-dimensional data Feature reduction