Top Banner
Yimeng Zhang, Zhaoyin Jia and Tsuhan Chen Cornell University Image Retrieval with Geometry-Preserving Visual Phrases
30

Image Retrieval with Geometry-Preserving Visual Phrases

Mar 23, 2016

Download

Documents

Mercia Mathias

Image Retrieval with Geometry-Preserving Visual Phrases. Yimeng Zhang, Zhaoyin Jia and Tsuhan Chen Cornell University. Similar Image Retrieval. …. Image Database. Ranked relevant images. Bag-of-Visual-Word (BoW) . Images are represented as the histogram of words - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Image Retrieval with  Geometry-Preserving Visual Phrases

Yimeng Zhang, Zhaoyin Jia and Tsuhan ChenCornell University

Image Retrieval with Geometry-Preserving Visual Phrases

Page 2: Image Retrieval with  Geometry-Preserving Visual Phrases

Similar Image Retrieval

Ranked relevant images

Image Database

Page 3: Image Retrieval with  Geometry-Preserving Visual Phrases

Bag-of-Visual-Word (BoW)

Images are represented as the histogram of words

Similarity of two images: cosine similarity of histograms

…Length: dictionary size

Page 4: Image Retrieval with  Geometry-Preserving Visual Phrases

Geometry-preserving Visual Phrases length-k Phrase:: k words in a certain spatial layout

……

(length-2 phrases)Bag of Phrases:

Page 5: Image Retrieval with  Geometry-Preserving Visual Phrases

Phrases vs. Words

Word

Length-2

Length-3

Word

Length-2

Length-3

Irrelevant Relevant

Page 6: Image Retrieval with  Geometry-Preserving Visual Phrases

Previous Works

Page 7: Image Retrieval with  Geometry-Preserving Visual Phrases

Geometry Verification

Searching Step with BoW

Post-processing (Geometry Verification)

Only on top ranked images

Encode Spatial Info

Page 8: Image Retrieval with  Geometry-Preserving Visual Phrases

Modeling relationship between words

Co-occurrences in Entire image [L. Torresani, et al, CVPR 2009]

No spatial information

Phrases in a local neighborhoods [J. Yuan et al, CVPR07][Z. Wu et al., CVPR10]

[C.L.Zitnick, Tech.Report 07]

No long range interactions, weak geometry

Select a subset of phrases [J. Yuan et al, CVPR07]

Discard a large portion of phrases

……

(length-2 Phrase)

Dimension: exponential to # of words in Phrase

Previous works: reduce the number of phrases

Our work: All phrases, Linear computation time

Page 9: Image Retrieval with  Geometry-Preserving Visual Phrases

Approach

Page 10: Image Retrieval with  Geometry-Preserving Visual Phrases

Overview

BoW BoP

1. Similarity Measure

2. Large Scale Retrieval

InvertedFiles

Min-hash InvertedFiles Min-hash

[Zhang and Chen, 09]

This Paper

Page 11: Image Retrieval with  Geometry-Preserving Visual Phrases

Co-occurring Phrases

A B

C

A B

C

D

F

D

F

A

A

E F

E F

[Zhang and Chen, 09]

Only consider the translation difference

Page 12: Image Retrieval with  Geometry-Preserving Visual Phrases

F

F

Co-occurring Phrase Algorithm

A B

C

A B

Cxxx '

yyy '

-2 -1 0 1 2 3 4

32

1

0-1-2-3-4

BCA

DF

A

EF

Offset space

D

F

D

F

A

A

E F

E F

[Zhang and Chen, 09]

# of co-occurring length -2 Phrases:

1 +1

32

=5

A

FA

Page 13: Image Retrieval with  Geometry-Preserving Visual Phrases

Relation with the feature vector

……

……

)(xk )(yk

)(),( yx kk

Inner product of the feature vectors

# of co-occurring length-k phrases)|||||(| 11 kkk YXO

M: # of corresponding pairs, in practice, linear to the number of local features

)(MO same as BOW!!!

Page 14: Image Retrieval with  Geometry-Preserving Visual Phrases

Inverted Index with BoWAvoid comparing with every image

Score table

Image ID I1 I2 … InScore +1

Inverted Index

Page 15: Image Retrieval with  Geometry-Preserving Visual Phrases

Inverted Index with Word Location

……

I1

Assume same word only occurs once in the same image, Same memory usage as BoW

Page 16: Image Retrieval with  Geometry-Preserving Visual Phrases

Score TableCompute # of Co-occurring Phrases:

BoW

Compute the Offset Space

Image ID I1 I2 … InScore

I1 I2 In

BoP

Page 17: Image Retrieval with  Geometry-Preserving Visual Phrases

wi

Inverted Files with Phrases

…Offset Space

+1 +1+1+1

I1 I10 …

I8 …

I5 …

……

Inverted Index

0,0 1,0

0,1

0,-1 1,-1-1,-1

-1,0

…… …

Page 18: Image Retrieval with  Geometry-Preserving Visual Phrases

Final Score

I1 I2 In

OffsetSpace

Image ID I1 I2 … InScore

Final similarity scores

5

82

1

32

2

4 2

101

Page 19: Image Retrieval with  Geometry-Preserving Visual Phrases

Overview

BoW BoP

InvertedFiles

Min-hash InvertedFiles Min-hash

Less storage and time complexity

Page 20: Image Retrieval with  Geometry-Preserving Visual Phrases

Min-hash with BoW

Probability of min-hash collision(same word)= Image Similarity

I

I’

imf

Page 21: Image Retrieval with  Geometry-Preserving Visual Phrases

Min-hash with Phrases

Probability of k min-hash collision with consistent geometry(Details are in the paper)

I

I’

imf

jmf

Offset spacexxx '

yyy '

-3 -2 -1 0 1 2

32

1

0

-1-2-3-4

Page 22: Image Retrieval with  Geometry-Preserving Visual Phrases

Other Invariances

)ˆlog(s

''ˆssxxx

''ˆssyyy x y

'x 'y

Image I

Image I’

1p

2p3p

Add dimension to the offset spaceIncrease the memory usage

[Zhang and Chen, 10]

Page 23: Image Retrieval with  Geometry-Preserving Visual Phrases

Variant MatchingLocal histogram matching

Page 24: Image Retrieval with  Geometry-Preserving Visual Phrases

Evaluation

1. BoW + Inverted Index vs. BoP + inverted Index

2. BoW + Min-hash vs. BoP + Min-hash

Post-processing methods: complimentary to our work

Page 25: Image Retrieval with  Geometry-Preserving Visual Phrases

Experiments –Inverted Index5K Oxford dataset (55 queries)1M flicker distracters

Philbin, J. et al. 07

Page 26: Image Retrieval with  Geometry-Preserving Visual Phrases

Example Precision-recall curve

Higher precision at lower recall

BoWBoP

Recall

Prec

ision

BoPBoW

RecallPr

ecisi

on

BoW

Page 27: Image Retrieval with  Geometry-Preserving Visual Phrases

ComparisonMean average precision: mean of the AP on 55 queries

0 100 200 300 400 500 600 700 800 900 10000.450

0.500

0.550

0.600

0.650

0.700

Vocabulary Size (K)

mAP

Outperform BoW (similar computation)Outperform BoW+RANSAC (10 times slower on 150 top images)Larger improvement on smaller vocabulary size

BoP

BoW BoW+RANSAC

BoP+RANSAC

Page 28: Image Retrieval with  Geometry-Preserving Visual Phrases

+Flicker 1M Dataset

Computational ComplexityMethod Memory Runtime (seconds)

Quantization SearchBoW 8.1G 0.89s 0.137sBoP 8.5G 0.215s

BoW+RANSAC - 0.89s 4.137s

RANSAC: 4s on top 300 images

0 200 400 600 800 10000.4

0.450.5

0.550.6

0.65 BoWBoP

Number of Images

mAP

Page 29: Image Retrieval with  Geometry-Preserving Visual Phrases

Experiment - min-hash

University of Kentucky dataset

Minhash with BoW: [O. Chum et al., BMVC08]

200 500 8002.80

2.90

3.00

3.10

3.20

3.30

BoWBoP

# of min-hash fun.

Page 30: Image Retrieval with  Geometry-Preserving Visual Phrases

ConclusionEncode more spatial information into the BoW

Can be applied to all images in the database at the searching step

Same computational complexity as BoW

Better Retrieval Precision than BoW+RANSAC