Top Banner
Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification Computer Vision, 2007. ICCV 2007. IEEE 11th Int ernational Conference on Andrea Frome , EECS, UC Berkeley Yoram Singer, Google, Inc Fei Sha , EECS, UC Berkeley Jitendra Malik, EECS, UC Berkeley
50

Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Feb 20, 2016

Download

Documents

do do

Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification. Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on Andrea Frome , EECS, UC Berkeley Yoram Singer, Google, Inc Fei Sha , EECS, UC Berkeley - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Learning Globally-Consistent Local Distance Functions for Shape-Based

Image Retrieval and Classification

Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Andrea Frome , EECS, UC BerkeleyYoram Singer, Google, Inc

Fei Sha , EECS, UC BerkeleyJitendra Malik, EECS, UC Berkeley

Page 2: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Outline

• Introduction• Training step• Testing step• Experiment & Result• Conclusion

Page 3: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Outline

• Introduction• Training step• Testing step• Experiment & Result• Conclusion

Page 4: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

What we do?

• Goal– classify an image to a more appropriate category

• Machine learning• Two steps– Training step– Testing step

Page 5: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Outline

• Introduction

• Training step• Testing step• Experiment & Result• Conclusion

Page 6: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Flow chart: training

Generate features each image from dataset, ex: SIFT or geometric blur

Input distances to SVM for training , evaluate W

Compute distance dji, dki

Page 7: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Flow chart: training

Generate features each image from dataset, ex: SIFT or geometric blur

Input distances to SVM for training , evaluate W

Compute distance dji, dki

Page 8: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Choosing features

• Dataset: Caltech101• Patch-based Features– SIFT• Old school

– Geometric Blur• It’s a notion of blurring• The measure of similarity between image patches• The extension of Gaussian blur

Page 9: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Geometric blur

Page 10: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Flow chart: training

Generate features each image from dataset, ex: SIFT or geometric blur

Input distances to SVM for training , evaluate W

Compute distance dji, dki

Page 11: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Triplet

• dji is the distance from image j to i• It’s not symmetric, ex: dji ≠ dij• dki > dji

dji dki

Page 12: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

How to compute distance

• L2 norm

12

3

dji, 1

m features

dji, 1distance vector dji

Image j

Image i

Page 13: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Example

• Given 101 category, 15 images each category101*15

Feature j

101*15

distance vector

distance vector

Image j vs training data

Page 14: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Flow chart: training

Generate features each image from dataset, ex: SIFT or geometric blur

Input distances to SVM for training , evaluate W

Compute distance dji, dki

Page 15: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Machine learning: SVM

• Support Vector Machine• Function: Classify prediction• Supervised learning• Training data are n dimension vector

Page 16: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Example

• Male investigate– Annual income– Free time

• Have girlfriend?

Page 17: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Ex: Training data

Page 18: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

spacefree

income

vector

Page 19: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on
Page 20: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Mathematical expression(1/2)

Page 21: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Mathematical expression(2/2)

Page 22: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Support vector

Model

free

income

Page 23: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

But the world is not so ideal.

Page 24: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Real world data

Page 25: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Hyper-dimension

Page 26: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Error cut

Page 27: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

SVM standard mathematical expression

Trade-off

Page 28: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

In this paper

• Goal: to get the weight vector W

101*15

feature

Image weight wj of W wj, 1

wj

Page 29: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Visualization of the weights

Page 30: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

How to choose Triplets?

• Reference Image– Good friend - In the same class– Bad friend - In the different class

• Ex: 101category, 15 images per category– 14 good friends & 15*100(1500) bad friends– 15*101(1515) reference images– total of about 31.8 million triplets

Page 31: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Mathematical expression(1/2)

• Idealistic: • Scaling:• Different:

The length of Weight i

0 0

triplet

Page 32: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Mathematical expression(2/2)

• Empirical loss:

• Vector machine:

Page 33: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Dual problem

Page 34: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Dual variable

• Iterate the dual variables:

Page 35: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Early stopping

• Satisfy KTT condition– In mathematics, a solution in

nonlinear programming to be optimal.• Threshold– Dual variable update falls below a value

Page 36: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Outline

• Introduction• Training step

• Testing step• Experiment & Result• Conclusion

Page 37: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Flow chart: testing

Query an image i

Output the most appropriate category

Calculate Dxi, x is all training data, except itself.

Page 38: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Flow chart: testing

Query an image i

Output the most appropriate category

Calculate Dxi, x is all training data, except itself.

Page 39: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Query image?

• Goal: classify the query image to an appropriate class

• Using the remaining images in the dataset as the query image

Page 40: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Flow chart: testing

Query an image i

Output the most appropriate category

Calculate Dxi, x is all training data, except itself.

Page 41: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Distance function(1/2)

• Query image i

Image i feature

101*15

distance vector

distance vector

Image i vs all training data

dxi, 1

Page 42: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Distance function(2/2)

101*15

Image I vs all the training data

Dji

Page 43: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Flow chart: testing

Query an image i

Output the most appropriate category

Calculate Dxi, x is all training data, except itself.

Page 44: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

How to choose the best image?

• Modified 3-NN classifier• no two images agree on the class within the

top 10– Take the class of the top-ranked image of the 10

Page 45: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Outline

• Introduction• Training step• Testing step

• Experiment & Result• Conclusion

Page 46: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Experiment & Result

• Caltech 101• Feature– Geometric blur (shape feature)– HSV histograms (color feature)

• 5, 10, 15, 20 training images per category

Page 47: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on
Page 48: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Confusion matrix for 15

Page 49: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Outline

• Introduction• Training step• Testing step• Experiment & Result

• Conclusion

Page 50: Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on

Conclusion

• Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification