Top Banner
10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University Dali, Swans Reflecting Elephants
57

10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Jan 15, 2016

Download

Documents

Molly McBride
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

10/21/10

Object Recognition and Augmented Reality

Computational PhotographyDerek Hoiem, University of Illinois

Dali, Swans Reflecting Elephants

Page 2: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Last class: Image Stitching1. Detect keypoints

2. Match keypoints

3. Use RANSAC to estimate homography

4. Project onto a surface and blend

Page 3: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Augmented reality

• Adding fake objects/textures to real images– Project by Karen Liu

• Interact with object in scene– Responsive characters in AR

• Overlay information on a display– Tagging reality– Layar– Google goggles– T2 video (13:23)

Page 4: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Adding fake objects to real video

Approach1. Recognize and/or track points that give you a

coordinate frame2. Apply homography (flat texture) or perspective

projection (3D model) to put object into scene

Main challenge: dealing with lighting, shadows, occlusion

Page 5: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Information overlay

Approach1. Recognize object that you’ve seen before2. Retrieve info and overlay

Main challenge: how to match reliably and efficiently?

Page 6: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Today

How to quickly find images in a large database that match a given image region?

Page 7: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Let’s start with interest points

Query Database

Compute interest points (or keypoints) for every image in the database and the query

Page 8: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Simple idea

See how many keypoints are close to keypoints in each other image

Lots of Matches

Few or No Matches

But this will be really, really slow!

Page 9: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Key idea 1: “Visual Words”• Cluster the keypoint descriptors

Page 10: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Key idea 1: “Visual Words”

K-means algorithm

Illustration: http://en.wikipedia.org/wiki/K-means_clustering

1. Randomly select K centers

2. Assign each point to nearest center

3. Compute new center (mean) for each cluster

Page 11: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Key idea 1: “Visual Words”

K-means algorithm

Illustration: http://en.wikipedia.org/wiki/K-means_clustering

1. Randomly select K centers

2. Assign each point to nearest center

3. Compute new center (mean) for each cluster

Back to 2

Page 12: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Key idea 1: “Visual Words”• Cluster the keypoint descriptors• Assign each descriptor to a cluster number

– What does this buy us?– Each descriptor was 128 dimensional floating

point, now is 1 integer (easy to match!)– Is there a catch?

• Need a lot of clusters (e.g., 1 million) if we want points in the same cluster to be very similar

• Points that really are similar might end up in different clusters

Page 13: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Key idea 1: “Visual Words”• Cluster the keypoint descriptors• Assign each descriptor to a cluster number• Represent an image region with a count of these

“visual words”

Page 14: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Key idea 1: “Visual Words”• Cluster the keypoint descriptors• Assign each descriptor to a cluster number• Represent an image region with a count of these

“visual words”• An image is a good match if it has a lot of the same

visual words as the query region

Page 15: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Naïve matching is still too slow• Imagine matching 1,000,000 images, each

with 1,000 keypoints

Page 16: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Key Idea 2: Inverse document file

• Like a book index: keep a list of all the words (keypoints) and all the pages (images) that contain them.

• Rank database images based on tf-idf measure.

tf-idf: Term Frequency – Inverse Document Frequency

# words in document

# times word appears in document

# documents

# documents that contain the word

Page 17: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Fast visual search

“Scalable Recognition with a Vocabulary Tree”, Nister and Stewenius, CVPR 2006.

“Video Google”, Sivic and Zisserman, ICCV 2003

Page 18: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Slide Slide Credit: Nister

110,000,000 Images in 5.8 Seconds

Page 19: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Slide Slide Credit: Nister

Page 20: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Slide Slide Credit: Nister

Page 21: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Slide Credit: NisterSlide

Page 22: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Recognition with K-tree

Following slides by David Nister (CVPR 2006)

Page 23: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 24: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 25: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 26: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 27: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 28: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 29: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 30: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 31: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 32: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 33: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 34: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 35: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 36: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 37: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 38: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 39: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 40: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 41: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 42: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 43: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 44: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 45: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Page 46: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Performance

Page 47: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

More words is better ImprovesRetrieval

ImprovesSpeed

Branch factor

Page 48: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Can we be more accurate?

So far, we treat each image as containing a “bag of words”, with no spatial information

af

z

e

e

afee

af

e

e

h

hWhich

matches better?

Page 49: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Can we be more accurate?

So far, we treat each image as containing a “bag of words”, with no spatial information

Real objects have consistent geometry

Page 50: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Final key idea: geometric verification• Goal: Given a set of possible keypoint

matches, figure out which ones are geometrically consistent

How can we do this?

Page 51: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Final key idea: geometric verificationRANSAC for affine transform

af

z

e

e

af

e

ez

af

z

e

e

af e

ez

Affine Transform

Randomly choose 3 matching pairs

Estimate transformation

Predict remaining points and count “inliers”

Repeat N times:

Page 52: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Application: Large-Scale Retrieval

K. Grauman, B. Leibe 54[Philbin CVPR’07]

Query Results on 5K (demo available for 100K)

Page 53: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Application: Image Auto-Annotation

K. Grauman, B. Leibe 55

Left: Wikipedia imageRight: closest match from Flickr

[Quack CIVR’08]

Moulin Rouge

Tour Montparnasse Colosseum

ViktualienmarktMaypole

Old Town Square (Prague)

Page 54: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Example Applications

B. Leibe 56

Mobile tourist guide• Self-localization• Object/building recognition• Photo/video augmentation

Aachen Cathedral

[Quack, Leibe, Van Gool, CIVR’08]

Page 55: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Video Google System

1. Collect all words within query region

2. Inverted file index to find relevant frames

3. Compare word counts4. Spatial verification

Sivic & Zisserman, ICCV 2003

• Demo online at : http://www.robots.ox.ac.uk/~vgg/research/vgoogle/index.html

57 K. Grauman, B. Leibe

Query region

Retrieved fram

es

Page 56: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Summary: Uses of Interest Points• Interest points can be detected reliably in different

images at the same 3D location– DOG interest points are localized in x, y, scale

• SIFT is robust to rotation and small deformation

• Interest points provide correspondence– For image stitching– For defining coordinate frames for object insertion– For object recognition and retrieval

Page 57: 10/21/10 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.

Announcements

• Project 4 is due Monday

• I’ll be out of town next Tues– Kevin Karsch will talk about his cool work

• I’ll still be here for office hours Mon (but leaving soon after for DC)