Top Banner
Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition
39

Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

Jan 18, 2016

Download

Documents

Dina Montgomery
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

1Computer Vision Set: Object Recognition

Slides by C.F. Olson

Object Recognition

Page 2: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

2Computer Vision Set: Object Recognition

Slides by C.F. Olson

Object Recognition

• Object recognition is the process of determining whether an object appears in an image.

• Sometimes divided into two subproblems:– Identification: which objects are in the image?– Location: given that the object is present, where is it?

• Many methods solve both at the same time.

Page 3: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

3Computer Vision Set: Object Recognition

Slides by C.F. Olson

Appearance-Based Matching

• Appearance-based techniques use example images (templates or exemplars) of the objects to perform recognition (as opposed to extracted features).

• I include edge images in this definition, although this is considered feature-based by some.

Page 4: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

Recognition by Finding Patterns

• If we know exactly what something looks like then it is easy to find.– Stereo matching

• Objects look different under varying conditions:– Changes in lighting or color– Changes in viewing direction– Changes in size / shape

• A single exemplar is unlikely to succeed reliably!– However, it is impossible to represent all appearances of an object.

Computer Vision - A Modern ApproachSet: Recognition as Template Matching

Slides by D.A. Forsyth, C.F. Olson

Page 5: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

5Computer Vision Set: Object Recognition

Slides by C.F. Olson

Example

?=

?=Frontal faces are fairly easy to find (and sometimes classify)

However, changes to lighting and background cause problems.

Page 6: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

6Computer Vision Set: Object Recognition

Slides by C.F. Olson

Edge Matching

• Changes in lighting and color usually don’t have much effect on image edges.

Page 7: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

7Computer Vision Set: Object Recognition

Slides by C.F. Olson

Edge Matching

• Strategy:– Detect edges in template

and image– Compare edges images to

find the template– Must consider range of

possible template positions

Page 8: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

8Computer Vision Set: Object Recognition

Slides by C.F. Olson

Edge Matching Measures

• What measure should we use to compare edge images?

• Can count number of overlapping edges.– Not robust to changes in shape

• Better: count number of template edge pixels with some distance of an edge in the search image.

• Best: – Determine probability distribution of distance to nearest edge in search

image (if template at correct position)– Estimate likelihood of each template position generating image

Page 9: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

9Computer Vision Set: Object Recognition

Slides by C.F. Olson

Hausdorff Distance

• The Hausdorff distance has also been used to compare edge images.– This is a distance between two point sets

• If A and B are sets of points (for example edge pixel locations):h(A,B) = max min || a – b ||

a A b B

H(A,B) = max(h(A,B), h(B,A))

• The Hausdorff distance is not robust to outliers.– A single bad point can lead to a large distance.

• Common variation: take the median (or some other percentile):

h(A,B) = med min || a – b || a A b B

Page 10: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

10Computer Vision Set: Object Recognition

Slides by C.F. Olson

Searching for Matches

• Can we search for matches more efficiently than looking at every possible translation?

• What if the object in the image is a different size or rotated or skewed?

• Premise: Given a set of possible positions we can find a bound on the best possible match in the set without looking at all of the positions.

• Assume that the best score is small (like the Hausdorff distance). – Can change this easily.

Page 11: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

11Computer Vision Set: Object Recognition

Slides by C.F. Olson

Searching for Matches

• Divide-and-conquer strategy: – consider all positions as a set (a cell in the space of positions)– determine lower bound on score at best position in cell– if bound is too large, prune cell– if bound is not too large, divide cell into subcells and try each subcell

recursively– process stops when cell is “small enough”

• Unlike multi-resolution search, this technique is guaranteed to find all matches that meet the criterion (assuming that the lower bound is accurate).

Page 12: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

12Computer Vision Set: Object Recognition

Slides by C.F. Olson

Divide-and-Conquer

At each level, cells are pruned when possible and divided into smaller cells when not possible.

Page 13: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

13Computer Vision Set: Object Recognition

Slides by C.F. Olson

Finding the Bound

• Given a cell of possible template positions, how can we find a lower bound on the best score?– Look at score for the template position represented by the center

of the cell.– Subtract maximum change from the “center” position for any

other position in cell (occurs at cell corners)

• This strategy can also be applied when the score is based on a count on the number of pixels that match well.– Must count maximum number of pixels that could match well at a

position in the cell

Page 14: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

14Computer Vision Set: Object Recognition

Slides by C.F. Olson

Complex Transformations

• What if space of possible template positions is more complex?– Rotations– Scale– Shear

• Basic methodology is exactly the same!

• Complexity arises from determining bounds on distance.

Page 15: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

15Computer Vision Set: Object Recognition

Slides by C.F. Olson

Greyscale Matching

• Although edges are (mostly) robust to illumination changes, they throw away a lot of information.

• Can we apply similar to techniques to greyscale matching?

• Yes. Must compute pixel distance as a function of both pixel position and pixel intensity.

• Can be applied to color also.

Page 16: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

16Computer Vision Set: Object Recognition

Slides by C.F. Olson

Matching Gradients

• One way to be robust to illumination changes, but not throw away as much information is to compare image gradients.

• Matching is performed like matching greyscale images.

• Simple alternative: use (normalized) correlation.

Page 17: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

17Computer Vision Set: Object Recognition

Slides by C.F. Olson

Different Viewpoints

• What if we don’t know the viewpoint?– Non-frontal faces– Objects in arbitrary orientation

• A partial solution: linear transformations model small changes in viewpoint.

• A better solution: use templates that model all possible view directions.– Computationally expensive

Page 18: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

18Computer Vision Set: Object Recognition

Slides by C.F. Olson

Large Modelbases

• If we have many potential templates that we are looking can we search efficiently?

• One approach is based on eigenvectors of the templates.– eigenfaces

http://www-white.media.mit.edu/vismod/demos/facerec/basic.html

Page 19: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

19Computer Vision Set: Object Recognition

Slides by C.F. Olson

Feature-Based Recognition

• Feature-based methods extract features of some sort from the objects to be recognized and the images to be searched:– Surface patches– Corners– Linear edges

• A search is used to find feasible matches between object features and image features.

• The primary constraint is that a single position of the object must account for all of the feasible matches.

Page 20: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

20Computer Vision Set: Object Recognition

Slides by C.F. Olson

Interpretation Trees

• One method for finding sets of feasible matches is to search a tree.

• Each node in the tree represents a set of matches.– Root node represents empty set.– Each other node is the union of the matches in the parent node and

one additional match.– Wildcard is used for features with no match.

• Nodes are “pruned” when the set of matches is infeasible.– A pruned node has no children (all would have infeasible matches).

• Historically significant and still used, but less commonly.

Page 21: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

21Computer Vision Set: Object Recognition

Slides by C.F. Olson

Recognition by Hypothesize and Test

• General idea– Hypothesize object identity

and pose– Compare hypothesized

appearance to image

• Issues– Where do the hypotheses come

from?– How do we compare to image

(verification)?

• Simplest approach– Construct a correspondence for

small sets of object features to every correctly sized subset of image points

• These are the hypotheses– Expensive search, which is

also redundant. • Can be improved using

randomization.

Page 22: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

22Computer Vision Set: Object Recognition

Slides by C.F. Olson

What are the features?

• They have to project to similar features in the image:– Points– Lines– Conics– Other fitted curves– Regions (particularly the center of a region, etc.)

Page 23: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

23Computer Vision Set: Object Recognition

Slides by C.F. Olson

Pose Consistency

• Correspondences between image features and model features are not independent.

• A small number of correspondences yields the object position - the others must be consistent with this.

• Strategy:– Generate hypotheses using small numbers of correspondences (e.g. triples

of points for 3D recognition)– Project other model features into image and verify additional

correspondences

• Use the smallest number of correspondences necessary to achieve discrete object poses.

Page 24: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

24Computer Vision Set: Object Recognition

Slides by C.F. Olson

Figure from “Object recognition using alignment,” D.P. Huttenlocher and S. Ullman, Proc. Int. Conf. Computer Vision, 1986, copyright IEEE, 1986

Example (2D)

Page 25: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

25Computer Vision Set: Object Recognition

Slides by C.F. Olson

Example (3D)

Page 26: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

26Computer Vision Set: Object Recognition

Slides by C.F. Olson

Randomization

• Improved efficiency can be gained using RANSAC (Random Sample Consensus)

• Examine small sets of image features until likelihood of missing object becomes small.

• For each set of image features must consider all possible matching sets of model features

• (1 – wc)k = z– w is fraction of image points that are “good” (w ~ m/n)– c is number of correspondences necessary– k is number of trials– z is probability of every trial using one (or more) incorrect

correspondences

Page 27: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

27Computer Vision Set: Object Recognition

Slides by C.F. Olson

Pose Clustering

• Each object leads to many correct sets of correspondences, each of which has (roughly) the same pose– Vote on pose, in an accumulator array– This is a (essentially) a Hough transform.

• Note that this method uses sets of correspondences, rather than individual correspondences– Implementation is easier, since each set yields a small number of possible

object poses.

Page 28: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

28Computer Vision Set: Object Recognition

Slides by C.F. Olson

Figure from “The evolution and testing of a model-based object recognition system”, J.L. Mundy and A. Heller, Proc. Int. Conf. Computer Vision, 1990 copyright 1990 IEEE

Page 29: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

29Computer Vision Set: Object Recognition

Slides by C.F. Olson

Figure from “The evolution and testing of a model-based object recognition system”, J.L. Mundy and A. Heller, Proc. Int. Conf. Computer Vision, 1990 copyright 1990 IEEE

Page 30: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

30Computer Vision Set: Object Recognition

Slides by C.F. Olson

Example

Detected craters Estimated poseGreen: matched craters

Yellow: unmatched craters

Page 31: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

31Computer Vision Set: Object Recognition

Slides by C.F. Olson

Alignment vs. Voting

Not all correct sets of matches will lead to a good pose!

Page 32: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

32Computer Vision Set: Object Recognition

Slides by C.F. Olson

Grouping

If we can determine groups of points that are likely to come from the same object, we can reduce the number of hypotheses that need to be examined.

Page 33: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

33Computer Vision Set: Object Recognition

Slides by C.F. Olson

Invariance

• There are geometric properties that are invariant to camera transformations

• One case: a planar object with a linear transformation• Assume we have three basis points Pi on the object, then any other

point on the object can be written as:

• Image points are obtained by multiplying by a linear transformation, so:

Pk =P1 + P2 - P1( )+ P3 - P1( ) kk

qk=APk=A P1 + P2 - P1( )+ P3 - P1( )( )= q1 + q2 - q1( )+ q3 - q1( )

kk

kk

Page 34: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

34Computer Vision Set: Object Recognition

Slides by C.F. Olson

Invariance

• This means that, if we know the basis points in the image, we can compute all of the α’s and β’s.– they’re the same in object and in image, i.e. invariant

• However, we don’t know the correspondences.

• Suggests another voting strategy:– form α’s and β’s in image and vote for model points with same values

Page 35: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

35Computer Vision Set: Object Recognition

Slides by C.F. Olson

Geometric Hashing

1. Preprocess data by determining α’s and β’s for all sets of points in all objects in database. Store in “hash table”.– This step is offline.

2. Pick a possible basis set (3 points) in the image.

3. Use image to compute α’s and β’s for remaining points and use them to look up possible matches in hash table.

4. If any object basis set gets enough consistent votes, then it is likely to be present in the image. Otherwise, repeat from step 2.

5. Perform verification.

Page 36: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

36Computer Vision Set: Object Recognition

Slides by C.F. Olson

Indexing With Invariants

• It would be nice to have invariants for more general cases (nonplanar objects, nonlinear transformations).

• Store invariants in a lookup table and index objects quickly.

• Invariants exist for:– 4 planar points with a linear transformation– 5 planar points with a perspective projection– planar curves (lines and conics) with a perspective projection

• There is no (nontrivial) invariant for unrestricted sets of nonplanar points.

Page 37: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

37Computer Vision Set: Object Recognition

Slides by C.F. Olson

Figure from “Efficient model library access by projectively invariant indexing functions,” by C.A. Rothwell et al., Proc. Computer Vision and Pattern Recognition, 1992, copyright 1992, IEEE

Page 38: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

38Computer Vision Set: Object Recognition

Slides by C.F. Olson

Verification

• Edge score– Are there image edges near predicted object edges?– Can be unreliable; in textured areas there are many edges

• Oriented edge score– Are there image edges near predicted object edges with the right

orientation?– Better, but still has false positives (see next slide)

• Could use texture, hue, etc.– Does the tool have the same texture as the wood?

Page 39: Computer Vision Set: Object Recognition Slides by C.F. Olson 1 Object Recognition.

39Computer Vision Set: Object Recognition

Slides by C.F. Olson