Top Banner
1 Recognition by Association: ask not “What is it?” ask “What is it like ?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08
41

1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

Apr 01, 2015

Download

Documents

Gage Capel
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

1

Recognition by Association:ask not “What is it?”

ask “What is it like?”

Tomasz Malisiewicz and Alyosha EfrosCMU

CVPR’08

Page 3: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

Object naming -> Object categorization

sky

building

flag

wallbanner

bus

cars

bus

face

street lamp

slide by Fei Fei, Fergus & Torralba

Page 4: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

Object categorization

sky

building

flag

wallbanner

bus

cars

bus

face

street lamp

Page 5: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

5

Classical View of Categories

• Dates back to Plato & Aristotle 1. Categories are defined by a

list of properties shared by all elements in a category

2. Category membership is binary

3. Every member in the category is equal

Page 6: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

6

Problems with Classical View• Humans don’t do this!

– People don’t rely on abstract definitions / lists of shared properties (Rosch 1973)• e.g. Are curtains furniture?

– Typicality• e.g. Chicken -> bird, but bird -> eagle, pigeon, etc.

– Intransitivity• e.g. car seat is chair, chair is furniture, but …

– Not language-independent• e.g. “Women, Fire, and Dangerous Things” category is

Australian aboriginal language (Lakoff 1987)

–Doesn’t work even in human-defined domains• e.g. Is Pluto a planet?

Page 7: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

Problems with Visual CategoriesChair

•A lot of categories are functional

Car

•Different views of same object can be visually dis-similar

Page 8: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

8

Categorization in Modern Psychology

• Prototype Theory (Rosch 1973)–One or more summary representations (prototypes) for

each category–Humans compute similarity between input and

prototypes

• Exemplar Theory (Medin & Schaffer 1978, Nosofsky 1986, Krushke 1992)–categories represented in terms of remembered objects

(exemplars)–Similarity is measured between input and all exemplars– think non-parametric density estimation

8

Page 9: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

Different way of looking at recognition

CarCarCar

Road

Building

Input Image

Page 10: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

10

Different way of looking at recognition

Page 11: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

11

What is the ultimate goal?

• Parsing Images

• A “what is it like?” machine

• A kind of “visual memex”

Page 12: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

12

Recognition as Association

LabelMe Dataset

12,905 Object Exemplars171 unique ‘labels’

http://labelme.csail.mit.edu/

Page 13: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

13

Our Contributions

• Posing Recognition as Association–Use large number of object exemplars

13

•Learning Object Similarity–Different distance function per exemplar

•Recognition-Based Object Segmentation– Use multiple segmentation approach

Page 14: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

14

Measuring Similarity

Page 15: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

15

Exemplar Representation

Segment from LabelMe

Page 16: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

16

ShapeCentered Mask

Bounding Box Dimensions

Pixel Area

Obj~Obj

Page 17: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

17

TextureTextons

top,bot,left,right boundary

Interior: Bag-of-Words

Page 18: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

18

ColorMean Color

Color std

Color Histogram

Page 19: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

19

LocationAbsolute Position Mask

0

1

.42

.8 Top Height

Bot Height

Page 20: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

Distance “Similarity” Functions

• Positive Linear Combinations of Elementary Distances Computed Over 14 Features

Building e Distance Function

Building e

Page 21: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

Learning Object Similarity

• Learn a different distance function for each exemplar in training set

• Formulation is similar to Frome et al [1,2][1] Andrea Frome, Yoram Singer, Jitendra Malik. "Image Retrieval and Recognition Using Local Distance Functions." In NIPS, 2006.

[2] Andrea Frome, Yoram Singer, Fei Sha, Jitendra Malik. "Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification." In ICCV, 2007.

Page 22: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

22

Non-parametric density estimation

Color Dimension

Sh

ap

e D

imen

sio

n

Class 1Class 2

Class 3

Page 23: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

23

Non-parametric density estimation

Color Dimension

Sh

ap

e D

imen

sio

n

Class 1Class 2

Class 3

Page 24: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

24

Non-parametric density estimation

Color Dimension

Sh

ap

e D

imen

sio

n

Class 1Class 2

Class 3

Page 25: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

25

Learning Distance Functions

25Dshape

Dcolor

Focal Exemplar

Page 26: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

26

Learning Distance Functions

26Dshape

Dcolor

Focal Exemplar

“similar” side

DecisionDecisionBoundaryBoundary

“dissimilar” side

Don’t Care

Page 27: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

Visualizing Distance Functions (Training Set)

Query

Query

Top Neighbors with Tex-Hist Dist

Top Neighbors with Learned Dist

Page 28: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

Visualizing Distance Functions (Training Set)

Page 29: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

Visualizing Distance Functions (Training Set)

Page 30: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

Visualizing Distance Functions (Training Set)

Page 31: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

personperso

nperso

nperso

nperso

n

Visualizing Distance Functions (Training Set)

Page 32: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

3232

Visualizing Distance Functions (Training Set)

personperso

nperso

n

standing

personwoman

person

Different Label on “similar” side of distance function

Page 33: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

Labels Crossing Boundary

Page 34: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

34

“Conventional” Recognition in Test Set

• Compute the similarity between an input and all exemplars

• All exemplars with D < 1 are “associated” with the input

• Most occurring label from associations is propagated onto input

• Association confidence score favors more associations and smaller distances

34

Page 35: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

Performance on labeling perfect segments (test set)

Page 36: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

Object Segmentation via Recognition

• Generate Multiple Segmentations (Hoiem 2005, Russell 2006, Malisiewicz 2007)

– Mean-Shift and Normalized Cuts

– Use pairs and triplets of adjacent segments

– Generate about 10,000 segments per image

• Enhance training with bad segments

• Apply learned distance functions to bottom-up segments

Page 37: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

37

Example AssociationsBottom-Up Segments

Page 38: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

38

Quantitative Evaluation

38

Object hypothesis is correct if labels match and OS > .5

*We do not penalize for multiple correct overlapping associations

OS(A,B) = Overlap Score = intersection(A,B) / union(A,B)

Page 39: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

39

Toward Image Parsing

39

Page 40: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

40

Conclusion: Main Points• Object Association: defining an object in terms of a set of visually similar objects. Trying to get away from classes.

• Learning per-examplar-distances: each object gets to decide on its own distance function. Suddenly, NN distances are meaningful!

• Using multiple segmentations: partition the input image into manageable chunks than can then be matched

Page 41: 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

41

Thank You

41

Questions?