1 Recognition by Association: ask not “What is it?” ask “What is it like ?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08
Apr 01, 2015
1
Recognition by Association:ask not “What is it?”
ask “What is it like?”
Tomasz Malisiewicz and Alyosha EfrosCMU
CVPR’08
Understanding an Image
slide by Fei Fei, Fergus & Torralba
Object naming -> Object categorization
sky
building
flag
wallbanner
bus
cars
bus
face
street lamp
slide by Fei Fei, Fergus & Torralba
Object categorization
sky
building
flag
wallbanner
bus
cars
bus
face
street lamp
5
Classical View of Categories
• Dates back to Plato & Aristotle 1. Categories are defined by a
list of properties shared by all elements in a category
2. Category membership is binary
3. Every member in the category is equal
6
Problems with Classical View• Humans don’t do this!
– People don’t rely on abstract definitions / lists of shared properties (Rosch 1973)• e.g. Are curtains furniture?
– Typicality• e.g. Chicken -> bird, but bird -> eagle, pigeon, etc.
– Intransitivity• e.g. car seat is chair, chair is furniture, but …
– Not language-independent• e.g. “Women, Fire, and Dangerous Things” category is
Australian aboriginal language (Lakoff 1987)
–Doesn’t work even in human-defined domains• e.g. Is Pluto a planet?
Problems with Visual CategoriesChair
•A lot of categories are functional
Car
•Different views of same object can be visually dis-similar
8
Categorization in Modern Psychology
• Prototype Theory (Rosch 1973)–One or more summary representations (prototypes) for
each category–Humans compute similarity between input and
prototypes
• Exemplar Theory (Medin & Schaffer 1978, Nosofsky 1986, Krushke 1992)–categories represented in terms of remembered objects
(exemplars)–Similarity is measured between input and all exemplars– think non-parametric density estimation
8
Different way of looking at recognition
CarCarCar
Road
Building
Input Image
10
Different way of looking at recognition
11
What is the ultimate goal?
• Parsing Images
• A “what is it like?” machine
• A kind of “visual memex”
12
Recognition as Association
LabelMe Dataset
12,905 Object Exemplars171 unique ‘labels’
http://labelme.csail.mit.edu/
13
Our Contributions
• Posing Recognition as Association–Use large number of object exemplars
13
•Learning Object Similarity–Different distance function per exemplar
•Recognition-Based Object Segmentation– Use multiple segmentation approach
14
Measuring Similarity
15
Exemplar Representation
Segment from LabelMe
16
ShapeCentered Mask
Bounding Box Dimensions
Pixel Area
Obj~Obj
17
TextureTextons
top,bot,left,right boundary
Interior: Bag-of-Words
18
ColorMean Color
Color std
Color Histogram
19
LocationAbsolute Position Mask
0
1
.42
.8 Top Height
Bot Height
Distance “Similarity” Functions
• Positive Linear Combinations of Elementary Distances Computed Over 14 Features
Building e Distance Function
Building e
Learning Object Similarity
• Learn a different distance function for each exemplar in training set
• Formulation is similar to Frome et al [1,2][1] Andrea Frome, Yoram Singer, Jitendra Malik. "Image Retrieval and Recognition Using Local Distance Functions." In NIPS, 2006.
[2] Andrea Frome, Yoram Singer, Fei Sha, Jitendra Malik. "Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification." In ICCV, 2007.
22
Non-parametric density estimation
Color Dimension
Sh
ap
e D
imen
sio
n
Class 1Class 2
Class 3
23
Non-parametric density estimation
Color Dimension
Sh
ap
e D
imen
sio
n
Class 1Class 2
Class 3
24
Non-parametric density estimation
Color Dimension
Sh
ap
e D
imen
sio
n
Class 1Class 2
Class 3
25
Learning Distance Functions
25Dshape
Dcolor
Focal Exemplar
26
Learning Distance Functions
26Dshape
Dcolor
Focal Exemplar
“similar” side
DecisionDecisionBoundaryBoundary
“dissimilar” side
Don’t Care
Visualizing Distance Functions (Training Set)
Query
Query
Top Neighbors with Tex-Hist Dist
Top Neighbors with Learned Dist
Visualizing Distance Functions (Training Set)
Visualizing Distance Functions (Training Set)
Visualizing Distance Functions (Training Set)
personperso
nperso
nperso
nperso
n
Visualizing Distance Functions (Training Set)
3232
Visualizing Distance Functions (Training Set)
personperso
nperso
n
standing
personwoman
person
Different Label on “similar” side of distance function
Labels Crossing Boundary
34
“Conventional” Recognition in Test Set
• Compute the similarity between an input and all exemplars
• All exemplars with D < 1 are “associated” with the input
• Most occurring label from associations is propagated onto input
• Association confidence score favors more associations and smaller distances
34
Performance on labeling perfect segments (test set)
Object Segmentation via Recognition
• Generate Multiple Segmentations (Hoiem 2005, Russell 2006, Malisiewicz 2007)
– Mean-Shift and Normalized Cuts
– Use pairs and triplets of adjacent segments
– Generate about 10,000 segments per image
• Enhance training with bad segments
• Apply learned distance functions to bottom-up segments
37
Example AssociationsBottom-Up Segments
38
Quantitative Evaluation
38
Object hypothesis is correct if labels match and OS > .5
*We do not penalize for multiple correct overlapping associations
OS(A,B) = Overlap Score = intersection(A,B) / union(A,B)
39
Toward Image Parsing
39
40
Conclusion: Main Points• Object Association: defining an object in terms of a set of visually similar objects. Trying to get away from classes.
• Learning per-examplar-distances: each object gets to decide on its own distance function. Suddenly, NN distances are meaningful!
• Using multiple segmentations: partition the input image into manageable chunks than can then be matched
41
Thank You
41
Questions?