Learning Shared Body Plans Ian Endres University of Illinois work with Derek Hoiem, Vivek Srikumar and Ming-Wei Chang
Dec 11, 2015
Learning Shared Body Plans
Ian EndresUniversity of Illinois
work with Derek Hoiem, Vivek Srikumar and Ming-Wei Chang
How should we represent multiple related object categories?
Want to detect, localize, and estimate pose of broad range of objects, including new ones
One option: independent detectors
CatDetector
DogDetector
4-Legged Animal
Detector
Basic-Level Categories
Broad Categories Parts
…
Head Detector
Our previous work: Train separate detectors, Joint spatial model
Vehicle
Wheel
Animal
Leg
Head
Four-leggedMammal
Can runCan JumpFacing rightMoves on road
Facing right
Farhadi Endres Hoiem (2010)
Jointly trained multi-category models• Train part/category detectors to jointly predict
object structure– Only need to perform well in context defined by
others
• Spatial model encodes likely part positions, number of parts, likely categories, etc.– Generalizes Felzenszwalb et al.: cross-category
sharing, multiple parts with one model, variable size
Shared mixture of deformable parts: Body Plans
Include a body plan for background patches:No appearance models, just a bias
Anchor Point Score
Sa = bias
+ appearance score
- deformation cost
HOG based Deformable part model (Felzenszwalb et al.)
Quadratic penalty in position and scale
Sa = bias
+ appearance score
- deformation cost
Overall score must be greater than 0 to be detected
(Latent) Max Margin Structured Learning
Highest Scoring Valid Structure
Invalid Structure Loss
Soft margin slack
Valid Structures
LEGLEG
LEG LEG
HeadFour-leggedElk
Object Detectors: 50% Overlap with ground truthPart Detectors: 25% Overlap with ground truth
Positive Examples Negative Examples
Must select BG body plan
Loss
LEGLEG
LEGHead
Four-leggedElk
False Positives: +1Duplicate Detections: +1Missed Detections: + 1
Head
LEG
Positive Examples Negative Examples
Non-BG body plan: +1False Positives: +1
Optimization
• Latent Structured SVM– Non-convex - CCCP
• Stochastic gradient descent based cutting plane optimization
Optimization Challenges
1) Expensive search for violated constraints– Mine many violated constraints at once– Speeds convergence
2) Large feature vectors (100k+)– Can’t store every mined violated constraint– Requires careful caching
Experimental Setup
• CORE: Train + Test– Familiar Categories: Camel, Dog, Elephant, Elk– Parts: Head, Leg, Torso– Unfamiliar Categories: Cat, Cow
• Pascal 2008: Test– Unfamiliar Categories: Cat, Cow, Horse, Sheep
Mixed Supervision
LEG
LEG
LEG
Head
Four-leggedDog L
EG
LEG
LEG
Four-leggedDog L
EG
LEG
Head
Learning
Mixed Supervision
LEG
LEG
LEG
Head
Four-leggedDog L
EG
Four-leggedDog+
LEG
LEG
Four-leggedDog L
EG
LEG
Head
Learning
Mixed Supervision - Learning
• Unlabeled boxes become latent variables– Compute most likely positition– No loss for missed detections
Highest Scoring Valid Structure
Loss
Conclusions
• Jointly representing related categories leads to better performance and generalization to unfamiliar categories
• Joint training important to get full benefit of spatial model