Top Banner
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei-Fei 1
43

Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

Towards Total Scene Understanding:

Classification, Annotation and Segmentation in an

Automatic FrameworkLi-Jia Li, Richard Socher, Li

Fei-Fei

1

Page 2: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

2

City Travel

Pagoda

SunriseSunshine

Sun

Page 3: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

3

City Travel

Pagoda

SunriseSunshine

Sun

Weber et al 00Fergus et al 03Felzenswalb et al 04Fei-Fei et al 05Sivic et al 05Bosch et al 06Oliva et al 01Lazebnik et al 06

Shi et al 00Felzenszwalb et al04Sali et al 99Winn et al 05Kumar et al 05Cao et al 07Russell et al 06Todorovic et al 06

Duygulu et al 02

Barnard et al 03

Blei et al 03

Gupta et al 08

Alipr Li et al 03

Sudderth et al 05

SegmentationSegmentation

ClassificationClassification

AnnotationAnnotation

Remark: Approaches in yellow will be used to compare withour model in later Experiments.

Page 4: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

4

City Travel

Pagoda

SunriseSunshine

Sun

Weber et al 00Fergus et al 03Felzenswalb et al 04Fei-Fei et al 05Sivic et al 05Bosch et al 06Oliva et al 01Lazebnik et al 06

Shi et al 00Felzenszwalb et al04Sali et al 99Winn et al 05Kumar et al 05Cao et al 07Russell et al 06Todorovic et al 06

Duygulu et al 02

Barnard et al 03

Blei et al 03

Gupta et al 08

Alipr Li et al 03

Sudderth et al 05

SegmentationSegmentation

ClassificationClassification

AnnotationAnnotation

Total Scene Total Scene UnderstandiUnderstandi

ngng

Page 5: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

Application

5

Page 6: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

6

ClassificationClassification AnnotationAnnotation SegmentationSegmentation

Mutually beneficial!Mutually beneficial!

Page 7: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

7

AthleteHorseGrassTreesSkySaddle

ClassificationClassification AnnotationAnnotation SegmentationSegmentation

HorseHorse

class: Polo

Page 8: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

8

Horse

Horse

Horse

HorseHorse

SkyTree

Grass

AthleteHorseGrassTreesSkySaddle

ClassificationClassification AnnotationAnnotation SegmentationSegmentation

Horse

Athlete

class: Polo

Page 9: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

9

class: Polo

Horse

Horse

Horse

HorseHorse

AthleteHorseGrassTreesSkySaddle

ClassificationClassification AnnotationAnnotation SegmentationSegmentation

Page 10: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

10

Related Work:

Tu et al 03

AnnotationAnnotation

SegmentationSegmentation

Horse

Horse

Horse

HorseHorse

SkyTree

GrassHorse

Athlete

Li & Fei-Fei 07

AnnotationAnnotation

ClassificationClassification

Sky

GrassHorse

Athlete

Horse

Horse

Horse

HorseHorse

Class: Polo

ClassificationClassification

SegmentationSegmentation

Tree

Heitz et al 08

Class: Polo

Page 11: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

Learning

Model

Recognition & Experiment

Outline

ClassificationClassification

AnnotationAnnotation SegmentationSegmentation

Page 12: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

12

C

Nr

O

RNF

XAr

NtZ

S

T

D

AthleteHorseGrassTreesSkySaddle

Page 13: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

13

C

Visual

Text

class: Polo

AthleteHorseGrassTreesSkySaddle

Joint distribution of random variable Visual Component

Text Component.

D

Page 14: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

14

O

14

Text Component.

D

Visual

TextC

class: Polo

Page 15: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

15

RNF

Color LocationTexture Shape

Text Component.

O

D

Visual

TextC

class: Polo

Page 16: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

RNF

O

D

Visual

TextC

class: Polo

16

XAr

Text Component.

Page 17: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

RNF

O

D

Visual

TextC

class: Polo

XAr ZNr Nt “Connector variable”

AthleteHorseGrassTreesSkySaddle

Text Component.

Page 18: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

RNF

O

D

Visual

TextC

class: Polo

XAr ZNr Nt “Connector variable”

.

S AthleteHorseGrassTreesSkySaddle

AthleteHorseGrassTreesSkySaddle

VisibleNot visible

“Switch variable”

Horse

Horse

Horse

HorseHorse

Athlete

Horse

Page 19: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

RNF

O

D

Visual

TextC

class: Polo

XAr ZNr Nt “Connector variable”

S AthleteHorseGrassTreesSkySaddle

VisibleNot visible

“Switch variable”

T

Horse

.

Page 20: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

Visual

Text C

Nr

O

RNF

XAr

NtZ

S

TLearning

Model

Recognition & Experiment

Outline

Page 21: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

21

Learning

Exact Exact Inference is Inference is Intractable !Intractable !

Relationship of the random variables

Visual

Text C

Nr

O

RNF

XAr

Nt

Z

S

T

Page 22: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

22

Relationship of the random variables

Visual

Text C

Nr

O

RNF

XAr

Nt

Z

S

T

Top-down force

Bottom-up force from visual information

Bottom-up force from text information

Collapsed Gibbs Sampling

(R. Neal, 2000)

Page 23: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

Scene/Event imagesfrom the Internet

There is no object-text correspondence…

AthleteHorseGrassTree

Saddle

23

Page 24: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

Scene/Event imagesfrom the Internet

Our model builds the correspondence…

C

Nr

O

RNF

XAr

NtZ

S

T

D

AthleteHorseGrassTree

Saddle

24

Page 25: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

25

AthleteHorseGrassTreesSkySaddle

AthleteHorseGrassBall

However, a big obstacle is: many objects always co-occur together

??

?

Scene/Event imagesfrom the Internet

Page 26: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

26

C

RNF

XAr Nr

ZNt

T

S

O

One solution: some good initialization of O

Grass

Athlete

Horse

AthleteHorseGrassTreesSkySaddle

Scene/Event imagesfrom the Internet

Page 27: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

Scene/Event imagesfrom the Internet

27

Initializing O: obtain internet images for each O

Object images

Page 28: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

28

Scene/Event images

C

RNF

XAr

Nr ZNt

T

S

O

Any object

detection&

segmentation

Algorithm

D

Initializing O: train an object detector for each O

Object imagesEvent/Scene images

Page 29: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

29

Scene/Event images

…Black box

object detection& segmentation

Black box object detection& segmentation

C

RNF

XAr

Nr ZNt

T

S

O

D

Initialize O in the scene image by the trained object detectors

Object imagesEvent/Scene images

Any object

detection&

segmentation

Algorithm

Page 30: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

30

Scene/Event images

…Black box

object detection& segmentation

Black box object detection& segmentation

C

RNF

XAr

Nr ZNt

T

S

O

Black box object detection& segmentation

D

Initialize O in the scene image by the trained object detectors

Cao & Fei-Fei, 2007

θ C

XR

O

NrAr

Our Model

Object imagesEvent/Scene images

Page 31: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

C

RNF

XAr

Nr ZNt

T

S

O

D

AutoAuto--semi-supervised learning: Small # of initialized images + Large # of uninitialized images

Our Model +Athlete

HorseGrassTree

SaddleWind

Small # of initialized images

AthleteRockGrassTree

SkyRope

AthleteSnow

TreeSky

SnowboardLarge # of uninitialized images

Scene/Event images

Page 32: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

Athlete

HorseGrassTree

SaddleWind

AthleteRockGrassTree

SkyRope

AthleteSnow

TreeSky

Snowboard

Large # of uninitialized images

Visual

Text C

Nr

O

RNF

XAr

NtZ

S

T

Learning Model

Recognition & Experiment• Dataset• Learned Model• Results

OutlineSmall # of automatically initialized images

Page 33: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

Badminton

Bocce

Croquet

Polo

33

8 Event/Scene Classes

Remark: Tags are not used during testing

Page 34: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

Rockclimbing

Rowing

Sailing

Snowboarding

34

8 Event/Scene Classes

Page 35: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

35

C

Nr

RNF

XAr

NtZ

S

T

Learned model: O

D

O

Page 36: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

36

Athlete

Grass

Horse

C

Nr

O

NF

XAr

NtZ

S

T

D

R

Learned model: R

Page 37: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

37

C

Nr

O

RNF

XAr

NtZ

T

D

S

Learned model: S

Page 38: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

38

8 way classification: 54%

ClassificationClassification AnnotationAnnotation SegmentationSegmentation

Page 39: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

39

ClassificationClassification AnnotationAnnotation SegmentationSegmentation

Alipr: Li et al 03 Corr LDA: Blei et al 03

Page 40: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

40

ClassificationClassification AnnotationAnnotation SegmentationSegmentation

Page 41: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

Effect of top-down class context

41

Horse

C

O

R X Z

T

S

O

R X Z

T

S

Model w/o top-down class Full Model

Page 42: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

Athlete

HorseGrassTree

SaddleWind

AthleteRockGrassTree

SkyRope

AthleteSnow

TreeSky

Snowboard

Large # of uninitialized images

Small # of automatically initialized images

Visual

Text C

Nr

O

RNF

XAr

NtZ

S

T

Sky

Athlete

Tree

Mountain

RockClass:

Rock climbingAthleteMountainTreeRockSkyAscent

Sky

Athlete

Water

Treesailboat

Class: Sailing

AthleteSailboatTreeWaterSkyWind

Learning Model

Recognition & Experiment

Tree

AthleteSnowboard

Snow

Class:

Snowboarding

AthleteSnowboardTreeSnowSkyPowder

Page 43: Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.

ThankProf. Silvio Savarese , Juan Carlos Niebles, Chong Wang, Barry Chai, Min Sun, Bangpeng Yao, Hao Su, Jia Deng, anonymous reviewers

And You

43