Top Banner
Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO
24

Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

Jan 21, 2016

Download

Documents

Avis Hancock
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

Feedforward semantic segmentation with zoom-out featuresMOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH

TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO

Page 2: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

2Main Ideas

Casting semantic segmentation as classifying a set of superpixels.

Extracting CNN features from different levels of spatial context around the superpixel at hand.

Using MLP as the classifier

Photo credit: Mostajabi et al.

Page 3: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

3Zoom-out feature extraction

Photo credit: Mostajabi et al.

Page 4: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

4Zoom-out feature extraction

Subscene Level Features Bounding box of superpixels within radius three from the superpixel

at hand

Warp bounding box to 256 x 256 pixels

Activations of the last fully connected layer

Scene Level Features Warp image to 256 x 256 pixels

Activations of the last fully connected layer

Page 5: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

5Training

Extracting the features from the mirror images and take element-wise max over the resulting two features vectors.

12416-dimensional representation for each superpixel.

Training 2 classifiers Linear classifier (Softmax)

MLP: Hidden layer (1024 neurons) + ReLU + Hidden layer (1024 neurons) with dropout

Page 6: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

6Loss Function

Imbalanced dataset Wheighted loss function

Loss function: Let be frequency of class c in the training data and

Page 7: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

7Effect of Zoom-out Levels

Photo and Table credit: Mostajabi et al.

Image Ground Truth

G1:3 G1:5 G1:5+S1 G1:5+S1+S2

Page 8: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

8Quantitative Results

Softmax Results on VOC 2012

Table credit: Mostajabi et al.

Page 9: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

9Quantitative Results MLP Results

Table credit: Mostajabi et al.

Page 10: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

10Qualitative Results

Photo credit: Mostajabi et al.

Page 11: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

11

Learning Deconvolution Network for Semantic SegmentationNOH, HONG AND HAN

POSTECH, KOREA

Page 12: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

12Motivations

Photo credit: Noh et al.

Image Ground Truth FCN Prediction

Page 13: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

13Motivations

Photo credit: Noh et al.

Page 14: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

14Deconvolution Network Architecture

Photo credit: Noh et al.

Page 15: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

15Unpooling

Photo credit: Noh et al.

Page 16: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

16Deconvolution

Photo credit: Noh et al.

Page 17: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

17Unpooling and Deconvolution Effects

Photo credit: Noh et al.

Page 18: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

18Pipeline

Generating 2K object proposals using Edge-Box and selecting top 50 based on their objectness scores.

Aggregating the segmentation maps which are generated for each proposals using pixel-wise maximum or average.

Constructing the class conditional probability map using Softmax

Apply fully-conncected CRF to the probability map.

Ensemble with FCN Computing mean of probability map generated with DeconvNet and

FCN

applying CRF.

Photo credit: Noh et al.

Page 19: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

19Training Deep Network

Adding a batch normalization layer to the output of every convolutional and deconvolutional layer.

Two-stage Training Train on easy examples first and then fine-tune with more

challenging ones.

Constructing easy examples: Crop object instances using ground-truth annotations

Limiting the variations in object location and size reduces the search space for semantic segmentation substantially

Page 20: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

20Effect of Number of Proposals

Photo credit: Noh et al.

Page 21: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

21Quantitative Results

Table credit: Noh et al.

Page 22: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

22Qualitative Results

Photo credit: Noh et al.

Page 23: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

23Qualitative Results

Examples that FCN produces better results than DeconvNet.

Photo credit: Noh et al.

Page 24: Feedforward semantic segmentation with zoom-out features MOSTAJABI, YADOLLAHPOUR AND SHAKHNAROVICH TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO.

24Qualitative Results

Examples that inaccurate predictions from our method and FCN are improved by ensemble.

Photo credit: Noh et al.