Top Banner
Constrained Convolutional Neural Networks for Weakly Supervised Segmentation Deepak Pathak, Philipp Krähenbühl and Trevor Darrell 1
28

Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Sep 26, 2018

Download

Documents

lamanh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Constrained Convolutional Neural Networks for Weakly Supervised Segmentation

Deepak Pathak, Philipp Krähenbühl and Trevor Darrell

1

Page 2: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Multi-class Image Segmentation

• Assign a class label to each pixel in the image

background

chair

table

2

Page 3: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Multi-class Image Segmentation

• Pixel-level classification

• Train a classifier Qi(l)

• for each pixel i

• and label l

• Convolutional neural network (CNN)

• trained end-to-end

3

background

chair

table

Fully Convolutional Networks for Semantic Segmentation [Long et al. CVPR 2015]

Page 4: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

How does prior work train

4

CNN

• back propagation • stochastic gradient

descent (SGD) • large labeled dataset

Page 5: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Limitation : Training Supervision

• Need full supervision

• Time consuming to obtain

• “79s per label per image”[Russakovsky et al. Arxiv 2015]

• Expensive to obtain

• Bottleneck for learning models at large scale

5

Page 6: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Weak Training Supervision

• Weak supervision

• Class labels or tags

• Cheap to obtain

• “1s per label per image”[Russakovsky et al. Arxiv 2015]

• Scalable to large number of categories

6

person horse

background

Page 7: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Training a CNN using weak supervision - Prior work

• Multiple instance learning

• Tag present

• at least one pixel takes label

• Tag absent

• No pixel takes that label

• Shown promise for weak detection

person car

background

7

Page 8: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Multiple instance learning - Issues

• Very weak signal

• one pixel per class per image

• Converges to bad local minima

• Requires good initialization !

• Heuristics to get out of local optima

person car

background

8

Page 9: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Weakly Supervised Training

• Is there a better description of the desired solution?

person car

background

9

person

car

person

car

Page 10: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Idea : Weakly Supervised Training with constraints

Constraint Hyperplanes

Space of

Segmentation Masks

Space of Good

Segmentation Label Masks

10

Page 11: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Description Constraints

Suppression Constraint:

• suppress labels that do not appear in the image.

Cat

Dog Others

11

person car

background

Horse

Page 12: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Description Constraints

Foreground Constraint:

• label at least some pixels for each object present

12

person car

background

Person

Car

Page 13: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Description Constraints

Background Constraint:

• The number of background pixels in an image should be bounded say between 10% to 75%

13

person car

background

Background

Page 14: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Constrained Convolutional Neural Network [CCNN]

Convolutional Neural Network + Constraints

14

Constraint Hyperplanes

Space of

Segmentation Masks

Space of Good

Segmentation Label MasksCNN

Page 15: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

How to constrain CNN output?

• Constraints on CNN distribution QI

15

Expensive and Non-Convex

CNN θ

QI

Page 16: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

CCNN : Output as latent distribution

• Introduce latent variable PI for distribution of network output

• Apply constraints on the latent distribution

• Minimize the distance between PI and QI

16

CNN θ

QIPI

Page 17: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

CCNN Optimization

• KL-Divergence minimization between latent distribution and network output distribution

17

minimize

✓,PD(PIkQI)

subject to AI~PI � ~b,

~1

> ~PI = 1

CNN θ

QIPI

Page 18: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

CCNN Optimization

• Solves same optimization problem

• Convex in P

• Standard convnet loss for Q

• log-likelihood / cross entropy

• Convex for log-linear model

• logistic regression

18

minimize

✓,PD(PIkQI)

subject to AI~PI � ~b,

~1

> ~PI = 1

Page 19: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Q(0)

Q(1)

P (1)

P (0)

Constrained

Region

SGD

CCNN Optimization : Alternative Minimization

• Optimization using block coordinate descent :

• Solve for P while convnet parameters θ fixed

• Gradient step in θ while P fixed

• Each step guaranteed to decrease the overall objective

19

minimize

✓,PD(PIkQI)

subject to AI~PI � ~b,

~1

> ~PI = 1

Page 20: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Summary : Constrained CNN

Constrained Convolutional Neural Networks forWeakly Supervised Segmentation

Deepak Pathak Philipp Krahenbuhl Trevor DarrellUniversity of California, Berkeley

{pathak,philkr,trevor}@cs.berkeley.edu

Abstract

We present an approach to learn a dense pixel-wise la-beling from image-level tags. Each image-level tag imposesconstraints on the output labeling of a Convolutional Neu-ral Network classifier. We propose a novel loss function tooptimize for any set of linear constraints on the output space(i.e. predicted label distribution) of a Convolutional NeuralNetwork. Our loss formulation is easy to optimize and canbe incorporated directly into standard stochastic gradientdescent optimization. The key idea is to phrase the train-ing objective as a biconvex optimization for linear models,which we then relax to nonlinear deep networks. Extensiveexperiments demonstrate the generality of our new learn-ing framework. The constrained loss yields state-of-the-artresults on weakly supervised semantic image segmentation.We further demonstrate that adding slightly more supervi-sion can greatly improve the performance of the learningalgorithm.

1. Introduction

In recent years, standard computer vision tasks, suchas recognition or classification, have made tremendousprogress. This is primarily due to the widespread adop-tion of Convolutional Neural Networks (CNNs) [11,19,20].Existing models excel by their capacity to take advantageof massive amounts of fully supervised training data [28].This reliance on full supervision is a major limitation onscalability with respect to the number of classes or tasks.For structured prediction problems, such as semantic seg-mentation, fully supervised, i.e. pixel-level, labels are bothexpensive and time consuming to obtain. Summarizationof the semantic-labels in terms of weak supervision, e.g.image-level tags or bounding box annotations, is often lesscostly. Leveraging the full potential of these weak annota-tions is challenging, and existing approaches are susceptibleto diverging into bad local optima from which recovery isdifficult [6, 16, 25].

Input Image Predicted labeling

Person

Car

Weak Labels

Person

Car

CNN

Constrained Region

NetworkOutput

LatentDistribution

Figure 1: We train convolutional neural networks from a setof linear constraints on the output variables. The networkoutput is encouraged to follow a latent probability distribu-tion, which lies in the constraint manifold. The resultingloss is easy to optimize and can incorporate arbitrary linearconstraints.

In this paper, we present a framework to incorporateweak supervision into the learning procedure through a se-ries of linear constraints. In general, it is easier to expresssimple constraints on the output space than to craft regu-larizers or adhoc training procedures to guide the learning.In semantic segmentation, such constraints can describe theexistence and expected distribution of labels from image-level tags. For example, given a car is present in an image,a certain number of pixels should be labeled as car.

We propose a novel loss function to optimize CNNs witharbitrary linear constraints on the structured output space ofpixel labels. The non-convex nature of deep nets makes a di-rect optimization of the constraints difficult. Our key insightis to model a distribution over latent “ground truth” labels,while the output of the deep net follows this latent distri-bution as closely as possible. This allows us to enforce the

1

20

Page 21: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Evaluation

• VOC 2012 dataset

• Trained using 10,582 tagged images

• Training time: 8hrs

• Constraint satisfaction: 30ms per image on CPU

• Forward - Backward: 400ms per image on GPU

• Evaluated on Intersection over Union score

21

Page 22: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Results : State of the Art

• State of the art weakly supervised semantic segmentation

22

Page 23: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Additional 1-bit Supervision

1-bit additional information:

• object size is ‘small’ (<10%) or ‘large’ (>10%)

• Size Constraints

• Boost large objects

• Limit small objects

23

person car

background

Person

Car

small

large

Page 24: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Results : Comparison with Fully Supervised

• 10% improvement using 1-bit additional supervision at training time.

24

Page 25: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

25

train

person

bicycle

sheep

dog

sofa

horse

cat

Page 26: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

26

person persontable

bottlebicycle

plant

Page 27: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

Questions?

• Paper (and code) is available :Constrained Convolutional Neural Networks for Weakly Supervised Segmentation, ICCV 2015http://arxiv.org/abs/1506.03648

27

Page 28: Constrained Convolutional Neural Networks for …vgg/rg/slides/ccnn1.pdf · Constrained Convolutional Neural Networks for Weakly Supervised Segmentation ... [CCNN] Convolutional Neural

28