Top Banner
Instance Segmentation Riley Simmons-Edler, Berthy Feng
51

Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Apr 26, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Instance Segmentation

Riley Simmons-Edler, Berthy Feng

Page 2: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Instance Segmentation Task

● Label each foreground pixel with object and instance

● Object detection + semantic segmentation

Slide Credit: Kaiming He

Page 3: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

In This Lecture...

● Microsoft COCO dataset● Mask R-CNN (fully supervised)● MaskX R-CNN (partially supervised)

Page 4: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Microsoft COCO:Common Objects in Context

Tsung-Yi Lin, Michael Maire, Serge Belongie, et al. “Microsoft COCO: Common Objects in Context.” arXiv, 2015.

Page 5: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Previous Datasets● ImageNet: many object

categories● PASCAL VOC: object

detection in natural images, small number of classes

● SUN: labeling scene types and commonly occurring objects, but not many instances per category

Image Credit: Tsung-Yi Lin et al.

Page 6: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Goal: Push research in scene understanding

1. Detecting non-iconic views2. Contextual reasoning between objects3. Precise 2D localization of objects

Page 7: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

MS COCO Dataset

❖ 91 object classes

❖ 328,000 images

❖ 2.5 million labeled instances

Image Credit: Tsung-Yi Lin et al.

Page 8: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Image Collection & Annotation

Page 9: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Object Categories

Image Credit: Tsung-Yi Lin et al.

Page 10: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Non-Iconic Image Collection

Image Credit: Tsung-Yi Lin et al.

Page 11: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Annotation

Image Credit: Tsung-Yi Lin et al.

Page 12: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Dataset Evaluation

Page 13: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Statistics

Image Credit: Tsung-Yi Lin et al.

Page 14: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Statistics

Image Credit: Tsung-Yi Lin et al.

Page 15: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

COCO Detection Challenge

Image Credit: Tsung-Yi Lin et al.

Page 16: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

COCO Keypoint Challenge

Image Credit: Tsung-Yi Lin et al.

Page 17: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

COCO Stuff Challenge

Image Credit: Tsung-Yi Lin et al.

Page 18: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

COCO Places Challenges

Image Credit: Tsung-Yi Lin et al.

Page 19: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Mask R-CNN

Kaiming He, Georgia Gkioxari, Piotr Dollar, and Ross Girshick. “Mask R-CNN.” ICCV, 2017.

Page 20: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Faster R-CNN

Page 21: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Fast R-CNN

Image Credit: Shaoqing Ren et al. Image Credit: Tomasz Grel

Page 22: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Insight: Region Proposal and Detection Use Same Features

Image Credit: Shaoqing Ren et al.

Page 23: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Faster R-CNN = RPN + Fast R-CNNRPN = Fully Convolutional Network

Page 24: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Extending to Instance Segmentation

Page 25: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Visual Perception Problems

Slide Credit: Kaiming He

Page 26: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Instance Segmentation Methods

Slide Credit: Kaiming He

Page 27: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Insight: Mask Prediction in Parallel

Slide Credit: Kaiming He

Page 28: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

RoIPool

Image Credit: Tomasz Grel

Page 29: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

RoIPool

Slide Credit: Kaiming He

Page 30: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

RoIAlign

Slide Credit: Kaiming He

Page 31: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Mask R-CNN

Page 32: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Mask R-CNN Results

Page 33: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Examples

Image Credit: Kaiming He et al.

● Mask AP = 35.7

Page 34: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Comparisons

Image Credit: Kaiming He et al.

Page 35: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Comparisons

Image Credit: Kaiming He et al.

Page 36: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Application: Human Pose Estimation

Image Credit: Kaiming He et al.

Page 37: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Mask R-CNN Recap

● Add parallel mask prediction head to Faster-RCNN● RoIAlign allows for precise localization● Mask R-CNN improves on AP of previous state-of-the-art, can be

applied in human pose estimation

Page 38: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Learning to Segment Every Thing

Ronghang Hu, Piotr Dollar, Kaiming He, Trevor Darrell, and Ross Girshick. “Learning to Segment Every Thing.” arXiv, 2017.

Page 39: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Partially Supervised Model

Page 40: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Motivation for a Partially Supervised Model

Image Credit: Ronghang Hu et al.

A = set of object categories with complete mask annotations

B = set of object categories with only bounding boxes (no segmentation annotations)

How can we know C = A U B?

Page 41: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Transfer Learning

Image Credit: Ronghang Hu et al.

Page 42: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Weight Transfer Function

Image Credit: Ronghang Hu et al.

Page 43: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Training● Train bounding box head using standard box detection losses on all

classes in A U B● Train mask head, weight transfer function using mask loss on classes in A

Image Credit: Ronghang Hu et al.

Page 44: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Stage-Wise Training1. Detection training2. Segmentation training

● Train detection once and then fine-tune weight transfer function

● Inferior performance

Image Credit: Ronghang Hu et al.

Page 45: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

End-to-End Joint Training

● Jointly train detection head and mask head end-to-end● Want detection weights to stay constant between A and B

Image Credit: Ronghang Hu et al.

Page 46: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

End-to-End Training Better

Image Credit: Ronghang Hu et al.

Page 47: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Mask Prediction

Baseline: Class-agonistic FCN mask prediction

Extension: FCN+MLP mask heads

Image Credit: Ronghang Hu et al.

Page 48: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Results

Page 49: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Examples

Image Credit: Ronghang Hu et al.

Page 50: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Comparisons

Image Credit: Ronghang Hu et al.

Page 51: Instance Segmentation...Instance Segmentation Task Label each foreground pixel with object and instance Object detection + semantic segmentation Slide Credit: Kaiming He Microsoft

Segmenting Everything

Image Credit: Ronghang Hu et al.