Top Banner
Gang Yu 旷视研究院 Context For Semantic Segmentation
43

Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Oct 15, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Gang Yu

旷视研究院

Context For Semantic Segmentation

Page 2: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Chao Peng Jingbo WangChangqian Yu Changxin GaoXiangyu Zhang Gang Yu Jian Sun

Collaborators

Nong Sang

Page 3: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Outline

• Revisit Semantic Segmentation• Context for Semantic Segmentation

• Backbone• Head• Loss

• Conclusion

Page 4: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Outline

• Revisit Semantic Segmentation• Context for Semantic Segmentation

• Backbone• Head• Loss

• Conclusion

Page 5: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

What is Semantic Segmentation?

• Classification + Localization• Visual Recognition

• Classification• Semantic Segmentation• Instance Segmentation• Panoptic Segmentation• Detection• Keypoint Detection

Page 6: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Pipeline

Backbone Head

LOSS

VGG16

ResNet

ResNext

Softmax

L2

U-Shape

4/8-Sampling + Dilation

Page 7: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Challenges in Semantic Segmentation?

• Speed• Performance

• Per-pixel Accuracy• Boundary

Page 8: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

What is Context?

• According to Dictionary:• the parts of a discourse that surround a word or passage and

can throw light on its meaning

Sports

ball

Grass

Play

Fields

Person

Page 9: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Outline

• Revisit Semantic Segmentation• Context for Semantic Segmentation

• Backbone• Head• Loss

• Conclusion

Page 10: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Backbone

• Motivation• Traditional Backbone is designed for Classification

• Large Receptive field by compromising spatial resolution• Segmentation requires both Classification & Localization

• Maintain both Receptive Field (context) & Spatial resolution• Computational cost?

Page 11: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Backbone - BiSeNet

BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, ECCV, 2018

• BiSeNet: Bilateral Segmentation Network

Page 12: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Backbone - BiSeNet

BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, ECCV, 2018

• Pipeline

Page 13: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Backbone - BiSeNet

BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, ECCV, 2018

• Results

Page 14: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Backbone - BiSeNet

BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, ECCV, 2018

• Ablation Results

Page 15: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Backbone - BiSeNet

BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, ECCV, 2018

• Speed

Page 16: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Backbone - BiSeNet

BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, ECCV, 2018

• Summary• Two path in backbone: Spatial path + Context path • Context is implicitly encoded in receptive field• Efficient speed• Code: https://github.com/ycszen/TorchSeg

• Context:• A branch encodes semantic meaning with large receptive field?

• Related work:• ICNet for Real-Time Semantic Segmentation on High-Resolution Images, Hengshuang Zhao,

Xiaojuan Qi, Xiaoyong Shen, Jianping Shi, Jiaya Jia, ECCV2018• Stacked Hourglass Networks for Human Pose Estimation, Alejandro Newell, Kaiyu Yang, Jia

Deng, ECCV2016

Page 17: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Head

• Motivation• Large Receptive field without compromising boundary results• Why working on Head?

• Efficient speed• Obvious gain on increasing the receptive• Simple to implement

Page 18: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Head – Large Kernel

• Receptive Field vs Valid Receptive Field

Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017

Page 19: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Head – Large Kernel

• Large Kernel Matters

Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017

Page 20: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Head – Large Kernel

• Large Kernel Matters• Why Boundary Refinement?

• Large receptive field will blur the object boundary

Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017

Page 21: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Head – Large Kernel

• Large Kernel Matters• Ablation: Why Boundary Refinement?

• Large receptive field will blur the object boundary

Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017

Page 22: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Head – Large Kernel

• Large Kernel Matters• Ablation: Different kernel size?

Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017

Page 23: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Head – Large Kernel

• Large Kernel Matters• Ablation: Are more parameters helpful?

Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017

Page 24: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Head – Large Kernel

• Large Kernel Matters• Ablation: GCN vs. Stack of small convolutions

Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017

Page 25: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Head – Large Kernel

• Large Kernel Matters• Ablation: GCN in Backbone

Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017

Page 26: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Head – Large Kernel

• Large Kernel Matters: illustrative examples

Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017

Page 27: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Head – Large Kernel

• Summary• Global Convolution network to increase the receptive field• Large separable convolution is an efficient implementation

• Context• Large receptive field?

• Related work• PSPNet: Pyramid Scene Parsing Network, Hengshuang Zhao, Jianping Shi, Xiaojuan Qi,

Xiaogang Wang, Jiaya Jia, CVPR2017• DeeplabV3: Rethinking Atrous Convolution for Semantic Image Segmentation, Liang-Chieh

Chen, George Papandreou, Florian Schroff, Hartwig Adam

Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017

Page 28: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Head – DFN

• Motivation:• Large kernel (GCN) is computationally intensive

• Global pooling is efficient to compute and can obtain the global context

• Large receptive field does not equal to good context• Attention strategy to adaptively aggreate the features

Learning a Discriminative Feature Network for Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, CVPR, 2018

Page 29: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Head – DFN

• DFN: Pipeline

Learning a Discriminative Feature Network for Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, CVPR, 2018

Page 30: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Head – DFN

• DFN: Ablation

Learning a Discriminative Feature Network for Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, CVPR, 2018

Page 31: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Head – DFN

• DFN: Results

Learning a Discriminative Feature Network for Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, CVPR, 2018

Page 32: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Head – DFN

• Summary• Global pooling is efficient and effective to capture the long-range

context• Attention for adaptive adjusting feature weights• Code: https://github.com/ycszen/TorchSeg/

• Context• Receptive field & feature aggregation?

• Related work• Non-local Neural Networks, Xiaolong Wang, Ross Girshick, Abhinav Gupta, Kaiming He, CVPR2018• CCNet: Criss-Cross Attention for Semantic Segmentation, Zilong Huang, Xinggang Wang, Lichao

Huang, Chang Huang, Yunchao Wei, Wenyu Liu• PSANet: Point-wise Spatial Attention Network for Scene Parsing, Hengshuang Zhao*, Yi Zhang*, Shu

Liu, Jianping Shi, Chen Change Loy, Dahua Lin, Jiaya Jia, ECCV2018• OCNet: Object Context Network for Scene Parsing, Yuhui Yuan, Jingdong Wang• ParseNet: Looking Wider to See Better, Wei Liu, Andrew Rabinovich, Alexander C. Berg

Learning a Discriminative Feature Network for Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, CVPR, 2018

Page 33: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Loss

• Motivation• “Thing” may be important for stuff prediction

COCO2018 Panoptic Segmentation Challenge, http://presentations.cocodataset.org/ECCV18/COCO18-Panoptic-Megvii.pdf

Sports

ball

Grass

Play

Fields

Person

Page 34: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Loss

• Motivation• “Thing” may be important for stuff prediction

COCO2018 Panoptic Segmentation Challenge, http://presentations.cocodataset.org/ECCV18/COCO18-Panoptic-Megvii.pdf

Page 35: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Encoder Train/Inference Train Supervision Inference MergeRes-Block

Multi Types Context

Objects

Semantic

Stuff

Stuff

Context in Loss

• Pipeline

COCO2018 Panoptic Segmentation Challenge, http://presentations.cocodataset.org/ECCV18/COCO18-Panoptic-Megvii.pdf

Page 36: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Loss

• COCO2018 Panoptic Segmentation Challenge

49.3 49.6 54.1 54.550.8

Res50

+Encoder

+Extra Res

Blocks

+Multi

Context

+Huge

Backbone

+Multi-Scale

Flip Test

Results of Stuff Regions on

COCO2018 Panoptic

Segmentation Validation

Dataset

Metric:Mean IoU%

Finally, we assembled three

models and achieve 55.9%

mIoU on this dataset.

COCO2018 Panoptic Segmentation Challenge, http://presentations.cocodataset.org/ECCV18/COCO18-Panoptic-Megvii.pdf

Page 37: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Loss

• COCO2018 Panoptic Segmentation Challenge

COCO2018 Panoptic Segmentation Challenge, http://presentations.cocodataset.org/ECCV18/COCO18-Panoptic-Megvii.pdf

Page 38: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Context in Loss

• Summary• “Thing” and “stuff” are complementary• Loss is a good approach to encode the context

• Better feature representation• Context

• A loss to encode the semantic meaning?• Related work

• Context Encoding for Semantic Segmentation, Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal, CVPR2018

COCO2018 Panoptic Segmentation Challenge, http://presentations.cocodataset.org/ECCV18/COCO18-Panoptic-Megvii.pdf

Page 39: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Outline

• Revisit Semantic Segmentation• Context for Semantic Segmentation

• Backbone• Head• Loss

• Conclusion

Page 40: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Conclusion

• Context in different parts• Backbone, Head, Loss

• What is Context?• Large receptive field? • A semantic branch?• Spatial/feature aggregation?

• Future work• Explicitly show what is a context• Panoptic seg: Stuff vs Thing

Page 41: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Reference

• Pyramid Scene Parsing Network, Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia, CVPR2017

• ICNet for Real-Time Semantic Segmentation on High-Resolution Images, Hengshuang Zhao, Xiaojuan Qi, Xiaoyong Shen, Jianping Shi, Jiaya Jia, ECCV2018

• Context Encoding for Semantic Segmentation, Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal, CVPR2018

• Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation, Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, Hartwig Adam, ECCV2018

• Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017

• Learning a Discriminative Feature Network for Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, CVPR, 2018

• BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, Changqian Yu, JingboWang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, ECCV, 2018

Page 42: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global

Q&A

• Megvii Detection 知乎专栏

• Webpage: http://www.skicyyu.org/

• Email: [email protected]

Page 43: Context For Semantic Segmentationvalser.org/webinar/slide/slides/20190529/2019.05.29 俞刚.pdfMay 29, 2019  · •Global pooling is efficient to compute and can obtain the global