Top Banner
Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine
25

Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Dec 17, 2015

Download

Documents

Kory Reed
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Face Detection, Pose Estimation, and Landmark Localization in the Wild

Xiangxin Zhu Deva RamananDept. of Computer Science, University of California, Irvine

Page 2: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Outline

•Introduction•Model•Inference•Learning•Experimental Results•Conclusion

Page 3: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Introduction

•A unified model for face detection, pose estimation,and landmark estimation.

•Based on a mixtures of trees with a shared pool of parts

•Use global mixtures to capture topological changes

Page 4: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Introduction

Page 5: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

mixture-of-trees model

Page 6: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Tree structured part model• We write each tree Tm =(Vm,Em) as a linearly-

parameterized ,where m indicates a mixture and .

• I : image, and li = (xi, yi) : the pixel location of part i.• We score a configuration of parts

• : a scalar bias associated with viewpoint mixture m

Page 7: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Tree structured part model

• Sums the appearance evidence for placing a template for part i, tuned for mixture m, at location li.

• Scores the mixture-specific spatial arrangement of parts L

Page 8: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Shape model• the shape model can be rewritten

• : re-parameterizations of the shape model (a, b, c, d)

• : a block sparse precision matrix, with non-zero entries corresponding to pairs of parts i, j connected in Em.

Page 9: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

The mean shape and deformation modes

Page 10: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Inference• Inference corresponds to maximizing S(I, L,m) in Eqn.1

over L and m:

• Since each mixture Tm =(Vm,Em) is a tree, the inner maximization can be done efficiently with dynamic programming.

Page 11: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Learning• Given labeled positive examples {In,Ln,mn} and negative

examples {In}, we will define a structured prediction objective function similar to one proposed in [41]. To do so, let’s write zn = {Ln,mn}.

• Concatenating Eqn1’s parameters into a single vector

[41] Y. Yang and D. Ramanan. Articulated pose estimation using flexible mixtures of parts. In CVPR 2011.

Page 12: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Learning• Now we can learn a model of the form:

• The objective function penalizes violations of these constraints using slack variables

• write K for the indices of the quadratic spring terms (a, c) in parameter vector .

Page 13: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Experimental Results

Page 14: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Dataset

•CMU MultiPIE•annotated face in-the-wild (AFW) (from Flickr images)

Page 15: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Dataset

Page 16: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Sharing

•We explore 4 levels of sharing, denoting each model with the number of distinct templates encoded.▫Share-99 (i.e. fully shared model)▫Share-146▫Share-622▫Independent-1050 (i.e. independent model)

Page 17: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.
Page 18: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

In-house baselines

•We define Multi.HoG to be rigid, multiview HoG template detectors, trained on the same data as our models.

•We define Star Model to be equivalent to Share-99 but defined using a “star” connectivity graph, where all parts are directly connected to a root nose part.

Page 19: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Face detection on AFW testset

[22] Z. Kalal, J. Matas, and K. Mikolajczyk. Weighted sampling for large-scale boosting. In BMVC 2008.

Page 20: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Pose estimation

Page 21: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Landmark localization

Page 22: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Landmark localization

Page 23: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

AFW image

Page 24: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Conclusion

•Our model outperforms state-of-the-art methods, including large-scale commercial systems, on all three tasks under both constrained and in-the-wild environments.

Page 25: Face Detection, Pose Estimation, and Landmark Localization in the Wild Xiangxin Zhu Deva Ramanan Dept. of Computer Science, University of California, Irvine.

Thanks for listening