Top Banner
SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University
42

SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

SIGGRAPH Course 30:Performance-Driven Facial AnimationSIGGRAPH Course 30:Performance-Driven Facial Animation

Section:

Markerless Face Capture and Automatic Model Construction

Part 2: Li Zhang, Columbia University

Page 2: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Part 1 vs Part 2Part 1 vs Part 2

Part 1• Arbitrary Videos• Sparse Models

Part 2• Controlled Environments• Dense Models

Register 3D Models

Create Model Priors

Page 3: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

OutlineOutline

1. Scanning face models

– Triangulation methods

– Non triangulation methods

2. Dense facial motion capture

– Marker based capture

– Template fitting for face scans

Page 4: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Principle 1: triangulationPrinciple 1: triangulation

I J

Stereo

Page 5: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Principle 1: triangulationPrinciple 1: triangulation

I J

Active stereo

Page 6: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Principle 1: triangulationPrinciple 1: triangulation

I J

Structured light

Page 7: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Laser scannerLaser scanner

Cyberware® face and head scanner

+ very accurate <0.01mm − >10sec per scan

Page 8: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

A. Gruss, S. Tada, and T. Kanade "A VLSI Smart Sensor for Fast Range Imaging," ICIRS 1992

Working Volume: 350-500mm - Accuracy: 0.1%Spatial Resolution: 28x32 - Speed: 1000Hz

+ Fast – up to 1000Hz− Customized device

Fast laser scanner (temporal)Fast laser scanner (temporal)

Page 9: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Fast laser scanner (spatial)Fast laser scanner (spatial)

Oike, Y. Ikeda, M. Asada, K., “Design and implementation of real-time 3-D image sensor with 640x480 pixel resolution”, IEEE Journal of Solid-State Circuits, 2004.

Working Volume: 1200mm - Accuracy: 0.07%Spatial Resolution: 640x480 - Speed: 65Hz

Possible issue: Stripes within a range map are not simultaneously measured.

Page 10: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

S. Zhang and P. Huang, “High-resolution Real-time 3-D Shape Measurement”, Journal of Optical Engineering, 2006

Working Volume: 10-2000mm - Accuracy: 0.025%Spatial Resolution: 532x500 - Speed: 120Hz

Digital fringe range sensorDigital fringe range sensor

+ Real time performance− Phase ambiguity near discontinuities− Customized device− Capture from one viewpoint at a time

Page 11: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Active multi-baseline stereoActive multi-baseline stereo

S. Kang, J.A. Webb, C. Zitnick, and T. Kanade, “A Multibaseline Stereo System with Active Illumination and Real-time Image Acquisition,” ICCV 1995.

Working Volume: 2000mm - Accuracy: 0.1% Spatial Resolution: 100x100? - Speed: 30Hz

+ Only require one image per camera+ Simultaneous multi-view capture− Less accurate than laser scanners or fringe scanners

Page 12: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

3D surface

I J

x1 x2

Disparity: d = x1 – x2

Spacetime stereoSpacetime stereo

Page 13: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

3D surface

time

I J

Spacetime stereoSpacetime stereo

Page 14: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

time

3D surface

I J

Spacetime stereoSpacetime stereo

Page 15: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

time

surface motion

I J

Page 16: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

time

surface motion

I J

•Matching volumetric window•Local linear disparity change

affine window warp

Key ideas:

Zhang et al. CVPR 2003Zhang et al. CVPR 2003

Page 17: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Spacetime stereoSpacetime stereo

Input stereo video:

656x494x60fps videos captured by firewire cameras

Page 18: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Face Example: Result ComparisonFace Example: Result Comparison

Frame-by-frame stereo

WxH=15x15 window

Spacetime stereo

WxHxT=9x5x5 window

Page 19: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Face Example: Mouth motionFace Example: Mouth motion

Zhang, L., Curless, B., Seitz, S., “Spacetime stereo”, CVPR 2003, Working Volume: 300mm - Accuracy: 0.1%Spatial Resolution: 640x480- Speed: 60Hz

+ More accurate and stable than frame by frame stereo+ Simultaneous multi-view capture− Offline computation (3min per frame)

Page 20: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Principle 2: Time-of-flightPrinciple 2: Time-of-flight

+ No baseline, no parallax shadows+ Mechanical alignment is not as critical − Low depth accuracy− Single viewpoint capture

Miyagawa, R., Kanade, T., “CCD-Based Range Finding Sensor”, IEEE Transactions on Electron Devices, 1997

Working Volume: 1500mm - Accuracy: 7%Spatial Resolution: 1x32- Speed: ??

Page 21: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Principle 3: DefocusPrinciple 3: Defocus

Page 22: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Principle 3: DefocusPrinciple 3: Defocus

Nayar, S.K., Watanabe, M., Noguchi, M., “Real-Time Focus Range Sensor”, ICCV 1995

Working Volume: 300mm - Accuracy: 0.2%Spatial Resolution: 512x480 - Speed: 30Hz

+ Hi resolution and accuracy, real-time− Customized hardware− Single view capture?

Page 23: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Commercial productsCommercial products

Company Working principle XY resolution

Depth accuracy

Speed

Cyberware Laser >500x500 0.01mm >10sec per scan

XYZRGB Laser Very high 0.01mm >10sec per scan

Eyetronics Structrued light High <2mm <0.1sec

3Q Active stereo High ? <0.1sec

3DV Time of flight 720x486 1-2cm 30Hz

Canesta Time of flight 64x64 1cm 30Hz

Page 24: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Comercial productsComercial products

Canesta

64x64@30hzAccuracy 1-2cm

Not accurate enough for face modeling, but good enough for layer extraction.

Page 25: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

OutlineOutline

1. Scanning face models

– Triangulation methods (created most accurate face models)

– Non triangulation methods

2. Dense facial motion capture

– Marker based capture

– Template fitting for face scans

Page 26: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Marker based approachMarker based approach

182 colored dots on a face 6 cameras videotaping performance

Dot removal for texturemapdeforming face model3D dot motion

Guenter et al SIGGRAPH 1998

Page 27: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Making facesMaking faces

Guenter et al SIGGRAPH 1998

+ Realistic appearance− Limited geometry details − The overhead of painting faces

MOVA Motion CapturePhosphorescent paint

Page 28: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

video projectors

color cameras

black & white camerasSpacetime facesSpacetime faces

Face capture rigZhang et al SIGGRAPH 2004

Page 29: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Capture processCapture process

Page 30: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Input videos (640x480, 60fps)Input videos (640x480, 60fps)

Page 31: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

Global spacetime stereoGlobal spacetime stereo

Page 32: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

A sequence of color image pairs:

A sequence ofdepth map pairs:

time

A sequence of meshes:

Template mesh

Page 33: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

A sequence of color image pairs:

A sequence ofdepth map pairs:

time

Warped template

Page 34: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

A sequence of color image pairs:

A sequence ofdepth map pairs:

time

Warped template Fitted template

Page 35: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

A sequence of color image pairs:

A sequence ofdepth map pairs:

time

Warped template Fitted template

Page 36: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

A sequence of color image pairs:

A sequence ofdepth map pairs:

time

Warped template Fitted template

Page 37: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

A sequence of color image pairs:

A sequence ofdepth map pairs:

time

Fitted template

Page 38: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

A sequence of color image pairs:

A sequence ofdepth map pairs:

time

Fitted template

Page 39: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

A sequence of color image pairs:

A sequence ofdepth map pairs:

time

Fitted template

Page 40: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

+ High resolution motion (~20K vertices)− not robust for very fast motion

Spacetime facesSpacetime faces

Better skin models for template fitting

Fast cameras

Zhang et al, SIGGRAPH 2004

Page 41: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

∙∙∙

High Resolution Acquisition of Dynamic 3-D expressionHigh Resolution Acquisition of Dynamic 3-D expression

template

Problem: estimating 3D motion between shape measurement

Approach: template fitting

∙∙∙ ∙∙∙

Wang et al ICCV 2005

Page 42: SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

High Resolution Acquisition of Dynamic 3-D expressionHigh Resolution Acquisition of Dynamic 3-D expression

+ High resolution motion+ More stable motion− less robust for larger inter-frame deformation

Wang et al ICCV 2005