Top Banner
Monocular Human Pose Estimation with Bayesian Networks Electronic Engineering Department, Fu Jen University 2010/6/11 Yuan-Kai Wang 本著作採用創用CC 「姓名標示」授權條款台灣3.0
68

Monocular Human Pose Estimation with Bayesian Networks

Dec 05, 2014

Download

Documents

Yuan-Kai Wang

My slides for acamedia talk about human motion capture given in 2010. Some of our research results are also presented in this presentation.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Monocular Human Pose Estimation with Bayesian Networks

Monocular Human Pose Estimation with Bayesian Networks

Electronic Engineering Department,Fu Jen University

2010/6/11

Yuan-Kai Wang

本著作採用創用CC 「姓名標示」授權條款台灣3.0版

Page 2: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 2

Outline1. Introduction2. Markless Monocular Human Pose

Estimation3. Overview of the Approach4. Model Learning by EM algorithm5. Pose Estimation by Approximate Inference6. Feature Extraction7. Experimental Results8. Conclusions

Page 3: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 3

1. Introduction• Applications of Human Motion

Capture– Performance animation in movie making– Game– Medical diagnosis– Sport & Health– Visual surveillance

Page 4: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 4

Performance Animation• Avatar • The Lord of the

Rings

Page 5: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 5

Game• Microsoft's Project Natal for XBOX360

Page 6: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 6

Medical Diagnosis• Gait analysis for

Rehabilitation

Page 7: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 7

Sport & Health• Golf training

Page 8: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 8

Visual Surveillance• Behavior analysis for event detection

– Irregular movement, body language, and unusual interactions, fighting

– Car crash• Content-based retrieval

Page 9: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 9

Sensor Approaches• Active sensors

– Types• Electro-magnetic marker• Optical• Accelerometer

– Wired connection– Drawbacks

• Intrusive• Expensive• Time consuming

• Passive sensorsby camera– Marker-based– Markerless

TooManyWires

Page 10: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 10

Marker-based Sensors• Add visual markers on body

– Active marker• Visual/non-visual light

– Passive marker• Need computer vision algorithms• Advantages

– No wires• Drawbacks

– Semi-intrusive– Time consuming

Activemarker

Passivemarker

Page 11: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 11

Markerless Sensors• No attachment on human body• Heavily dependent on

computer vision analyzer– Stereo/Multiple cameras– Monocular cameras

Pure vision solution

Page 12: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 12

Sensor v.s. Analyzer

T. B. Moeslund, "Computer vision-based human motion capture – a survey", Technical report LIA 99-02, University of AALBORG, 1999.

Page 13: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 13

Pose Estimation v.s. Gesture Recognition

Walking

GestureRecognition

Pose Estimation

Page 14: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 14

2D v.s. 3D

Page 15: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 15

2. Markerless Monocular Human Motion Capture

• Goal– Markless– Single camera– 3D poses

• Challenges– Ill-posed– Highly articulated– Self-occluding

Depth ambiguities & occlusion using

monocular silhouettes

Page 16: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 16

Joint Representation

• Articulated human body is linked by joints

Page 17: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 17

Abstract Representation

2D 3D

Stick

Surface/Volume

Page 18: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 18

Literature Review

ImageSpace

(Pixel domain)

HumanSegmentation

(S)

ImageFeature

Descriptor (F)

2D JointLocation

(J)

3D ModelParametric Space(Pose domain, P)

• Full body• Body

parts

• Shape• Silhouette• Color• Appearance• Motion• Feature

point (corner)

• ...

•Joint angle

•Joint location

Neck

Left shoulder

Right shoulder

Left elbow

Right elbow

Left hand

Right hand

BottomLeft waist

Left knee

Right knee

Right foot

Left foot

X

y

Z

Right waist

Marker-based

Low-LevelObservation

High-LevelAbstraction

Θi

Pi

P=f(S)P=f(F) P=f(J)

• Background subtraction• Object detection

P=f1(f2(F))A two-stage approach is proposed

Page 19: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 19

Approaches• Model-free [Agarwal, 2006] [Loy, 2004]

– No utilization of joints articulation to constrain the search of function mapping P = f(X)

• Model-based [Rbert, 2006] [Rohr, 1994]

– A model of human articulation to constrain the search of f and P

– Two kinds of approach• Discriminative• Generative: Bayesian networks (BNs)

),|ˆ(maxargˆ:Inference

),Training(maxargˆ:Training

2

1

PXfLP

fLf

P

f

=

=

Page 20: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 20

An Articulated Model = A Bayesian Network

• Human body is represented as a kinematics tree, consisting of divisions linking by joints

• Kinematics models are addressed with graphical probability network

• Graphical probability models are computed via Bayesian network

Neck

Left shoulder

Right shoulder

Left elbow

Right elbow

Left hand

Right hand

BottomLeft waist

Left knee

Right knee

Right foot

Left foot

X

y

Z

Right waist

Page 21: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 21

Three Steps to Utilize BNs

• Representation, learning and inference

Representation

Inference

Learning

X1

X2 X4X3

X1

X2 X4X3

P(X1|X2,X3,X4)

Joints

Features

),Training(maxargˆ1 fLf

f=

Feature-Joint correspondenceby Conditional Probability

),|ˆ(maxargˆ2 PXfLP

P=

Pose Estimation

Page 22: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 22

Two Causal Models in BNs• Undirected acyclic graph [Lan, 2008] [Hua, 2005]

– Bayesian network is a tree or a graph model that the linking edge between two nodes has no direction.

• Directed acyclic graph [Ramanan, 2007] [Lee, 2006] [Leonid, 2003]

– Every node has directed arcs linked to another node.

X1

P(X1|X2)X2

P(X1,X2)X1 X2

Page 23: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 23

Directed Bayesian Articulated Model

• Nodes in directed acyclic graph (DAG) are not influenced by their child nodes.

• Human body parts are not regarded as two-way

h2d,1

h2d,2

h2d,4h2d,5 h2d,6h2d,7 h2d,8

h2d,9h2d,10 h2d,11

h2d,12

h2d,14

h2d,13

h2d,15

h2d,3

Page 24: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 24

Inference of Bayesian Networks

• Top-down approach [Gavrila, 1996]

– Has the strength at finding human body parts in the image.

• Bottom-up approach [Ren, 2005]

– Has the strength at finding people in the image.

• Combined approach [Navaraman, 2005][Lee, 2002]

– Has the benefit from the advantages of both.

Page 25: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 25

3. Overview of the Approach

Neck

Left shoulder

Right shoulder

Left elbow

Right elbow

Left hand

Right hand

BottomLeft waist

Left knee

Right knee

Right foot

Left foot

X

y

Z

Right waist

Head

Left knee

Right knee

Left foot

Right foot

Neck

Left shoulder

Right shoulder

Left elbow

Right elbow

Left hand

Right hand

Bottom

Left waist

Right waist

2D 3D

They are belief propagation networks using an annealing Gibbs sampling algorithm.

Page 26: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 26

System Architecture• We estimate the 2D human joint

positions before 3D estimation.

2D Bayesian Human Model

Setting

EM Training3D Bayesian

Human Model Setting

EM Training

Testing image

2D Bayesian Inference with

Annealed Gibbs Sampling

3D Bayesian Inference with

Annealed Gibbs Sampling

Feature Extraction

2D Model Training

Result

Training Features

3D Model Training

Training Features

Page 27: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 27

2D Human Graphical Model• The articulated structure of 2D human

body is represented by a 15-node graphical model.

Head

Left knee

Right knee

Left foot

Right foot

Neck

Left shoulder

Right shoulder

Left elbow

Right elbow

Left hand

Right hand

Bottom

Left waist

Right waist

h2d,1

h2d,2

h2d,4h2d,5 h2d,6h2d,7 h2d,8

h2d,9h2d,10 h2d,11

h2d,12

h2d,14

h2d,13

h2d,15

h2d,3

},...,{ 15,21,22 ddD hhH =

2D stick figure (articulated model)

Page 28: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 28

Neck

Left shoulder

Right shoulder

Left elbow

Right elbow

Left hand

Right hand

BottomLeft waist

Left knee

Right knee

Right foot

Left foot

X

y

Z

Right waist

3D Human Graphical Model• 3D human body model is described by a 45D

vector H3D representing joint positions for dimensions of each joint node in the 3D space

},...,{ 15,31,33 ddD hhH =

h3d,1h3d,2 h3d,3

h3d,4 h3d,5

h3d,6 h3d,7 h3d,8

h3d,9 h3d,10

h3d,11 h3d,12

h3d,14h3d,13

h3d,15

3D stick figure (articulated model)

Page 29: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 29

The BN Model• A directed acyclic graph

– V: vertex set {Vi, 1≤i≤N}– : a set of directed edges (i,j) – C: (i,j) → R+, edge cost functions

• To encode probabilistic information– An edge indicates a probabilistic

dependence– C : P(Vi | Vj): conditional probability

function set• The 2D and 3D BNs

),,( CEVG

=h2d,1

h2d,2

h2d,4h2d,5 h2d,6h2d,7 h2d,8

h2d,9h2d,10 h2d,11

h2d,12

h2d,14

h2d,13

h2d,15

h2d,3

E

),,( 2222 DDDD CEVG

= ),,( 3333 DDDD CEVG

=

Page 30: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 30

2D Graphical Model

NcS

AC

h2d,2

h2d,1

h2d,10h2d,8h2d,9

h2d,4

h2d,11

h2d,13 h2d,14

h2d,12

h2d,6 h2d,8h2d,3h2d,5h2d,7

O2d :

))}(|({ ,2,22 ididD hpahPC =

},{ 222 DDD OHV =

Page 31: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 31

3D Graphical Model

hu3d,1hu3d,2 hu3d,3

hu3d,5hu3d,4

hu3d,6 hu3d,7

h2d,1h2d,3 h2d,4

h2d,5 h2d,6

h2d,7 h2d,8

h2d,9

h2d

wN

L

O3d :

hl3d,1hl3d,2 hl3d,3

hl3d,5hl3d,4

hl3d,6 hl3d,7

h2d,9h2d,10 h2d,11

h2d,12 h2d,13

h2d,14 h2d,15

Upperbody

Lowerbody

))}(|({ ,3,33 ididD hpahPC =

},{ 333 DDD OHV =

Page 32: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 32

Joint Probability Distribution(JPD)

• The two proposed graphical models specify two unique JPDs: P2D(V2D) and P3D(V3D)

• Let P(V) represent the two JPDs

∏=

=n

iii VpaVPVP

1

))(|()(h2d,2

h2d,1

h2d,10h2d,8h2d,9

h2d,4

h2d,11

h2d,13 h2d,14

h2d,12

h2d,6 h2d,8h2d,3h2d,5h2d,7

• The factorization of the JPD comes from the Markov Blanket, a local Markov property

• If we can learn the finite conditional probabilities, we can inference the human pose

Page 33: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 33

Two Problems• Training problem

– Given a training set : {O2d, O3d}– How can we learn the edge cost function

C = { P(h | pa(h)) }– We apply the EM algorithm

• Inference problem– Given an evidence O– How can we inference

the human poseP(H | O) by P(V)

– We propose an annealed Gibbs samplingalgorithm

h2d,2

h2d,1

h2d,10h2d,8h2d,9

h2d,4

h2d,11

h2d,13 h2d,14

h2d,12

h2d,6 h2d,8h2d,3h2d,5h2d,7

Page 34: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 34

4. Model Learning by EM• Why apply the EM algorithm for model

learning– The human poses and observations are

incomplete and sparse• Incomplete: occlusion due to single camera• Sparse: small training samples in large-

dimension space

Page 35: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 35

The Likelihood Function• The training set D={D1,…DN}

– N represents the number of training samples– Dl={V1[l],…,Vn[l]} is the l-th training sample

• Let θ be the learning model: C = { P(h | pa(h)) }•

• A log-likelihood function is formulated based on the independence assumption of training samples

= ∏=

N

lnD lVlVPL

11 )|][],...,[(log)( θθ

∑ ∑= ==

n

i

N

l iii lVpalVP1 1

))),((|][(log θ

))|(log()( θθ DPLD =

∏=

=

===

Nll

DPP

DP

DPDPDP

~1

)()(

)|(maxarg

)|(maxarg)|(maxarg)|(maxargˆ

θ

θθθθ

θ

θ

θ

θθ

Page 36: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 36

MLE v.s. EM• If D is complete, we can apply the MLE

(Maximum Likelihood Estimation) to find θ

• However D is incomplete because of occlusion and partial observability

• Let D=Y∪U– Y is observed data– U is the missing data

h2d,2

h2d,1

h2d,10h2d,8h2d,9

h2d,4

h2d,11

h2d,13 h2d,14

h2d,12

h2d,6 h2d,8h2d,3h2d,5h2d,7

Page 37: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 37

The EM• Expectation Step

– Computes the expectation of the log likelihood function

• Maximization Step– Updates the t+1 step parameter θ(t+1) from

current parameter θ(t)

• Stop condition of the E-M steps iteration– converges

],|)|([log)|( )()()( YDPEQ tt

t θθθθθ

==

)|(maxarg )()1( tt Q θθθθ

=+

)()( )()1( tD

tD LL θθ −+

Page 38: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 38

5. Pose Estimation by Approximate Inference

• Let the observed data be O'=O-U– U is the set of hidden variables that are

unobservable due to occlusion• The best estimated pose is a vector H*,

which is defined as the pose with the maximum probability given O'.

=

==

Uu

Uu

duuOHP

duOuHPOHPH

),',(maxarg

)'|,(maxarg)'|(maxarg*

V= H ∪ O' ∪ UP(V) ∫ ∏∈ =

=Uu

n

iii VpaVP

1

))(|(maxarg

Page 39: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 39

Inference of Posterior Probability

• How to calculate the posterior probability?

– Exact inference• Junction tree, Message passing

– Approximate inference• Loopy belief propagation , Variational method• Markov chain Monte Carlo (MCMC) sampling

– Metropolis-Hasting– Gibbs sampling

∫ ∏∈ =

=Uu ni

ii duVpaVPH...1

))(|(maxarg*

Page 40: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 40

Approximate Inference (1/2)

• MCMC algorithm uses sampling theorem• To approximate posterior distributions

P(V) by random number generation• The key idea of MCMC is to simulate the

sampling process as a Markov chain• Definition

• A sample vector v of V• A proposal distribution q(v*|v(t-1)) to generate v*• An acceptance distribution α to accept v* as v(t)

= −−

−−

)|*()(*)|(*)(,1min*),( )1()1(

)1()1(

tt

tt

vvqvpvvqvpvvα

Page 41: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 41

Approximate Inference (2/2)• MCMC will generate a Markov chain

(v(0), v(1), ..., v(k), ...), as the transition probabilities from v(t-1) to v(t)

– Depends only on v(t-1)

– But not (v(0), v(1), ..., v(t-2))• The chain approaches its stationary

distribution– Samples from the vector (v(k+1), ..., v(k+n)) are

samples from P(V)• However, if V is in high dimensions,

MCMC is not easy to converge

Page 42: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 42

Annealed Gibbs Sampling (1/4)• Gibbs sampling method

– Formally proposed by Geman&Geman in 1984 for Markov Random Field (MRF)

– Here the sampler is revised for the proposed two-stage Bayesian network

– The basic idea• Sampling uni-variate conditional

distributions• That is, Markov chain of (v(0), v(1), ..., v(k),

...) is achieved by only changing one variable of v

Page 43: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 43

Annealed Gibbs Sampling (2/4)• We draw from the distribution

• The Annealed Gibbs (AG) sampler– The uni-variate conditional distributions

sampling is controlled by a stochastic process of simulated cooling

( ))()(1

)(1

)(1

)( ,,,,,|~ tn

tj

tj

tj

tj vvvvVPv +−

=

= −−−

otherwise 0 if )|(

)|*()(*)(*

)(tjj

ijjt vvvvp

vvq

=

)|*(*)|(

)(*)(,1min )(

)()(1

)( tj

tj

tT

tj

AG vvqvvq

vpvpα

Page 44: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 44

Annealed Gibbs Sampling (3/4)• Function T(t) is called cooling

schedule• The particular value of T at any point in

the chain is called the temperature – T0 is start temperature– Tf is the final cool down temperatures over

n step • As the process proceeds, we decrease

the probability of such down-hill moves

nt

f

TT

TtT )()(0

0=

Page 45: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 45

Annealed Gibbs Sampling (4/4)• The AG sampler adopts a stochastic iterative

algorithm that converges to the set of points which are the global maxima of the given function

• The advantage of the AG sampler is – Its efficiency compared to the Gibbs sampler is

better• Because Instead of approximating P(V)

– We want to find the global maximum, i.e., the ML estimate of posterior distribution.

– We run a Markov chain of invariant distribution P(V) and estimate only the global mode

Page 46: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 46

6. Feature Extraction• Human silhouette sampling

• Normalized width

• Normalized center

• Spatial distribution of skin color

• Corners of silhouette

Width

Length

Page 47: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 47

Human Silhouette Sampling (S)• Human segmentation• Human silhouette capturing [Suzuki, 1985]

• Uniform sampling is used in human silhouette sampling.

Page 48: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 48

Normalized Width (wN )• Human segmentation• Binary image profile• Width adjust

48

Width

Length

LRN xxw −=

wxforthresholdh

thresholdhxx

x

xL →=

<≥

=−

11

0 100 200 300 400 500 6000

50

100

150

200

250

300

350

400

450

Profile of X coordinate

x coordinate of image

pixel

accu

mulat

ion va

lue

Normalization width

11

→=

<≥

=+

wxforthresholdh

thresholdhxx

x

xR

Page 49: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 49

Normalized Center (Nc) • Boundary adjustment• Center of new boundary

Width

Length

NpN wxx 5.0+=

Lyy pN 5.0+=

Page 50: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 50

Spatial Distribution of Skin Color (A)

Skin color detection by GMM

Morphology

Region segment

Spatial distribution of skin color

Page 51: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 51

Corners of Silhouette (C)• Human segmentation• Human silhouette capturing• The level curve curvature approach

[Lindeberg, 1998]

• Adaptive corner choicexyyxxxyyyx DDDDDDDyxI 2maxarg),(~ 22 −+=

Page 52: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 52

7. Experimental Results• Experimental environment

– CPU:1.86G, RAM:1G, VC6.0– HumanEva database I

Page 53: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 53

HumanEva Database I• Provider:

– Department of Computer Science in Brown Univ.• Actions of HumanEva I

Action DescriptionWalking Subjects walked in an elliptical around

the capture space.Jog Subjects jogged in an elliptical around

the capture space.Gesture Subjects performed “hello”

and ”good-bye” gestures in repetition.Throw/Catch

Subjects tossed and caught a baseball with the help of the lab assistant.

Box Subjects imitated boxing.Combo Subjects performed combinational

actions of walking and jogging.

Page 54: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 54

Environment Setting

• 7 cameras– 3 color cameras

( C1, C2, C3 ) – 4 gray level cameras

( BW1, BW2, BW3, BW4 )

Control Station

Capture Space2m

3m

BW1 BW2

C1BW4 BW3

C2 C3

Page 55: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 55

The Experimental Data• Our proposed method has been trained by 1900

images from walking sequences of subjects 1 and 2 from C1

• 200 testing images: • 100 images from subject 1 • 100 images from subject 2

• Difficulties:– Self-occluding– Clothe variation– Large variation of

joint location

Page 56: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 56

Evaluation of Accuracy

• Average distance error of poses between estimated results and ground truth• Let H = {h1, h2, ...hM}, where hm ∈ R3 (or xm ∈

R2 for the 2D body model), be the position vector of the body pose in the world (or image respectively)

• D(H, H*): the error in estimated pose H* to the ground truth pose H

∑=

−=

M

m

mm

Mhh

HHD1

*

*),( ∑∑= =

=N

n

T

tntnt HHD

NT 1 1

*,, ),(1ξ

Page 57: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 57

Performance Comparison Between Two-stage and One-stage methods

• AG sampler performs better than the Gibbs sampler,• Two-stage approach performs better than classical

one-stage approach• AG sampler takes less inference time

Page 58: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 58

Effect of Iteration Number on Accuracy

Page 59: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 59

2D Results of Subject 1

GTAGs

GTAGs

GTAGs

GTAGs

Frame:1122

Frame:1149

Frame:1172

Frame:1200

Page 60: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 60

GTAGs

2D Results of Subject 2

GTAGs

GTAGs

GTAGs

Frame:804

Frame:835

Frame:875

Frame:899

Page 61: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 61

3D Results• The 1110 frame of subject 1

-1000100 -1000100

-50

0

50

100

150

Ground truth

-1000100 -1000100

-50

0

50

100

150AGs estimation result

Page 62: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 62

3D Results (Cont.)

• The 1135 frame of subject 1

-1000

100

-1000

100-50

0

50

100

150

Ground truth

-1000

100

-1000

100-50

0

50

100

150

AGs estimation result

Page 63: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 63

3D Results (Cont.)• The 845 frame of subject 2

-1000

100

-100

0

100-50

0

50

100

150

Ground truth

-1000

100

-100

0

100-50

0

50

100

150

AGs estimation result

Page 64: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 64

3D Results (Cont.)• The 872 frame of subject 2

-1000

100

-100

0

100-50

0

50

100

150

Ground truth

-1000

100

-100

0

100-50

0

50

100

150

AGs estimation result

Page 65: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 65

8. Conclusions• A markerless and monocular motion

capture problem is considered• The proposed two-stage annealed Gibbs

sampling method can estimate more accurate poses with less computation time

• The method can overcome three challenges of the problem– Self-occlusion– High-degree variation of joint locations– Clothing limitation

Page 66: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 66

Future Work• Use GMM to approximate prior and

posterior distribution of our human models • Combine model-free method and model-

based methods to obtain benefits of both • Exploit HMM to inference human motions

in time series• Add human parts detectors to help locate

human joints

Page 67: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai Electronic Engineering Department, Fu Jen University 67

Page 68: Monocular Human Pose Estimation with Bayesian Networks

Wang, Yuan-Kai

本簡報授權聲明• 此簡報內容採用 Creative Commons 「姓名標示 - 非商業性台灣 3.0 版」授權條款

• 歡迎非商業目的的重製、散布或修改本簡報的內容,但請標明: (1)原作者姓名:王元凱; (2)圖標示:

• 簡報中所取用的部份圖形創作乃截取自網際網路,僅供演講者於自由軟體推廣演講時主張合理使用,請讀者不得對其再行取用,除非您本身自忖亦符合主張合理使用之情狀,且自負相關法律責任。