Cut-And-Stitch: Efficient Parallel Learning of Linear Dynamical Systems on SMPs Lei Li Computer Science Department School of Computer Science Carnegie.

1

Cut-And-Stitch: Efficient Parallel Learning of Linear Dynamical Systems on SMPs

Lei LiComputer Science Department

School of Computer Science Carnegie Mellon University

[email protected]

School of Computer Science

Efficient Parallel Learning of Linear Dynamical Systems on SMPs

2

Motion stitching via effort minimizationwith James McCann, Nancy Pollard and

Christos Faloutsos [Eurographics 2008]

Parallel learning of linear dynamical systemswith Wenjie Fu, Fan Guo, Todd Mowry and

Christos Faloutsos[KDD 2008]

3

Background

• Motion Capture

• Markers on human body, optical cameras to capture the marker positions, and translated into body local coordinates.

• Application:– Movie/game/medical industry

4

Outline• Background• Motivation: effortless motion stitching• Parallel learning with Cut-And-Stitch• Experiments and Results• Conclusion

5

Motivation

• Given two human motion sequences, how to stitch them together in a natural way( = looks natural in human’s eyes)?

e.g. walking to running• Given a human motion sequence, how to find

the best natural stitchable motion in motion capture database?

6

Intuition

• Intuition:– Laziness is a virtue. Natural motion use minimum

energy• Laziness-score (L-score) = energy used during

stitching• Objective:

– Minimize laziness-score

7

Example

Taking off

landing

8

Example, Natural stitching

Taking off

landing

9

But, how about this way?

Taking off

landing

10

Observations

• Naturalness depends on smoothness• Naturalness also depends on motion speed

11

Proposed Method

• Estimate stitching path using Linear Dynamical Systems

12

Proposed Method (cont’)

• Estimate the velocity and acceleration during the stitching, compute energy (defined as L-score)

13

Proposed Method (cont’)

• Minimize L-score with respect to any stitching hops. (defined as elastic L-score)

14

Example stitching

• Link to video

15


16

Parallel Learning for LDS

• Challenge: – Learning Linear Dynamical System is slow for long

sequences• Traditional Method:

– Maximum Likelihood Estimation via Expectation-Maximization(EM) algorithm

• Objective:– Parallelize the learning algorithm

• Assumption:– shared memory architecture

17

Linear Dynamical Systemaka. Kalman Filter

• Parameters: =(u0, V0, A, Γ, C, Σ)

• Observation: y1…yn

• Hidden variables: z1… zn

17

Z1 Z2 Z3 Z4Z5

Y1 Y2 Y3Y4 Y5

N(A z∙ 1, Γ)

N(u0, V0)

N(C z∙ 3, Σ)

N(A z∙ 2, Γ)

N(C z∙ 1, Σ) N(C z∙ 2, Σ) N(C z∙ 4, Σ)

N(A z∙ 3, Γ)

N(C z∙ 5, Σ)

N(A z∙ 4, Γ)

18

Example

given positions, estimate dynamics (i.e. params)

z

1

y1

y2

z

3

y3

z

4

y4

z

5

y5

z

6

y6

z

2

Time

Position of left elbow

19

Traditional:How to learn LDS?

Sequential Learning (EM)z

1

z

2

y1

y2

z

3

y3

z

4

y4

z

5

y5

z

6

y6

Compute P(z1 | y1)

Time*

Measured

Estimated


21


1

z

2

y1

y2

z

3

y3

z

4

y4

z

5

y5

z

6

y6

From P(z1 | y1) Compute P(z2| y1 , y2)

Time*

Intuition: z2 may be close to z1

*

Measured

Estimated


22


1

y1

y2

z

3

y3

z

4

y4

z

5

y5

z

6

y6

From P(z2| y1 , y2) Compute P(z3| y1 , y2 , y3)

z

2

Time**

*

Measured

Estimated


23


1

y1

y2

z

3

y3

z

4

y4

z

5

y5

z

6

y6

From P(z3| y1 , y2 , y3) Compute P(z4| y1 , y2 , y3 , y4)

z

2

23

Time**

*

*

Measured

Estimated


24


1

y1

y2

z

3

y3

z

4

y4

z

5

y5

z

6

y6

z

2

From P(z4| y1 , y2 , y3 , y4) Compute P(z5| y1 , y2 , y3 , y4 , y5)

Time**

**

*

Measured

Estimated


25


1

y1

y2

z

3

y3

z

4

y4

z

5

y5

z

6

y6

z

2

From P(z5| y1 , y2 , y3 , y4 , y5) Compute P(z6| y1 , y2 , y3 , y4 , y5 , y6)

Time**

**

**

Measured

Estimated


26

*


1

y1

y2

z

3

y3

z

4

y4

z

5

y5

z

6

y6

z

2

From P(z6| y1 , y2 , y3 , y4 , y5 , y6) Compute P(z5| y1 , y2 , y3 , y4 , y5 , y6)

26

Time**

*

*Measured Estimated

*

*Intuition: take the future backward


27


1

y1

y2

z

3

y3

z

4

y4

z

5

y5

z

6

y6

z

2


*

2727

Time**

*

*Measured

*

*

*

Estimated

* **

**



1

y1

y2

z

3

y3

z

4

y4

z

5

y5

z

6

y6

z

2


28

*

2828

Time**

*

*Measured

*

*

**

Estimated

* **

**



1

y1

y2

z

3

y3

z

4

y4

z

5

y5

z

6

y6

z

2


29

*

2929

Time**

*

*Measured

*

*

**

*

Estimated

* **

**


30


1

y1

y2

z

3

y3

z

4

y4

z

5

y5

z

6

y6

z

2


303030

*

*

3030

Time**

*

*Measured

*

*

**

*

Estimated

* **

**



1

y1

y2

z

3

y3

z

4

y4

z

5

y5

z

6

y6

z

2

From all posterior z1 , z2 , z3 , z4 , z5 , z6

P(z1| y1 , y2 , y3 , y4 , y5 , y6), P(z2| y1 , y2 , y3 , y4 , y5 , y6)…Compute sufficient statistics E[zi] E[zizi’] E[zi-1zi’]

32


1

y1

y2

z

3

y3

z

4

y4

z

5

y5

z

6

y6

z

2

*

*Time*

*

*

*Measured

*

*

**

*

with sufficient statistics, compute argmax ←likelihood(θ) θ

reconstructed signal


34

How to parallelize it?

Speed Bottleneck: sequential computation of posterior

z

1

y1

y2

z

3

y3

z

4

y4

z

5

y5

y6

z

2

z

6

35

“Leap of faith”

start computation without feedback from previous node (cut),

and reconcile later (stitch)

36

Proposed Method: Cut-And-Stitchz

1

y1

y2

z

3

y3

z

4

y4

z

5

y5

y6

z

2

z

6

υ2,Φ2,η2,Ψ2υ1,Φ1,η1,Ψ1

z

1

y1

y2

z'

2

z

2

z

3

y3

z

4

y4

z'

4

z

5

y5

y6

z

6

υ3,Φ3,η3,Ψ3

start computation without feedback from previous node (cut)

reconcile later (stitch)

Cut-And-Stitchυ2,Φ2,η2,Ψ2υ1,Φ1,η1,Ψ1

z

1

y1

y2

z'

2

z

2

z

3

y3

z

4

y4

z'

4

z

5

y5

y6

z

6

υ3,Φ3,η3,Ψ3

Cut step: Estimate posteriors (E)

Time

Measured

Estimated

Intuition: compute all three at once*

*

*

P(z1| y1), P(z3| y3), P(z5| y5)Position of left elbow


z

1

y1

y2

z'

2

z

2

z

3

y3

z

4

y4

z'

4

z

5

y5

y6

z

6

υ3,Φ3,η3,Ψ3


Time*

**

*

**

Measured

Estimated



z

1

y1

y2

z'

2

z

2

z

3

y3

z

4

y4

z'

4

z

5

y5

y6

z

6

υ3,Φ3,η3,Ψ3


Time

Measured

*

**

*

***

*

* Intuition: backward adjust all at once


Cut-And-Stitch

Stitch step: Collect sufficient Statistics (C) Maximize parameters (M)

υ2,Φ2,η2,Ψ2υ1,Φ1,η1,Ψ1

z

1

y1

y2

z'

2

z

2

z

3

y3

z

4

y4

z'

4

z

5

y5

y6

z

6

υ3,Φ3,η3,Ψ3

41

Time

Measured

*

**

*

***

*

*



z

1

y1

y2

z'

2

z

2

z

3

y3

z

4

y4

z'

4

z

5

y5

y6

z

6

υ3,Φ3,η3,Ψ3

Stitch together: Re-estimate block parameters (R)

Time

Measured

*

**

*

***

*

*

*

*

*

Intuition: exchange messages cross block Iterate…

reconstructed signal


43


44

Experiments

Q1: How much speed up can we get?

Q2: How good is the reconstruction accuracy?

45

Experiments

• Dataset:– 58 human motion sequences, 200 – 500 frames– Each frame with 93 bone positions in body local coordinates– http://mocap.cs.cmu.edu

• Setup:– Supercomputer: SGI Altix system, distributed shared

memory architecture – Multi-core desktop: 4 Intel Xeon cores, shared memory

• Task:– Learn the dynamics, hidden variables and reconstruct

motion

46

Q1: How much speed up?Supercomputer Result

spee

dup

# of processors

ideal

average of 58

47

Q1: How much speed up?Multi-core Result

spee

dup

# of cores

ideal

average of 58

48

Q2: How good?

walking (#22) jumping (#1) running (#45)0.000%

0.500%

1.000%

1.500%

2.000%

2.500%

Sequential algCut-And-Stitch

Normalized Reconstruction Error

Result: ~ IDENTICAL accuracy

49

Conclusion & Contributions

• A distance function for motion stitching– Based on first principle: minimize effort

• General approximate parallel learning algorithm for LDS– Near linear speed up– Accuracy (NRE): ~ identical to sequential learning– Easily extended to HMM and other chain Markovian

models• Software (C++ w. openMP) and datasets:

www.cs.cmu.edu/~leili/paralearn

http://www.cs.cmu.edu/~leili/paralearn

50

Promising Extensions

• Extension– HMM – other Markov models (similar graphical model)

• Open Problem:– Can prove the error bound?

51

Thank you

• Questions

Cut-And-Stitch: Efficient Parallel Learning of Linear Dynamical Systems on SMPs Lei Li Computer Science Department School of Computer Science Carnegie.

Documents

learning algorithmassumption

natural motion

machine learning cut

linear dynamical systems11

motion capture database

human motion sequences

y2 compute pz3 y1

best natural stitchable