A Family of MCMC Methods on Implicitly Deﬁned Manifolds · A Family of MCMC Methods on Implicitly Deﬁned Manifolds! Marcus A. Brubaker ,+, Mathieu Salzmann and Raquel Urtasun

A Family of MCMC Methods on Implicitly Defined Manifolds!Marcus A. Brubaker,+, Mathieu Salzmann and Raquel Urtasun!

Toyota Technological Institute at Chicago!+ University of Toronto, Canada!

Introduc)on: •  Tradi&onal MCMC methods (e.g., Gauss-‐Metropolis, HMC) assume the

target distribu&on is over a Euclidean space •  However, many problems exist which are most naturally characterized over

a non-‐linear manifold •  Sampling from posteriors that arise in such problems has typically required

the deriva&on of posterior-‐specific sampling schemes

Contribu)ons: •  Here we derive an MCMC scheme based on Hamiltonian dynamics on an

implicitly defined manifold •  We prove that, subject to suitable condi&ons, the Markov Chain converges

to the target posterior •  We present constrained variants of several MCMC methods including:

Gauss-‐Metropolis, Hamiltonian (and Langevin) Monte Carlo and Riemann Manifold HMC [6]

•  These algorithms are demonstrated on a range of problems including: o  Sampling from a linearly constrained Gaussian distribu&on o  Sampling from the Bingham-‐von Mises-‐Fisher distribu&on over o  Bayesian matrix factoriza&on for collabora&ve filtering o  Human pose es&ma&on

•  Matlab code available from: hSp://www.cs.toronto.edu/~mbrubake/

Previous Work: •  Similar methods are commonly used in molecular dynamics to compute the

free energy of a constrained system (eg, [1-‐3]) •  Gibbs samplers have been derived for some distribu&ons (eg, [4]) but even

those specialized methods are outperformed by methods presented here

M = {q ∈ Rn|c(q) = 0}

π(q)

Sn

Experimental Results: •  Gaussian distribu&on in a linear subspace

•  Bingham-‐von Mises-‐Fisher

•  Collabora&ve filtering

•  Human pose es&ma&on o  Pose is a set of 3D joint posi&ons o  Manifold is induced by the limb length

constraints of the skeleton o  Posterior combines noisy 2D joint projec&ons

with a PCA based prior model of pose o  Compared with direct op&miza&on for

different levels of noise

References: 1. G. Cicco^ and J. P. Ryckaert. Molecular dynamics simula&on of rigid molecules. Computer Physics Report, 4(6):346–392, 1986

2. C. Hartmann. An ergodic sampling scheme for constrained Hamiltonian systems with applica&ons to molecular dynamics. Journal of Sta&s&cal Physics, 130:687–711, 2008

3. T. Lelièvre, M. Rousset, and G. Stoltz. Free energy computa&ons: A Mathema&cal Perspec&ve. Imperial College Press, 2010

4. P. D. Hoff. Simula&on of the matrix Bingham-‐von Mises-‐FIsher distribu&on, with applica&ons to mul&variate and rela&onal data. Journal of Computa&onal and Graphical Sta&s&cs, 18:438–456, 2009

5. E. Hairer, C. Lubich, and G. Wanner. Geometric Numerical Integra&on. Springer, 2nd edi&on, 2006 6. M. Girolami and B. Calderhead. Riemann manifold Langevin and Hamiltonian Monte Carlo methods. Journal of the Royal Sta&s&cal Society: Series B, 73:123–214, 2011

0 0.01 0.02 0.03 0.04 0.05

0

0.2

0.4

0.6

0.8

1

CHMC (L = 4)CHMC (L = 3)CHMC (L = 2)CLangevinCMetropolisGibbs

20 40 60 80 100

100

200

300

400

Frame #

Mea

n jo

int e

rror [

mm

]

Constr optOurs MAPOurs mean

0 2 4 6 8 10

50

100

150

200

250

Noise std

Mea

n jo

int e

rror [

mm

]

Constr optOurs MAPOurs mean

M = {q ∈ Rn|c(q) = 0}Theore)cal Result: •  Assume that is connected, smooth and

differen&able with full-‐rank everywhere and the target posterior is strictly posi&ve on

•  Given:   a mass matrix which is posi&ve definite on   a simula&on poten&al energy func&on which is con&nuous   a numerical integra&on method which is

symmetric, locally accessible, consistent with the Simula&on Hamiltonian , and symplec0c on the co-‐tangent bundle

•  Theorem: For all

where denotes steps of the Markov transi&on kernel of the Constrained Hamiltonian Monte Carlo algorithm

C(q) = ∂c∂q

M(q) M

Mπ(q)

U(q)ΦH

h : T ∗M → T ∗M

T ∗M =�(p, q)|c(q) = 0 and C(q)∂H∂p (p, q) = 0

�

C2

H

q0 ∈ M

limn→∞

�Tn(q0 → ·)− π(·)� = 0

Tn(q0 → ·) n

Simula)on of constrained Hamiltonian systems •  Need a symplec&c, consistent and symmetric integra&on method on •  Generalized RATTLE Algorithm (see [5] for details and other op&ons)

•  If and the mass matrix is constant, RATTLE reduces to Leapfrog

M

p1/2 = p0 −h

2

�∂H(p1/2, q0)

∂q+ C(q0)

Tλ

�

q1 = q0 +h

2

�∂H(p1/2, q0)

∂p+

∂H(p1/2, q1)

∂p

�

0 = c(q1)

p1 = p1/2 −h

2

�∂H(p1/2, q1)

∂q+ C(q1)

Tµ

�

0 = C(q1)∂H(p1, q1)

∂p

M = Rn

Instances of Constrained HMC: •  Gauss-‐Metropolis with covariance can expressed as HMC with

and . Constrained Gauss-‐Metropolis is thus similarly defined. •  Constrained Langevin Monte Carlo arises with •  Constrained Riemann Manifold HMC [6] arises for suitable choices of

Σ U(q) = 0M(q) = Σ−1

L = 1M(q)

10 0 1015

10

5

0

5CHMC

10 0 1015

10

5

0

5CLangevin

10 0 1015

10

5

0

5CMetropolis

M = Sn π(q) ∝ exp(dT q + qTAq)

Method E[− log π(q)] ESS % ESS/second

CHMC (L = 4) -999.021 27.3 183.756

CHMC (L = 3) -998.759 25.4 217.427

CHMC (L = 2) -999.121 37.9 440.898

CLangevin -998.757 33.0 619.339

CMetropolis -998.82 3.8 90.1513

Gibbs [4] -998.742 50.8 160.722

M = Vr(RN )× Vr(RM )× Rr π(U,S,V) ∝�

(i,j)∈E

exp

�− (f(UiSVj)−Yi,j)2

2σ2p

�

1M Movie Lens (RMSE) EachMovie (RMSE)r 5 10 15 5 10 15

HMC 1.577 ± 0.39 2.001 ± 0.66 2.306 ± 0.25 1.153 ± 0.002 1.161 ± 0.002 1.204 ± 0.018

HMC-l 0.909 ± 0.008 0.949 ± 0.01 0.99 ± 0.01 1.155 ± 0.007 1.164 ± 0.001 1.184 ± 0.004

CHMC 0.893 ± 0.01 0.888 ± 0.01 0.889 ± 0.01 1.144 ± 0.002 1.121 ± 0.001 1.116 ± 0.001

CHMC-l 0.888 ± 0.01 0.881 ± 0.01 0.881 ± 0.01 1.137 ± 0.003 1.115 ± 0.002 1.11 ± 0.002

Constrained Hamiltonian Monte Carlo: •  Input: •  Define: o  Co-‐tangent Projec0on:

o  Acceptance Hamiltonian:

o  Simula0on Hamiltonian:

1.  , 2.  For , 3.  With probability o  Return

4.  Else o  Return

q0, M(q), h, L, π(q), U(q)

i = 1, . . . , L (pi, qi) ← ΦH

h (pi−1, qi−1)

P(q) = I −M(q)−TC(q)T�C(q)M(q)−1M(q)−TC(q)T

�−1C(q)M(q)−1

H(p, q) = 12p

TM(q)−1p+ U(q)

H(p, q) = 12p

TM(q)−1p+ 12 log |2πP(q)TM(q)P(q)|− log π(q)

qL

q0

p�0 ∼ N (0,M(q0)) p0 ← P(q0)p�0

min {1, exp(H(p0, q0)−H(pL, qL))}

A Family of MCMC Methods on Implicitly Deﬁned Manifolds · A Family of MCMC Methods on Implicitly Deﬁned Manifolds! Marcus A. Brubaker ,+, Mathieu Salzmann and Raquel Urtasun

Documents