Top Banner
CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009
20

CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

Dec 18, 2015

Download

Documents

Bertram Boone
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

CS774. Markov Random Field : Theory and Application

Lecture 16

Kyomin JungKAIST

Nov 03 2009

Page 2: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

Markov Chain Monte Carlo

Let be a probability den-sity function on S={1,2,..n}.

f(‧) is any function on S and we want to es-timate

Construct P={Pij}, the transition matrix of an irreducible Markov chain with states S, where

and π is its unique stationary distribution.

1( ,..., )n

.

1

( ) ( )n

i

i

I E f f i

SXiXjXP tttij },|Pr{ 1

Page 3: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

Markov Chain Monte Carlo

Run this Markov chain for times t=1,…,N and calculate the Monte Carlo sum

then

1

1ˆ { },N

t

t

I f XN

ˆ as .I I N

Page 4: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

Metropolis-Hastings Algorithm

A kind of the MCMC methods. The Metropolis-Hastings algorithm can

draw samples from any probability distribu-tion π(x), requiring only that a function proportional to the density can be calcu-lated at x.

Process in three steps:Set up a Markov chain;Run the chain until stationary;Estimate function with Monte Carlo

methods.

Page 5: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

Metropolis-Hastings Algorithm

In order to perform this method for a given distribution π , we must construct a Markov chain transition matrix P with π as its sta-tionary distribution, i.e. πP= π.

Consider the matrix P was made to satisfy the reversibility condition that for all i and j, πiPij= πjPij.

The property ensures that

and hence π is the stationary distribution for P.

P for all ,i ij j

i

j

Page 6: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

Metropolis-Hastings Algorithm

Goal: Draw from a distribution with support S. Start with some value

Draw a proposed value y from some candidate density q(x,y)=q(y|x).

Accept a move to the candidate with probability

otherwise, stay at state x.

Assign

x=y. Assign

x=x. In either case, return to

S.x

y)(x)q(x,

x)(y)q(y,1,min y)(x,

Page 7: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

Metropolis-Hastings Algorithm

Note that we do not need a symmetric q.

When q(x,y) is close to , it is the best.

We are creating a Markov chain with transition law

and

xyfor y)(x, y)q(x, y)P(x,

dy y)](x,-[1 y)q(x, {x})P(x,

(y)

Page 8: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

Metropolis-Hastings Algorithm

We know that this transition law P will give a chain with stationary distribution if the detailed balance condition holds:

Obviously, this holds for y=x.

x)(y)P(y,y)(x)P(x,

Page 9: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

Metropolis-Hastings Algorithm

For x,y

y)(x, y)q(x, (x)y)(x)P(x,

x)q(y, (y) y),q(x, (x)min

y)(x)q(x,

x)(y)q(y,1,min y)q(x, (x)

1 ,x)(y)q(y,

y)(x)q(x,min x)q(y, (y)

x)(y, x)q(y, (y) x)p(y, (y)

Page 10: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

Metropolis-Hastings Algorithm

Some common choices for the proposal distribution:

“independent” candidate: q(x,y) = q(y)

symmetric candidate: q(x,y)=q(y,x)

“random walk” candidate: q(x,y)=q(x-y)=q(y-x)

ETC

Page 11: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

The MH Algorithm: Independent Candidate

Example:

Simulate a bivariate normal distribution with mean vector

and covariance matrix

121/

21/1

0

0

Page 12: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

The MH Algorithm: Independent CandidateIn this case,

for

We need a candidate distribution to draw values on R2, and since exponentials are so easy to draw, we use:

}x-x x2exp{-x )x,(x)x( 2221

2121

. x - , x- 21

Page 13: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

The MH Algorithm: Independent Candidate

)y,q(y )yq()y,xq( 21

)(yI e

2

1)(yI e

2

1)(yI e

2

1)(yI e

2

12)(0,

y-2,0)(-

y1)(0,

y-1,0)(-

y 2211

Page 14: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

The MH Algorithm: Independent CandidateSimulation steps:

consider the ratio

if R>1, new = candidate

if R<1, draw U~unif(0,1)

if U<R, new = candidate

if U>R, new = old

e)q(candidat (old)

q(old) )(candidate:R

Page 15: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

Total Variation Norm

The total variation norm distance between two probability measures is

where B denotes the Borel sets in R.

2 1 and

|(A)(A)| sup : 21A

21 B

Page 16: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

Rate of Convergence

A Markov chain with n-step transition probability P(n)(x,y) and stationary distribution is called uniformly ergodic if

for all x in the state space.

0-)(x,P lim (n)

n

Page 17: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

The MH Algorithm: Independent CandidateTheorem: (Mengersen and Tweedie)

The Metropolis-Hastings algorithm with independent candidate density q is uniformly ergodic if and only if there exists a constant such that

for all x in the state space.

In this case,

0

(x)

q(x)

.)-(1 -)(x,P n(n)

Page 18: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

Gibbs Sampling A special case of single-site Updating Me-

tropolis

Page 19: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

Swendsen-Wang Algorithm for Ising Models

Swedsen-Wang (1987) is a smart idea that flips a patch at a time.

Each edge in the lattice e=<s,t> is associated a probability q=e-b.

1. If s and t have different labels at the current state, e is turned off. If s and t have the same label, e is turned off with probability q. Thus each object is broken into a number of connected components (subgraphs). 2. One or many components are chosen at random.

V 0

V 2

V 1

3. The collective label is changed randomly to any of the labels.

V 0

V 2

V 1

Page 20: CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

Swendsen-Wang Algorithm

Pros: Computationally efficient in sampling the Ising

models Cons:

Limited to Ising models Not informed by data, slows down in the pres-

ence of an external field (data term)

Cf) Swendsen Wang Cuts Generalizes Swendsen-Wang to arbitrary posterior probabilities

Improves the clustering step by using the image data