Top Banner
Lecture 3 Advanced Sampling Techniques Dahua Lin The Chinese University of Hong Kong 1
72
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MLPI Lecture 3: Advanced Sampling Techniques

Lecture'3

Advanced(Sampling(TechniquesDahua%Lin

The$Chinese$University$of$Hong$Kong

1

Page 2: MLPI Lecture 3: Advanced Sampling Techniques

Overview• Collapsed*Gibbs*Sampling

• Sampling*with*Auxiliary*Variables

• Slice*Sampling

• Simulated*Tempering*&*Parallel*Tempering

• Swendsen?Wang*Algorithm

• Hamiltonian*Monte*Carlo

2

Page 3: MLPI Lecture 3: Advanced Sampling Techniques

Collapsed)Gibbs)Sampling

3

Page 4: MLPI Lecture 3: Advanced Sampling Techniques

Mo#va#ng(Example

with

We#want#to#sample#from# .

4

Page 5: MLPI Lecture 3: Advanced Sampling Techniques

Gibbs%Sampling

Draw% %where% :

with% .

5

Page 6: MLPI Lecture 3: Advanced Sampling Techniques

Gibbs%Sampling%(cont'd)

Draw% :

• How%well%can%this%sampler%perform%when%?

6

Page 7: MLPI Lecture 3: Advanced Sampling Techniques

Collapsed)Gibbs)Sampling

• Basic&idea:"replace"the"original"condi0onal"distribu0on"with"a"condi0onal"distribu0on"of"a"marginal(distribu.on,"o7en"called"a"reduced(condi.onal(distribu.on.

• Consider"the"example"above,"we"consider"a"marginal(distribu.on:

7

Page 8: MLPI Lecture 3: Advanced Sampling Techniques

Collapsed)Gibbs)Sampling)(cont'd)

• Draw& ,&with& &marginalized&out,&as:

• Draw&

• Can&we&exchange&the&order&of&these&two&steps?&Why?

8

Page 9: MLPI Lecture 3: Advanced Sampling Techniques

Basic&Guidelines

• Order%of%steps%ma-ers!

• Generally,*one*can*move*components*from*"being'sampled"*to*"being'condi0oned'on".

• replacing*outputs*with*intermediates*would*change*the*sta:onary*distribu:on.

• A*variable*can*be*updated*mul:ple*:mes*in*an*itera:on.

9

Page 10: MLPI Lecture 3: Advanced Sampling Techniques

Why$do$collapsed$samplers$o/en$perform$be3er$than$full6fledged$Gibbs$samplers?

10

Page 11: MLPI Lecture 3: Advanced Sampling Techniques

Rao$Blackwell+Theorem

Consider)an)example) )and)we)want)to)es1mate) .)Suppose)we)have)two)tractable)ways)to)do)so:

(1)$draw$ ,$and$compute

11

Page 12: MLPI Lecture 3: Advanced Sampling Techniques

Rao$Blackwell+Theorem+(cont'd)

(2)$draw$ $where$ $is$the$marginal$distribu4on,$and$compute

• Both&are&correct.&By&Strong&LLN,&both& &and& &converge&to& &almost&surely.

• Which&one&is&be<er?&Can&you&jus@fy&your&answer?12

Page 13: MLPI Lecture 3: Advanced Sampling Techniques

Rao$Blackwell+Theorem+(cont'd)• (Rao%Blackwell,Theorem)"Sample"variance"will"be"reduced"when"some"components"are"marginalized"out."With"the"se:ng"above,"we"have

• Generally,*reducing)sample)variance*would*also*lead*to*the*reduc3on*of*autocorrela2on*of*the*chain,*thus*improving*the*mixing*performance.

13

Page 14: MLPI Lecture 3: Advanced Sampling Techniques

Sampling)with)Auxiliary)Variables• The%Rao$Blackwell$Theorem%suggests%that%in%order%to%achieve%be3er%performance,%one%should%try%to%marginalize%out%as%many%components%as%possible.

• However,%in%many%cases,%one%may%want%to%do%the%opposite,%that%is,%to%introduce%addi>onal%variables%to%facilitate%the%simula>ons.

• For%example,%when%the%target$distribu6on%is%mul6modal,%one%may%use%an%auxiliary%variable%to%help%the%chain%escape%from%local%traps.

14

Page 15: MLPI Lecture 3: Advanced Sampling Techniques

Use$Auxiliary$Variables

• Specify)an)auxiliary)variable) )and)the)joint)distribu8on) )such)that)

)for)certain) .

• Design)a)chain)to)update) )using)the)M=H)algorithm)or)the)Gibbs)sampler.

• The)samples)of) )can)then)be)obtained)through)marginaliza)on)or)condi)oning.

15

Page 16: MLPI Lecture 3: Advanced Sampling Techniques

Slice&Sampling

16

Page 17: MLPI Lecture 3: Advanced Sampling Techniques

Slice&Sampler

• Sampling* *is*equivalent*to*sampling*uniformly*from*the*area*under* :*

.

• Gibbs*sampling*based*on*the*uniform*distribu;on*over* .*Each*itera;on*consists*of*two*steps:

• Given* ,*

• Given* ,*

17

Page 18: MLPI Lecture 3: Advanced Sampling Techniques

Slice&Sampler&(Illustra0on)

18

Page 19: MLPI Lecture 3: Advanced Sampling Techniques

Slice&Sampler&(Discussion)• Slice&sampler"can"mix"very"rapidly,"as"it"will"not"be"locally"trapped.

• Slice&sampler"is"o7en"nontrivial"to"implement"in"prac8ce."Drawing" "is"some8mes"very"difficult.

• For"distribu8ons"of"certain"forms,"which"have"an"easy&way"to"draw" ,"slice&sampling"is"good"strategy.

19

Page 20: MLPI Lecture 3: Advanced Sampling Techniques

Simulated*Tempering

20

Page 21: MLPI Lecture 3: Advanced Sampling Techniques

Gibbs%Measure

A"Gibbs%measure"is"a"probability"measure"with"a"density"in"the"following"form:

Here,% %is%called%the%energy&func*on,% %is%called%the%inverse&temperature,%and%the%normalizing%constant% %depends%on% .

21

Page 22: MLPI Lecture 3: Advanced Sampling Techniques

Gibbs%Measure%(cont'd)

In#literature#of#MCMC#sampling,#we#o5en#parameterize#a#Gibbs#measure#using#the#temperature(parameter# ,#thus#

.

22

Page 23: MLPI Lecture 3: Advanced Sampling Techniques

Tempered'MCMC

Typical(MCMC(methods(usually(rely(on(local%moves(to(explore(the(state(space.(What(is(the(problem?

23

Page 24: MLPI Lecture 3: Advanced Sampling Techniques

Tempered'MCMC'(cont'd)

Local&traps&o+en&leads&to&very&poor&mixing.&Can&we&improve&this?

24

Page 25: MLPI Lecture 3: Advanced Sampling Techniques

Simulated*Tempering

Suppose'we'intend'to'sample'from'

Basic&idea:!Augment!the!target!distribu0on!by!including!a!temperature(index( ,!with!joint!distribu0on!given!by

25

Page 26: MLPI Lecture 3: Advanced Sampling Techniques

Simulated*Tempering*(cont'd)• We$only$collect$samples$at$the$lowest'temperature,$

.

• The$chain$mixes$much$faster$at$high$temperatures,$but$we$want$to$collect$samples$at$the$lowest$temperature.$So$we$have$to$constantly$switch$between$temperatures.

26

Page 27: MLPI Lecture 3: Advanced Sampling Techniques

Simulated*Tempering*(Algorithm)

One$itera)on$of$Simulated*Tempering$has$two$steps:

• (Base&transi+on):#update# #at#the#same#temperature,#i.e.#holding# #fixed.

• (Temperature&switching):#with# #fixed,#propose##with# #such#that#

• Accept#the#change#with#probability#.

• Any#drawbacks?27

Page 28: MLPI Lecture 3: Advanced Sampling Techniques

Simulated*Tempering*(Discussion)

• Set% .%Given% ,%we%should%set% %such%that%uphill%moves%from%( )%should%have%a%considerable%probability%of%being%accepted.

• Build%the%temperature(ladder%step%by%step%un?l%we%have%a%sufficiently%smooth%distribu?on%at%the%top.

• The%?me%spent%on%the%base%level% %is%around%.%If%we%have%too%many%levels,%only%a%very%

small%por?on%of%samples%can%be%used.

28

Page 29: MLPI Lecture 3: Advanced Sampling Techniques

Simulated*Tempering*(Discussion)• All$temperature$levels$play$an$important$role.$So$it$is$desirable$to$spend$comparable$amount$of$8me$at$

each$level.$Se:ng$ $for$each$ ,$we$have

• The%normalzing%constants% %are%typically%unknown%and%es8ma8ng%them%is%very%difficult%and%expensive.

29

Page 30: MLPI Lecture 3: Advanced Sampling Techniques

Parallel&Tempering

(Basic'idea)!rather!than!jumping!between!temperatures,!it!simultaneously!simulate!mul3ple!chains,!each!at!a!temperature!level! ,!called!a!replica,!and!constantly!swap!samples!between!replicas.

30

Page 31: MLPI Lecture 3: Advanced Sampling Techniques

Parallel&Tempering&(Algorithm)

Each%itera*on%consists%of%the%following%steps:

• (Parallel'update):"simulate"each"replica"with"its"own"transi2on"kernel

• (Replica'exchange):"propose"to"swap"states"between"two"replicas"(say"the" 7th"and" 7th,"where"

):

31

Page 32: MLPI Lecture 3: Advanced Sampling Techniques

Parallel&Tempering&(Algorithm)• The%proposal%is%accepted%with%probability%

,%where

• We$collect$samples$from$the$base$replica$(the$one$with$ ).

• Why$does$this$algorithm$produce$the$desired$distribu;on?

32

Page 33: MLPI Lecture 3: Advanced Sampling Techniques

Parallel&Tempering&(Jus1fica1on)

Let$ .$We$define

Obviously,+the+step+of+parallel&update+preserves+the+invariant+distribu5on+ .

33

Page 34: MLPI Lecture 3: Advanced Sampling Techniques

Parallel&Tempering&(Jus1fica1on)

Note%that%the%step%of%replica(exchange%is%symmetric,%i.e.%the%probabili0es%of%going%up%and%down%are%equal,%then%according%to%the%Metropolis(algorithm,%we%have%

%with

34

Page 35: MLPI Lecture 3: Advanced Sampling Techniques

Parallel&Tempering&(Discussion)• It$is$efficient$and$very$easy$to$implement,$especially$in$a$parallel$compu6ng$environment.

• It$is$o9en$an$art$instead$of$a$technique$to$tune$a$parallel$tempering$system.

• The$parallel-tempering$is$a$special$case$of$a$large$family$of$MCMC$methods$called$Extended-Ensemble-Monte-Carlo,$which$involves$a$collec6on$of$parallel$Markov$chains$and$the$simula6on$switches$between$these$them.

35

Page 36: MLPI Lecture 3: Advanced Sampling Techniques

Swendsen'Wang+Algorithm

The$Swendsen'Wang$algorithm$(R.-Swendsen$and$J.-Wang,$1987)$is$an$efficient$Gibbs$sampling$algorithm$for$sampling$from$the$extended-Ising-model.

36

Page 37: MLPI Lecture 3: Advanced Sampling Techniques

Standard'Ising'Model

The$standard$Ising&model$is$defined$as

where% %for%each% %is%called%a%spin,%and%.

• Gibbs&sampling&is&extremely&slow,&especially&when&the&temperature&is&low.

37

Page 38: MLPI Lecture 3: Advanced Sampling Techniques

Extended'Ising'Model• We$extend$the$model$by$introducing$addi5onal$bond%variables$ ,$each$for$an$edge.$Each$bond$has$two$states:$ $indica5ng$connected$and$ $indica5ng$disconnected.

• We$define$a$joint$distribu5on$that$couples$the$spins$and$bonds,

38

Page 39: MLPI Lecture 3: Advanced Sampling Techniques

Extended'Ising'Model'(cont'd)

Here,% %is%described%as%below:

• When& ,& &for&every&se.ng&of&

• When& ,&

39

Page 40: MLPI Lecture 3: Advanced Sampling Techniques

Extended'Ising'Model'(cont'd)

With%this%se(ng,% %can%be%wri1en%as:

where% :

• when& ,& &must&be&

• when& ,& &is&set&to&zero&with&probability&.

40

Page 41: MLPI Lecture 3: Advanced Sampling Techniques

Swendsen'Wang+Algorithm

!Each!itera*on!consists!of!two!steps:

• (Clustering):"condi(oned"on"the"spins" ,"draw"the"bonds" "independenly."For"an"edge"

:

• If" ,"set"

• If" ,"set" "with"probability""or" "otherwise.

41

Page 42: MLPI Lecture 3: Advanced Sampling Techniques

Swendsen'Wang+Algorithm

• (Swapping):"condi(oned"on"the"bonds" ,"draw"the"spins" .

• For"each"connected"component,"draw" "or" "with"equal"chance,"and"assign"the"resultant"value"to"all"nodes"in"the"component.

42

Page 43: MLPI Lecture 3: Advanced Sampling Techniques

Swendsen'Wang+Algorithm+(Illustra7on)In the case of a rectangular grid, this Gibbs sampling algorithm mixes very rapidly.

The following figures illustrate Gibbs sampling. Spin states up and down areshown by filled and empty circles. Bond states 1 and 0 are shown by thick lines andthin dotted lines. We start from a state with five connected components. (Rememberthat isolated spins count as connected components, albeit of size 1.)

First, let’s update the bonds The forbidden bonds are highlighted

Bonds are forbidden from forming wherever the two adjacent spins are in oppositestates. The bonds that are not forbidden are set to the 1 state with probability p.

After updating the bonds Now we update spins Update bonds again

1.2 Other properties of the extended model

We already mentioned that the partition function Z is the same as that of the Isingmodel.

The marginal P (x) is correct, because when we sum the factor gm over dm, we getfm. Summing over dm is easy because it appears in only one factor.

OK, we’ve summed out d and obtained the Ising model. What if we sum out x?The marginal P (d) is called the random cluster model. Summing over x for given

d, all factors are constants. The number of states is 2number of clusters. Thus

P (d) =1

!

m

"

pdm(1 − p)1−dm

#

2c(d) (10)

where c(d) is the number of connected components in the state d. Isolated spinswhose neighbouring bonds are all zero count as single connected components.

The random cluster model can be generalized by replacing the number 2 by aparameter q:

P (q)(d) =!

m

"

pdm(1 − p)1−dm

#

qc(d) (11)

The random cluster model can be simulated directly, just as the Ising model canbe simulated directly; but the S–W method, augmenting the bonds with spins, isprobably the most efficient way to simulate the model. For integer values of q, theappropriate spin system is the ‘Potts model’, the generalization of the Ising modelfrom 2 spin states to q.

In the case of a rectangular grid, this Gibbs sampling algorithm mixes very rapidly.The following figures illustrate Gibbs sampling. Spin states up and down are

shown by filled and empty circles. Bond states 1 and 0 are shown by thick lines andthin dotted lines. We start from a state with five connected components. (Rememberthat isolated spins count as connected components, albeit of size 1.)

First, let’s update the bonds The forbidden bonds are highlighted

Bonds are forbidden from forming wherever the two adjacent spins are in oppositestates. The bonds that are not forbidden are set to the 1 state with probability p.

After updating the bonds Now we update spins Update bonds again

1.2 Other properties of the extended model

We already mentioned that the partition function Z is the same as that of the Isingmodel.

The marginal P (x) is correct, because when we sum the factor gm over dm, we getfm. Summing over dm is easy because it appears in only one factor.

OK, we’ve summed out d and obtained the Ising model. What if we sum out x?The marginal P (d) is called the random cluster model. Summing over x for given

d, all factors are constants. The number of states is 2number of clusters. Thus

P (d) =1

!

m

"

pdm(1 − p)1−dm

#

2c(d) (10)

where c(d) is the number of connected components in the state d. Isolated spinswhose neighbouring bonds are all zero count as single connected components.

The random cluster model can be generalized by replacing the number 2 by aparameter q:

P (q)(d) =!

m

"

pdm(1 − p)1−dm

#

qc(d) (11)

The random cluster model can be simulated directly, just as the Ising model canbe simulated directly; but the S–W method, augmenting the bonds with spins, isprobably the most efficient way to simulate the model. For integer values of q, theappropriate spin system is the ‘Potts model’, the generalization of the Ising modelfrom 2 spin states to q.

43

Page 44: MLPI Lecture 3: Advanced Sampling Techniques

Swendsen'Wang+Algorithm+(Discussion)

• When& &is&large,& &has&a&high&probability&of&being&set&to&one,&i.e.& &and& &are&likely&to&be&connected.

• Experiments&show&that&the&Swendsen)Wang&algorithm&mixes&very&rapidly,&especially&for&rectangular&grids.

• Can&you&provide&an&intui?ve&explana?on?

44

Page 45: MLPI Lecture 3: Advanced Sampling Techniques

Swendsen'Wang+Algorithm+(Discussion)

• The%Swendsen'Wang%algorithm%can%be%generalized%to%Po4s%models%(nodes%can%take%values%from%a%finite%set).

• The%Swendsen'Wang%algorithm%has%been%widely%used%in%image%analysis%applicaAons,%e.g.%image%segmentaAon%(in%this%case,%it%is%called%Swendsen'Wang,cut).

45

Page 46: MLPI Lecture 3: Advanced Sampling Techniques

Hamiltonian)Monte)Carlo• An$MCMC$method$based$on$Hamiltonian)Dynamics.$It$was$originally$devised$for$molecular)simula1on

• In$1987,$a$seminal$paper$by$Duane$et)al$unifies$MCMC$and$molecular$dynamics.$They$called$it$Hybrid)Monte)Carlo,$which$abbreviates$to$HMC

• In$many$arEcles,$people$call$it$Hamiltonian)Monte)Carlo,$as$this$name$is$considered$to$be$more$specific$and$informaEve,$and$it$retains$the$same$abbreviaEon$"HMC".

46

Page 47: MLPI Lecture 3: Advanced Sampling Techniques

Mo#va#ng(Example:(Free(Fall

47

Page 48: MLPI Lecture 3: Advanced Sampling Techniques

Mo#va#ng(Example:(Free(Fall

• The%change%of%momentum% %is%caused%by%the%accumula5on/release%of%the%poten(al+energy:

• The%change%of%loca-on% %is%caused%by%velocity,%the%deriva-ve%of%kinema-c.energy%w.r.t.%the%momentum:

48

Page 49: MLPI Lecture 3: Advanced Sampling Techniques

Hamiltonian)Dynamics• Hamiltonian)Dynamics"is"a"generalized"theory"of"the"classical)mechanics,"which"provides"a"elegant"and"flexible"abstrac:on"of"a"dynamic"system"in"physics.

• In"Hamiltonian"Dynamics,"a"physical"system"is"described"by" ,"where" "and" "are"respec:vely"the"posi1on"and"momentum"of"the" @th"en:ty.

49

Page 50: MLPI Lecture 3: Advanced Sampling Techniques

Hamilton's+Equa/ons

The$dynamics$of$the$system$is$characterized$by$the$Hamilton's+Equa/ons:

Here,% %is%called%the%Hamiltonian,%which%can%be%interpreted%as%the%total)energy%of%the%system.

50

Page 51: MLPI Lecture 3: Advanced Sampling Techniques

Hamilton's+Equa/ons+(cont'd)

• The%Hamiltonian% %is%o)en%formulated%as%the%sum%of%the%poten+al,energy% %and%the%kine+c,energy% :

• With&this&se)ng,&the&Hamilton's+Equa/ons&become:

51

Page 52: MLPI Lecture 3: Advanced Sampling Techniques

Conserva)on*of*Hamiltonian

The$Hamiltonian$is$conserved,$i.e.,$it$is$invariant$over$,me:

Intui&vely,,this,reflects,the,law$of$energy$conserva/on.

52

Page 53: MLPI Lecture 3: Advanced Sampling Techniques

Hamiltonian)Reversibility• The%Hamiltonian)dynamics%is%reversible

• Let%the%ini+al%states%be% %and%the%states%at%+me% %be% .%Then,%it%we%reverse%the%process,%star+ng%at% ,%then%the%states%at%+me% %would%be% .

• In%the%context%of%MCMC,%this%leads%to%the%reversibility%of%the%underlying%chain.

53

Page 54: MLPI Lecture 3: Advanced Sampling Techniques

Simula'on*of*Hamiltonian*Dynamics

A"natural"idea"to"simulate"Hamiltonian)dynamics"is"to"use"Euler's)method"over"discre1zed"1me"steps:

Is#this#a#good#method?

54

Page 55: MLPI Lecture 3: Advanced Sampling Techniques

Leapfrog)Method

Be#er%results%can%be%obtained%with%leapfrog:

More%importantly,%the%leapfrog%update%is%reversible.55

Page 56: MLPI Lecture 3: Advanced Sampling Techniques

Leapfrog)Method)(cont'd)

56

Page 57: MLPI Lecture 3: Advanced Sampling Techniques

Example

Consider)a)Hamiltonian)system:

Write&down&the&Hamilton's+Equa/ons:

Derive&the&solu-on:

57

Page 58: MLPI Lecture 3: Advanced Sampling Techniques

Example((Simula-on)

58

Page 59: MLPI Lecture 3: Advanced Sampling Techniques

Hamiltonian)Monte)Carlo

(Basic'idea):!Consider!the!poten&al)energy!as!the!Gibbs)energy,!and!introduce!the!"momentums"!as!auxiliary)variables!to!control!the!dynamics.

59

Page 60: MLPI Lecture 3: Advanced Sampling Techniques

Hamiltonian)Monte)Carlo)(cont'd)

Suppose'the'target&distribu,on'is'

,'then'we'form'an'augmented&

distribu,on'as

Here,%the%loca%ons% %represent%the%variables%of%interest,%and%the%momentums% %control%the%dynamics%of%simula7on.

60

Page 61: MLPI Lecture 3: Advanced Sampling Techniques

Hamiltonian)Monte)Carlo)(cont'd)

In#prac(ce,#the#kine%c#energy#is#o2en#formalized#as

61

Page 62: MLPI Lecture 3: Advanced Sampling Techniques

Hamiltonian)Monte)Carlo)(Algorithm)

Each%itera*on%of%HMC%comprises%two%steps:

• Gibbs%update:#sample#the#momentums# #from#the#Gaussian#prior#given#by

62

Page 63: MLPI Lecture 3: Advanced Sampling Techniques

Hamiltonian)Monte)Carlo)(Algorithm)• Metropolis*update:#using#Hamiltonian#dynamics#to#propose#a#new#state.#Star8ng#from# ,#simulate#the#dynamic#system#with#the#leapfrog#method#for# #steps#with#step<size# ,#which#yields# .#The#proposed#state#is#accepted#with#probability:

63

Page 64: MLPI Lecture 3: Advanced Sampling Techniques

HMC$(Discussion)• If$the$simula.on$is$exact,$we$will$have$

,$and$thus$the$proposed$state$should$always$be$accepted.$

• In$prac.ce,$there$can$be$some$devia.on$due$to$discre.za.on,$we$have$to$use$the$Metropolis$rule$to$guarantee$the$correctness.

64

Page 65: MLPI Lecture 3: Advanced Sampling Techniques

HMC$(Discussion)

• HMC%has%a%high%acceptance%rate%while%allowing%large%moves%along%less6constrained%direc8ons%at%each%itera8on.

• This%is%a%key%advantage%as%compared%to%random'walk%proposals,%which,%in%order%to%maintain%a%reasonably%high%acceptance%rate,%has%to%keep%a%very%small%step%size,%resul8ng%in%substan8al%correla8on%between%consecu8ve%samples.

65

Page 66: MLPI Lecture 3: Advanced Sampling Techniques

Tuning&HMC• For%efficient%simula1on,%it%is%important%to%choose%appropriate%values%for%both%the%leapfrog%step%size% %and%the%number%of%leapfrog%steps%per%itera1on% .

• Tuning%HMC%(and%actually%many%generic%sampling%methods)%oCen%requires%preliminary*runs%with%different%trial%seGngs%and%different%ini1al%values,%as%well%as%careful%analysis%of%the%energy%trajectories.

66

Page 67: MLPI Lecture 3: Advanced Sampling Techniques

Tuning&HMC&(cont'd)

• For%most%cases,% %and% %can%be%tuned%independently.

• Too%small%a%stepsize%would%waste%computa8on%8me,%while%large%stepsize%would%cause%unstable%simula8on,%and%thus%low%acceptance%rate.

• One%should%choose% %such%that%the%energy%trajectory%is%stable%and%the%acceptance%rate%is%maintained%at%a%reasonably%high%level.

• One%should%choose% %such%that%back@and@forth%movement%of%the%states%can%be%observed.

67

Page 68: MLPI Lecture 3: Advanced Sampling Techniques

Generic'Sampling'Systems

A"number"of"so,ware"systems"are"available"for"sampling"from"models"specified"by"the"user

• WinBUGS:*based*on*BUGS*(Bayesian*inference*Using*Gibbs*Sampling).

• provide*a*friendly*language*for*user*to*specify*the*model

• Running*only*on*Windows

• Note:*The*development*has*stopped*since*2007.68

Page 69: MLPI Lecture 3: Advanced Sampling Techniques

Generic'Sampling'Systems'(cont'd)• JAGS:'"Just'Another'Gibbs'Sampler"

• Cross8pla9orm'support

• Use'a'dialect'of'BUGS

• Extensible:'allow'users'to'write'customized'funcCons,'distribuCons,'and'samplers

69

Page 70: MLPI Lecture 3: Advanced Sampling Techniques

Generic'Sampling'Systems'(cont'd)• Stan:'"Sampling'Through'Adap5ve'Neighborhoods"

• Core'wri=en'in'C++,'and'ports'available'in'Python,'R,'Matlab,'and'Julia

• A'user'friendly'language'for'model'specifica5on

• Use'Hamiltonian'Monte'Carlo'(HMC)'and'No'ULTurn'Samplers'(NUTS)'as'core'algorithm

• Open'source'(GPLv3'licensed)'and'under'ac5ve'development'on'Github

70

Page 71: MLPI Lecture 3: Advanced Sampling Techniques

Stan%Exampledata { int<lower=0> N; vector[N] x; vector[N] y;}parameters { real alpha; real beta; real<lower=0> sigma;}model { for (n in 1:N) y[n] ~ normal(alpha + beta * x[n], sigma);}

71

Page 72: MLPI Lecture 3: Advanced Sampling Techniques

Generic'Sampling'System'vs.'Dedicated'Algorithms

Generic' Dedicated'

Easy%to%use% Require%knowledge%and%experience%

High%produc9vity% Time=consuming%to%develop%

Slow% O@en%remarkably%more%efficient%

Limited%flexibility% Necessary%for%many%new%models%

72