Intro ABC Rejection MCMC ABC SMC ABC Closing Refs Tutorial on ABC Algorithms Dr Chris Drovandi Queensland University of Technology, Australia [email protected] July 3, 2014 Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
Tutorial on ABC Algorithms
Dr Chris DrovandiQueensland University of Technology, Australia
July 3, 2014
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
Notation
Model parameter θ with prior π(θ)
Likelihood is f (y |θ) with observed data yPosterior π(θ|y ) ∝ f (y |θ)π(θ)Simulated data xABC posterior πǫ(θ, x |y ) ∝ Kǫ(ρ(y , x))f (x |θ)π(θ)
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
Ingredients for Implementing ABC Algorithm
Code to simulate data from model
Code to compute summary statistics
Code to compute discrepancy function
Combine within rejection, MCMC or SMC algorithm (not toohard)
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
Importance Sampling
Define importance distribution g(θ)
Draw M iid samples from g , θiiid∼ g(θ), i = 1, . . . ,M
Weight the samples
wi =f (y |θi )π(θi )
g(θi )
Normalise the weights to obtain Wi , i = 1, . . . ,M
{θi ,Wi}Mi=1 represents weighted sample from π(θ|y )
Effective sample size:
ESS =1
∑Mi=1(Wi )2
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
ABC Importance Sampling
Require proposal on joint space of θ and xLet g(θ, x) = g(θ)f (x |θ)Generate sample {θi , x i}
Mi=1
iid∼ g(θ, x)
Weight
wi =Kǫ(ρ(y , x i))��
��
f (x i |θi)π(θi )
g(θi)����
f (x i |θi )
If we choose g(θ) = π(θ) and letKǫ(ρ(y , x)) = 1(ρ(y , x) ≤ ǫ) then wi is either 0 or 1
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
ABC Rejection Algorithm
ABC rejection (post-determination of ǫ)
1: Draw θi ∼ π(·) and simulate x i ∼ f (·|θi) for i = 1, . . . ,M.Generates collection {θi , x i}
Mi=1.
2: Compute discrepancy ρi = ρ(y , x i). Obtain particles{θi , ρi}
Mi=1
3: Sort {θi , ρi}Mi=1 based on ρ
4: Keep N = α×M of θi with lowest discrepancy (this definesthe ǫ)
Choice of α trade off between accuracy and Monte Carlo error(Fearnhead and Prangle (2012))
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
ABC Rejection Algorithm (Pritchard et al 1999)
ABC Rejection (pre-specification of ǫ)
1: Draw θ ∼ π(·)2: Simulate x ∼ f (·|θ)3: If ρ(y , x) ≤ ǫ then accept θ4: Repeat lines 1:, 2: and 3: until N samples are drawn
Avoids storage requirements but choosing ǫ difficult
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
ABC Rejection Example
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20
5
10
15
20
25
30
τ
Den
sity
ε
1 = 0.24 (20%)
ε2 = 0.13 (10%)
ε3 = 0.013 (1%)
ε4 = 0.0014 (0.1%)
Figure: Plot courtesy of Brenda Nho Vo. ABC posterior for different α(leading to different ǫ).
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
ABC Rejection Algorithm
Advantages
Simplicity
Useful for testing different sets of summary statistics (Nunesand Balding 2010) or analysing multiple datasets from samemodel (see Drovandi and Pettitt 2013 for Bayesianexperimental design example)
Disadvantages
Storage requirements (depending on implementation chosen)
Highly inefficient if posterior different to prior (too high ǫrequired to obtain a reasonable size sample from ABCposterior)
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
Markov chain Monte Carlo
Motivation: Keep proposals within non-negligible posteriorregions
Construct a Markov chain whose limiting distribution is π(θ|y )Assume current value of chain is θ. Propose next value ofchain θ
∗ ∼ q(·|θ). Accept θ∗ as next value of chain withprobability min(1,A) where
A =f (y |θ∗)π(θ∗)q(θ|θ∗)
f (y |θ)π(θ)q(θ∗|θ),
otherwise set θ as the next value of chain.
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
MCMC ABC
Require proposal on joint space of θ and xLet q(θ∗, x∗|θ, x) = f (x∗|θ∗)q(θ∗|θ)
Metropolis-Hastings Ratio
A =Kǫ(ρ(y , x∗))
�����
f (x∗|θ∗)π(θ∗)q(θ|θ∗)����f (x |θ)
Kǫ(ρ(y , x))����f (x |θ)π(θ)q(θ∗|θ)
�����
f (x∗|θ∗)
Choice of proposal leads to cancellation of intractablelikelihoods!
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
MCMC ABC (Marjoram et al 2003, Sisson and Fan 2011)
1: Obtain θ0, x0 using a burnin or from one sample of rejection ABC
2: Compute ψ0 = Kǫ(ρ(y , x0))3: for i = 1 to N do
4: Draw θ∗ ∼ q(·|θi−1)
5: Simulate x∗ ∼ f (·|θ∗)6: Compute ψ∗ = Kǫ(ρ(y , x∗))
7: Compute r = π(θ∗)ψ∗q(θi−1|θ∗)
π(θi−1)ψi−1q(θ∗|θi−1)
8: if uniform(0, 1) < r then
9: θi = θ
∗, ψi = ψ∗
10: else
11: θi = θ
i−1, ψi = ψi−1
12: end if
13: end for
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
MCMC ABC
Advantages
Still pretty simple
Tends to be more efficient than ABC rejection for single dataanalysis
Disadvantages
Slightly more difficult to implement than ABC rejection
Requires tuning of proposal distribution q
Can get ‘stuck’ in low probability regions (thus long runs areoften required)
Must be re-run and tuned for each new dataset or summarystatistic selection
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
MCMC ABC Example of “stickiness”
Figure: Plot taken from Sisson and Fan 2011. From bottom to top ǫ is4.5, 4, 3.5, 3.
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
Variants on MCMC ABC
Bortot et al 2007 - Include ǫ as a ‘parameter’ (every now andthen propose larger value of ǫ)
Baragatti et al 2013 - Population MCMC (parallel tempering,swapping chains with different ǫ)
Picchini et al 2013 - Early rejection (when using uniformkernel, may reject proposal without simulating data)
Aandahl et al 2014 - Multiple try MCMC ABC
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
Sequential Monte Carlo (SMC) ABC
SMC for sampling from a sequence of target distribution
For ABC natural to define sequence in terms of non-increasingset of tolerances ǫ1 ≥ ǫ2 ≥ ǫT
πt(θ, x |y , ǫ) ∝ f (x |θ)π(θ)1(ρ(x , y ) ≤ ǫt), for t = 1, . . . ,T ,
Traverse set of N ‘particles’ through sequence of targets byiteratively applying re-weighting (importance sampling),re-sampling and mutation steps
Ultimately obtain set of weighted samples from πT
Nice accessible reference to SMC is Chopin (2002) (see alsoDel Moral et al (2006))
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
SMC ABC Algorithm of a few papers
1: Initialise ǫ1, . . . , ǫT and specify the initial importance sampling distributionπ0(·)
2: for t = 1 to T do
3: for i = 1 to N do
4: If t = 1 sample θ∗∗ from π0(·). If t > 1 sample θ
∗ from the previouspopulation {θt−1i ,W t−1
i }Ni=1 and perturb the particle θ∗∗ ∼ Kt(·|θ
∗).5: Generate a dataset x∗∗ ∼ f (·|θ∗∗).6: If ρ(y , x∗∗) > ǫt then go back to step 4.7: Set θit = θ
∗∗ and re-weight
wit =
π(θit)π0(θ
it ) if t = 1
π(θit)∑
Nj=1
Wjt−1Kt (θ
it |θjt−1)
if t > 1.
8: end for
9: Normalise the weights W it = w i
t/∑N
j=1 wjt for i = 1, . . . ,N.
10: Update the tuning parameters of Kt+1 using the set of particles{θit ,W i
t }Ni=1.
11: end for
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
SMC ABC
Advantages
Advantages of SMC over MCMC (deals better withmulti-modal posteriors, easy adaptation of proposal)
Tends to be more efficient than ABC rejection and MCMCABC (from adaptive proposal)
Disadvantages
Not as simple to implement as ABC rejection and MCMCABC
Must be re-run and tuned for each new dataset or summarystatistic selection
Requires specification of sequence of tolerances
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
SMC ABC of Drovandi et al 2011
1: Set Na as the integer part of αN2: Perform the rejection sampling algorithm with ǫ1. This produces a set of
particles {θi , ρi}Ni=1
3: Sort the particle set by ρ, so that ρ1 ≤ ρ2 ≤ · · · ≤ ρN , and set ǫt = ρN−Na
and ǫmax = ρN . If ǫmax ≤ ǫT then finish, otherwise go to 4.4: Compute the tuning parameters of the MCMC kernel qt(·|·) using the
particle set {θi}N−Na
i=1
5: for j = N − Na + 1 to N do
6: Resample θj from {θi}N−Na
i=1
7: Perform Rt iterations of MCMC ABC using qt and ǫt to update θj
8: end for
9: Compute Rt based on the overall MCMC acceptance rate of the previousiteration and go to 3.
See also Del Moral et al 2012 for similar algorithm
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
Movie
It’s movie time!
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
SMC ABC of Drovandi et al 2011
Advantages
Same advantages as previous but also do not require tospecify sequence of tolerances, removes weight calculation andhas stopping rule
Disadvantages
Same disadvantages as previous but do not require sequenceof tolerances but uses MCMC ABC so not all particles getmoved (some duplication inevitable)
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
Other Variants on SMC ABC
Silk et al (2013) - different way to determine adaptivesequence of tolerances
Filippi et al (2013) - determine optimal proposal distributionfor SMC ABC
Probably a few other papers out there...
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
ABC Software Packages
R Packages (ABC, EasyABC)
DIYABC, PopABC (ABC for population genetics models)
ABC-SysBio (Python package with GPU support(?))
ABCtoolbox
See Wiki pagehttp://en.wikipedia.org/wiki/Approximate_Bayesian_computation
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
Closing Remarks on ABC Algorithms
ABC Algorithms generally easy to implement
MCMC and SMC versions tend to be more efficient but ABCrejection still has its charm
Generally still quite computationally intensive
Regression adjustment (see Beaumont et al 2002) can beapplied to output of any ABC algorithm
Many other Likelihood-free algorithms out there...
Chris Drovandi ABC in Sydney 2014
Intro ABC Rejection MCMC ABC SMC ABC Closing Refs
References
Beaumont et al (2002). Approximate Bayesian Computation inPopulation Genetics. Genetics.Marjoram et al (2003). Markov chain Monte Carlo withoutlikelihoods. PNAS.Sisson et al (2007). Sequential Monte Carlo without likelihoods.PNAS.Drovandi and Pettitt (2011). Estimation of parameters formacroparasite population evolution using approximate Bayesiancomputation. Biometrics.Del Moral et al (2012). An adaptive sequential Monte Carlomethod for approximate Bayesian computation. Statistics andComputing.Drovandi and Pettitt (2013). Bayesian Experimental Design forModels with Intractable Likelihoods. Biometrics.
Chris Drovandi ABC in Sydney 2014