Pre-computation for ABC in image analysis

Background Pre-computation Simulation Study Conclusion

Pre-computation for ABC in image analysis

Matt Moores1,2 Kerrie Mengersen1,2 Christian Robert3,4

1Mathematical Sciences School, Queensland University of Technology,Brisbane, Australia

2Institute for Health and Biomedical Innovation, QUT Kelvin Grove

3CEREMADE, Universite Paris Dauphine, France

4CREST, INSEE, France

MCMSki IV, Chamonix 2014


Outline

1 BackgroundApproximate Bayesian Computation (ABC)Sequential Monte Carlo (SMC-ABC)Hidden Potts model

2 Pre-computation

3 Simulation Study


Background

Image analysis often involves:

Large datasets, with millions of pixels

Multiple images with similar characteristics

For example: satellite remote sensing (Landsat), computedtomography (CT)

Table : Scale of common types of images

Number Landsat CT slicesof pixels (90m2/px) (512×512)

26 0.06km2 . . .56 14.06km2 0.1

106 900.00km2 3.8156 10251.56km2 43.5


Approximate Bayesian Computation (ABC)

Algorithm 1 ABC rejection sampler

1: for all iterations t ∈ 1 . . . T do2: Draw independent proposal θ′ ∼ π(θ)3: Generate x ∼ f(·|θ′)4: if |ρ(x)− ρ(y)| < ε then5: set θt ← θ′

6: else7: set θt ← θt−18: end if9: end for

Pritchard, Seielstad, Perez-Lezaun & Feldman (1999) Mol. Biol. Evol. 16(12)Marin, Pudlo, Robert & Ryder (2012) Stat. Comput. 22(6)


Adaptive ABC using Sequential Monte Carlo (SMC-ABC)

Algorithm 2 SMC-ABC

1: Draw N particles θ′i ∼ π(θ)2: Generate pseudo-data xi,m ∼ f(·|θ′i)3: repeat4: Adaptively select ABC tolerance εt5: Update importance weights ωi for each particle6: if effective sample size (ESS) < Nmin then7: Resample particles according to their weights8: end if9: Update particles using random walk proposal

(with adaptive RWMH bandwidth σ2t )10: until naccept

N < 0.015 or εt = 0

Del Moral, Doucet, & Jasra (2012) Stat. Comput. 22(5)Liu (2001) Monte Carlo Strategies in Scientific Computing New York: Springer


Motivation

Computational cost is dominated by simulation of pseudo-data

e.g. Hidden Potts model in image analysis(Grelaud et al. 2009, Everitt 2012)

Model fitting with ABC can be separated into:

Learning about the summary statistic, given the parameterρ(x) | θChoosing parameter values, given a summary statisticθ | ρ(y)

For latent models, an additional step of learning about thesummary statistic, given the data: ρ(z) | y, θ

Grelaud, Robert, Marin, Rodolphe & Taly (2009) Bayesian Analysis 4(2)Everitt (2012) JCGS 21(4)


hidden Markov random field

Joint distribution of observed pixel intensities yi ∈ yand latent labels zi ∈ z:

Pr(y, z|µ,σ2) ∝ L(y|µ,σ2, z)π(µ|σ2)π(σ2)π(z|β)π(β) (1)

Additive Gaussian noise:

yi|zi=jiid∼ N

(µj , σ

2j

)(2)

Potts model:

π(zi|zi∼`, β) =exp {β

∑i∼` δ(zi, z`)}∑k

j=1 exp {β∑

i∼` δ(j, z`)}(3)

Potts (1952) Proceedings of the Cambridge Philosophical Society 48(1)


Inverse Temperature


Doubly-intractable likelihood

p(β|z) = C(β)−1π(β) exp {β S(z)} (4)

The normalising constant of the Potts model has computationalcomplexity of O(n2kn), since it involves a sum over all possiblecombinations of the labels z ∈ Z:

C(β) =∑z∈Z

exp {β S(z)} (5)

S(z) is the sufficient statistic of the Potts model:

S(z) =∑i∼`∈L

δ(zi, z`) (6)

where L is the set of all unique neighbour pairs.


Pre-computation

The distribution of ρ(x) | θ is independent of the data

By simulating pseudo-data for values of θ, we can create amapping function f(θ) to approximate E[ρ(x)|θ]This mapping function can be reused across multiple datasets,amortising its computational cost

By mapping directly from θ → ρ(x), we avoid the need to simulatepseudo-data during model fitting


Sufficient statistic of the Potts model

0.0 0.5 1.0 1.5 2.0 2.5 3.0

10000

15000

20000

25000

30000

β

S(z

)

(a) E(S(z)|β)

0.0 0.5 1.0 1.5 2.0 2.5 3.00

50

100

150

200

250

β

σ(S

(z))

(b) σ(S(z)|β)

Figure : Distribution of S(z) | β for n = 56, k = 3


Scalable SMC-ABC for the hidden Potts model

Algorithm 3 SMC-ABC using precomputed f(β)

1: Draw N particles β′i ∼ π0(β)2: Approximate sufficient statistics S(xi,m) ≈ f(β′i)3: repeat4: Update S(zt)|y, πt(β)5: Adaptively select ABC tolerance εt6: Update importance weights ωi for each particle7: if effective sample size (ESS) < Nmin then8: Resample particles according to their weights9: end if

10: Update particles using random walk proposal(with adaptive RWMH bandwidth σ2t )

11: until naccept

N < 0.015 or εt < 10−9 or t ≥ 100


Simulation Study

20 images, n = 125× 125, k = 3:

β ∼ U(0, 1.005)

z ∼ f(·|β) using 2000 iterations of Swendsen-Wang

µj ∼ N(0, 1002

)1σ2j∼ Γ (1, 100)

Comparison of 2 ABC algorithms:

Scalable SMC-ABC using precomputed f(β)

Standard SMC-ABC using 500 iterations of Gibbs sampling

Swendsen & Wang (1987) Physical Review Letters 58


Accuracy of posterior estimates for β

0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

β

poste

rior

dis

trib

ution

(a) pseudo-data

0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

β

poste

rior

dis

trib

ution

(b) pre-computed


Distribution of posterior sampling error for β

algorithm

err

or

0.0

0.2

0.4

0.6

Pseudo−data Pre−computed


Improvement in runtime


0.5

1.0

2.0

5.0

10.0

20.0

50.0

100.0

algorithm

ela

psed tim

e (

hours

)

(a) elapsed (wall clock) time


510

20

50

100

200

500

1000

algorithm

CP

U tim

e (

hours

)

(b) CPU time


Summary

Scalability of SMC-ABC can be improved by pre-computing anapproximate mapping θ → ρ(x)

Pre-computation took 8 minutes on a 16 core Xeon server

Average runtime for SMC-ABC improved from 74.4 hours to39 minutes

The mapping function represents the nonlinear, heteroskedasticrelationship between the parameter and the summary statistic.

This method could be extended to multivariate applications, suchas estimating both β and k for the hidden Potts model.

Appendix

Acknowledgements

I gratefully acknowledge the financial support received from:

Mathematical Sciences School,Queensland University of Technology, Brisbane, Australia

Institute for Health and Biomedical Innovation, QUT

Bayesian section of the American Statistical Association

International Society for Bayesian Analysis

BayesComp section of ISBA

CEREMADE, Universite Paris Dauphine, France

Department of Economics, University of Warwick, UK

Computational resources and services used in this work wereprovided by the HPC and Research Support Group, QUT.

Appendix

For Further Reading I

Jun S. LiuMonte Carlo Strategies in Scientific ComputingSpringer-Verlag, 2001.

Pierre Del Moral, Arnaud Doucet & Ajay JasraAn adaptive sequential Monte Carlo method for approximate Bayesiancomputation.Statistics & Computing, 22(5): 1009–20, 2012.

Richard EverittBayesian Parameter Estimation for Latent Markov Random Fields andSocial Networks.J. Comput. Graph. Stat., 21(4): 940–60, 2012.

A. Grelaud, C. P. Robert, J.-M. Marin, F. Rodolphe & J.-F. TalyABC likelihood-free methods for model choice in Gibbs random fields.Bayesian Analysis, 4(2): 317–36, 2009.

Appendix

For Further Reading II

J.-M. Marin, P. Pudlo, C. P. Robert & R. J. RyderApproximate Bayesian computational methods.Statistics & Computing, 22(6): 1167–80, 2012.

Renfrey B. PottsSome generalized order-disorder transformations.Proc. Cambridge Philosophical Society, 48(1): 106–9, 1952.

J. K. Pritchard, M. T. Seielstad, A. Perez-Lezaun & M. W. FeldmanPopulation growth of human Y chromosomes: a study of Y chromosomemicrosatellitesMol. Biol. Evol., 16(12): 1791–8, 1999.

R. H. Swendsen & J.-S. WangNonuniversal critical dynamics in Monte Carlo simulations.Physical Review Letters, 58: 86–8, 1987.

Appendix

ABC tolerance levels

Iteration

0 20 40 60 80 100

02000

4000

6000

ε

σ

(a)

Iteration

0 20 40 60 80 100

02000

4000

6000

8000

10000

ε

σ

(b)

Appendix

Sufficient Statistic

Iteration

S(z

)

0 20 40 60 80 100

15000

15500

16000

16500

17000

17500

18000

(a)

Iteration

S(z

)

0 20 40 60 80 100

10000

12000

14000

16000

18000

20000

(b)

Appendix

RWMH acceptance rate

Iteration

pro

port

ion a

ccepte

d

0 20 40 60 80 100

0.0

0.2

0.4

0.6

0.8

1.0

(a)

Iteration

pro

port

ion a

ccepte

d

0 20 40 60 80 100

0.0

0.2

0.4

0.6

0.8

1.0

(b)

Appendix

Effective Sample Size

Iteration

ES

S

0 20 40 60 80 100

02000

4000

6000

8000

10000

(a)

Iteration

ES

S

0 20 40 60 80 100

02000

4000

6000

8000

10000

(b)

Pre-computation for ABC in image analysis

Software

abc tolerance t

simulation of pseudodata

abc algorithms

iterations t

labels z z

n particles

precomputed f standard

abc rejection sampler