Approximate Bayesian Computation: algorithms, theory and ...Parameter inferenceModel selection2-3 examples Criticisms about model selection with ABC Templeton, PNAS 2010 ‘The probability
Post on 28-May-2020
8 Views
Preview:
Transcript
Parameter inference Model selection 2-3 examples
Approximate Bayesian Computation:algorithms, theory and applications
Michael G.B. Blum
Laboratoire TIMC-IMAG, UJF Grenoble, CNRS
MCEB, June 2012
Parameter inference Model selection 2-3 examples
“ABC is a ’democratizing’ method in that it will attract, forexample, biologists, who enjoy computer simulation but havelittle background in probability, into converting their favoritesimulation into a tool for inference”
Beaumont and Rannala, Nat. Rev. Genet. 2004
Different values of the parameter (Random design)
Simula9ons Simulated DNA sequences
Observed DNA sequences
ABC Most probable values for the parameter
Parameter inference Model selection 2-3 examples
Parameter inference for ABC
Parameter inference Model selection 2-3 examples
A coalescent example in population geneticsEstimating the mutation rate θ
T2
T3
T4
A B C D
1
2
3
4
5
6
ABCD
TMR
CA
123456
Segregatingsites
000100011000101000101011
Number of segregating sites: S=6
Heterozygosity: H=3.16
Parameter inference Model selection 2-3 examples
Two approximations in ABC
Replace full posterior p(θ|D) with partial posterior p(θ|sobs)
Nonparametric estimation of p(θ|sobs)
Parameter inference Model selection 2-3 examples
Rejection algorithmPritchard et al., MBE 1999
Simulate n values θi , i = 1, . . . ,n from the prior πSimulate n (possibly multivariate) summary statistics siaccording to p(si|θi)
Consider the weighted sample (θi ,Wi), i = 1, . . . ,n
Wi =
{1 if‖si − sobs)‖ ≤ b0 otherwise.
The parameter b is an acceptance threshold.
Parameter inference Model selection 2-3 examples
Rejection algorithm
�
�
��
�
�
�
�
�
�
�
�
���
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
��
��
�
�
�
�
0.2 0.4 0.6 0.8 1.0
Regression correction
Summary statistic
Mod
el p
aram
eter
�
��
�
�
�
�
��
��
�
�
�
�
�
�
��
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
��
�
�
�
�
�
�
�
�
�
��
�
�
�
�
�
�
�
�
�
�
�
�
−−b ++ b
S((y0))
�θ i
Posterior distribution
Parameter inference Model selection 2-3 examples
Regression adjustmentBeaumont et al., Genetics 2002
�
�
��
�
�
�
�
�
�
�
�
���
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
��
��
�
�
�
�
0.2 0.4 0.6 0.8 1.0
Regression correction
Summary statistic
Mod
el p
aram
eter
�
��
�
�
�
�
��
��
�
�
�
�
�
�
��
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
��
�
�
�
�
�
�
�
�
�
��
�
�
�
�
�
�
�
�
�
�
�
�
��
�
sobs
++ b−−b
θ i
Posterior distribution
θ i*
Parameter inference Model selection 2-3 examples
Linear regression adjustment
A model of local regression
θi |si = m(si) + εi
Local linear approximation
m(si) = α+ stiβ
Adjustmentθ∗i = m̂(sobs) + ε̃i ,
Parameter inference Model selection 2-3 examples
Main theoremBlum, JASA 2010
Asymptotic bias of the estimated posterior meanj = 0 rejection, j = 1 linear adjustment
C1,jb2
Asymptotic varianceC3
nbd
d is the number of the statistics and n is the number of simulations
Overemphasizes the curse of dimensionality because empiricalevidence are much more optimistic.
Parameter inference Model selection 2-3 examples
Comparison between the two estimators withadjustment
When the modelθi = m(si) + εi
is homoscedastic in the vicinity of sobs, the bias for theestimator with quadratic adjustment is
o(b2).
Transformations of the sum stat to make the model ashomoscedastic as possible.non-linear adj.Non-homoscedastic adjustment
Parameter inference Model selection 2-3 examples
Regression adjustment for the mean and the varianceBlum and François, Stat and Comput 2010
�
�
��
�
�
�
�
�
�
�
�
���
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
��
��
�
�
�
�
0.2 0.4 0.6 0.8 1.0
Regression correction
Summary statistic
Mod
el p
aram
eter
�
��
�
�
�
�
��
��
�
�
�
�
�
�
��
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
��
�
�
�
�
�
�
�
�
�
��
�
�
�
�
�
�
�
�
�
�
�
�
S((y0))
�
�
�
++ b−−b
θθi
θθi*
Posterior distribution
Parameter inference Model selection 2-3 examples
ABC with and without adjustmentEstimating the mean in a Gaussian sample
0.5 1.5 2.5 3.5
0.0
0.5
1.0
1.5
2.0
μ
Den
sity
1.0 1.5 2.0 2.5
0.0
0.5
1.0
1.5
2.0
No regression adjustment With regression adjustment
μ
True posterior
Parameter inference Model selection 2-3 examples
How to check that ABC works when you do not knowthe posteriorCook et al. 2006 J. Comp. Graph. Stat.
Take a (θi ,si) drawn from π(θi)p(si|θi).Perform ABC with sobs = si .Compute the proportion pi of posterior samples smallerthan θi .
If the algorithm provide samples from p(θi |si), pi should beuniformly distributed.
Parameter inference Model selection 2-3 examples
How to check that ABC works when you do not knowthe posteriorCook et al. 2006 J. Comp. Graph. Stat.
No regression adjustment
pi
Freq
uenc
y
0.0 0.4 0.8
05
1015
2025
3035
With regression adjustment
Freq
uenc
y
0.0 0.4 0.8
05
1015
2025
pi
Parameter inference Model selection 2-3 examples
Adaptive ABCSisson et al., PNAS 2007; Beaumont et al, Biometrika
2009; Del Moral et al. Stat and Comput 2011
Multi-step algorithms that sample θ from updated distributionsthat get closer and closer to the posterior distribution.
Parameter inference Model selection 2-3 examples
Model selection and related criticisms
Parameter inference Model selection 2-3 examples
Distinguishing between modelsAn exemple in human evolutionFagundes et al., PNAS 2007
Parameter inference Model selection 2-3 examples
Rejection algorithmPritchard et al., MBE 1999
Simulate the same number of simulations under eachmodelMk , k = 1, . . . ,K .Accept the simulations for which ‖si − sobs‖ ≤ b.
The proportion of accepted simulations under each modelk = 1, . . . ,K is an estimate of the posterior distributionp(Mk |sobs).
Parameter inference Model selection 2-3 examples
Model selection with logistic regressionBeaumont 2008
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Summary statistic
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �
Post
erio
r pro
babi
lity
of m
odel
1
−−b
S((y0))� Model 0Model 1
+b
Parameter inference Model selection 2-3 examples
Criticisms about model selection with ABCTempleton, PNAS 2010
‘The probability of the nested special case must be less than orequal to the probability of the general model within which thespecial case is nested’.
In a proper Bayesian framework, two hypotheses representnon-nested events.Coin tossing example.M0 : q = 0.5,M1 : q U(0,1).9 heads out of 20 tosses.
p(M0|9/20) ≈ 3.36p(M1|9/20)
Parameter inference Model selection 2-3 examples
Criticisms about model selection with ABCRobert et al., PNAS 2011
‘The algorithm involves an unknown loss of informationinduced by the use of insufficient summary statistics’.Assume that s is sufficient for parameter inference inmodel 0 and model 1.
p(M0|sobs)
p(M1|sobs)= g(D)
p(M0|D)
p(M1|D),
with g(D) possibly different from 1.
Parameter inference Model selection 2-3 examples
Some elements of answers if your NSF reviewerkeeps bothering you with this.
The criticism pertains to situation where s is sufficient forparameter inference in model 0 and model 1. Exercise :For coalescent models, try to find one for a constant-sizepopulation model and a bottleneck model.In approximate Bayesian Computation, we target p(M|s)instead of p(M|D).An important question to address might be ‘Does s containenough information to distinguish between models’.
Parameter inference Model selection 2-3 examples
Model selection with ABC : the right answer ?
Fagundes et al., PNAS 2007
Parameter inference Model selection 2-3 examples
A deviance criterion for model selection with ABCFrançois and Laval, SAGMB 2011
Estimators of p(M|s) ignore regression adjustments onparameter samples
DIC = EPost[deviance] + effective number of parameters
Parameter inference Model selection 2-3 examples
2-3 examples
Parameter inference Model selection 2-3 examples
Example 1 : Models of origins for modern humansBlum and Jakobsson, MBE 2011
TMRCA distribution A) Autosomal genes
DataSingle OriginModelLow admixturewith archaichumans
Millions of years
Den
sity
0.0 0.5 1.0 1.5 2.0 2.5 3.0
B) X chromosome
0.0 0.5 1.0 1.5 2.0 2.5 3.0Millions of years
Den
sity
Parameter inference Model selection 2-3 examples
Models of origins for modern humans
C) mtDNA and Y chromosome
Millions of years
Den
sity
Y chr. mtDNA
0.0 0.2 0.4 0.6 0.8 1.0
Single Origin Model
Low admixture witharchaic humans
Bottleneck
Parameter inference Model selection 2-3 examples
Example 1 : Testing the human ‘speciation’ bottleneckSjödin et al., MBE 2012
Lahr and Foley 1998
Parameter inference Model selection 2-3 examples
Estimating the strength of the bottleneck
−1.5 −1.0 −0.5 0.0
02
46
8
Magnitude of reduction b
Den
sity San
BiakaMandenka
Prior
30 10 3 1Ratio of population sizes NA/NB
Parameter inference Model selection 2-3 examples
Support for a ‘no-bottleneck’ model against 2bottleneck modelsPr(no bottleneck |sobs) ≥ 79%
max. expansionof Sahara
SWAfricandesert
max. expansionof Sahara
SWAfricandesert
TgrN0
NANA
n
T =20-60 kyadur
130 kya
NB
No bottleneck Bottleneck
Founder hypothesis Fragmentation hypothesis
NA
NA
NA
N0 N0
Parameter inference Model selection 2-3 examples
Is it possible to distinguish between models ?
−1.5 −1.0 −0.5 0.0
Magnitude of reduction b
Prop
ortio
n of
ass
ignm
ent
Magnitude of reduction b
A. No bottleneck B. Founder C. Fragmentation
FragmentationFounderNo bott.
0.0
0.2
0.4
0.6
0.8
1.0
−1.5 −1.0 −0.5 0.0
30 10 3 1Ratio of population sizes NA/NB
30 10 3 1Ratio of population sizes NA/NB
Parameter inference Model selection 2-3 examples
Goodness of fitPosterior predictive checks + PCA
6
High Mut.Low Mut.
No bottleneck Bottleneck founder fragmentation
−6 −4 −2 0 2 4 6
−6−4
−20
24
6
PC1
PC2
−6 −4 −2 0 2 4 6
−6−4
−20
24
6
PC1−6 −4 −2 0 2 4
−6−4
−20
24
6
PC1
San Biaka Mandenka
Parameter inference Model selection 2-3 examples
Example 2 : Species delimitation with ABCCamargo et al., Evolution 2012
A""""""""B""""""""C""""""" (A,B)""""""""""""""""""(C,
Parameter inference Model selection 2-3 examples
Gene trees of loci sampled for species delimitationanalyses
Parameter inference Model selection 2-3 examples
Prior predictive checks
Parameter inference Model selection 2-3 examples
Example 3 : Fitting models of continuous trait evolutionSlater et al., Evolution 2012
Time-calibrated phylogeny of Carnivora used to estimate ratesof trait evolution
Parameter inference Model selection 2-3 examples
Comparing a two-rate modelM2 to a one-rate modelM1
If p(M1|s) = x%, there is a probability of x% that s wasgenerated fromM1 (and 1− x% that s was generated fromM2).Cook et al. 2006 in a model selection framework.
Parameter inference Model selection 2-3 examples
Checking the ‘consistency’ of the Bayes factor
Parameter inference Model selection 2-3 examples
Comparing pinniped and terrestrial carnivore bodysize evolutionary rates
Parameter inference Model selection 2-3 examples
Conclusion
ABC incorporates all aspects of Bayesian data analysis :formulation, fitting and model selection, and improvementof a model through model checkingCsilléry et al., TREE 2010.To address issues related to model selection
1 Ability to distinguish between models2 ‘Consistency’ of the Bayes factor3 Buy a bottle of wine to the reviewer
The R package abc implements several ABC algorithms.
Parameter inference Model selection 2-3 examples
Colleagues
top related