Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge http://cosmologist.info/ collaborator: Sarah Bridle CosmoMC: http://cosmologist.info/cosmomc Lewis, Bridle: astro-ph/0205436 http://cosmologist.info/notes/cosmomc.ps.gz
26
Embed
Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge collaborator: Sarah Bridle CosmoMC:.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
CMB PolarizationBaryon oscillationsWeak lensingGalaxy power spectrumCluster gas fractionLyman alphaetc…
+
Cosmological parameters
Introduction
+ Priors
WMAP Science Team
Bayesian parameter estimation
• Can compute P( {ө} | data) using e.g. assumption of Gaussianity of CMB field and priors on parameters
• Often want marginalized constraints. e.g.
nn ddddataPdata ..)|...(| 2132111
• BUT: Large n-integrals very hard to compute!
• If we instead sample from P( {ө} | data) then it is easy:
)(11
1| i
iNdata
Generate samples from P
old CMB data alonecolor = optical depth
Samples in6D parameterspace
Markov Chain Monte Carlo sampling
• Detailed balance is a sufficient condition:
Metropolis-Hastings method:
for (nearly) arbitrary proposal function q and
• Move in parameter space with transition probability:
Simplest if q chosen to be symmetric
Chain distribution converges to P if all points can be reached (ergodic) andT leaves the distribution invariant
for all θ’
x1 x2
0
If proposed position has higher probability: always move
P(x)
x1x2
0
?
If proposed position has lower probability: move to x2 with probability P(x2)/P(x1)
P(x1)
P(x2)
P(x)
otherwise take another sample at x1
Procedure:
- Chose priors and parameterization
- Chose a proposal distribution
- Start chain in some random position (or several in different starting positions) - Iterate large sequence of chain steps storing chain positions (correlated samples)
- Remove ‘some’ samples from the beginning (‘burn in’)
- Optionally thin the samples
- Check for convergence
- Estimate desired quantities from samples (or just publish list of samples)
Potential Problems:
- Proposal width low or directions bad: slow sqrt(N) random walk through space
- Proposal width too large: low probability of proposing similar likelihood, low acceptance rate
- Proposal distribution must be able to reach all points in parameter space
Can never cross gap!
Assessing performance and convergence
- Autocorrelation length- Raftery and Lewis- Chain splitting variances
From single chain:
From multiple chains (e.g. MPI):
Gelman-Rubin (last half chains): - variance of means - diagonalized variance of means - diagonalized variance of limits
All necessary but not sufficient!
Which proposal distribution?Possible approaches:
Use many different proposals, try to be as robust as possible – ‘black box’:
- e.g. BayeSys http://www.inference.phy.cam.ac.uk/bayesys/- Unnecessarily slow for ‘simple’ systems. - Less likely to give wrong answers.- Versatile (but work to integrate with cosmology codes).
Use physical knowledge to make distribution ‘simple’ – use simple fast method:
- e.g. CosmoMC- Fast as long as assumptions are met. - More likely to give wrong answers; need e.g. close to unimodal distributions.- Less versatile- Optimized to use multi-processors efficiently; integrated CAMB
Either of above, but also try to calculate ‘Evidence’: - e.g. BayeSys, Nested Sampling (CosmoMC plug-in), fast method + thermodynamic integration, etc…
Essentially no method is guaranteed to give correct answers in reasonable time!
Variation of: Neal 1996; Statistics and Computing 6, 353
(Nested sampling??)
Choice of Priors
e.g. parameterize reionization history in a set of bins of xe. What is your prior?
- Physically 0 < xe < 1
- ‘Simple’ choice is uncorrelated uniform flat prior on optical depth from each bin:
Lewis et al. 2006, in prep
1/τn
0 τn
P(τn)
What does this mean for our prior on the total optical depth?
Total optical depth prior ~ Gaussian (strongly peaked away from zero).Low optical depth is a priori very unlikely - is this really what you meant??
Exact result is piecewise polynomial
Large N
If you don’t have very good data worth thinking carefully about your prior!
Another example: reconstructing primordial power spectrum in n bins
Bridle et al. astro-ph/0302306
or?
with flat uncorrelated priors on bins?
Differ by an effective prior on optical depth !
Also, do we really mean to allow arbitrarily wiggly initial power spectra?
One sensible solution: Gaussian process prior (imposes smoothness) - result then converges for large n
Conclusions
• MCMC is a powerful method, but should be used with care
• Judicious choice of parameters and proposal distribution greatly improves performance
• Can exploit difference between fast and slow parameters
• Worth thinking carefully about the prior
• Do not run CosmoMC blindly with flat priors on arbitrary new parameters
Also good for slice sampling (sampling_method=2,3) (Neal: physics/0009028)
Random directions? Don’t really want to head back in same direction…
- Chose random orientation of orthogonalized basis vectors- Cycle through directions- Periodically chose new random orientation
onlymixture
Helps in low dimensions
Can be generalized to n-dimensions:
- make several proposals in fast subspace - make proposals in best slow eigendirections - save lots of time if many fast parameters
- Less efficient per step because basis not optimal- Can compensate by doing more ‘fast’ transitions at near-zero cost- Ideally halves the number of ‘slow’ calculations