Top Banner
Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge http://cosmologist.info/ collaborator: Sarah Bridle CosmoMC: http://cosmologist.info/cosmomc Lewis, Bridle: astro-ph/0205436 http://cosmologist.info/notes/cosmomc.ps.gz
26

Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge collaborator: Sarah Bridle CosmoMC:.

Mar 26, 2015

Download

Documents

Jose Richards
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Efficient MCMC for cosmological parameters

Antony LewisInstitute of Astronomy, Cambridge

http://cosmologist.info/

collaborator: Sarah Bridle

CosmoMC: http://cosmologist.info/cosmomc

Lewis, Bridle: astro-ph/0205436

http://cosmologist.info/notes/cosmomc.ps.gz

Page 2: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

CMB PolarizationBaryon oscillationsWeak lensingGalaxy power spectrumCluster gas fractionLyman alphaetc…

+

Cosmological parameters

Introduction

+ Priors

WMAP Science Team

Page 3: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Bayesian parameter estimation

• Can compute P( {ө} | data) using e.g. assumption of Gaussianity of CMB field and priors on parameters

• Often want marginalized constraints. e.g.

nn ddddataPdata ..)|...(| 2132111

• BUT: Large n-integrals very hard to compute!

• If we instead sample from P( {ө} | data) then it is easy:

)(11

1| i

iNdata

Generate samples from P

Page 4: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

old CMB data alonecolor = optical depth

Samples in6D parameterspace

Page 5: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Markov Chain Monte Carlo sampling

• Detailed balance is a sufficient condition:

Metropolis-Hastings method:

for (nearly) arbitrary proposal function q and

• Move in parameter space with transition probability:

Simplest if q chosen to be symmetric

Chain distribution converges to P if all points can be reached (ergodic) andT leaves the distribution invariant

for all θ’

Page 6: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

x1 x2

0

If proposed position has higher probability: always move

P(x)

Page 7: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

x1x2

0

?

If proposed position has lower probability: move to x2 with probability P(x2)/P(x1)

P(x1)

P(x2)

P(x)

otherwise take another sample at x1

Page 8: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Procedure:

- Chose priors and parameterization

- Chose a proposal distribution

- Start chain in some random position (or several in different starting positions) - Iterate large sequence of chain steps storing chain positions (correlated samples)

- Remove ‘some’ samples from the beginning (‘burn in’)

- Optionally thin the samples

- Check for convergence

- Estimate desired quantities from samples (or just publish list of samples)

Page 9: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Potential Problems:

- Proposal width low or directions bad: slow sqrt(N) random walk through space

- Proposal width too large: low probability of proposing similar likelihood, low acceptance rate

- Proposal distribution must be able to reach all points in parameter space

Can never cross gap!

Page 10: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Assessing performance and convergence

- Autocorrelation length- Raftery and Lewis- Chain splitting variances

From single chain:

From multiple chains (e.g. MPI):

Gelman-Rubin (last half chains): - variance of means - diagonalized variance of means - diagonalized variance of limits

All necessary but not sufficient!

Page 11: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Which proposal distribution?Possible approaches:

Use many different proposals, try to be as robust as possible – ‘black box’:

- e.g. BayeSys http://www.inference.phy.cam.ac.uk/bayesys/- Unnecessarily slow for ‘simple’ systems. - Less likely to give wrong answers.- Versatile (but work to integrate with cosmology codes).

Use physical knowledge to make distribution ‘simple’ – use simple fast method:

- e.g. CosmoMC- Fast as long as assumptions are met. - More likely to give wrong answers; need e.g. close to unimodal distributions.- Less versatile- Optimized to use multi-processors efficiently; integrated CAMB

Either of above, but also try to calculate ‘Evidence’: - e.g. BayeSys, Nested Sampling (CosmoMC plug-in), fast method + thermodynamic integration, etc…

Essentially no method is guaranteed to give correct answers in reasonable time!

Page 12: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Proposal distribution: Use knowledge to change to physical parameters

E.g. peak position ~

sound horizon--------------------------------angular diameter distance

-use this as variable-if non-linear transformation beware of priors (Jacobian)

Kosowsky et al. astro-ph/0206014

Page 13: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Proposal distribution: changing variables, transform to near-isotropic Gaussian

Linear Transformation

No change in priors

Equivalent to proposing alongeigenvector directions

Easily done automatically from estimate of covariance matrix

e.g.

Page 14: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Proposal distribution: dynamically adjusting parameters

Beware of non-Markovian updates…

OK methods:

-Do short run, estimate covariance, change parameters, do final run

-Make periodic updates, but discard all samples before the final update

-Periodically update covariance, but use constant fraction of all previous samples; covariance will convergence, hence asymptotically Markovian

BAD methods (sample from wrong distribution, e.g. astro-ph/0307219):

- Estimate local covariance and use for proposal distribution

- Dynamically adjust proposal width based on local acceptance rate

- Propose change to some parameters then adjust other parameters to find good fit, then take new point as final proposal

The onus is on you to prove any new method leaves the target distribution invariant!

??

Page 15: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Symmetric Metropolis proposal distributions

- Most common choice is a n-D Gaussian (n may be a sub-space dimension)

- in 2D

‘Best’ scaling is r → r / 2.4 if posterior is GaussianDunkley et al astro-ph/0405462

- Scale with to n get same acceptance rate in any dimension

- In general corresponds to choosing random direction, and moving normalizeddistance r in that direction with probability

Q: Is this really best, or just shortest autocorrelation length??

Page 16: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Is using a large-n Gaussian proposal a good idea?

e.g. consider Pn(r) as a proposal in 1D

– be careful how you assess performance!

Higher n, shorter correlation length, worse Raftery & Lewis

Definitely not in 1D… maybe not in general

Page 17: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

1/3 exponential, n=2, 5 (n=2 used by CosmoMC)

Want robust to inaccurate size estimates: broaden tail, make non-zero at zero:e.g. mixture with exponential

+cycling

Page 18: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Fast and Slow Parameters

- Calculating theoretical CMB and P(k) transfer functions is slow (~ second) ‘Slow’ parameters:

- Changing initial power spectrum is fast (~ millisecond)

‘Fast’ parameters:

- Other nuisance parameters may also be fast (calibrations etc.), though often can be integrated numerically.

- ‘Fast’ only as fast as the data likelihood code

How do we use the fast/slow distinction efficiently?

Page 19: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Fast Slow

Usual choice of proposal directions:

Fast-slow optimizedchoice:

Page 20: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

‘Directional Gridding’(Radford Neal)

Slow

Fast

Other methods: e.g. tempered transitions etc. may work for very fast parameters

Neal 2005, math.ST/0502099

Slow grid width chosen randomlyat start of eachgridding step: fullgridding step leavesdistribution invariant

Variation of: Neal 1996; Statistics and Computing 6, 353

(Nested sampling??)

Page 21: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Choice of Priors

e.g. parameterize reionization history in a set of bins of xe. What is your prior?

- Physically 0 < xe < 1

- ‘Simple’ choice is uncorrelated uniform flat prior on optical depth from each bin:

Lewis et al. 2006, in prep

1/τn

0 τn

P(τn)

Page 22: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

What does this mean for our prior on the total optical depth?

Total optical depth prior ~ Gaussian (strongly peaked away from zero).Low optical depth is a priori very unlikely - is this really what you meant??

Exact result is piecewise polynomial

Large N

Page 23: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

If you don’t have very good data worth thinking carefully about your prior!

Another example: reconstructing primordial power spectrum in n bins

Bridle et al. astro-ph/0302306

or?

with flat uncorrelated priors on bins?

Differ by an effective prior on optical depth !

Also, do we really mean to allow arbitrarily wiggly initial power spectra?

One sensible solution: Gaussian process prior (imposes smoothness) - result then converges for large n

Page 24: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Conclusions

• MCMC is a powerful method, but should be used with care

• Judicious choice of parameters and proposal distribution greatly improves performance

• Can exploit difference between fast and slow parameters

• Worth thinking carefully about the prior

• Do not run CosmoMC blindly with flat priors on arbitrary new parameters

Page 25: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Also good for slice sampling (sampling_method=2,3) (Neal: physics/0009028)

Random directions? Don’t really want to head back in same direction…

- Chose random orientation of orthogonalized basis vectors- Cycle through directions- Periodically chose new random orientation

onlymixture

Helps in low dimensions

Page 26: Efficient MCMC for cosmological parameters Antony Lewis Institute of Astronomy, Cambridge  collaborator: Sarah Bridle CosmoMC:.

Can be generalized to n-dimensions:

- make several proposals in fast subspace - make proposals in best slow eigendirections - save lots of time if many fast parameters

- Less efficient per step because basis not optimal- Can compensate by doing more ‘fast’ transitions at near-zero cost- Ideally halves the number of ‘slow’ calculations

This is CosmoMC default

Using fast-slow basis: