Top Banner
July 9, 2007 Bayesian Inference and Maximum Entropy 20 07 1 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics; Theoretical Division Los Alamos National Laboratory This presentation available at http://www.lanl.gov/home/k mh/ LA-UR-07- 2971 Bayesian Inference and Maximum Entropy Workshop, Saratoga Springs, NY, July 8-13, 2007
25

July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

Dec 30, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 1

Lessons about Likelihood Functions from Nuclear Physics

Kenneth M. Hanson

T-16, Nuclear Physics; Theoretical DivisionLos Alamos National Laboratory

This presentation available at http://www.lanl.gov/home/kmh/

LA-UR-07-2971

Bayesian Inference and Maximum Entropy Workshop, Saratoga Springs, NY, July 8-13, 2007

Page 2: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 2

Overview• Uncertainties in physics experiments

• Particle Data Group (PDG)

• Lifetime data

• Outliers

• Uncertainty in the uncertainty

• Student t distributions vs. Normal distribution

• Analysis of lifetime data using t distributions

Page 3: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 3

Page 4: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 4

Physics experiments• Experimenters state their measurement of physical quantity y as

measurement ± standard error or d ± σd

• Experimenter’s degree of belief in measurement described by a normal distribution (Gaussian) and σd its standard deviation

• Experimental uncertainty often composed of two components:► statistical uncertainty

• from noise in signal or event counting (Poisson)

• Type A – determined by repeated meas., frequentist methods► systematic uncertainty

• from equipment calibration, experimental procedure, corrections

• Type B – determined by nonfrequentist methods

• based on experimenter’s judgment, hence subjective; difficult to assess ► these usually added in quadrature (rms sum)

Page 5: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 5

Physics experiments – likelihood functions• In probabilistic terms, experimentalist’s statement y ± σd

is interpreted as a likelihood functionp( d | y σd I)

where I is background information about situation, including how experiment is performed

• Inference about the physical quantity y is obtained by Bayes lawp( y | d σy I) ~ p( y | d σd I) p( y | I)

where p( y | I) is the prior information about y

Page 6: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 7

Exploratory data analysis• John Tukey (1977) suggested each set of

measurements of a quantity be scrutinized

► find quantile positions, Q1, Q2, Q3► calculate the inter-quantile range

IQR = Q3 – Q2 (for normal distr., IQR = 1.35 σ)

► determine fraction of data outside interval, SO Q1 – 1.5 IQR < y < Q3 + 1.5 IQRlabeling these as suspected outliers(for normal distr., 0.7%)

• IQR measures width of core• SO measures extent of tail

Page 7: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 8

Page 8: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 9

Particle Data Group (PDG)• Particle Data Group formed in 1957

► annually summarizes state-of-knowledge of properties of elementary particles

• For each particle property► list all relevant experimental data► committee decides which data to include in final analysis► state best current value (usually least-squares average)

and its standard error • often magnified by sqrt[χ2/(N – 1)] (avg of 2.0, 50% of time)

• PDG reports are excellent source of information about measurements of unambiguous physical quantities

► available online, free► provide insight into how physicists interpret data

Page 9: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 11

Lambda lifetime measurement in the 60s• Hydrogen bubble chambers were used in

1950s and 60s to record elementary particles

• Picture shows reaction sequence: K- + p → Ξ- + K+

Ξ-→ Λ0 + π-

Λ0 → p + π-

• Track lengths and particle momenta determined from curvature in magnetic field yield survival time of Ξ- and Λ

• Hubbard et al. observed 828 such events to obtain lifetimes: τΞ = 1.69 ±0.06×10-10 s τΛ = 2.59 ±0.09×10-10 s

Hydrogen bubble- chamber photo

From J.R. Hubbard et al., Phys.Rev. 135B (1964)

Page 10: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 12

Measurements of neutron lifetime• Because n lifetime is so long,

it is difficult to measure without slowing neutrons or trapping them

• Plot shows all measurements of neutron lifetime

• Red line is PDG value, which includes 7 data sets,but excludes older ones and #2 because it is discrepant

• χ2 for red line = 149/21 pts.

• Evidence of outliers

• Large systematic uncertainties

Neutron lifetime measurements

Page 11: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 13

Measurements of lifetimes of other particles

Page 12: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 14

Collection of lifetime measurements• Goal: determine distribution of

measurements relative to their estimated uncertainties

• Upper graph shows deviations of 99 lifetime meas. for 5 particles from PDG values, divided by their standard errors, i.e. Δx/σ

• Lower graph shows histogram

• Objective: characterize the distribution of Δx/σ for these expts

• χ2 = 367/99 DOF

• IQR = 1.83 (1.35 for normal)

• Suspected outliers = 6.6 %

Page 13: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 15

Page 14: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 16

Uncertainty in the uncertainty• Suppose there is uncertainty in the stated standard error σ0 for

measurement d

• Dose and von der Linden (2000) gave plausible derivation:► assume likelihood has underlying Normal distr

► assume uncertainty distr for ω, where σ is scaled by

► marginalizing over ω, the likelihood is Student t distr., (2a = ν)

• Many have contributed to outlier story: Box and Tiao, O’Hagan, Fröhner, Press, Sivia, Hanson and Wolf

21

2( | ) expy

d yp d y I

1( | ) ( ) expaap I a

0 /

21

21( | ) 1 ( )

d y d yp d y I t

Page 15: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 17

Student t distribution• Student* t distribution

► long tail for ν < 9 (SO > 1%)► outlier-tolerant likelihood

function► ν = 1 is Cauchy distr (solid red)► ν = ∞ is Normal distr (solid blue)

2

121

( ) 1t z z

ν = 1, 5, ∞

* Student (1908) was pseudonym for W.S Gossett, who was not allowed to publish by his employer, Guiness brewery

Page 16: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 18

Physical analogy of probability• Drawing analogy between φ(Δx) = minus-log-posterior and a physical potential

• is a force with which each datum pulls on fit model

• Outlier-tolerant likelihoods► generally have long tails► restoring force eventually decreases for large residuals

G

2GG+C

G+E

2G

G

G+C

G+E

Page 17: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 19

Analysis of a collection of data• To calculate “average” value of a data set, use the Student t

distribution for the likelihood of each datum:

where s is scaling factor of standard error for whole data set

• Select ν based on data using model selection

• Scale factor s marginalized out of posterior

21

21

( | ) 1 ( )i i

i i

i ii i

i

d y

s

d yp d y s I t

s

( | ) ( | ) ( )p y p y s p s dsdσ dσ

Page 18: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 21

Model selection• Odds ratios of t distr (t) to Normal (N)

is

= 1.32x10-85 /2.2x10-90 = 6x104

► for prior ratio on models = 1► evidence is integral over x (lifetime)

and s; includes prior on s proportional to 1/s

• Thus, t distr is strongly preferred by data to Normal distr

► ν ≈ 2.6 (maximizes evidence)

( | ) ( | ) ( | )

( | ) ( | ) ( | )

p t I p t I p t I

p N I p N I p N I

dσ d σ

dσ d σ

t distr

Normal distr

Page 19: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 22

Measurements of neutron lifetime• Upper plot shows all

measurements of neutron lifetime

• Lower plot shows results based on all 21 data points:

► posterior for t-distr analysis (ν = 2.6, margin. over s)

► least-squares result (with and w/o χ2 scaling)

► PDG results (using 7 selected data points, Serebrov rejected)

Neutron lifetime measurements

Page 20: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 23

Measurements of π0 lifetime• Upper plot shows all

measurements of π0 lifetime

• Lower plot shows results based on all 13 data points:

► posterior for t-distr analysis (ν = 2.6, margin. over s)

► least-squares result (with χ2 scaling)

► PDG results (using 4 selected data points, excl. latest one)

π0 lifetime measurements

Page 21: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 24

Measurements of lambda lifetime• Upper plot shows all

measurements of lambda lifetime

• Lower plot shows results based on all 27 data points:

► posterior for t-distr analysis (ν = 2.6, margin. over s)

► least-squares result (with χ2 scaling)

► PDG results (using 3 latest data points)

Lambda lifetime measurements

Page 22: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 25

Tests • Draw 20 data points from various

t distrs. and analyze them using likelihoods: a) t distr with ν = 3 b) Normal distr

► scale uncertainties according to data variance

► results from 10,000 random trials

• Conclude► t distr results well behaved► normal distr results unstable

when data have significant outliers

t distr

Normal

dashed line = estimated σ

Page 23: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 26

Summary• Technique presented for dealing gracefully with outliers

► is based on using for likelihood function the Student t distr. instead of the Normal distr.

► copes with outliers, while treating every datum identically

• Particle lifetime data distribution matched by t distr. with ν ≈ 2.6 to 3.0

► using likelihood functions based on t distr. produce stable results when outliers exist in data sets, whereas Normal distr. does not

Page 24: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 27

Bibliography ► “A further look at robustness via Bayes;s theorem,” G.E.P. Box and G.C. Tiao,

Biometrica 49, pp. 419-432 (1962)► “On outlier rejection phenomena in Bayes inference,” A. O’Hagan, J. Roy. Statist. Soc. B

41, 358–367 (1979)► “Bayesian evaluation of discrepant experimental data,” F.H. Fröhner, Maximum Entropy

and Bayesian Methods, pp. 467–474 (Kluwer Academic, Dordrecht, 1989)► “Estimators for the Cauchy distribution,” K.M. Hanson and D.R. Wolf, Maximum Entropy

and Bayesian Methods, pp. 157-164 (Kluwer Academic, Dordrecht, 1993)► “Dealing with duff data,” D. Sivia, Maximum Entropy and Bayesian Methods, pp. 157-

164 (1996)► “Understanding data better with Bayesian and global statistical methods,” in W.H. Press,

Unsolved Problems in Astrophysics, pp. 49-60 (1997)► “Outlier-tolerant parameter estimation,” V. Dose and W. von der Linden, Maximum

Entropy and Bayesian Methods, pp. 157-164 (AIP, 2000)

This presentation available at http://www.lanl.gov/home/kmh/

Page 25: July 9, 2007Bayesian Inference and Maximum Entropy 20071 Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics;

July 9, 2007 Bayesian Inference and Maximum Entropy 2007 28