Conditional Tests of Goodness of Fit Richard Lockhart Outline Conclusions Sufficiency von Mises GOF Conditional tests Implementation Bayes power Parametric Bootstrap Conclusions Conditional Tests of Goodness of Fit Richard Lockhart Simon Fraser University Birthday (2101 3 ) Conference for Federico O’Reilly Mexico City, November 27, 2009 Richard Lockhart Conditional Tests of Goodness of Fit
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Conditional Tests of Goodness of Fit
Richard Lockhart
Simon Fraser University
Birthday (21013) Conference forFederico O’Reilly
Mexico City, November 27, 2009
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Outline Page # -31
1 Conclusions
2 Sufficiency
3 Von Mises regression
4 Goodness-of-fit
5 Conditional Tests of Fit
6 How do we implement conditional tests?
7 Conditional tests maximize average power
8 Comparison with the parametric bootstrap
9 Conclusions
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
For when I don’t finish -30
Conditional tests of fit are the right tests.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
For when I don’t finish -30
Conditional tests of fit are the right tests.
They have exactly the advertised level.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
For when I don’t finish -30
Conditional tests of fit are the right tests.
They have exactly the advertised level.
The best unbiased tests are conditional tests.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
For when I don’t finish -30
Conditional tests of fit are the right tests.
They have exactly the advertised level.
The best unbiased tests are conditional tests.
“Best” is for Bayesian prior.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
For when I don’t finish -30
Conditional tests of fit are the right tests.
They have exactly the advertised level.
The best unbiased tests are conditional tests.
“Best” is for Bayesian prior.
Implementation via Markov Chain Monte Carlo.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
For when I don’t finish -30
Conditional tests of fit are the right tests.
They have exactly the advertised level.
The best unbiased tests are conditional tests.
“Best” is for Bayesian prior.
Implementation via Markov Chain Monte Carlo.
As a by-product can implement Rao-Blackwell for testsand estimation.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
For when I don’t finish -30
Conditional tests of fit are the right tests.
They have exactly the advertised level.
The best unbiased tests are conditional tests.
“Best” is for Bayesian prior.
Implementation via Markov Chain Monte Carlo.
As a by-product can implement Rao-Blackwell for testsand estimation.
The parametric bootstrap nearly implements conditionaltests.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
For when I don’t finish -30
Conditional tests of fit are the right tests.
They have exactly the advertised level.
The best unbiased tests are conditional tests.
“Best” is for Bayesian prior.
Implementation via Markov Chain Monte Carlo.
As a by-product can implement Rao-Blackwell for testsand estimation.
The parametric bootstrap nearly implements conditionaltests.
Results extend to exponential family regression models.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
For when I don’t finish -30
Conditional tests of fit are the right tests.
They have exactly the advertised level.
The best unbiased tests are conditional tests.
“Best” is for Bayesian prior.
Implementation via Markov Chain Monte Carlo.
As a by-product can implement Rao-Blackwell for testsand estimation.
The parametric bootstrap nearly implements conditionaltests.
Results extend to exponential family regression models.
Ideas might have relevance in bump hunting (LHC) or ?
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Sufficiency -29
Sufficiency plays several roles in goodness of fit:
1 Alternative to plug-in estimate of distribution function.
2 Control of level of test by conditioning.
3 Maximize average power by conditioning.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
An example — von Mises distribution -28
Von Mises density:
f (θ;κ, θ0) =1
2πIo(κ)exp{κ cos(θ − θ0)}1(0 ≤ θ < 2π)
modal angle θ0
concentration parameter κ.
Continuous around circle. f (0;κ, θ0) = f (2π;κ, θ0).
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Reparametrize as natural exponential family -27
For regression: recast data as unit vector:
U = (cos θ, sin θ).
Define natural exponential family parameter:
ψ = (κ cos θ0, κ sin θ0).
Density (relative to arc length on unit circle)
f (u;ψ) =1
2πIo(‖ψ‖)exp{ψ′u}.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Add covariates -26
Responses Ui
Covariates for observation i form matrix xi (p × 2).
Model: for some p × 1 parameter vector β we have:
ψi = β′xi .
Likelihood is (ignoring powers of 2π):
L(β) = exp{β′S −∑
log Io(‖ψi‖)}
whereS =
∑
i
xiUi
is a complete sufficient statistic.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Periwinkles from N. Fisher -25
Distance moved and Direction moved for 31 periwinkles,whatever they are.
If ith periwinkle moved distance di then
xi =
di 00 di1 00 1
Plot di versus angle on next slide.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Plot -24
50 100 150 200
040
80
Angle (degrees)
Dis
tanc
e
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Plot -23
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Fitted values, parameter estimates -22
0 20 40 60 80 100 120
050
150
Distance
Fitt
ed A
ngle
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Fitted values, parameter estimates -21
0 20 40 60 80 100 120
26
1014
Distance
Fitt
ed K
appa
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Our focus — goodness-of-fit -20
Is the von Mises model any good?
What is the role of sufficiency in this question?
Answers we learned from and with Federico:
Define the statistic. Use Rao-Blackwell.
Control the level of the test.
Design most powerful unbiased tests.
From now on: iid case.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Rao-Blackwell and goodness-of-fit tests -19
Our favourite statistics are based on empirical distribution:
F (u1, u2] ≡1
n
∑
i
1(Ui ∈ Sec(u1, u2])
where Sec(u1, u2] is sector of circle from u1 to u2.
u_1
u_2
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Test Statistics -18
Compare this function to its expected value
Fψ(u1, u2] = Pψ(Ui ∈ Sec(u1, u2])
We use Watson’s U2:
U2 =
∫ 2π
0
∫ 2π
0
{
F (u1, u2]− Fψ(u1, u2]}2
dFψ(u1)dFψ(u2).
Replace Fψ by an estimate: Rao-Blackwell is a good estimate.
Fψ(u1, u2] = E [1(Ui ∈ Sec(u1, u2])|S ]
Hard to compute, though — more later.
Notice we get unbiased estimate of dFψ, the density.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Conditional tests -17
Federico’s suggestion: test the fit using conditional P-value:
p = P(U2 > U2Obs|S)
If the data are a sample from any von Mises distributionthen p has a Uniform[0,1] distribution.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Conditional tests -17
Federico’s suggestion: test the fit using conditional P-value:
p = P(U2 > U2Obs|S)
If the data are a sample from any von Mises distributionthen p has a Uniform[0,1] distribution.
This is a great idea.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Conditional tests -17
Federico’s suggestion: test the fit using conditional P-value:
p = P(U2 > U2Obs|S)
If the data are a sample from any von Mises distributionthen p has a Uniform[0,1] distribution.
This is a great idea.
But how do we do it and how well does it work?
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Implementation -16
How do we do compute p?
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Implementation -16
How do we do compute p?
Exact distribution is essentially impossible.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Implementation -16
How do we do compute p?
Exact distribution is essentially impossible.
Simulation.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Implementation -16
How do we do compute p?
Exact distribution is essentially impossible.
Simulation.
Draw many samples from conditional dist of data given S .
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Implementation -16
How do we do compute p?
Exact distribution is essentially impossible.
Simulation.
Draw many samples from conditional dist of data given S .Markov Chain Monte Carlo if that can’t be done.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Implementation -16
How do we do compute p?
Exact distribution is essentially impossible.
Simulation.
Draw many samples from conditional dist of data given S .Markov Chain Monte Carlo if that can’t be done.
Approximation.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Direct Simulation -15
Three references:
O’Reilly, F., Rueda, R. (1998). A note on the fit of theLevy distribution.
O’Reilly, F. J. & Gracia-Medrano, L. (2006). On theConditional Distribution of Goodness of Fit Tests.
Gonzalez Barrios, J. M., O’Reilly, F. J. & Rueda, R.(2006) Goodness of Fit for Discrete Random Variablesusing the Conditional Density.
Suggestion: generate iid samples from conditional distributionof data given sufficient statistic S .
Richard Lockhart Conditional Tests of Goodness of Fit
Sometimes: given Sn, can simulate an observation Xn.
Then compute Sn−1, simulate Xn−1 given Xn,Sn−1.
Continue to generate a random sample X1, . . . ,Xn fromconditional distribution given Sn.
Repeat for M independent samples from conditional
Compute GOF statistic for each sample.
Estimate p by fraction of simulated GOF stats larger thanGOF stat for data.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
MCMC -13
But often cannot generate observation from Xn|Sn.
So can’t easily generate samples (X1, . . . ,Xn)|Sn.
So find a Markov Chain whose stationary distribution is
L(X1, . . . ,Xn|Sn)
and which we can simulate.
We have so far used Gibbs sampler.
Leads us back to Pearson, Kluyver, Rayleigh and Bennett.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Gibbs sampling for von Mises case -12
Our method goes like this:
Start with original sample θ1, . . . , θn and Sn.
Compute the sufficient statistic S3 for θ1, θ2, θ3.
Compute conditional density of θ3 given S3: joint overmarginal.
Doesn’t depend on von Mises parameter! So do uniformcase!
Joint – easy by change of variables. Write S3 in polarco-ordinates R ,Θ.
Marginal: Angle Θ is uniform and independent of R .
Marginal: Stephens (1962) uses elliptic integrals to getdensity of R .
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Density of resultant -11
0.9
0.5
0.3
2.5
0.1
1.1
1.0
0.8
0.7
0.6
0.4
3.0
0.2
0.0
2.01.51.00.50.0
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Finish off Gibbs sampling -10
Simulate new X3, say X ∗
3 from conditional.
Implicitly X1,X2 also change but not needed yet.
Now work on X1,X2,X4.
Condition on S worked out from Sn,X∗
3 ,X5, . . . ,Xn.
Simulate X ∗
4 .
And so on to get X ∗
3 , . . . ,X∗
n .
Compute X ∗
1 and X ∗
2 by solving equations for known Sn.
Repeat whole process ad nauseam.
Gives, after burn-in, many samples (not independent fromone another) from conditional law of X1, . . . ,Xn given Sn.
Rest is like direct method.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
After How? comes Why? -9
Back to basics – Lehmann.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
After How? comes Why? -9
Back to basics – Lehmann.
Any unbiased test must have
Pψ(Reject) ≡ α.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
After How? comes Why? -9
Back to basics – Lehmann.
Any unbiased test must have
Pψ(Reject) ≡ α.
In presence of complete sufficient S :
P(Reject|S) ≡ α.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
After How? comes Why? -9
Back to basics – Lehmann.
Any unbiased test must have
Pψ(Reject) ≡ α.
In presence of complete sufficient S :
P(Reject|S) ≡ α.
To maximize power against simple alternative:
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
After How? comes Why? -9
Back to basics – Lehmann.
Any unbiased test must have
Pψ(Reject) ≡ α.
In presence of complete sufficient S :
P(Reject|S) ≡ α.
To maximize power against simple alternative:
Find least favourable prior π0 on null.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
After How? comes Why? -9
Back to basics – Lehmann.
Any unbiased test must have
Pψ(Reject) ≡ α.
In presence of complete sufficient S :
P(Reject|S) ≡ α.
To maximize power against simple alternative:
Find least favourable prior π0 on null.Use Neyman-Pearson test of simple alternative vs simplenull (average out null pars).
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Bayes makes alternative simple -8
Suggested strategy: maximize average power over familyof reasonable alternatives.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Bayes makes alternative simple -8
Suggested strategy: maximize average power over familyof reasonable alternatives.
Implementation via Bayes ideas.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Bayes makes alternative simple -8
Suggested strategy: maximize average power over familyof reasonable alternatives.
Implementation via Bayes ideas.
If null hypothesis (Ui has von Mises density) false thendensity of Ui is some gi .
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Bayes makes alternative simple -8
Suggested strategy: maximize average power over familyof reasonable alternatives.
Implementation via Bayes ideas.
If null hypothesis (Ui has von Mises density) false thendensity of Ui is some gi .
Model gi as random (Bayesian prior).
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Bayes makes alternative simple -8
Suggested strategy: maximize average power over familyof reasonable alternatives.
Implementation via Bayes ideas.
If null hypothesis (Ui has von Mises density) false thendensity of Ui is some gi .
Model gi as random (Bayesian prior).
First do simple null hypothesis – specific value of β.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Bayes makes alternative simple -8
Suggested strategy: maximize average power over familyof reasonable alternatives.
Implementation via Bayes ideas.
If null hypothesis (Ui has von Mises density) false thendensity of Ui is some gi .
Model gi as random (Bayesian prior).
First do simple null hypothesis – specific value of β.
Bayesian maximizes average power (wrt prior) subject tolevel α using Neyman Pearson.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Simple Null -7
For a Bayesian the marginal distribution of data onalternative is
E
[
n∏
i=1
gi (ui )
]
.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Simple Null -7
For a Bayesian the marginal distribution of data onalternative is
E
[
n∏
i=1
gi (ui )
]
.
Get simple interpretation if we model
ǫZi(u) = log{gi (u)/fi (u;β)}
as stochastic process.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Simple Null -7
For a Bayesian the marginal distribution of data onalternative is
E
[
n∏
i=1
gi (ui )
]
.
Get simple interpretation if we model
ǫZi(u) = log{gi (u)/fi (u;β)}
as stochastic process.
So
E
[
n∏
i=1
gi (ui )
]
= E
[
exp{ǫn
∑
i=1
Zi(ui )}
]
∏
fi(ui , β).
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Neyman Pearson lemma -6
Neyman Pearson: reject for large values of
Ψβ(U1, . . . ,Un) = E
[
exp{ǫ
n∑
i=1
Zi(ui )}
]
|ui=Ui
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Neyman Pearson lemma -6
Neyman Pearson: reject for large values of
Ψβ(U1, . . . ,Un) = E
[
exp{ǫ
n∑
i=1
Zi(ui )}
]
|ui=Ui
This has to be made computable!
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Neyman Pearson lemma -6
Neyman Pearson: reject for large values of
Ψβ(U1, . . . ,Un) = E
[
exp{ǫ
n∑
i=1
Zi(ui )}
]
|ui=Ui
This has to be made computable!
This happens approximately for ǫ = O(n−1/2) and Zi
Gaussian (corrected).
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Best statistics -5
The test statistics are approximately functions of EDFtype tests.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Best statistics -5
The test statistics are approximately functions of EDFtype tests.
For instance, one choice of Zi leads to Watson’s U2.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Best statistics -5
The test statistics are approximately functions of EDFtype tests.
For instance, one choice of Zi leads to Watson’s U2.
Now extend this to composite hypotheses as in real gof.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Best statistics -5
The test statistics are approximately functions of EDFtype tests.
For instance, one choice of Zi leads to Watson’s U2.
Now extend this to composite hypotheses as in real gof.
Think of null hypothesis as curve in space of distributions.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Best statistics -5
The test statistics are approximately functions of EDFtype tests.
For instance, one choice of Zi leads to Watson’s U2.
Now extend this to composite hypotheses as in real gof.
Think of null hypothesis as curve in space of distributions.
Then alternative averages over tube around curve.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
The prior -4
Build up prior in steps.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
The prior -4
Build up prior in steps.
First pick parameter value β at random – density π1.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
The prior -4
Build up prior in steps.
First pick parameter value β at random – density π1.
Then model likelhood ratio of gi (ui ) to fi(ui ;β) as before.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
The prior -4
Build up prior in steps.
First pick parameter value β at random – density π1.
Then model likelhood ratio of gi (ui ) to fi(ui ;β) as before.
Choose ǫ on the order n−1/2 to get
α < Asymptotic Power < 1
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
The prior -4
Build up prior in steps.
First pick parameter value β at random – density π1.
Then model likelhood ratio of gi (ui ) to fi(ui ;β) as before.
Choose ǫ on the order n−1/2 to get
α < Asymptotic Power < 1
So tube is effectively about n−1/2 in diameter.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Structure of solution -3
Goal: maximize average power subject to level α.
Method: from Lehmann—find least favourable prior π0 onnull.
Use Neyman Pearson test – simple null vs simplealternative.
Adjust prior so NP test has desired level.
Test statistic becomes∫
Ψβ(U1, . . . ,Un)∏
ifi(Ui ;β)π1(β)dβ
∫∏
ifi(Ui ;β)π0(β)dβ
.
Multiply and divide by∫
∏
i
fi (Ui ;β)π1(β)dβ.
Recognize structure!
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
The structure interpreted -2
Neyman-Pearson likelihood ratio is product of two terms.
First term in likelihood ratio is posterior expectation
Ψ(u) ≡
∫
Ψ(u, β) π1(dβ|u)
where π1(dβ|u) is posterior computed under null for priorπ1.
Second is ratio of marginals under null:∫∏
f (ui , θ)π1(β)dβ∫∏
f (ui , β)π0(β)dβ.
Second term depends only on sufficient statistic!
So: average test statistic wrt posterior for π1.
Reject if this exceeds some function of S .
Adjust function to get level α.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Section summary -1
Average power maximized subject to level by conditionaltest.
For a suitable prior on the alternative Watson’s U2
provides the right test statistic.
Test statistic for composite null is average of test statisticfor simple null wrt posterior.
Contrast with usual statistic based on single β value:
U2(β).
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Implementation: some details 0
MCMC methods can compute most powerful test againstreasonable alternatives.
Method can be made as exact as desired.
Lots of room for improvement of MCMC method.
As part of MCMC can compute sample of βs fromposterior to help compute statistic by simulation.
In addition simulation produces samples from conditionallaw of data given S so we can compute otherRao-Blackwell estimates by simulation.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Comparison with other methods 1
We tried to show our conditional tests were better thanparametric bootstrap.
Generate many data sets.
For each data set run MCMC to compute conditionalp-value.
For each data set do parametric bootstrap: estimateparameters by ml; compute new test statistic; compareobserved test statistic to simulations to get unconditionalp-value.
Plot two p values against each other.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
P-value comparisons 2
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Gibbs Sampling p−value
Par
amet
ric B
oots
trap
p−
valu
e
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
The kicker 3
Two methods produce nearly equal P-values.
So nearly equal levels and powers.
This is a theorem.
Starting with work of Lars Holst (Ann Prob).
Conditional law of data given S is singular wrtunconditional law.
But: conditional dist of gof tests asymptotically same asunconditional.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Goodness of fit for the periwinkle data 4
Fit the model that angles are iid von Mises.
FindU2 = 0.314
Very significant – large sample points.
Fit the model discussed earlier.
FindU2 = 0.049
Much smaller. Not significant.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Goodness of fit for the periwinkle data 4
Fit the model that angles are iid von Mises.
FindU2 = 0.314
Very significant – large sample points.
Fit the model discussed earlier.
FindU2 = 0.049
Much smaller. Not significant.
Confession is good for the soul.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Goodness of fit for the periwinkle data 4
Fit the model that angles are iid von Mises.
FindU2 = 0.314
Very significant – large sample points.
Fit the model discussed earlier.
FindU2 = 0.049
Much smaller. Not significant.
Confession is good for the soul.
I have not implemented either of the methods discussedabove (yet?) for this model.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Conclusions reiterated 5
I learned from, with and because of Federico:
Conditional tests of fit are the right tests.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Conclusions reiterated 5
I learned from, with and because of Federico:
Conditional tests of fit are the right tests.
They have exactly the advertised level.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Conclusions reiterated 5
I learned from, with and because of Federico:
Conditional tests of fit are the right tests.
They have exactly the advertised level.
The best unbiased tests are conditional tests.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Conclusions reiterated 5
I learned from, with and because of Federico:
Conditional tests of fit are the right tests.
They have exactly the advertised level.
The best unbiased tests are conditional tests.
“Best” is for Bayesian prior.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Conclusions reiterated 5
I learned from, with and because of Federico:
Conditional tests of fit are the right tests.
They have exactly the advertised level.
The best unbiased tests are conditional tests.
“Best” is for Bayesian prior.
Implementation via Markov Chain Monte Carlo.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Conclusions reiterated 5
I learned from, with and because of Federico:
Conditional tests of fit are the right tests.
They have exactly the advertised level.
The best unbiased tests are conditional tests.
“Best” is for Bayesian prior.
Implementation via Markov Chain Monte Carlo.
As a by-product can implement Rao-Blackwell for testsand estimation.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Conclusions reiterated 5
I learned from, with and because of Federico:
Conditional tests of fit are the right tests.
They have exactly the advertised level.
The best unbiased tests are conditional tests.
“Best” is for Bayesian prior.
Implementation via Markov Chain Monte Carlo.
As a by-product can implement Rao-Blackwell for testsand estimation.
The parametric bootstrap nearly implements conditionaltests.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Conclusions reiterated 5
I learned from, with and because of Federico:
Conditional tests of fit are the right tests.
They have exactly the advertised level.
The best unbiased tests are conditional tests.
“Best” is for Bayesian prior.
Implementation via Markov Chain Monte Carlo.
As a by-product can implement Rao-Blackwell for testsand estimation.
The parametric bootstrap nearly implements conditionaltests.
Results extend to exponential family regression models.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Conclusions reiterated 5
I learned from, with and because of Federico:
Conditional tests of fit are the right tests.
They have exactly the advertised level.
The best unbiased tests are conditional tests.
“Best” is for Bayesian prior.
Implementation via Markov Chain Monte Carlo.
As a by-product can implement Rao-Blackwell for testsand estimation.
The parametric bootstrap nearly implements conditionaltests.
Results extend to exponential family regression models.
Ideas might have relevance in bump hunting (LHC) or ?
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Thanks 6
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Thanks 6
To the organizers for a great opportunity to speak.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Thanks 6
To the organizers for a great opportunity to speak.
To you all for putting up with me and my English.
Richard Lockhart Conditional Tests of Goodness of Fit
Conditional
Tests of
Goodness of
Fit
Richard
Lockhart
Outline
Conclusions
Sufficiency
von Mises
GOF
Conditional
tests
Implementation
Bayes power
Parametric
Bootstrap
Conclusions
Thanks 6
To the organizers for a great opportunity to speak.
To you all for putting up with me and my English.
To Federico for so much fun over so much time.
Richard Lockhart Conditional Tests of Goodness of Fit