A Bayesian method for the analysis of deterministic and stochastic time series Coryn Bailer-Jones Max Planck Institute for Astronomy, Heidelberg DPG, Berlin, March 2015
A Bayesian method for the analysis of deterministic and stochastic time series
Coryn Bailer-JonesMax Planck Institute for Astronomy, Heidelberg
DPG, Berlin, March 2015
Coryn Bailer-Jones, MPI for Astronomy, Heidelberg
Time series modelling
• heteroscedastic, asymmetric noise on time and signal
• non-uniform time sampling
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
0 20 40 60 80 100
−20
020
4060
measured time, sm
easu
red
sign
al, y
P (Dj |�j , ✓,M) =
Z
tj ,zj
P (Dj |tj , zj ,�j)| {z }Measurement model
P (tj , zj |✓,M)| {z }Time series model
dtjdzj
Measured data Dj = (sj , yj) and uncertainties �j = (�sj ,�yj )
Likelihood of single data point: integrate over unknown true time (t) and signal (z)
Model M with parameters ✓
Coryn Bailer-Jones, MPI for Astronomy, Heidelberg
Model comparison
Likelihood of all data points is P (D|�, ✓,M) =Y
j
P (Dj |�j , ✓,M)
Evidence is the likelihood marginalized over the parameter prior
More robust alternative is the leave-one-out cross validation likelihood
P (D|�,M) =
Z
✓P (D|�, ✓,M)| {z }
likelihood
P (✓|M)| {z }prior
d✓
Calculate integrals by MCMC sampling of posterior
P (Dj |D�j ,�,M) =
Z
✓P (Dj |�j , ✓,M)| {z }
likelihood
P (✓|D�j ,��j ,M)| {z }
posterior
d✓
LCV =j=JY
j=1
P (Dj |D�j ,�,M)
Coryn Bailer-Jones, MPI for Astronomy, Heidelberg
Time series model
0 20 40 60 80 100
−0.0
40.
000.
020.
04
true time t
true
sign
al z
• red solid: deterministic component
• red dashed: standard deviation of stochastic component
• black: true data
Deterministic mean plus stochastic variation of constant variance
P (zj |tj , ✓,M) =
1p2⇡!
e�(zj�⌘(tj))2/2!2
Gaussian
⌘(tj) =
a
2
cos[2⇡(⌫t+ �)] + b sinusoidal
Coryn Bailer-Jones, MPI for Astronomy, Heidelberg
Time series model
µz = z0�
Vz =
c⌧
2
(1� �2)
9=
; where � = e�(t�t0)/⌧for t > t0
Ornstein-Uhlenbeck process
A Stationary, Markov, Gaussian process
dz(t) = �1
⌧z(t)dt+ c1/2N (t; 0, dt)
P (zj |tj , ✓,M) =1p2⇡Vz
e�(zj�µz)2/2Vz with
⌧ relaxation time
c di↵usion constant
Coryn Bailer-Jones, MPI for Astronomy, Heidelberg
Examples of OU process realizations
relaxation time, ⌧−15
−50
510
−15
−50
510
0 20 60 100
−15
−50
510
0 20 60 100 0 20 60 100 0 20 60 100time
signal
1 10 100 1000
diffe
rent
ran
dom
isat
ions
Coryn Bailer-Jones, MPI for Astronomy, Heidelberg
Luminosity variations in ultra cool dwarf stars
0 10 20 30 40 50
−0.0
60.
00
2m0345
0 20 40 60 80 100 120
−0.2
0.0
2m0913
0 20 40 60
−0.0
60.
022m1145a
0 20 40 60 80 100 120
−0.0
50.
05
2m1145b
0 20 40 60 80 100 120−0.0
40.
02
2m1146
0 20 40 60 80 100 120−0.0
60.
00
2m1334
0 5 10 15 20 25
−0.1
00.
05
calar3
0 20 40 60
−0.0
20.
02
sdss0539
0 10 20 30 40 50−0.0
30.
01
sori31
0 10 20 30 40 50
−0.0
20.
01sori33
0 10 20 30 40 50
−0.2
0.0
sori45
time / hrs
sign
al /
mag
Coryn Bailer-Jones, MPI for Astronomy, Heidelberg
Luminosity variations in ultra cool dwarf stars
Models compared:
• constant (variability just due to measurement noise)
• constant with Gaussian stochastic component
• sinusoid with Gaussian stochastic component
• OU process
Coryn Bailer-Jones, MPI for Astronomy, Heidelberg
Luminosity variations in ultra cool dwarf stars
0 10 20 30 40 50
−0.0
60.
00
2m0345
0 20 40 60 80 100 120
−0.2
0.0
2m0913
0 20 40 60
−0.0
60.
022m1145a
0 20 40 60 80 100 120
−0.0
50.
05
2m1145b
0 20 40 60 80 100 120−0.0
40.
02
2m1146
0 20 40 60 80 100 120−0.0
60.
00
2m1334
0 5 10 15 20 25
−0.1
00.
05
calar3
0 20 40 60
−0.0
20.
02
sdss0539
0 10 20 30 40 50−0.0
30.
01
sori31
0 10 20 30 40 50
−0.0
20.
01sori33
0 10 20 30 40 50
−0.2
0.0
sori45
time / hrs
sign
al /
mag
OU process
Sinusoid + stochastic
Sinusoid (8.3h, 13.3h)
Coryn Bailer-Jones, MPI for Astronomy, Heidelberg
Periodicity in biodiversity over past 550 Myr?
550 450 350 250 150 50
−800
−400
040
0
time BP / Myr
no. g
ener
a (-
cubi
c fit
)
periodic model withadditional fitted Gaussian noise
black = datared = model fit
stochastic process(OU process)
550 450 350 250 150 50
−800
−400
040
0
time BP / Myr
no. g
ener
a (-
cubi
c fit
)
550 450 350 250 150 50
−800
−400
040
0
time BP / Myr
no. g
ener
a (-
cubi
c fit
)
CV likelihood is much higher for this model
Rohde & Muller 2005
Coryn Bailer-Jones, MPI for Astronomy, Heidelberg
Summary
• a Bayesian method for modelling times series
‣ arbitrary time sampling and error models
‣ deterministic and stochastic times series
‣ use of cross-validation likelihood, a robust alternative to the evidence
• applications
‣ light curves of some very cool stars (and quasars) evolve stochastically
‣ no evidence for periodic variation of biodiversity over past 550 Myr
• more information and software: tinyurl.com/ctsmod
Coryn Bailer-Jones, MPI for Astronomy, Heidelberg
Ultra cool dwarf model comparison results
C. A. L. Bailer-Jones: Bayesian time series analysis
Table 4. Log (base 10) LOO-CV likelihood of each model relative to that for the no-model for each light curve (log LLOO−CV − log LNM).
Light curve OUprocess Off+Stoch Sin Sin+Stoch Off+Sin+Stoch No-model p-value
2m0345 3.26 2.07 0.15 2.06 2.66 –13.60 4e-42m0913 0.44 0.72 0.23 0.97 0.10 –53.39 7e-42m1145a 15.23 8.59 3.01 12.26 11.70 –63.83 <1e-92m1145b –0.73 1.96 2.00 2.69 2.95 –39.71 1e-32m1146 0.67 0.56 –0.08 0.21 1.17 –26.83 3e-32m1334 14.95 12.82 4.06 16.86 16.12 –65.88 1e-9sdss0539 5.50 1.99 4.93 4.48 4.67 –19.62 3e-5calar3 3.60 1.43 5.65 5.11 4.28 –28.06 6e-4sori31 2.04 2.12 1.02 2.59 1.90 –11.16 4e-5sori33 1.49 0.66 2.14 1.85 2.12 –8.39 2e-3sori45 6.70 4.32 5.08 6.23 6.32 –29.93 5e-9
Notes. The penultimate column gives the value of the log likelihood for the no-model, log LNM. The last column is the p-value for the hypothesistest from BJM.
Although the measured data points have negligible timinguncertainties, they do have a finite duration (the integrationtime of the observations), either 5 or 8 min (fixed for a givenlight curve). This could be accommodated into the measurementmodel (Sect. 2.2) by using a top-hat distribution instead of aGaussian. I nonetheless approximate this as a delta function, fortwo reasons: (1) it accelerates considerably the likelihood cal-culations, because it allows us to replace the 2D likelihood in-tegral (Eq. (7)) with an analytic calculation (Sect. B.2); (2) themethod of calculating the likelihood of the OU process (derivedin Sect. A.2) is only defined for this limit.
6.2. Results: LOO-CV likelihood
I follow the procedure outlined in Sect. 4 to define the priors andto sample the posterior with MCMC. The results are summarizedin Table 4. A first glance over the table shows that for ten of thelight curves, most of the models are significantly better than theno-model at explaining the data, often by a large amount.
According to the χ2 test of BJM, all of these objects have avariability which is inconsistent with Gaussian noise on the scaleof the error bars, so there should be a better model than the no-model (although it may not be among those I have tested). Wesee from the Table that the no-model is not favoured for any lightcurve. However, for 2m0913, none of the models is significantlymore likely than the no-model, so there is no reason to “reject” it.As the no-model is equivalent to the null hypothesis of BJM’s χ2
test, and this gave a p-value of 7e−4, this shows that the p-valueis not a reliable metric for “rejecting” the null hypothesis.
On the other hand, in the three cases where the p-value isvery low – 2m1145a, 2m1334, sori45 – the relative log likeli-hood for at least one model is high. This suggests that a verylow p-value sometimes correctly indicates that another model ex-plains the data better, although this is of limited use as we do notknow how low the p-value has to be. But at least it might moti-vate us to define and test other models. The converse is not true:a relatively high p-value does not indicate that the null hypothe-sis is the best fitting model.
We turn now to identifying the best models. For all lightcurves, there is no significant difference between Off+Sin+Stochand Sin+Stoch, which just means that the offset is not needed.That is not surprising, because the light curves have zero meanby construction. For eight of the light curves, the LOO-CV like-lihood for Sin+Stoch is significantly larger than for Sin, imply-ing there is a source of (Gaussian) stochastic variability whichis not accounted for by the error bars in the data, {σy j }. This
indicates either an additional source of variance (variability), orthat the error bars have been underestimated. (In only two ofthese cases – 2m1145a and 2m1334 – are the differences be-tween Sin and Sin+Stoch very large.)
Of course, there is no reason a priori to assume that a si-nusoidal model is the appropriate one. In 9 of the 11 lightcurves, the sinusoidal models give a higher likelihood thanthe Off+Stoch model, and in the other two cases the value isnot significantly lower. We can therefore state that for noneof the 11 light curves is Off+Stoch significantly better thanthe sinusoidal models. But only with five or six light curvescan we say that a sinusoidal model is significantly better thanOff+Stoch. For the remaining light curves, the data (and priors!)do not discriminate sufficiently between the models, so neithercan be “rejected”.
Turning now to the OU process, we see that this is signifi-cantly better than all other models only for 2m1145a, but by aconfident margin. In seven other cases the OU process is stillbetter than the other models, or at least not significantly worsethan the best model, so cannot be discounted as an explanation.In the remaining three cases – 2m1145b, 2m1334, calar3 – atleast one other model is significantly better than the OU process.
The results for 2m1145a and 2m1145b are interesting, asthese are light curves of the same object observed a year apart.At one time the OU process is the best explanation, at the othereither a sinusoidal model or Off+Stoch. Although it is plausiblethat the object shows different behaviour at different times, e.g.according to the degree of cloud coverage, we should not over-interpret this. We should also not forget that another, untestedmodel could be better than any of these.
To summarize: based just on the LOO-CV likelihood, I con-clude that 10 of 11 light curves are explained much better bysome model other than the no-model, by a factor of 100 or morein likelihood. The exception is 2m0913, for which all models areequally plausible (likelihoods within a factor of ten). Three lightcurves can be associated with one particular model: 2m1145ais best described by the OU process; 2m1334 and calar3 arebest described by a sinusoidal model, the former requiring anadditional stochastic component (Sin+Stoch), the latter couldbe either with or without it (Sin). This would seem to be con-sistent with a rotational modulation of the light curve (but seethe next section). For the remaining seven light curves, no sin-gle model emerges as the clear winner, although some modelsare significantly disfavoured. In three of these seven cases –2m0345, sdss0539, sori45 – both the OU process and a sinu-soid model explain the data equally well (for 2m0345 and sori45
A89, page 9 of 16
Coryn Bailer-Jones, MPI for Astronomy, Heidelberg
Parameter posterior PDFs:
0.05 0.10 0.15
020
4060
frequency, ν / hr−1
dens
ity
0.02 0.06 0.100
510
2030
amplitude, a / mag
dens
ity
0.5 0.6 0.7 0.8 0.9
02
46
8
phase φ
dens
ity
black = posteriorred = prior
Coryn Bailer-Jones, MPI for Astronomy, Heidelberg
Parameter posterior PDFs: 2m1145a
0 50 100 150
0.00
0.02
0.04
τ / hr
dens
ity
−0.06 −0.02 0.02 0.060
1020
3040
b / mag
dens
ity0.00000 0.00015 0.00030
050
0010
000
c / mag2hr−1
dens
ity−0.04 0.00 0.04
010
2030
4050
µ[z1] / mag
dens
ity
0.00 0.04 0.08
050
100
150
V1 2[z1] / mag
dens
ity black = posteriorred = prior
Coryn Bailer-Jones, MPI for Astronomy, Heidelberg
Parameter posterior PDFs: 2m1334
−0.06 −0.02
040
8012
0
offset, b / mag
dens
ity
0.000 0.004 0.0080
100
300
500
frequency, ν / hr−1
dens
ity0.00 0.05 0.10 0.15
010
2030
4050
amplitude, a / mag
dens
ity0.85 0.90 0.95 1.00
020
4060
phase φ
dens
ity
0.005 0.015 0.025
050
100
150
standard deviation, ω / mag
dens
ity black = posteriorred = prior