Multi-state modelling software, and encouraging statistical software development. Chris Jackson MRC Biostatistics Unit, Cambridge, U.K. MRC Biostatistics Unit Centenary Conference, 25 March 2014 Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 1/ 24
35
Embed
Multi-state modelling software, and encouraging ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Multi-state modelling software, and encouragingstatistical software development.
Chris JacksonMRC Biostatistics Unit, Cambridge, U.K.
MRC Biostatistics Unit Centenary Conference, 25 March 2014
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 1/ 24
Overview
Part 1. Software for multi-state models
I Two different types of multi-state model
I msm package for R — features, design principles, potentialdevelopments
I survival + mstate packages for R.
Part 2. Encouraging statistical software development inbiostatistics research
I What’s needed, and how to do it. Start discussion. . .
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 2/ 24
Part I
Software for multi-state modelling —
state of the art and future.
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 3/ 24
Multi-state models in continuous time
Example:
STAGE n−2 STAGE n−1
DISEASE DISEASE DISEASE DISEASE...
ABSORBING
STATE n
STAGE 1 STAGE 2
Defined by matrix Q of transition intensities: instantaneous risk ofmoving from state r to state s 6= r : at time t:
qrs(t,F(t−)) = limδt→0
P(S(t + δt) = s|S(t) = r ,F(t−))/δt.
e.g.Markovtime-homogeneous
}model, qrs independent of
{F(t−)t
I Single period (sojourn time) in state r ∼ Exp(mean = −1/qrr )
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 4/ 24
Multi-state models in continuous time
Example:
STAGE n−2 STAGE n−1
DISEASE DISEASE DISEASE DISEASE...
ABSORBING
STATE n
STAGE 1 STAGE 2
Defined by matrix Q of transition intensities: instantaneous risk ofmoving from state r to state s 6= r : at time t:
qrs(t,F(t−)) = limδt→0
P(S(t + δt) = s|S(t) = r ,F(t−))/δt.
e.g.Markovtime-homogeneous
}model, qrs independent of
{F(t−)t
I Single period (sojourn time) in state r ∼ Exp(mean = −1/qrr )
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 4/ 24
Data for multi-state models (1): intermittently-observed
State 1
State 2
State 3
State 4
t0 t1 t2 t3 t4
Panel data. State only observed ata finite number of times j .
I Don’t know the state betweenthese times
I e.g. chronic disease onlymeasurable at clinic visit /screening test
I Likelihood is product of transition probabilities between statesS(tj) observed at successive tj (Kalbfleisch & Lawless, JASA 1985).
L(Q) =∏j
PS(tj ),S(tj+1)(tj+1 − tj).
I Closed form for corresponding matrix P(t) = Exp(tQ) only ifQ is constant / piecewise constant with time t.
I Non-Markov models difficult (see later. . .)
msm package for R — designed for this type of data
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 5/ 24
Data for multi-state models (1): intermittently-observed
State 1
State 2
State 3
State 4
t0 t1 t2 t3 t4
Panel data. State only observed ata finite number of times j .
I Don’t know the state betweenthese times
I e.g. chronic disease onlymeasurable at clinic visit /screening test
I Likelihood is product of transition probabilities between statesS(tj) observed at successive tj (Kalbfleisch & Lawless, JASA 1985).
L(Q) =∏j
PS(tj ),S(tj+1)(tj+1 − tj).
I Closed form for corresponding matrix P(t) = Exp(tQ) only ifQ is constant / piecewise constant with time t.
I Non-Markov models difficult (see later. . .)
msm package for R — designed for this type of data
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 5/ 24
Data for multi-state models (2): completely-observed
State 1
State 2
State 3
Death
0 12
Observe all changes of state
I know the complete processhistory.
I e.g. changes of state representevents
I MI, stroke, periods in hospital.
I event times may be known fromadministrative data.
I Time-to-event data with competing event times censored.I substantial literature on survival / competing risks
I Only Markov models supported by msm.I exponential / piecewise-exponential event times.
I Can estimate transition rates under more flexible models (e.g.Cox semi-Markov) using standard survival analysis software.
survival and mstate packages for R designed for this
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 6/ 24
Data for multi-state models (2): completely-observed
State 1
State 2
State 3
Death
0 12
Observe all changes of state
I know the complete processhistory.
I e.g. changes of state representevents
I MI, stroke, periods in hospital.
I event times may be known fromadministrative data.
I Time-to-event data with competing event times censored.I substantial literature on survival / competing risks
I Only Markov models supported by msm.I exponential / piecewise-exponential event times.
I Can estimate transition rates under more flexible models (e.g.Cox semi-Markov) using standard survival analysis software.
survival and mstate packages for R designed for this
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 6/ 24
msm R package for multi-state modelling
http://CRAN.R-project.org/package=msm
Jackson (J Stat. Soft. 2011), Jackson et al. (Statistician 2003)
Used in health, finance, ecology, social science, engineering. . .
General and flexible. Fit continuous-time Markov modelsI with any state structure / transition matrixI covariates (proportional intensities) for any / all transitions
I subject-specific time-constant orI piecewise-constant time-dependent, including time itself
I to various patterns of observation, particularlyintermittently-observed. . .
msm(state ~ time, subject=subj, data=mydata,
covariates = list("1-2" = ~ age,
"2-3" = ~ age + treatment),
qmatrix=rbind(c(0,1,1),
c(0,0,1),
c(0,0,0)), gen.inits=TRUE)
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 7/ 24
I MCMC approaches (JAGS / BUGS / Stan software) or MonteCarlo EM (Sutradhar and Cook, JRSS C, 2008)?
I experimental facility available in msm to generate code to fitsame model in JAGS
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 15/ 24
Time-inhomogeneous models for panel data
State 1
State 2
State 3
State 4
t0 t1 t2 t3 t4
I Likelihood needs transitionprobability matrix P with r , sentry Pr(S(t1) = s|S(t0) = r).
I Kolmogorov forward equationsdP(t0, t1)/dt = P(t0, t1)Q(t)
I Q not constant or piecewiseconstant with time → noanalytic solution.
Or numerically solve the differential equation (Titman, Biometrics
2011).
I Allows e.g. Weibull or spline functions for Q(t) — smoother /more realistic than piecewise constant
I Need to solve for each distinct covariate value — hard forcontinuous covariates / big datasets.
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 16/ 24
Time-inhomogeneous models for panel data
State 1
State 2
State 3
State 4
t0 t1 t2 t3 t4
I Likelihood needs transitionprobability matrix P with r , sentry Pr(S(t1) = s|S(t0) = r).
I Kolmogorov forward equationsdP(t0, t1)/dt = P(t0, t1)Q(t)
I Q not constant or piecewiseconstant with time → noanalytic solution.
Or numerically solve the differential equation (Titman, Biometrics
2011).
I Allows e.g. Weibull or spline functions for Q(t) — smoother /more realistic than piecewise constant
I Need to solve for each distinct covariate value — hard forcontinuous covariates / big datasets.
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 16/ 24
Future developments in msm
Ideally any new methods should work with any multi-statestructure.
I or at least for common structures: e.g. progressive disease
, or progression and death
I unmaintainable if handle too many special cases in differentways.
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 17/ 24
All transition times known: survival/mstate system
Model fitting: survival R package (Therneau)
I Estimate event-specific hazards == transition ratesI under Cox or fully parametric models for times to each event.
Prediction: mstate R package (de Wreede et al. J. Stat. Soft. 2011)
I Estimate cumulative incidences from a Cox model (Breslow)
I Convert these to transition probabilities over a time periodI Aalen-Johansen estimator for inhomogeneous Markov modelsI Individual patient simulation for semi-Markov models
I No documentation for using parametric modelsI needed e.g. for extrapolation in health economic evaluations
I Data need some awkward manipulation
Tutorials / courses to clear up msm vs. mstate confusion / makeall their methods more accessible?
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 18/ 24
Part II
Encouraging more software development
in biostatistics research
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 19/ 24
Why encourage software development?
Statistical methods need accessible software.
I allows the method to be used by more people— especially non-experts
I saves time for everyone, even experts
I increases transparency / trust in research resultsI good for promoting a new method
I “the other way our ideas get out there is through software . . .
software implementation is a kind of publication, indeed, one of the
best kinds.” (http://andrewgelman.com/2014/03/12/publishing-journals)
I see, e.g. popularity of DIC in BUGS
I impact, citations. . .
→ good for science (and scientists).
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 20/ 24
Standalone software / large libraries (e.g. BUGS, JAGS, Stan).
I Often to accompany a major methodological advance (e.g.MCMC).
I Needs advanced programming / software engineering skills todevelop.
I Still needs users for feedback / bug reports / testing / support
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 21/ 24
How can we develop more accessible software?
Culture shift: software viewed as a valuable research output
I Funding bodies / grant reviewers, research assessors, PhDexaminers, journal editors, supervisors and line managers. . .
Time and money. . .
People and skills. . .
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 22/ 24
How can we develop more accessible software?
Culture shift: software viewed as a valuable research output
Time and money. . .
I Priorities: consider a methodology project not finishedwithout usable software — not an “optional extra”
I A lot of tedious work involved in writing software — but sameis true for writing papers!
People and skills. . .
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 22/ 24
How can we develop more accessible software?
Culture shift: software viewed as a valuable research output
Time and money. . .
I Priorities: consider a methodology project not finishedwithout usable software — not an “optional extra”
I A lot of tedious work involved in writing software — but sameis true for writing papers!
“. . .[our academic culture] has traditionally put a very high premium on beingclever and a relatively low premium on being willing to go through the schlep.[As applied statistics grows] the schlep becomes just as important as the cleveridea. If you aren’t willing to put in the time to code your methods up and makethem accessible to other investigators, then who will be?”Jeff Leek (http://simplystatistics.org/2012/05/28/schlep-blindness-in-statistics)
People and skills. . .
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 22/ 24
Culture shift: software viewed as a valuable research output
Time and money. . .
I Priorities: consider a methodology project not finishedwithout usable software — not an “optional extra”
I A lot of tedious work involved in writing software — but sameis true for writing papers!
“creating an R package is building something. It is something you can point toand say, ”I made that”. Leaving aside all the tangible benefits to your career,the profession, etc. it is maybe the most gratifying feeling you get whenworking on research.” (Jeff Leek, https://github.com/jtleek/rpackages)
People and skills. . .
Chris Jackson, MRC-BSU Cambridge Multi-state modelling, and encouraging more software 22/ 24
How can we develop more accessible software?
Culture shift: software viewed as a valuable research output
Time and money. . .
People and skills. . .
I More collaborative programming, just like we do collaborativewriting (tools could help e.g. GitHub)
I Informal / internal peer review by local software expertsI Training of students / researchers in software development
I Software user / discussion groups (e.g. for R techniques)I Courses and online resources. A lot to be learnt from “open
source” community!
I Collaborations with computing specialists: especially for majorsoftware projects: http://www.timeshighereducation.co.uk/news/