Survival Analysis I (CHL5209H) · Survival Analysis I (CHL5209H) Olli Saarela Likelihood construction under non-informative censoring Piecewise constant hazard model 22-3 Parametrized

Survival AnalysisI (CHL5209H)

Olli Saarela

Likelihoodconstructionunder non-informativecensoring

Piecewiseconstanthazard model

22-1


Olli Saarela

Dalla Lana School of Public HealthUniversity of Toronto

[email protected]

January 14, 2015


Olli Saarela



22-2

Likelihood construction under non-informativecensoring


Olli Saarela



22-3

Parametrized hazard function

I Usually, we are interested in modeling the dependence ofthe hazard rate on some covariates Zi .

I This dependency is parametrized through a vector ofparameters θ.

I The resulting individual-level cause-specific hazardfunction is

λij(t; θ) =P(t ≤ Ti < t + dt,Ei = j | Ti ≥ ti , zi , θ)

dt.

I How to estimate θ?


Olli Saarela



22-4

Likelihood contributions

I The observed data consists of realizations of randomvectors (Ti ,Ei ,Zi ), i = 1, . . . , n.

I The likelihood contribution of a given individual i ,conditional on the covariates, is the probability of an eventof type ei occurring at time ti , that is,

P(ti ≤ Ti < ti + dt,Ei = ei | zi , θ)

= P(ti ≤ Ti < ti + dt,Ei = ei | Ti ≥ t, zi , θ)

× P(Ti ≥ ti | zi , θ)

=1∏

j=0

λij(ti ; θ)1{ei=j} dt exp

−∫ ti

0

1∑j=0

λij(u; θ)du

.


Olli Saarela



22-5

Likelihood function

I The individual level contributions are assumedconditionally independent, giving the likelihood function

L(θ)

≡n∏

i=1

1∏j=0


−∫ ti

0

1∑j=0

λij(u; θ)du

.

I Note that this is a function of both the hazard functionλi1(t; θ) characterizing the events of interest, and λi0(t; θ)characterizing the censoring events.

I We don’t want to estimate the latter; how can weeliminate it from the likelihood expression?

I We need to make further assumptions concerning thecensoring mechanism.


Olli Saarela



22-6

Non-informative censoring

I Suppose that we can decompose the parameter vector intoθ = (θ0, θ1), where θ1 characterizes the events of interestand θ0 the censoring events.

I Now θ0 is a nuisance parameter, while θ1 is a parameter ofinterest.

I The assumption of non-informative censoring means thatthe censoring mechanism is not informative of theparameters of interest.

I We can express this by

λij(t; θ) = λij(t; θ0, θ1) = λij(t; θj).

I Referring to the definition of the hazard function, anequivalent (Bayesian) definition would be that at any t

(t ≤ Ti < t + dt,Ei = 0) ⊥⊥ Θ1 | (Ti ≥ t, zi , θ0)

and (t ≤ Ti < t + dt,Ei = 1) ⊥⊥ Θ0 | (Ti ≥ t, zi , θ1).


Olli Saarela



22-7

Interpretation?

I Non-informative censoring is a rather abstract property, soexamples will follow.

I Note that the independence of the censoring mechanismwas conditional on the covariates zi .

I The non-informative censoring assumption is satisfied ifwe can condition on all common determinants of thecensoring events and the events of interest.

I Example: suppose that incident myocardial infarction (MI)events censor the follow-up for incident ischemic stroke(IS) events. What are the common determinants of MIand IS?


Olli Saarela



22-8

Simplified likelihood

I The likelihood expression can now be simplified as

n∏i=1

1∏j=0


−∫ ti

0

1∑j=0

λij(u; θ) du

=n∏

i=1

[λi0(ti ; θ0)1−eiλi1(ti ; θ1)ei dt

× exp

{−∫ ti

0λi0(u; θ0)du

}exp

{−∫ ti

0λi1(u; θ1) du

}]θ1∝

n∏i=1

[λi1(ti ; θ1)ei exp

{−∫ ti

0λi1(u; θ1) du

}].

I The nuisance parameters θ0 were eliminated, this is afunction of the parameters θ1 only.

I Maximum likelihood criterion can now be applied as usualto estimate θ1.


Olli Saarela



22-9

Piecewise constant hazard model


Olli Saarela



22-10

Parametric survival models

I For this, we need to parametrize the hazard functionthrough a regression equation.

I For example,

λi1(u; θ1) = exp{α + β′Zi},

where θ1 = (α, β) would specify a Poisson regressionmodel, with the baseline hazard given by exp{α} and theregression coefficients having interpretation as log-rateratios (this is a special case of a proportional hazardsmodel).

I Usually, we would not want to assume the hazard to beconstant over time.

I A generalization of this model is obtained if we assumethat hazard to be constant over pre-specified intervals.

I This also allows us to easily incorporate more than onetime scale.


Olli Saarela



22-11

Lexis diagram

calendar year

ag

e

1982 1987 1992 1997 2002 2007 2011

25

35

45

55

65

70

75

80

85

94

78621

23106

039

2011448

2512192

118828

53077

038

4210397

4910309

4011467

5311889

268595

122801

12611434

14710882

1159918

769662

7810838

609072

301456

1205331

925035

644675

584471

453568

441284

1344704

1234494

994264

713318

481082

1903877

1553797

1262948

50806

1932891

1422322

57519

1881640


Olli Saarela



22-12

Notation for grouped follow-up data

I The Lexis diagram depicted the follow-up for totalmortality of 9029 individuals recruited as a cross-sectionalcohort in 1982 (then of age 25-65) until the end of year2010.

I Assume that the mortality rate is constant within theagegroups k = 1, . . . , 9 in the Lexis diagram, and withinone-year calendar time intervals l = 1, . . . , 29.

I Let dijkl ∈ {0, 1} denote whether individual i experienceda death due to cause j at age k in year j .

I Let yikl denote the person-years individual i contributed inage group k and year l .

I If we have no other individual level information, the hazardrate of any individual i in age group k and year l isassumed to be λjkl .

I This is why the model is called piecewise constant.


Olli Saarela



22-13

Likelihood

I The likelihood contribution of individual i is given by

9∏k=1

29∏l=1

λdijkljkl exp

{−

9∑k=1

29∑l=1

∫ yikl

0λjkl dt

}

=9∏

k=1

29∏l=1

[λdijkljkl exp {−λjklyikl}

].

I The likelihood expression from n individuals is then

n∏i=1

9∏k=1

29∏l=1


].


Olli Saarela



22-14

Connection to Poisson model

I Since λjkl does not depend on i , we get

n∏i=1

9∏k=1

29∏l=1


]=

9∏k=1

29∏l=1

[λ∑n

i=1 dijkljkl exp

{−λjkl

n∑i=1

yikl

}].

I We would get the same likelihood expression if we assumethe total number of deaths of type j in each agegroup/year to be independently Poisson distributed as

n∑i=1

dijkl ∼ Poisson

(λjkl

n∑i=1

yikl

).

I Thus, the model can be fitted using any available Poissonregression software (as examples, we consider the glm

function in R, and the Bayesian software JAGS).


Olli Saarela



22-15

More general models

I We can allow the piecewise constant hazard rates tofurther depend on individual-level covariates Zi , in whichcase the likelihood expression is of the form

n∏i=1

9∏k=1

29∏l=1

[λdijklijkl exp {−λijklyikl}

].

I While there are no Poisson distributed counts here, themodel can still be fitted as a Poisson regression. (Why?)


Olli Saarela



22-16

Model parametrization

I Even without further individual-level characteristics, theexample model involved 9× 29 = 261 hazard rateparameters λjkl .

I Not all of these can be estimated, since some agegroup/year combinations have no observed deaths.(Why?)

I Estimating this many parameters would also be inefficient,and interpretation of the results would be difficult.

I Suppose that we are mainly interested in the calendar timetrend in mortality, while removing the age effect.

I Further, we allow the hazard rate to depend covariates sex(xi1 ∈ {0, 1}, men/women) and region (xi2 ∈ {0, 1},eastern/western Finland).

I A more parsimonious parametrization can now be specifiedthrough a regression equation.


Olli Saarela



22-17

Regression equation

I For example, we can specify the model as

log(λijkl) = αjk + βjl + γj1xi1 + γj2xi2.

I This model involves only 9 + 29 + 2 = 40 parameters.

I Now we are mainly interested in the calendar time effectparameters βjl , l = 1, . . . , 29.

I Adjustment for age through the parameters αjk ,k = 1, . . . , 9 is needed to exctract the calendar time effect.(What would happen to the calendar time effect if we didnot adjust for age?)

I Note that for this model to be identifiable, one parameterrestriction, for example

αj1 = 0

is needed.


Olli Saarela



22-18

Parametrization with an intercept term

I An alternative way to parametrize the same model wouldbe

log(λijkl) = µ+ αjk + βjl + γj1xi1 + γj2xi2,

which includes a separate intercept term µ.

I Now two parameter restrictions are required, for example

αj1 = 0 and βj1 = 0.

I The interpretation of the calendar time effects βjl ,j = 2, . . . , 29 is now different, they represent log-rateratios w.r.t. the first year.


Olli Saarela



22-19

More advanced models

I Estimating 29 calendar time effect parameters separatelyis still quite many, as there are not a large number ofdeaths during any given year in this cohort.

I Moreover, the estimated effect will not be smooth.

I If we are Bayesians (in this course, we mostly won’t be),we can force these parameters to be dependent through aprior distribution.

I The motivation for this would be to obtain more smoothestimates by borrowing strength from the previous years.


Olli Saarela



22-20

Autoregressive smoothing

I Possible autoregressive prior models to achieve this wouldbe for example the first order normal random walk

βjl | βj(l−1), βj(l−2), . . . , βj1 ∼ N(βj(l−1), 1/φj)

or the second order normal random walk

βjl | βj(l−1), βj(l−2), . . . , βj1 ∼ N(2βj(l−1) − βj(l−2), 1/φj).

I In both models the precision parameter φj is asmoothing/penalty parameter which controls variationaround the assumed mean.

I Using Baysian methods, this can also be estimated fromthe data.

I JAGS is a general-purpose Bayesian software that can beused from within R.


Olli Saarela



22-21

JAGS script (likelihood)

model {

for (i in 1:N) {

d[i] ~ dpois(mu[i])

mu[i] <- y[i] * exp(alpha[age[i]] + beta[year[i]] +

gamma[1] * sex[i] + gamma[2] * area[i])

}


Olli Saarela



22-22

JAGS script (prior distributions)

betamean[1] <- 0.0

betaprec[1] <- phi * 0.001

betamean[2] <- 0.0

betaprec[2] <- phi * 0.001

for (i in 3:29) {

betamean[i] <- 2 * beta[i-1] - beta[i-2]

betaprec[i] <- phi

}

for (i in 1:29) {

beta[i] ~ dnorm(betamean[i],betaprec[i])

logRR[i] <- beta[i]-beta[1]

}

phi ~ dgamma(0.001,0.001)

for (i in 1:8) {

alpha[i] ~ dnorm(0.0,0.001)

}

alpha[9] <- 0.0

for (i in 1:2)

gamma[i] ~ dnorm(0.0,0.001)

}

Survival Analysis I (CHL5209H) · Survival Analysis I (CHL5209H) Olli Saarela Likelihood construction under non-informative censoring Piecewise constant hazard model 22-3 Parametrized

Documents