The use of flexible parametric survival models in epidemiology...Lecture notes online - web address will be circulated. Paul C Lambert Flexible parametric survival models 28th October

The use of flexible parametric survival models in

epidemiology

Paul C Lambert1,2

1Department of Health Sciences,University of Leicester, UK

2Department of Medical Epidemiology and Biostatistics,Karolinska Institutet, Stockholm, Sweden

Research Seminars in Medicine, Epidemiology and Public HealthDepartment of Public Health, Aarhus University

28th October 2014

What I am covering today

Session 1

Introduction to flexible parametric survival models.

Example (Henrik Møller).

Session 2

Some extensions

Age as the time-scale.Standardised survival curves.Competing RisksExample (Henrik Størving).

Lecture notes online - web address will be circulated.

Paul C Lambert Flexible parametric survival models 28th October 2014 2

Session 1

In the first session I will,

Explain why I use parametric models.

Briefly review the Cox model.

Give an introduction to flexible parametric survival models.

The use of spline functions.Brief introduction to theoryProportional hazards exampleVarious predictions.


Why I use parametric models

I analyse large population-based datasets where

The proportional hazards assumption is often not appropriate.The hazard function is of interest.

I fit excess mortality/relative survival models in population-basedcancer studies.

Was not an easy adaption for the Cox model.Proportional excess hazards rarely true.The excess hazard is of interest.

Quantification of absolute risks and rates.

I believe this should be done more than it is.Much easier if you estimate the baseline.


The Cox model I

Web of Science: over 26,938 citations (February 2013).

Has an h-index of 13 from repeat mis-citations1.

hi(t|xi) = h0(t) exp (xiβ)

Estimates (log) hazard ratios.

Advantage: The baseline hazard, h0(t) is not estimated from aCox model.

Disadvantage: The baseline hazard, h0(t) is not estimated froma Cox model.

1http:

//occamstypewriter.org/boboh/2008/06/24/outdone_by_mis_prints/Paul C Lambert Flexible parametric survival models 28th October 2014 5

http://occamstypewriter.org/boboh/2008/06/24/outdone_by_mis_prints/http://occamstypewriter.org/boboh/2008/06/24/outdone_by_mis_prints/

The Cox model II

The crucial assumption of the Cox model is that the estimatedparameters are not associated with time, i.e., we assumeproportional hazards.

If you are only interested in the relative effect of a covariate onthe hazard rate and the assumption of proportional hazards isreasonable, then the Cox model is probably the most appropriatemodel. In other situations alternative models may be moreappropriate.

However, whenever we estimate a relative effect we should ask“relative to what?”


Quote from Sir David Cox (Reid 1994 [1])

Reid “What do you think of the cottage industry that’s grown uparound [the Cox model]?”

Cox “In the light of further results one knows since, I think Iwould normally want to tackle the problem parametrically.. . . I’m not keen on non-parametric formulations normally.”

Reid “So if you had a set of censored survival data today, youmight rather fit a parametric model, even though there wasa feeling among the medical statisticians that that wasn’tquite right.”

Cox “That’s right, but since then various people have shown thatthe answers are very insensitive to the parametricformulation of the underlying distribution. And if you wantto do things like predict the outcome for a particular patient,it’s much more convenient to do that parametrically.”


What are splines?

Flexible mathematical functions defined by piecewisepolynomials.

Used to model non-linear functions.

The points at which the polynomials join are called knots.

Constraints ensure the function is smooth.

The most common splines used in practice are cubic splines.

However, splines can be of any degree, n.

Function is forced to have continuous 0th, 1st and 2nd

derivatives.

Regression splines can be incorporated into any regression modelwith a linear predictor.


Cubic splines

Cubic spline functions can be used in any regression model bycalculation of some extra variables.

After defining K knots, t1, . . . , tK the spline function is

S(x) =3∑

j=0

β0jxj +

K+4∑i=4

βi3(xj − ti)3+

Note the “+” notation means that u+ = u if u > 0 and u+ = 0if u ≤ 0.There will be K + 4 parameters (including the intercept) neededin the linear predictor.


Using splines to estimate non-linear functions.

25

50

100

150

200

Mor

talit

y R

ate

(100

0 py

's)

0 1 2 3 4 5Years from Diagnosis

Interval Length: 1 week


No continuity corrections

25

50

100

150

200

Mor

talit

y R

ate

(100

0 py

's)


No Constraints


Function forced to join at knots

25

50

100

150

200

Mor

talit

y R

ate

(100

0 py

's)


Forced to Join at Knots


Continuous first derivative

25

50

100

150

200

Mor

talit

y R

ate

(100

0 py

's)


Continuous 1st Derivatives


Continuous second derivative

25

50

100

150

200

Mor

talit

y R

ate

(100

0 py

's)


Continuous 2nd Derivatives


Restricted cubic splines

Cubic splines can behave poorly in the tails.

Extension is restricted cubic splines[2] .

Forced to be linear before the first knot and after the final knot.

This is where there is often less data and standard cubic splinestend to be sensitive to a few extreme values.

For same number of knots needs 4 fewer parameters than cubicsplines.

To understand splines further, play with some interactive graphs Ihave developed.http://www2.le.ac.uk/Members/pl4/interactive-graphs


http://www2.le.ac.uk/Members/pl4/interactive-graphs

Flexible Parametric Survival Models

Parametric estimate of the survival and hazard functions.

Useful for ‘standard’ and relative survival models.

First introduced by Royston and Parmar (2002) [3].

Parametric Models have advantages for

Understanding.Prediction.Extrapolation.Quantification (e.g., absolute and relative measures of risk).Modelling time-dependent effects.All cause, cause-specific or relative survival.


Flexible parametric models: basic idea

Consider a Weibull survival curve.

S(t) = exp (−λtγ)

If we transform to the log cumulative hazard scale.

ln [H(t)] = ln[− ln(S(t))]

ln [H(t)] = ln(λ) + γ ln(t)

This is a linear function of ln(t)Introducing covariates gives

ln [H(t|xi)] = ln(λ) + γ ln(t) + xiβ

Rather than assuming linearity with ln(t) flexible parametricmodels use restricted cubic splines for ln(t).


Flexible parametric models: incorporating splines

We thus model on the log cumulative hazard scale.

ln[H(t|xi)] = ln [H0(t)] + xiβ

This is a proportional hazards model.

Restricted cubic splines with knots, k0, are used to model thelog baseline cumulative hazard.

ln[H(t|xi)] = ηi = s (ln(t)|γ, k0) + xiβ

For example, with 4 knots we can write

ln [H(t|xi)] = ηi = γ0 + γ1z1i + γ2z2i + γ3z3i︸︷︷︸log baseline

cumulative hazard

+ xiβ︸︷︷︸log hazard

ratios

We are fitting a linear predictor on the log cumulative hazardscale.


Flexible parametric models: incorporating splines

We thus model on the log cumulative hazard scale.

ln[H(t|xi)] = ln [H0(t)] + xiβ

This is a proportional hazards model.Restricted cubic splines with knots, k0, are used to model thelog baseline cumulative hazard.

ln[H(t|xi)] = ηi = s (ln(t)|γ, k0) + xiβ

For example, with 4 knots we can write

ln [H(t|xi)] = ηi = γ0 + γ1z1i + γ2z2i + γ3z3i︸︷︷︸log baseline

cumulative hazard

+ xiβ︸︷︷︸log hazard

ratios

We are fitting a linear predictor on the log cumulative hazardscale.


Survival and hazard functions

We can transform to the survival scale

S(t|xi) = exp(− exp(ηi))

The hazard function is a bit more complex.

h(t|xi) =ds (ln(t)|γ, k0)

dtexp(ηi)

This involves the derivatives of the restricted cubic splinesfunctions.

However, these are easy to calculate.


Fitting a proportional hazards model

Example: 24,889 women aged under 50 diagnosed with breastcancer in England and Wales 1986-1990.

Compare five deprivation groups from most affluent to mostdeprived.

No information on cause of death, but given their age, mostwomen who die will die of their breast cancer.

Proportional hazards models. stcox dep2-dep5,

. stpm2 dep2-dep5, df(5) scale(hazard) eform

The df(5) option implies using 4 internal knots and 2 boundaryknots at their default locations.

The scale(hazard) requests the model to be fitted on the logcumulative hazard scale.


Cox Model

Cox proportional hazards model

. stcox dep2-dep5,failure _d: dead == 1

analysis time _t: survtimeexit on or before: time 5

Iteration 0: log likelihood = -73334.091Iteration 1: log likelihood = -73303.081Iteration 2: log likelihood = -73302.997Iteration 3: log likelihood = -73302.997Refining estimates:Iteration 0: log likelihood = -73302.997Cox regression -- Breslow method for tiesNo. of subjects = 24889 Number of obs = 24889No. of failures = 7366Time at risk = 104638.953

LR chi2(4) = 62.19Log likelihood = -73302.997 Prob > chi2 = 0.0000

_t Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]

dep2 1.048716 .0353999 1.41 0.159 .9815786 1.120445dep3 1.10618 .0383344 2.91 0.004 1.03354 1.183924dep4 1.212892 .0437501 5.35 0.000 1.130104 1.301744dep5 1.309478 .0513313 6.88 0.000 1.212638 1.414051


Flexible parametric proportional hazards model

Flexible Parametric Proportional Hazards Model

. stpm2 dep2-dep5, df(5) scale(hazard) eformIteration 0: log likelihood = -22507.096Iteration 1: log likelihood = -22502.639Iteration 2: log likelihood = -22502.633Iteration 3: log likelihood = -22502.633Log likelihood = -22502.633 Number of obs = 24889

exp(b) Std. Err. z P>|z| [95% Conf. Interval]

xbdep2 1.048752 .0354011 1.41 0.158 .9816125 1.120483dep3 1.10615 .0383334 2.91 0.004 1.033513 1.183893dep4 1.212872 .0437493 5.35 0.000 1.130085 1.301722dep5 1.309479 .0513313 6.88 0.000 1.212639 1.414052

_rcs1 2.126897 .0203615 78.83 0.000 2.087361 2.167182_rcs2 .9812977 .0074041 -2.50 0.012 .9668927 .9959173_rcs3 1.057255 .0043746 13.46 0.000 1.048715 1.065863_rcs4 1.005372 .0020877 2.58 0.010 1.001288 1.009472_rcs5 1.002216 .0010203 2.17 0.030 1.000218 1.004218


Proportional hazards models

The hazard ratios and 95% confidence intervals are very similar.

I have yet to find an example of a proportional hazards model,where there is a large difference in the estimated hazard ratios.

If you are just interested in hazard ratios in a proportionalhazards model, then you can get away with poor modelling ofthe baseline hazard.

One important exception is when the follow-up time differsbetween groups.

It is of course better to model the baseline hazard well!


Simple predictions

To predict the survival and hazard functions use the folllowing

The predict command. predict survpred, survival

. predict hazpred, hazard

To estimate confidence intervals use the ci option.

To predict for particular covariate patterns use the at() option.

The at() option. predict haz_male_age50, hazard ci at(male 1 age 50)


Simple predictions 2

The zeros option sets values of all covariates to zero, otherthan those specified in the the at() option, to zero. Forexample the baseline survival function can be estimates using.

The zeros option. predict surv_baseline, survival ci zeros


Survival Function

.6

.7

.8

.9

1P

ropo

rtio

n A

live

0 1 2 3 4 5Time from Diagnosis (years)

Least Deprived234Most Deprived

Deprivation Group


Hazard Function ×1000

0

25

50

75

100

125

150P

redi

cted

Mor

talit

y R

ate

(per

100

0 py

)


Least Deprived234Most Deprived

Deprivation Group


Sensitivity to knots

When using splines it is important to ask if the fitted values aresensitive to the number and the location of the knots.

Too many knots will overfit with local ‘humps and bumps’.

Too few knots will underfit.

In most situations the choice of knots is not crucial.

We can use the AIC and BIC to help us select how many knotsto use, but a simple sensitivity analysis is recommended.


Example of different knots for baseline hazard

0

25

50

75

100P

redi

cted

Mor

talit

y R

ate

(per

100

0 py

)


1 df: AIC = 53746.92, BIC = 53788.35

2 df: AIC = 53723.60, BIC = 53771.93

3 df: AIC = 53521.06, BIC = 53576.29

4 df: AIC = 53510.33, BIC = 53572.47

5 df: AIC = 53507.78, BIC = 53576.83

6 df: AIC = 53511.59, BIC = 53587.54

7 df: AIC = 53510.06, BIC = 53592.91

8 df: AIC = 53510.78, BIC = 53600.54

9 df: AIC = 53509.62, BIC = 53606.28

10 df: AIC = 53512.35, BIC = 53615.92


Where to place the knots?

The default knots positions tend to work fairly well.

Unless the knots are in silly places then there is usually very littledifference in the fitted values.

The graphs on the following page shows for 5 df (4 interiorknots) the fitted hazard and survival functions with the interiorknot locations randomly selected.


Random knot positions for baseline hazard

0

25

50

75

100P

redi

cted

Mor

talit

y R

ate

(per

100

0 py

)


13.7 55.8 60.5 64.3

6.1 10.9 61.8 68.4

4.5 25.5 55.5 87.1

42.4 52.2 84.1 89.8

21.1 26.5 56.4 94.8

11.8 27.7 40.8 72.2

42.2 46.1 87.2 89.4

5.8 67.6 69.9 71.5

9.8 23.2 35.3 59.5

10.2 10.9 57.7 80.7


Effect of location of knots on baseline survival

.7

.8

.9

1P

redi

cted

Sur

viva

l


13.7 55.8 60.5 64.3

6.1 10.9 61.8 68.4

4.5 25.5 55.5 87.1

42.4 52.2 84.1 89.8

21.1 26.5 56.4 94.8

11.8 27.7 40.8 72.2

42.2 46.1 87.2 89.4

5.8 67.6 69.9 71.5

9.8 23.2 35.3 59.5

10.2 10.9 57.7 80.7


How well do splines approximate the hazard?[4]

Journal of Statistical Computation and Simulation, 2013

http://dx.doi.org/10.1080/00949655.2013.845890

The use of restricted cubic splines to approximate complex

hazard functions in the analysis of time-to-event data:

a simulation study

Mark J. Rutherforda∗, Michael J. Crowthera and Paul C. Lamberta,b

We do not believe the spline function is the true model, butprovides a very good approximation.

We assessed this in a simulation study.


Simulation Study (Rutherford et al.)

Want to assess how well splines approximate the true function.

Generate data assuming a mixture Weibull distribution,

S(t) = π exp(−λ1tγ1) + (1− π) exp(−λ2tγ2)

For various scenarios,

Generate 1000 data sets under proportional hazards.Fit various restricted cubic spline models (varying degree offreedom)Fit Cox modelFit true model (mixture Weibull)


True hazard functions

0.0

0.5

1.0

1.5

2.0

2.5H

azar

d ra

te

0 2 4 6 8 10Time Since Diagnosis (Years)

Scenario 1

0.0

0.5

1.0

1.5

2.0

2.5

Haz

ard

rate


Scenario 2

0.0

0.5

1.0

1.5

2.0

2.5

Haz

ard

rate


Scenario 3

0.0

0.5

1.0

1.5

2.0

2.5

Haz

ard

rate


Scenario 4


True survival functions

0.0

0.2

0.4

0.6

0.8

1.0S

urvi

val


Scenario 1

0.0

0.2

0.4

0.6

0.8

1.0

Sur

viva

l


Scenario 2

0.0

0.2

0.4

0.6

0.8

1.0

Sur

viva

l


Scenario 3

0.0

0.2

0.4

0.6

0.8

1.0

Sur

viva

l


Scenario 4


Comparison of log hazard ratios (scenario 3)

-.6

-.55

-.5

-.45

-.4

Cox

Mod

el

-.6 -.55 -.5 -.45 -.4Flexible Parametric Model

Similar for other scenarios

Near perfect agreement for standard errors as well


Evaluating hazard and survival functions

For each model calculate the absolute area difference betweenthe fitted and true functions was calculated over the 10 years offollow-up.

0

1

2

3

4

5

Ha

za

rd f

un

ctio

n

0 2 4 6 8 10

Follow−up time (Years)

Integral area

True function

Weibull model

0.0

0.2

0.4

0.6

0.8

1.0

Su

rviv

al fu

nctio

n

0 2 4 6 8 10

Follow−up time (Years)

Integral area

True function

Weibull model


Restricted cubic splines vs true model (hazard)

0

10

20

30

40

50

60

Perc

enta

ge o

f T

ota

l A

rea D

iffe

rence

on t

he H

azard

Scale

1 2 3 4 5 6 7 8 9 10Degrees of Freedom

Sample Size 300

Sample Size 3000

Sample Size 30,000

Scenario 3


Restricted cubic splines vs true model (survival)

0

5

10

15

20

25

Perc

enta

ge o

f T

ota

l A

rea D

iffe

rence

on t

he S

urv

ival S

cale

1 2 3 4 5 6 7 8 9 10Degrees of Freedom

Sample Size 300

Sample Size 3000

Sample Size 30,000

Scenario 3

See Mark’s paper for more details[4].Paul C Lambert Flexible parametric survival models 28th October 2014 40

Modelling time-dependent effects

In studies I am involved in we frequently have non-proportionalhazards.

Time-dependent effects can be introduced.

If we have non-proportional hazards, there is a covariate×timeinteraction.

With D covariates with time-dependent effects.

ln [Hi(t|xi)] = s (ln(t)|γ, k0) +D∑j=1

s (ln(t)|δj , kj)xij + xiβ

Generally have fewer knots for interaction term than for baseline.

Hazard ratio as a function of (log) time is a simple case.

Need some caution with interpretation with multipletime-dependent effects.


Predicting hazard ratios

. stpm2 dep5, scale(hazard) df(5) tvc(dep5) dftvc(3)

. predict hr tvc, hrnumerator(dep5 1) hrdenominator(dep5 0) ci

1

1.5

2

2.5

3

3.5

haza

rd r

atio



More useful predictions

A key advantage of using a parametric model over the Coxmodel is that we can transform the model parameters to expressdifferences between groups in different ways.The hazard ratio is a relative measure and a greaterunderstanding of the impact of an exposure can be obtained byalso looking at absolute differences.For two covariate patterns, x1 and x2 we can obtain

Differences in hazard rates

h(t|x1)− h(t|x2)

Differences in survival functions

S(t|x1)− S(t|x2)

Use the delta-method to calculate confidence intervals.Paul C Lambert Flexible parametric survival models 28th October 2014 43

Difference in hazard functions

Most Deprived - Least Deprived. predict hdiff, hdiff1(dep5 1) hdiff2(dep5 0) ci

0

50

100

150

200

Diff

eren

ce in

mor

talit

y ra

te (

per

1000

per

son

year

s)



Predicted survival functions

0.6

0.7

0.8

0.9

1.0P

ropo

rtio

n A

live


Least DeprivedMost Deprived


Difference in survival proportions

Most Deprived - Least Deprived. predict sdiff, sdiff1(dep5 1) sdiff2(dep5 0) ci

−0.10

−0.08

−0.06

−0.04

−0.02

0.00

0.02

Diff

eren

ce in

Sur

viva

l Cur

ves



Software

Log cumulative hazard scale

Stata - stpm2[5]R - Rstpm2a, flexsurvb

ahttp://rstpm2.r-forge.r-project.org/bhttp://cran.r-project.org/web/packages/flexsurv

Log hazard scale

Stata - stgenreg[6], strcs


http://rstpm2.r-forge.r-project.org/http://cran.r-project.org/web/packages/flexsurv

Summary Session 1

The hazard and survival functions are of interest and it is easierif they are directly estimated within our model.

We need to improve the way we quantify what our modelparameters mean at both the population and individual level.Generally need estimates of absolute rates/risks for this.

Particularly useful when we have non-proportional hazards.

‘Reasonable’ choices of knots lead to very similar fitted values.

Parametric models particularly useful for extrapolation.


Introduction to Session 2

I have introduced the basic idea of flexible parametric models.

In this session I will cover three extensions,

Attained age as the time-scale.Adjusted (standardized survival curves)Competing risks.

I am only giving a brief overview of each extension.


Example of Attained Age as the Time-scale

Study from Sweden[7] comparing incidence of hip fracture of,17,731 men diagnosed with prostate cancer treated withbilateral orchiectomy.43,230 men diagnosed with prostate cancer not treated withbilateral orchiectomy.362,354 men randomly selected from the general population.

Study entry is 6 months post diagnosis.

Outcome is femoral neck fracture.

Risk of fracture varies by age.

Attained age is used as the main time-scale.

Alternative way of “adjusting” for age.

Gives the age specific incidence rates.

Actually, two timescales, but will initially ignore time fromdiagnosis.


Estimates from a proportional hazards model

Cox ModelIncidence rate ratio (no orchiectomy) = 1.37 (1.28 to 1.46)Incidence rate ratio (orchiectomy) = 2.09 (1.93 to 2.27)

Flexible Parametric ModelIncidence rate ratio (no orchiectomy) = 1.37 (1.28 to 1.46)Incidence rate ratio (orchiectomy) = 2.09 (1.93 to 2.27)


Proportional Hazards

.1

1

510

255075In

cide

nce

Rat

e (p

er 1

000

py's

)

50 60 70 80 90 100Age

ControlNo OrchiectomyOrchiectomy


Non Proportional Hazards

.1

1

510

255075In

cide

nce

Rat

e (p

er 1

000

py's

)

50 60 70 80 90 100Age

ControlNo OrchiectomyOrchiectomy


Incidence Rate Ratio

1

2

5

10

20

50

Inci

denc

e R

ate

Rat

io

50 60 70 80 90 100Age

Orchiectomy vs Control


Incidence Rate Difference

0

10

20

30

Diff

eren

ce in

Inci

denc

e R

ates

(per

100

0 pe

rson

yea

rs)

50 60 70 80 90 100Age

Orchiectomy vs Control


Adjusted (Standardised) Survival Curves

When exploring our data we produce descriptive plots of thesurvival curve by our exposure variable.

These curves are not adjusted for confounding and anydifferences could be explained by imbalance between covariates.

Adjustment for measured confounders is usually performedthrough fitting a regression model.

The reported parameter is often an adjusted hazard ratio. Thisis often all that is reported.

We can still obtain survival curves from these models (Cox,flexible-parametric etc), which can help in our interpretation ofthe impact of hazard ratio on probabilities of survival.


Average survival curves

There is not a single definition of ‘average survival curve’.Approaches include using the mean value of all covariates [8],the mean of all predicted survival curves [9] and inverseprobability weighting [10].

Most software (e.g., stcurve) uses the mean covariate method.This gives the survival for an individual who happens to have themean value of all covariates. For example, for a Cox model themean survival is,

Ŝind(t) = exp (−H0(t) exp (β1x̄1 + β2x̄2))

This is the survival of an ‘average’ individual, who happens tohave the average values of all covariates.

Problem with categorical covariates. May be someone with aproportion of each stage and who is 50% male.


Standardized survival curves

Also known as direct adjustment.

The predicted survival for individual i is

Ŝi(t) = exp (−H0(t) exp (β1x1i + β2x2i))

We can also average over all predicted survival curves

ŜP(t) =1

N

N∑i=1

Ŝi(t)

Note that the model can be as complex as we want (continuouscovariates, interactions, non-linear functions, non-proportionalhazards).

SP(t) will be smaller than Sind(t).


Standardized survival curves

When interest lies in comparing the survival of (two) exposuregroups we need standardize to the same covariate distribution.

Let X be the exposure of interest.

Let Z denote the set of measured covariates.

ŜP(t|X = x ,Z ) = 1N

N∑i=1

Si (t|X = x ,Z )

Note that the average is over the marginal distribution of Z , notover the conditional distribution of Z among those with X = x .

This is needed to compare like-with-like.

We are forcing the same covariate distribution on both exposuregroups.


Average and Adjusted survival in stpm2

In stpm2 the meansurv option obtains an averages over allpredicted survival predicted curves.

Works for continuous covariates, time-dependent effects etc.

For adjusted survival curves we force the distribution of one ormore covariates (e.g., age) to be the same when comparinggroups of interest.

When comparing calendar periods we can predict survival in onecalendar period assuming it has the age distribution of another(reference) calendar period.


Renal Example

252 patients entering a renal dialysis program in Leicestershire,England 1982-1991 with follow-up to the end of 1994.

Interest in difference in survival by ethnicity (Non-South Asian vsSouth Asian).

At the time of the study approximately 25% of population inLeicester of South Asian origin (currently around 36%)


Kaplan-Meier Curves - Renal Replacement Therapy

Unadjusted HR = 0.62 (0.41, 0.94)Age adjusted HR = 1.14 (0.73, 1.79)

Mean Age = 62.9Mean Age = 55.5

0.0

0.2

0.4

0.6

0.8

1.0S

urvi

val F

unct

ion

0 2 4 6 8Survival Time (years)

Non−AsianAsian


Predictions for Standardised Survival Curves

The meansurv optionstpm2 asian age, df(3) scale(hazard)

/* Age distribution for study population as a whole */

predict meansurv pop0, meansurv at(asian 0)

predict meansurv pop1, meansurv at(asian 1)

Survival curve calculated for each subject in the studypopulation and then averaged.

The adjusted curves show the survival we would expect to see inboth groups if each had the age distribution of the studypopulation as a whole.

In large studies use the timevar() option to predict survival foreach individual at fewer time points.


Adjusted Survival Curve 1

0.0

0.2

0.4

0.6

0.8

1.0

Sur

viva

l Fun

ctio

n

0 2 4 6 8Survival Time (years)

Non−AsianAsian

Age Distribution in Whole Study Population


Summary: Standardised Survival Curves

Standardised survival curves provide a useful summary.

However, standardisation provides an average and hidesimportant and interesting variation in survival.

Survival model can be as complex as we like (non proportionalhazards, non-linear effects, various interactions ect), but somuch easier in a parametric framework.

Related to age-standardisation in relative survival.

Also possible to externally standardise[11].

Use to ask ‘What if?’ questions.

How many fewer breast cancer deaths would we see inEngland if the stage distribution of those living in deprivedareas matched that of the most affluent areas?[12]


Competing risks... been around for a while....

Daniel Bernoulli (1700-1792)

Seminal paper (1766)describing how to separatethe risk of dying fromsmallpox from that of othercauses.

Also, gain in life expectancy ifdeaths from smallpox couldbe eliminated.


Survival analysis - basic requirements

Our outcome of interest is death from (some specific) cancer.

-︷︸︸︷Person-time at risk

Date of Cancerdiagnosis

Cancer death

Require: Precise definitions of start/end of follow-up, and a relevanttime-scale (e.g., time since diagnosis).



We now introduce censoring (e.g., emigration/administrative).



Emigration

Assumption: Individuals who are censored can be represented bythose who remain in the risk set (non-informative censoring).



We now introduce censoring (e.g., emigration/administrative).



Emigration

-

?

For censored individuals, all we know is that their survival time isafter their censoring time.


Competing risks in the setting of cause-specific

survival

Let’s assume that the patients dies from a myocardial infarction(from death certificate).



MI death

Definition: Competing events are any events that preclude the eventof interest from occurring.


Competing risks in the setting of cause-specific

survival

Competing deaths are typically also handled by censoring.



MI death

-

? ?

An MI death makes death due to cancer impossible. However, undercertain assumptions, censoring for competing events provides us withan estimate of net survival.


Net survival

In cancer patient survival we often want to ‘eliminate’ thecompeting events to estimate net survival.

It is important to recognise that net survival is interpreted in ahypothetical world where competing risks are assumed to beeliminated, i.e. it is not possible to die from other causes.

Important for comparisons between populations (e.g.demographics groups, regions, countries) where mortality due toother causes may vary.

As we never observe net survival, we have to make assumptionsto estimate it.


Can we interpret cause-specific survival?

0.0

0.2

0.4

0.6

0.8

1.0

Sur

viva

l Pro

babi

lity

0 5 10 15 20Years since Diagnosis

Colon cancer age 75+

The independence assumption is keyThe time to death from the cancer in question is conditionallyindependent of the time to death from other causes. i.e., thereshould be no factors that influence both cancer and non-cancermortality other than those factors that have been controlled for in theestimation.


Independence assumption - interpretation of

survival curves

The independence assumption is not satisfied

The survival curves do not provide an estimate of net survival.

If it is not possible to control for the mechanism that introducesthe dependence the survival curves should be interpreted withcare.

However, the cause-specific hazard rates still have a usefulinterpretation as the rates that are observed when competingrisks are present.


Independence assumption - interpretation of

survival curves

The independence assumption is satisfied

Cause-specific survival curves provide estimates of net survival(provided that the classification of cause-of death is accurate).

The survival curves are interpreted as the survival that we wouldobserve if it was possible to eliminate all competing causes ofdeath.

This is a strictly hypothetical (but useful!) construct.

The cause-specific hazard rates provide estimates of the ratesthat we would observe in the absence of competing causes ofdeath.

In the competing risks literature, net survival and hazard aretypically referred to as marginal survival and hazard, respectively.


Cause-specific survival

Colon cancer age 75+Cause-specific Kaplan-Meier estimates do not give probability ofhaving the event.

0.0

0.2

0.4

0.6

0.8

1.0P

roba

bilit

y of

Dea

th


Cause-specific (Cancer)Cause-Specific (Other Causes)

Probabilities sum to > 1.Paul C Lambert Flexible parametric survival models 28th October 2014 76

Cumulative Incidence Functions (CIF)

We want the probability of dying of cause k accounting for thecompeting risks.

For cause k .

CIFk(t) = P (T ≤ t, event = k)

CIFk(t) =

∫ t0

S(u)hk(u)du

Note: CIF does not require independence between causes.

For further details on competing risks see references[13, 14, 14, 15]

Post estimation command stpm2cif will estimate CIFs andrelated measures after using stpm2 [16, 17]


Partitioning

Cause specific hazards

h(t) =K∑

k=1

hk(t)

e.g. all-cause mortality rate is sum of cause-specific mortalityrates.

Cause-specific incidence functions

CIF (t) =K∑

k=1

CIFk(t)

e.g. all-cause probability of death is sum of probability of deathfrom each cause under consideration.


Modelling cause-specific hazards

Expanding the data. expand 2. by id, sort: generate cause= _n. gen cancer = cause==1. gen other = cause==2. generate event = (cause==status). stset surv_mm, failure(event) scale(12)

Data setup

. list id cause _t event in 1/6, noobs sepby(id)

id cause _t event

1 cancer 1.375 11 other causes 1.375 0

2 cancer 6.875 02 other causes 6.875 1

3 cancer .125 13 other causes .125 0


Fitting the model

. stpm2 cancer other, scale(hazard) tvc(cancer other) rcsbaseoff dftvc(4) noconsIteration 0: log likelihood = -13724.324Iteration 1: log likelihood = -13123.992Iteration 2: log likelihood = -12795.284Iteration 3: log likelihood = -12776.812Iteration 4: log likelihood = -12776.767Iteration 5: log likelihood = -12776.767Log likelihood = -12776.767 Number of obs = 11736

Coef. Std. Err. z P>|z| [95% Conf. Interval]

xbcancer -.8825017 .0191499 -46.08 0.000 -.9200348 -.8449686other -2.560891 .0478737 -53.49 0.000 -2.654721 -2.46706

_rcs_cancer1 1.00009 .0156182 64.03 0.000 .9694788 1.030701_rcs_cancer2 .3121854 .0115509 27.03 0.000 .289546 .3348249_rcs_cancer3 -.0555807 .0069943 -7.95 0.000 -.0692892 -.0418722_rcs_cancer4 .0224981 .0039218 5.74 0.000 .0148116 .0301847_rcs_other1 1.5319 .0454149 33.73 0.000 1.442889 1.620912_rcs_other2 -.0856169 .0329804 -2.60 0.009 -.1502572 -.0209766_rcs_other3 -.2336573 .0200657 -11.64 0.000 -.2729853 -.1943293_rcs_other4 -.0193576 .0124958 -1.55 0.121 -.043849 .0051338

Should think about separate knot positions by cause.


Predicted CIF (Age 75+)

Predict CIFsstpm2cif cancer0 other0, cause1(cancer 1) cause2(other 1) ci

0.0

0.2

0.4

0.6

0.8

1.0

Pro

babi

lity

of D

eath


Cause-specific (Cancer)Cause-Specific (Other Causes)


Summary: Competing Risks

Cumulative incidence functions useful for understandingindividual risk.

Will help in our interpretation of the importance of anydifferences we see on the hazard scale.

Advantages of standard flexible parametric models carry over tocompeting risks models.

For etiological associations CIFs (and Fine and Gray models) canmislead[18].

I have recently extended the ability of stpm2 to fit a parametricequivalent of the Fine and Gray model (and other models), butsee warning above.


Summary Session 2

All topics presented the second session could in theory beobtained using a Cox model.

However, predictions are so much easier within a parametricsetting.

Particularly so with time-dependent effects.

Think about presenting more than just an adjusted hazard ratio.


Further extensions

Modelling excess mortality / relative survival[19].

Crude probabilities [20, 21]Loss in expectation of life[22]

Extrapolation.

Partitioning excess mortality[23, 24].

Avoidable deaths[21, 12].

Cure models[25, 26].

Modelling on other scales (proportional odds, etc)[27, 3]

Predicting conditional survival

Predicting centiles of survival distribution.

Restricted mean survival time[28]


References

[1] Reid N. A conversation with Sir David Cox. Statistical Science 1994;9:439–455.

[2] Durrleman S, Simon R. Flexible regression models with cubic splines. Statistics inMedicine 1989;8:551–561.

[3] Royston P, Parmar MKB. Flexible parametric proportional-hazards and proportional-oddsmodels for censored survival data, with application to prognostic modelling and estimationof treatment effects. Statistics in Medicine 2002;21:2175–2197.

[4] Rutherford MJ, Crowther MJ, Lambert PC. The use of restricted cubic splines toapproximate complex hazard functions in the analysis of time-to-event data: a simulationstudy. Journal of Statistical Computation and Simulation 2014 (in press);.

[5] Lambert PC, Royston P. Further development of flexible parametric models for survivalanalysis. The Stata Journal 2009;9:265–290.

[6] Crowther MJ, Lambert PC. stgenreg: A stata package for general parametric survivalanalysis. Journal of Statistical Software 2013;53:1–17.

[7] Dickman PW, Sloggett A, Hills M, Hakulinen T. Regression models for relative survival.Stat Med 2004;23:51–64.

[8] Cupples LA, Gagnon DR, Ramaswamy R, D’Agostino RB. Age-adjusted survival curveswith application in the Framingham study. Statistics in Medicine 1995;14:1731–1744.


References 2

[9] Nieto FJ, Coresh J. Adjusting survival curves for confounders: a review and a newmethod. American Journal of Epidemiology 1996;143:1059–1068.

[10] Cole SR, Hernán MA. Adjusted survival curves with inverse probability weights. ComputMethods Programs Biomed 2004;75:45–49.

[11] Colzani E, Liljegren A, Johansson ALV, Adolfsson J, Hellborg H, Hall PFL, Czene K.Prognosis of patients with breast cancer: causes of death and effects of time sincediagnosis, age, and tumor characteristics. J Clin Oncol 2011;29:4014–4021.

[12] Rutherford MJ, Hinchliffe SR, Abel GA, Lyratzopoulos G, Lambert PC, Greenberg DC.How much of the deprivation gap in cancer survival can be explained by variation in stageat diagnosis: An example from breast cancer in the east of england. International Journalof Cancer 2013;.

[13] Andersen PK, Abildstrom SZ, Rosthøj S. Competing risks as a multi-state model. StatMethods Med Res 2002;11:203–215.

[14] Coviello V, Boggess M. Cumulative incidence estimation in the presence of competingrisks. The Stata Journal 2004;4:103–112.

[15] Putter H, Fiocco M, Geskus RB. Tutorial in biostatistics: competing risks and multi-statemodels. Stat Med 2007;26:2389–2430.


References 3

[16] Hinchliffe SR, Lambert PC. Flexible parametric modelling of cause-specific hazards toestimate cumulative incidence functions. BMC Medical Research Methodology 2013;13:13.

[17] Hinchliffe SR, Lambert PC. Extending the flexible parametric survival model for competingrisks. The Stata Journal 2013;13:344–355.

[18] Bhaskaran K, Rachet B, Evans S, Smeeth L. Re: Helene hartvedt grytli, morten wangfagerland, sophie d. foss̊a, kristin austlid taskén. association between use of β-blockers andprostate cancer-specific survival: a cohort study of 3561 prostate cancer patients withhigh-risk or metastatic disease. eur urol. in press.http://dx.doi.org/10.1016/j.eururo.2013.01.007.: beta-blockers and prostate cancersurvival–interpretation of competing risks models. Eur Urol 2013;64:e86–e87.

[19] Nelson CP, Lambert PC, Squire IB, Jones DR. Flexible parametric models for relativesurvival, with application in coronary heart disease. Stat Med 2007;26:5486–5498.

[20] Lambert PC, Dickman PW, Nelson CP, Royston P. Estimating the crude probability ofdeath due to cancer and other causes using relative survival models. Stat Med 2010;29:885 – 895.

[21] Lambert PC, Holmberg L, Sandin F, Bray F, Linklater KM, Purushotham A, et al..Quantifying differences in breast cancer survival between England and Norway. CancerEpidemiology 2011;35:526–533.


References 4

[22] Andersson TML, Dickman PW, Eloranta S, Lambe M, Lambert PC. Estimating the loss inexpectation of life due to cancer using flexible parametric survival models. Statistics inMedicine 2013;32:5286–5300.

[23] Eloranta S, Lambert PC, Andersson TML, Czene K, Hall P, Björkholm M, Dickman PW.Partitioning of excess mortality in population-based cancer patient survival studies usingflexible parametric survival models. BMC Med Res Methodol 2012;12:86.

[24] Eloranta S, Lambert PC, Sjöberg J, Andersson TML, Björkholm M, Dickman PW.Temporal trends in mortality from diseases of the circulatory system after treatment forHodgkin lymphoma: a population-based cohort study in Sweden (1973 to 2006). Journalof Clinical Oncology 2013;31:1435–1441.

[25] Andersson TML, Dickman PW, Eloranta S, Lambert PC. Estimating and modelling curein population-based cancer studies within the framework of flexible parametric survivalmodels. BMC Med Res Methodol 2011;11:96.

[26] Andersson TML, Eriksson H, Hansson J, Månsson-Brahme E, Dickman PW, Eloranta S,et al.. Estimating the cure proportion of malignant melanoma, an alternative approach toassess long term survival: a population-based study. Cancer Epidemiol 2014;38:93–99.

[27] Royston P, Lambert PC. Flexible parametric survival analysis in Stata: Beyond the Coxmodel . Stata Press, 2011.


References 5

[28] Royston P, Parmar MKB. The use of restricted mean survival time to estimate thetreatment effect in randomized clinical trials when the proportional hazards assumption isin doubt. Stat Med 2011;30:2409–2421.


TitlepageSession 1What are splines?fpm introfpm prop hazardsfpm sensitivity to knotsfpm useful Predictions

Session 2References

The use of flexible parametric survival models in epidemiology...Lecture notes online - web address will be circulated. Paul C Lambert Flexible parametric survival models 28th October

Documents