Survival Analysis Diane Stockton
Feb 10, 2016
Survival Analysis
Diane Stockton
Survival Curves
Y axis, gives the proportion of people surviving from 1 at the top to zero at the bottom, representing 100% survival to zero percent survival at the bottom.
The X axis, gives the time after diagnosis
A survival curve is a statistical picture of the survival experience of a group of patients in the form of a graph showing the percentage surviving versus time.
Survival Curves
Any point on the curve gives the proportion or percentage surviving at a particular time after the start of observation. E.g. the blue dot on the example curve shows that at one year, about 75% of patients were alive.
A survival curve always starts out with 100% survival at time zero, the beginning.
A survival curve is a statistical picture of the survival experience of a group of patients in the form of a graph showing the percentage surviving versus time.
Questions in survival analysesWhich Survival?• Observed (or crude) survival• Cause-specific (also known as Net or Corrected) survival• Relative survival• Period survival
Data issues• Censoring • Life tables• Standardisation
Modelling survival
Observed (crude) survival
Observed survival = number surviving the interval
number alive at the start of the interval
Frequently 1 and 5 year survival rates are reported which are interpreted as the proportion surviving 1 or 5 years after diagnosis.
The median survival is the time at which the percentage surviving is 50%.
Survival Curves
Survival Curves• The experience of a particular group as represented by a staircase
curve can be considered an estimate or sample of what the "real" survival curve is for all people with the same circumstances.
• As with other estimates, the accuracy improves as the sample size increases.
• With staircase curves, as the group of patients is larger, the step down caused by each death is smaller.
• If the times of the deaths are plotted accurately, then you can see that as the size of the group increases the staircase will become closer and closer to the ideal of a smooth curve.
Observed (crude) survival
Observed survival = number surviving the interval
number alive at the start of the interval
Often the length of follow-up is not the same for all patients and some became “censored” during the interval. Usually we assume that each “censored” patient was at risk for only half of the interval, so :
= number surviving the interval number alive at start of interval – (0.5 *number censored)
Censored data
From Paul W Dickman Gothenburg slides
Censoring• When a patient is censored the curve doesn't take a step
down as it does when a patient dies.
• But censoring the patient reduces the number of patients who are contributing to the curve, so each death after that point represents a higher proportion of the remaining population, and so every step down afterwards will be a little bit larger than it would have been.
Censoring0.
000.
250.
500.
751.
00
0 10 20 30 40analysis time
drug = 1 drug = 2drug = 3
Kaplan-Meier survival estimates, by drug
From Robert A Yaffee, Survival analysis with STATA
Cause-specific (or net or corrected) Survival
Cause of death from the death certificate is used to attribute the death to
• the disease of interest • other causes
BUT……..Which deaths should be considered attributable to the disease of interest?Are the death certificates available and accurate?
The analysis is exactly the same as for observed survival (actuarial or Kaplan-meier) but those dying from other causes are counted as censored at their time of death
Censoring0.
000.
250.
500.
751.
00
0 10 20 30 40analysis time
drug = 1 drug = 2drug = 3
Kaplan-Meier survival estimates, by drug
From Robert A Yaffee, Survival analysis with STATA
Comparison of survival methods
0
10
20
30
40
50
60
70
80
90
100
0 0.5 1.0 1.5 2.0 2.5 3.0 4.0 5.0 10.0
Time (years) since diagnosis
Surv
ival
tim
e (%
)
Crude
Cause Specif ic
Relative Survival
Relative survival = observed survival expected survival
where :
Expected survival = survival that would have been expected if the patients had been subject only to the mortality rates of the general population.
It can be interpreted as the proportion of patients alive after i years of follow-up in the hypothetical situation where the disease in question is the only possible cause of death.
Calculating the expected survival
Tables of the mortality rates of the general population, by
• age (single year of age at death, 0-99)• sex• calendar period of death
And by other important factors such as
• Geographical area• deprivation category
Life tables
10
100
1,000
10,000
100,000
0 10 20 30 40 50 60 70 80 90 100Age at death (years)
Rate per 100,000
Most deprived
Least deprived
General mortality rates
General life table
90
4440
90
6760
0
20
40
60
80
100
Affluent 2 3 4 DeprivedDeprivation category
Surv
ival
(%)
Observed
Expected
Relative
Life tables and bias in deprivation gradient - 1
23% gap in relative survival between affluent and deprived
Deprivation life tables
4440
6760
85
95
0
20
40
60
80
100
Affluent 2 3 4 DeprivedDeprivation category
Surv
ival
(%)
Observed
Expected
Relative
Life tables and bias in deprivation gradient - 2
Deprivation life tables
85
40
60
95
47
63
0
20
40
60
80
100
Affluent 2 3 4 DeprivedDeprivation category
Surv
ival
(%)
Observed
Expected
Relative
16% gap in relative survival between affluent and deprived
Life tables and bias in deprivation gradient - 3
Life Tables
The use of the same life table for groups for whom general mortality is known to differ can lead to bias
because the expected survival will be under estimated for the groups who have better than average survival and hence the relative survival will be over estimated
and visa versa for the groups who have worse than average survival
Appropriate life tables are important!Deprivation-specific Relative survival estimates:
Deprivation-specifi c lif etable
General lif etable
Cancer Affl . Depr. Diff Affl . Depr. Diff Oesophagus Larynx Lung Breast Bladder
6.3 68.7 6.1
70.5 65.5
6.7 58.9 5.1
62.5 58.7
0.4 9.8 1.0 8.0 6.8
6.5 70.9 6.3 71.3 67.7
6.4 56.8 4.9 61.5 56.4
0.1 14.4 1.4 9.8 11.3
Different relative survival methods
There are different ways of computing :
• EDERER I (not recommended)
• EDERER II (not recommended for estimating cumulative expected survival, however a good estimator for the interval-specific expected survival)
• Hakulinen (recommended for estimating cumulative expected survival for the purpose of estimating relative survival ratios but is not recommended for interval-specific expected survival)
• Maximum likelihood (Esteve) (similar results to Hakulinen method)
• Patients do not all die of the disease you are monitoring
• Observed (crude) survival– “Real” survival of the patients– survival from disease of interest and all causes of death
combined– Intuitive; easy to explain– Easily computed in wide variety of statistical software
Survival analysis for population studies
• Patients do not all die of the disease you are monitoring
• Net survival (corrected, cause-specific)– separates risk from disease of interest and background risk
(everyone)– deaths from other causes are censored– survival from cancer in the absence of other causes
– agreement on which causes of death are due to the disease– death certification is precise, stable over time, comparable– coding of death certificates is accurate, consistent
Survival analysis for population studies
• Patients do not all die of the disease you are monitoring
• Relative survival– also separates risk from disease of interest and background risk
(everyone)– all deaths in study period are included– uses vital statistics to account for background risk– ratio of observed and expected survival– survival relative to that of general population
– does not require information on cause of death– avoids need for attribution of death to disease or other cause– long-term survival (disease hazard falls, other hazard rises)
– need appropriate (and accurate) life tables– different methods give slightly different results– not as easy to explain
Survival analysis for population studies
Comparison of survival methods
0
10
20
30
40
50
60
70
80
90
100
0 0.5 1.0 1.5 2.0 2.5 3.0 4.0 5.0 10.0
Time (years) since diagnosis
Crude
Cause Specif ic
Relative - Esteve
From Paul W Dickman Gothenburg slides
Age-Standardised Relative Survival Rate
• The calculation of the expected survival probability adjusts only for the age-specific mortality from other causes.
• If an overall (all-ages) estimate of relative survival for patients is used to compare survival rates for two populations with very different age structures, the results may be misleading.
• It is therefore desirable to age-standardise the relative survival rates.
• Age-adjustment is also important for the analysis of time trends in relative survival because if survival varies markedly with age, a change in the age distribution of patients over time can produce spurious survival trends (or obscure real trends).
Survival varies markedly by age for many diseases
Why we should standardise …cont
15.513.5
30%
30%
25%
15%
John-o-Groats
Population (%)
0.25
0.25
0.25
0.25
Weight
11.313.5
All agesUnstandardisedStandardised
27%
17%
8%
2%
15%
25%
30%
30%
0-44
45-64
65-74
75+
2 year survival estimates identical in both places
Lands end
Population (%)
Age band
Age-standardisationAdvantages
None of the routine survival methods adjust for age or sex (not even relative survival)
• Standardising allows comparison with other populations or over time when the the population structures are not the same
… the other option is modelling which is discussed later
Age-standardisationLimitations
• No routinely-used common standard population for survival analyses – create a standard that is sensible for your data
• Unclear what to do when there is no survival estimate for an age-group
• Obviously more time consuming than “all ages” estimates
If there is no estimate in an age group then:
–Merge with another age group until have enough cases
–Increase time band for that estimate (changes in time are usually smaller than changes between age bands so I prefer this option)
–And if all else fails then produce a truncated standardised rate
ALSO REMEMBER when producing estimates for sexes combined, remember to analyse the sexes separately and then sex-standardise them… add together the two estimates and then halve
Period survival• Also known as the Brenner method (after proposer)• Gives estimate of e.g. 5 year survival using the most up-to-date
information
• Trade off between recency of data and number of patients
Interval Includes patients diagnosed1 1996 1997 1998 1999 20002 1995 1996 1997 1998 19993 1994 1995 1996 1997 19984 1993 1994 1995 1996 19975 1992 1993 1994 1995 1996
From Paul W Dickman Gothenburg slides
Period survival continued (Finland example)
Interval-specific relative survival estimates:Interval 1978 1979 1980 1981 19821 0.627 0.7902 0.823 0.8393 0.865 0.8794 0.967 0.9585 0.978
Most up-to-date five year survival estimate available would be:Using cohort method = 0.627*0.823*0.865*0.967*0.978 = 0.431
Using period method = 0.790*0.839*0.879*0.958*0.978 = 0.557
The actual five-year survival estimate for patients diagnosed in 1982 was 0.583 so the period method did not over-estimate survival
Modelling survival
• For crude or cause-specific survival– Cox Proportional Hazards regression– Poisson regression
• For relative survival– Programmes/macros written in SAS, STATA and R
see http://www.pauldickman.com/
• Grouped or individual data can be modelled• Splines and fractional polynomials can be modelled
Recommendations• Always display relative survival estimates
• Display crude survival along with relative survival to satisfy those who like to see the “real” thing
• Only use cause-specific survival for a specific study where the cause of death flag can be reviewed
• Choice of relative survival technique depends on ease of use and comparability
• Age-standardise if appropriate (include weights used in your footnotes).
• Include confidence intervals
Useful links / contacts
• Presentations, statistical programme code, scientific papers and examples of survival analyses http://www.pauldickman.com/
• STATA ado files for the maximum likelihood relative survival method
http://www.lshtm.ac.uk/eph/ncde/cancersurvival/tools/
My details:[email protected] 0131 275 6817