Top Banner
1 Survival Analysis Sources: Slides: Kristin Sainani Stanford http://www.stanford.edu/~kcobb Johnson and Shih An Introduction to Survival Analysis, Principles and Practice of Clinical Research 2E (2007) Rich et al. A practical guide to understanding Kaplan-Meier Curves, Otolaryngology – Head and Neck Surgery (2010) ABDBM © Ron Shamir
44

Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Feb 11, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

1

Survival Analysis Sources: •Slides: Kristin Sainani Stanford http://www.stanford.edu/~kcobb •Johnson and Shih An Introduction to Survival Analysis, Principles and Practice of Clinical Research 2E (2007) •Rich et al. A practical guide to understanding Kaplan-Meier Curves, Otolaryngology – Head and Neck Surgery (2010)

ABDBM © Ron Shamir

Page 2: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

2

Overview

• Intro, terminology • Survival/hazard functions. • Kaplan-Meier curves • The LogRank test • Cox PH

ABDBM © Ron Shamir

Page 3: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

3

Early example of survival analysis, 1669

Christiaan Huygens' 1669 curve showing how many out of 100 people survive until 86 years.

From: Howard Wainer- STATISTICAL GRAPHICS: Mapping the Pathways of Science. Annual Review of Psychology. Vol. 52: 305-335.

ABDBM © Ron Shamir

Page 4: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

4

What is survival analysis? • Statistical methods for analyzing

longitudinal data on the occurrence of event.

• Possible events: – death, injury, onset of disease, recovery from illness,

recurrence-free survival for 5 years (binary variables) – transition above or below the clinical threshold of a

continuous variable (e.g. blood glucose level).

• Accommodates data from randomized clinical trial or cohort study design.

ABDBM © Ron Shamir

Page 5: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Randomized Clinical Trial (RCT)

Target population

Intervention

Control

Disease

Disease-free

Disease

Disease-free

TIME

Random assignment

Disease-free, at-risk cohort

5 ABDBM © Ron Shamir

Page 6: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Target population

Treatment

Control

Cured

Not cured

Cured

Not cured

TIME

Random assignment

Patient population

Randomized Clinical Trial (RCT)

6 ABDBM © Ron Shamir

Page 7: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Target population

Treatment

Control

Dead

Alive

Dead

Alive

TIME

Random assignment

Patient population

Randomized Clinical Trial (RCT)

7 ABDBM © Ron Shamir

Page 8: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Cohort study (prospective/retrospective)

Target population

Exposed

Unexposed

Disease

Disease-free

Disease

Disease-free

TIME

Disease-free cohort

8 ABDBM © Ron Shamir

Page 9: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

9

Examples of survival analysis in medicine

ABDBM © Ron Shamir

Page 10: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

10

RCT: Women’s Health Initiative (JAMA, 2002)

On hormones

On placebo Cumulative incidence

Women’s Health Initiative Writing Group.

JAMA. 2002;288:321-333.

ABDBM © Ron Shamir

Page 11: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

11

Breast cancer and low-fat diet Control

Low-fat diet

Prentice et al. JAMA, February

8, 2006; 295: 629 - 642.

ABDBM © Ron Shamir

Page 12: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

12

Aspirin, ibuprofen, and mortality after myocardial infarction: retrospective cohort study

Curits et al. BMJ 2003;327:1322-1323. ABDBM © Ron Shamir

Curtis et al. BMJ 2003

Page 13: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

13

Why survival analysis? 1. Why not compare mean time-to-event

between groups using a t-test or linear regression?

-- For some patients we may not know if and when an event occurred: study terminated or we lost touch with them

2. Why not compare proportion of events in each group using risk/odds ratios or logistic regression?

--ignores time ABDBM © Ron Shamir

Page 14: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

14

Terminology • The event of interest: the outcome sought • Time-to-event: The time from entry into a

study until a subject had the outcome • Censoring: Subjects are said to be

censored if they are lost to follow up or drop out of the study, or if the study ends before they have the outcome. They are counted as alive / disease-free for the time they were enrolled in the study. – Must assume censoring is independent of the

outcome, otherwise censoring will create bias

ABDBM © Ron Shamir

Page 15: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

An example

ABDBM © Ron Shamir 15

Solid circles: uncensored Open: censored

Page 16: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Moving all start times to 0

ABDBM © Ron Shamir 16

A better view only if time homogeneity holds

Page 17: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Data of a hypothetical study

ABDBM © Ron Shamir 17 Johnson and Shih

Page 18: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

18

Data Two-variable outcome : • ti = time at last disease-free observation or

time at event • ci =1 if had the event; ci =0 no event by time

ti

ABDBM © Ron Shamir

Page 19: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Survival function • S(t): the probability of an individual

surviving at least until time t • Usually unknown, evaluated based on a

sample • Survival experience – the empirical function

ABDBM © Ron Shamir 19

Page 20: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

20

Cumulative survival

ABDBM © Ron Shamir

Page 21: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

21

Probability density function: f(t) T: the event time for an individual (a random variable)

The probability of the event time occurring at exactly time t

F(t) = CDF of f(t)

S(t) = 1-F(t)

ABDBM © Ron Shamir

tttTtPtf

t ∆∆+<≤

=→∆

)(lim)(0

Page 22: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

23

The hazard function

ttTttTtPth

t ∆≥∆+<≤

=→∆

)/(lim)(0

The probability that if you survive to t, you will succumb to the event in the next instant.

)()((t) :survival anddensity from Hazard

tStfh =

)()(

)()(

)()&()/()(

tSdttf

tTPdttTtP

tTPtTdttTtPtTdttTtPdtth =

≥+<≤

=≥

≥+<≤=≥+<≤=

Bayes’ rule

ABDBM © Ron Shamir

Page 23: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

24

AGE ABDBM © Ron Shamir

Page 24: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

25

A possible set of probability density, failure, survival, and hazard functions.

F(t)=cumulative failure

S(t)=cumulative survival h(t)=hazard function

f(t)=density function

ABDBM © Ron Shamir

Page 25: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

The Kaplan-Meier curve Sorted events t1 < t2 < …< tn. No censoring. Pr(surviving to ti) = (n-i+1)/n What to do when some subjects are censored? Sorted events t1 < t2 < …< tn, di – no of events in (ti-1,ti]; ni – no of individuals at

risk (remaining in the study) in (ti-1,ti]; Pr(survival to ti)= P(surviving to ti-1) x P(surviving

interval (ti-1,ti]) = P(survival to ti-1) x (ni-di)/ni ABDBM © Ron Shamir 26

K-M or product-

limit estimator

Page 26: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

ABDBM © Ron Shamir 27

Page 27: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

28

K-M estimate and curve • Non-parametric estimate of the survival function • Empirical probability of surviving past certain

times in the sample (taking into account censoring). • Describes survivorship of study population/s. • Commonly used to compare two study populations. • Intuitive graphical presentation.

ABDBM © Ron Shamir

Page 28: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Paul Meier 1924-2011

ABDBM © Ron Shamir 29

Edward L. Kaplan

Page 29: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Comparing two survival curves • Two methods:

– Compare the curves at a pre-specified time point t – Compare the overall plots over the entire time range

ABDBM © Ron Shamir 30

Hormones vs Placebo Women’s Health Initiative

Writing Group. JAMA. 2002;288:321-333.

Result depends on t; tendency to pick the

“best” t

Page 30: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Comparing two curves: Log rank test • H0: S1(t) = S2(t) for all t • Log rank test: Use the ranks of events, not times. Sorted events t1 < t2 < …< tK, For time tj: Under H0, E(aj)=tot events x # at risk group 1/# at risk =

(aj+cj)x(aj+bj)/nj Z is approximately standard normal – evaluate p-val

ABDBM © Ron Shamir 32

Events Surviving Total

Group 1 aj bj aj+bj

Group2 cj dj cj+dj

Total aj+cj bj+dj nj

Page 31: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Example: breast cancer survival signature

• Caveats: – No mention of mean

survival – Visual inspection can be misleading – Must predefine the groups in advance

ABDBM © Ron Shamir 33

Van de Vijver NEJM 02

Small numbers left

Certain characteristics (age, sex, ..) can be related to survival – confounding / prognostic factors can change the relation of treatment to outcome

Need to stratify the test and compare survival differences within each level of these factors

Page 32: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

WHI and breast cancer

Women’s Health

Initiative Writing Group.

JAMA. 2002;288:321-

333. 34 ABDBM © Ron Shamir

Page 33: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

35

Cox Proportional Hazard Model • K-M curves and Log Rank – univariate

analysis; describe survival using one categorical factor

• Cox PH: allows many prognostic factors, categorical or real-valued

• Semi-parametric • Models the effect of predictors and

covariates on the hazard rate but leaves the baseline hazard rate unspecified.

• Estimates relative rather than absolute hazard.

ABDBM © Ron Shamir

Page 34: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

36

The model

ikki xxi etth ββλ ++= ...

011)()(

Components:

•A baseline hazard function that is left unspecified but must be positive (=the hazard when all covariates are 0)

•A linear function of a set of k fixed covariates that is exponential.

ikkii xxtth ββλ +++= ...)(log)(log 110

Can take on any form!

Page 35: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

37

The model

)(...)(...

0

...0

,1111

11

11

)()(

)()( jkikji

jkkj

ikkixxxx

xx

xx

j

iji e

etet

ththHR −++−

++

++

=== ββββ

ββ

λλ

Proportional hazards:

Hazard functions should be strictly parallel Produces covariate-adjusted hazard ratios

Hazard for person j (eg a non-smoker)

Hazard for person i (eg a smoker)

Hazard ratio

Page 36: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

38

The model

)(

0

0

2

1 21

2

1

)()(

)()( xx

x

x

eetheth

ththHR −=== β

β

β

The point is to compare the hazard rates of individuals who have different covariates:

Hence, called Proportional hazards:

Hazard functions should be strictly parallel.

ABDBM © Ron Shamir

For binary x: β is exp log (increase in hazard)

betw categories. For numerical x: exp log increase per unit (e.g.

year)

Page 37: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Cox PH - computation • The coefficients β1, …, βK can be estimated using

numerical optimization (details not shown) • For large enough sample, the estimate of each βi

has a normal distribution and its p-val and confidence intervals can be computed.

ABDBM © Ron Shamir 39

Page 38: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Example: Farmingham heart study • Cohort of 5,180 aged 45-82 followed until time of

death or up to 10 years. 46% males, 402 deaths.

• Cox PH model for age and sex as factors:

• Both factors increase risk. – Age: exp(0.11149) = 1.118 so 11.8% higher risk per year. – Male: exp(0.67958) = 1.973 higher risk per males, holding

age constant ABDBM © Ron Shamir 40

Die (n=402) Do Not Die (n=4778) Mean (SD) Age, years 65.6 (8.7) 56.1 (7.5) N (%) Male 221 (55%) 2145 (45%)

Risk Factor Parameter Estimate P-Value Age, years 0.11149 0.0001 Male Sex 0.67958 0.0001

http://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704_Survival/

Page 39: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Model with more covariates

• Significant factors have CI that do not include 1 (the null)

ABDBM © Ron Shamir 41

Risk Factor Parameter Estimate P-Value Hazard Ratio (HR) (95% CI for HR)

Age, years 0.11691 0.0001 1.124 (1.111-1.138)

Male Sex 0.40359 0.0002 1.497 (1.215-1.845)

Systolic Blood Pressure

0.01645 0.0001 1.017 (1.012-1.021)

Current Smoker 0.76798 0.0001 2.155 (1.758-2.643)

Total Serum Cholesterol

-0.00209 0.0963 0.998 (0.995-2.643)

Diabetes -0.02366 0.1585 0.816 (0.615-1.083)

Page 40: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

42

Example 1: Study of publication bias

By Kaplan-Meier methods

From: Publication bias: evidence of delayed publication in a cohort study of clinical research projects BMJ 1997;315:640-645 (13 September)

Page 41: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

43

From: Publication bias: evidence of delayed publication in a cohort study of clinical research projects BMJ 1997;315:640-645 (13 September)

Table 4 Risk factors for time to publication using univariate Cox regression analysis

Characteristic

# not published

# published

Hazard ratio (95% CI)

Null

29

23

1.00

Non-significant trend

16

4

0.39 (0.13 to 1.12)

Significant

47

99

2.32 (1.47 to 3.66)

Interpretation: Significant results have a 2-fold higher incidence of publication compared to null results.

Univariate Cox regression

Page 42: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

44

Example 2: Study of mortality in academy award winners for screenwriting

Kaplan-Meier methods

From: Longevity of screenwriters who win an academy award: longitudinal study BMJ 2001;323:1491-1496 ( 22-29 December )

Page 43: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Table 2. Death rates for screenwriters who have won an academy award.* Values are percentages (95% confidence intervals) and are adjusted for the factor indicated

Relative increase in death rate for

winners

Basic analysis

37 (10 to 70) Adjusted analysis

Demographic:

Year of birth

32 (6 to 64)

Sex

36 (10 to 69) Documented education

39 (12 to 73)

All three factors

33 (7 to 65) Professional:

Film genre

37 (10 to 70)

Total films

39 (12 to 73) Total four star films

40 (13 to 75)

Total nominations

43 (14 to 79) Age at first film

36 (9 to 68)

Age at first nomination

32 (6 to 64) All six factors

40 (11 to 76)

All nine factors

35 (7 to 70)

HR=1.37; interpretation: 37% higher incidence of death for winners compared with nominees

HR=1.35; interpretation: 35% higher incidence of death for winners compared with nominees even after adjusting for potential confounders

Page 44: Survival Analysis - TAUrshamir/abdbm/pres/17/Survival.pdf4 What is survival analysis? •Statistical methods for analyzing longitudinal data on the occurrence of event. •Possible

Sir David Cox • Born 1924 • Cambridge, Imperial College London, Oxford • Books:

– Planning of experiments (1958) – Queues (Methuen, 1961). With Walter L. Smith – Renewal Theory (Methuen, 1962). – The theory of stochastic processes (1965). With Hilton David Miller – Analysis of binary data (1969). With Joyce E. Snell – Theoretical statistics (1974). With D. V. Hinkley – Point processes (Chapman & Hall/CRC, 1980). With Valerie Isham – Applied statistics, principles and examples (Chapman & Hall/CRC, 1981). With Joyce E. Snell – Analysis of survival data (Chapman & Hall/CRC, 1984). With David Oakes – Asymptotic techniques for use in statistics. (1989) With Ole E. Barndorff-Nielsen – Inference and asymptotics (Chapman & Hall/CRC, 1994). With Ole E. Barndorff-Nielsen – Multivariate dependencies, models, analysis and interpretation (Chapman & Hall, 1995). With Nanny Wermuth – The theory of design of experiments. (Chapman & Hall/CRC, 2000). With Nancy M. Reid. – Complex stochastic systems (Chapman & Hall/CRC, 2000). With Ole E. Barndorff-Nielsen and Claudia

Klüppelberg – Components of variance (Chapman & Hall/CRC, 2003). With P. J. Solomon – Principles of Statistical Inference (Cambridge University Press, 2006). ISBN 978-0-521-68567-2 – Selected Statistical Papers of Sir David Cox 2 Volume Set – Principles of Applied Statistics (CUP) With Christl A. Donnelly

ABDBM © Ron Shamir 46