Top Banner
Survival Analysis using R Bruce L. Jones Department of Statistical and Actuarial Sciences The University of Western Ontario March 24, 2010
40

Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Jul 11, 2018

Download

Documents

duongthuan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Survival Analysisusing R

Bruce L. Jones

Department of Statistical and Actuarial SciencesThe University of Western Ontario

March 24, 2010

Page 2: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Outline

• What is R?

• Why use R?

• A bit about R

• What is Survival Analysis?

• The survival package in R

• Example

1

Page 3: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

What is R?

• R is a free software environment for statistical computing and graphics.

• It compiles and runs on a wide variety of UNIX platforms, Windowsand MacOS.

• R is very popular among researchers in statistics.

• R is similar in appearance to S.

• R was initially written by Ross Ihaka and Robert Gentleman

2

Page 4: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Why use R?

• It contains advanced statistical routines not yet available in otherpackages.

• It provides an unparalleled platform for programming new statisticalmethods in an easy and straightforward manner.

• It has state-of-the-art graphics capabilities.

• It’s free. Just go to http://www.r-project.org

3

Page 5: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical
Page 6: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical
Page 7: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical
Page 8: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical
Page 9: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical
Page 10: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Assignment, Vectors and Arrays

> 1+2*3

[1] 7

> x=3

> y<-2

> x+y

[1] 5

> z=c(2,3,4,5)

> z

[1] 2 3 4 5

> 2*z

[1] 4 6 8 10

>

9

Page 11: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Assignment, Vectors and Arrays

> 1+2*3

[1] 7

> x=3

> y<-2

> x+y

[1] 5

> z=c(2,3,4,5)

> z

[1] 2 3 4 5

> 2*z

[1] 4 6 8 10

>

9

Page 12: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Assignment, Vectors and Arrays

> z=2:5

> z

[1] 2 3 4 5

> z=seq(2,5,1)

> z

[1] 2 3 4 5

> zz=seq(10,300,3)

> zz

[1] 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64[20] 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 121[39] 124 127 130 133 136 139 142 145 148 151 154 157 160 163 166 169 172 175 178[58] 181 184 187 190 193 196 199 202 205 208 211 214 217 220 223 226 229 232 235[77] 238 241 244 247 250 253 256 259 262 265 268 271 274 277 280 283 286 289 292[96] 295 298

>

10

Page 13: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Assignment, Vectors and Arrays

> z=2:5

> z

[1] 2 3 4 5

> z=seq(2,5,1)

> z

[1] 2 3 4 5

> zz=seq(10,300,3)

> zz

[1] 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64[20] 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 121[39] 124 127 130 133 136 139 142 145 148 151 154 157 160 163 166 169 172 175 178[58] 181 184 187 190 193 196 199 202 205 208 211 214 217 220 223 226 229 232 235[77] 238 241 244 247 250 253 256 259 262 265 268 271 274 277 280 283 286 289 292[96] 295 298

>

10

Page 14: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Assignment, Vectors and Arrays

> mat=array(1:12,c(3,4))

> mat

[,1] [,2] [,3] [,4]

[1,] 1 4 7 10

[2,] 2 5 8 11

[3,] 3 6 9 12

> mat=matrix(1:12,3,4)

> mat

[,1] [,2] [,3] [,4]

[1,] 1 4 7 10

[2,] 2 5 8 11

[3,] 3 6 9 12

>

11

Page 15: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Assignment, Vectors and Arrays

> mat=array(1:12,c(3,4))

> mat

[,1] [,2] [,3] [,4]

[1,] 1 4 7 10

[2,] 2 5 8 11

[3,] 3 6 9 12

> mat=matrix(1:12,3,4)

> mat

[,1] [,2] [,3] [,4]

[1,] 1 4 7 10

[2,] 2 5 8 11

[3,] 3 6 9 12

>

11

Page 16: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Functions

> plus=function(a,b) a+b> plus(3,4)[1] 7> plus(3)

Error in plus(3) : element 2 is empty;the part of the args list of ’+’ being evaluated was:(a, b)

> plus=function(a,b=0) a+b> plus(3,4)[1] 7> plus(3)[1] 3> plus(1:3,4:5)

[1] 5 7 7Warning message:In a + b : longer object length is not a multiple of shorter object length

>

12

Page 17: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Functions

> plus=function(a,b) a+b> plus(3,4)[1] 7> plus(3)

Error in plus(3) : element 2 is empty;the part of the args list of ’+’ being evaluated was:(a, b)

> plus=function(a,b=0) a+b> plus(3,4)[1] 7> plus(3)[1] 3> plus(1:3,4:5)

[1] 5 7 7Warning message:In a + b : longer object length is not a multiple of shorter object length

>

12

Page 18: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

What is Survival Analysis?

Survival Analysis is the study of lifetimes and their distributions. It usuallyinvolves one or more of the following objectives:

• to explore the behaviour of the distribution of a lifetime.

• to model the distribution of a lifetime.

• to test for differences between the distributions of two or more lifetimes.

• to model the impact of one or more explanatory variables on a lifetimedistribution.

13

Page 19: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

The Nature of Lifetime Data

• It’s almost always incomplete.

– It often involves right-censoring.

– It sometimes involves left-truncation.

• The methods of survival analysis allow for this incompleteness.

14

Page 20: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

The survival Package in R

> install.packages("survival") # first time only

--- Please select a CRAN mirror for use in this session ---trying URL ’http://probability.ca/cran/bin/windows/contrib/2.10/survival_2.35-8.zip’Content type ’application/zip’ length 2445387 bytes (2.3 Mb)opened URLdownloaded 2.3 Mb

package ’survival’ successfully unpacked and MD5 sums checked

The downloaded packages are inC:\Documents and Settings\jones\Local Settings\Temp\RtmpEQ5ZaF\downloaded_packages

> library(survival)

Loading required package: splines

>

15

Page 21: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Creating a Survival Object

Example 1. Complete data lifetimes: 26, 42, 71, 85, 92.

> ex1.times=c(26,42,71,85,92)

> ex1.surv=Surv(ex1.times)

> ex1.surv

[1] 26 42 71 85 92

> class(ex1.surv)

[1] "Surv"

> class(ex1.times)

[1] "numeric"

>

16

Page 22: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Creating a Survival Object

Example 2. Right-censored lifetimes: 26, 42, 71, 80+, 80+.

> ex2.times=c(26,42,71,80,80)

> ex2.events=c(1,1,1,0,0)

> ex2.surv=Surv(ex2.times,ex2.events)

> ex2.surv

[1] 26 42 71 80+ 80+

>

17

Page 23: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Creating a Survival Object

Example 3. Left-truncated and right-censored lifetimes:Left-truncation time is 40 for all individuals;Event/right-censoring times are 42, 71, 80+, 80+.

> ex3.lttimes=rep(40,4)

> ex3.times=c(42,71,80,80)

> ex3.events=c(1,1,0,0)

> ex3.surv=Surv(ex3.lttimes,ex3.times,ex3.events)

> ex3.surv

[1] (40,42 ] (40,71 ] (40,80+] (40,80+]

>

18

Page 24: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Real Data Example

Lifetimes: Times until death of 26 psychiatric patients

Number of deaths: 14

Number of censored observations: 12

Covariates: patient age and sex (15 females, 11 males)

19

Page 25: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Real Data Example

The Data

patient sex age time death patient sex age time death

1 2 51 1 1 14 2 30 37 02 2 58 1 1 15 2 33 35 03 2 55 2 1 16 1 36 25 14 2 28 22 1 17 1 30 31 05 1 21 30 0 18 1 41 22 16 1 19 28 1 19 2 43 26 17 2 25 32 1 20 2 45 24 18 2 48 11 1 21 2 35 35 09 2 47 14 1 22 1 29 34 010 2 25 36 0 23 1 35 30 011 2 31 31 0 24 1 32 35 112 1 24 33 0 25 2 36 40 113 1 25 33 0 26 1 32 39 0

20

Page 26: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Real Data Example

Questions

• Does the lifetime distribution behave the way we expect?

• Are the lifetimes different for females and males?

• Do the lifetimes depend on age?

21

Page 27: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Estimating the Survival Function

We can explore the lifetime distribution by examining nonparametricestimates of the survival function.

The R function survfit allow us to do this.

> library(KMsurv) # get the data> data(psych)> attach(psych)> names(psych)

[1] "sex" "age" "time" "death"

> psych.surv=Surv(age,age+time,death) # create a survival object

> psych.fit1=survfit(psych.surv˜1) # obtain the estimates

> plot(psych.fit1,xlim=c(40,80),xlab="age",ylab="probability",+ main="Survival Function Estimates") # plot the estimates>

22

Page 28: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Estimating the Survival Function

40 50 60 70 80

0.0

0.2

0.4

0.6

0.8

1.0

Survival Function Estimates

age

prob

abili

ty

23

Page 29: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Estimating the Survival Function

Now let’s consider females and males separately.

> psych.fit2=survfit(psych.surv˜sex) # separate by sex

> plot(psych.fit2,xlim=c(40,80),xlab="age",ylab="probability",+ main="Survival Function Estimates for Males (red) and Females",+ col=c("red","blue"))

> plot(psych.fit2,xlim=c(40,80),xlab="age",ylab="probability",+ main="Survival Function Estimates for Males (red) and Females",+ col=c("red","blue"), conf.int=T)>

24

Page 30: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Estimating the Survival Function

40 50 60 70 80

0.0

0.2

0.4

0.6

0.8

1.0

Survival Function Estimates for Females (blue) and Males

age

prob

abili

ty

25

Page 31: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Estimating the Survival Function

40 50 60 70 80

0.0

0.2

0.4

0.6

0.8

1.0

Survival Function Estimates for Females (blue) and Males

age

prob

abili

ty

26

Page 32: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Testing for Differences

The R function survdiff allow us to test for differences between lifetimedistributions.

> survdiff(psych.surv˜sex)

Error in survdiff(psych.surv ˜ sex) : Right censored data only

> psych.surv2=Surv(time,death) # create new survival object> survdiff(psych.surv2˜sex)

Call:survdiff(formula = psych.surv2 ˜ sex)

N Observed Expected (O-E)ˆ2/E (O-E)ˆ2/Vsex=1 11 4 6.24 0.807 1.61sex=2 15 10 7.76 0.650 1.61

Chisq= 1.6 on 1 degrees of freedom, p= 0.205

>

27

Page 33: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Testing for Differences

The R function survdiff allow us to test for differences between lifetimedistributions.

> survdiff(psych.surv˜sex)

Error in survdiff(psych.surv ˜ sex) : Right censored data only

> psych.surv2=Surv(time,death) # create new survival object> survdiff(psych.surv2˜sex)

Call:survdiff(formula = psych.surv2 ˜ sex)

N Observed Expected (O-E)ˆ2/E (O-E)ˆ2/Vsex=1 11 4 6.24 0.807 1.61sex=2 15 10 7.76 0.650 1.61

Chisq= 1.6 on 1 degrees of freedom, p= 0.205

>

27

Page 34: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Fitting a Proportional Hazards Model

The model: h(t|x1, . . . , xp) = h0(t) exp(β1x1 + · · · + βpxp)

• The PH model is often used when we are interested in the impact ofthe covariates, x1, . . . , xp, but not the lifetime distributions themselves.

• We can estimate and make inferences about β1, . . . , βp without esti-mating h0.

• The R function coxph allows us to do this.

28

Page 35: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Fitting a Proportional Hazards Model

> psych.coxph1=coxph(psych.surv˜sex)> summary(psych.coxph1)

Call:coxph(formula = psych.surv ˜ sex)

n= 26

coef exp(coef) se(coef) z Pr(>|z|)sex 0.3900 1.4770 0.6102 0.639 0.523

exp(coef) exp(-coef) lower .95 upper .95sex 1.477 0.677 0.4466 4.884

Rsquare= 0.016 (max possible= 0.926 )Likelihood ratio test= 0.43 on 1 df, p=0.5141Wald test = 0.41 on 1 df, p=0.5227Score (logrank) test = 0.41 on 1 df, p=0.5203

29

Page 36: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Fitting a Proportional Hazards Model

Next we use our survival object psych.surv2, which does not involve left-truncation.

> psych.coxph2=coxph(psych.surv2˜sex)> summary(psych.coxph2)

Call:coxph(formula = psych.surv2 ˜ sex)

n= 26

coef exp(coef) se(coef) z Pr(>|z|)sex 0.7511 2.1194 0.6055 1.241 0.215

exp(coef) exp(-coef) lower .95 upper .95sex 2.119 0.4718 0.6469 6.944

Rsquare= 0.062 (max possible= 0.945 )Likelihood ratio test= 1.66 on 1 df, p=0.1981Wald test = 1.54 on 1 df, p=0.2148Score (logrank) test = 1.61 on 1 df, p=0.2046

Note that the last test is exactly that performed using survdiff.

30

Page 37: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Fitting a Proportional Hazards Model

Finally, consider

> psych.coxph3=coxph(psych.surv2˜age+sex)> summary(psych.coxph3)

Call:coxph(formula = psych.surv2 ˜ age + sex)

n= 26

coef exp(coef) se(coef) z Pr(>|z|)age 0.20753 1.23063 0.05828 3.561 0.00037 ***sex -0.52374 0.59230 0.73753 -0.710 0.47762---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

exp(coef) exp(-coef) lower .95 upper .95age 1.2306 0.8126 1.0978 1.380sex 0.5923 1.6883 0.1396 2.514

Rsquare= 0.553 (max possible= 0.945 )Likelihood ratio test= 20.91 on 2 df, p=2.879e-05Wald test = 14.3 on 2 df, p=0.0007866Score (logrank) test = 21.27 on 2 df, p=2.409e-05

31

Page 38: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Conclusions about this Example

• There is great uncertainty due to the small number of observations.

• Times until death depend on age at first admission to the hospital.

• We cannot conclude that the lifetimes are different for females andmales.

32

Page 39: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Fitting an Accelerated Failure Time Model

• This is a popular fully parametric model for which the lifetime distrib-ution is the same for different covariate values, except that the timescale is multiplied by a different constant.

• The R function survreg can be used to fit an AFT model.

33

Page 40: Survival Analysis using R - Western Universityrdc.uwo.ca/events/docs/presentation_slides/2009-10/Jones-SurvivalR... · Survival Analysis using R Bruce L. Jones Department of Statistical

Summary

• R is a flexible and free software environment for statistical computingand graphics.

• The survival package contains functions for survival analysis.

– Surv creates a survival object.

– survfit estimates (nonparametrically) the survival function.

– survdiff performs tests for differences in lifetime distributions.

– coxph fits the proportional hazards model.

– survreg fits the accelerated failure time model.

These slides are here:

http://www.stats.uwo.ca/faculty/jones/survival_talk.pdf

34