Top Banner
Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9
37

Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Jul 18, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Survival Analysis

STAT 526

Professor Olga Vitek

May 4, 2011

9

Page 2: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Survival Data andSurvival Functions

• Statistical analysis of time-to-event data

– Lifetime of machines and/or parts(called failure time analysis in engineering)

– Time to default on bonds or credit card(called duration analysis in economics)

– Patients survival time under different treatment(called survival analysis in clinical trial)

– Event-history analysis (sociology)

• Why a special topic on survival analysis?

– Non-normal and skewed distribution

– Need to answer questions related to

P (X > t+ t0|X ≥ t0)

– Censored/truncated data

– Here we only focus on right-censored data

9-1

Page 3: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Survival Function

• Continuous survival time T

– its probability density function is f(t)

– its cumulative probability function is F (t)

F (t) = P (T ≤ t) =

∫ t

0f(s)ds

• The survival function of T is

S(t) = P (T > t) = 1− F (t)

– also called survival rate

– steeper curve → shorter survival

– S′(t) = −f(t)

• Mean survival time

µ = E{T} =

∫ ∞0

tf(t) dt =

∫ ∞0

t dF (t) =

∫ ∞0

t d[1− S(t)]

=

∫ ∞0

[∫ t

0dx

]d[1− S(t)] =

∫ ∞0

[∫ ∞x

d[1− S(t)]

]dx

=

∫ ∞0

S(x) dx

9-2

Page 4: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Hazard Function

• The hazard function of T is

λ(t) = lim∆t↘0

P (t ≤ T < t+ ∆t |T ≥ t)∆t

– proportion of subjects with the event per unittime, around time t; λ(t) ≥ 0

– measure of ’proneness’ to the even as functionof time

– λ(t) 6= f(t): λ(t) is conditional on survival to t

• Relates to the survival function

λ(t) = lim∆t↘0

F (t+ ∆t)− F (t)

∆t

/S(t) =

f(t)

S(t)= −

S′(t)

S(t)

= −d

dtlogS(t) (negative slope of log-survival)

• The cumulative hazard function of T is

defined as Λ(t) =∫ t0 λ(s)ds

– Λ(t) = − logS(t)

– S(t) = exp{−Λ(t)}

– λ(t), Λ(t) or S(t) define the distribution

9-3

Page 5: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Parametric Survival Models

• Exponential distribution:

– λ(t) = ρ, where ρ > 0 is a constant, and t > 0

– S(t) = e−ρt, ⇒ f(t) = −S′(t) = ρe−ρt

• Weibull distribution:

– λ(t) = λp(λt)p−1; λ, p > 0 are constants, t > 0

– S(t) = e−(λt)p, ⇒ f(t) = −S′(t) = e−(λt)pλp(λt)p−1

– is the Exponential distribution when p = 1

• Gompertz distribution:

– λ(t) = αeβt; α, β > 0 are constants, t > 0

– is the Exponential distribution when β = 0

• Gompertz-Makeham distribution:

– λ(t) = λ+ αeβt; α, β > 0 are constants, t > 0

– adds an initial fixed component to the hazard

9-4

Page 6: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Non-Parametric SurvivalModels

• Approximate time by discrete intervals

– f(xi) =

{P{T = xi}0 otherwise

– S(xi) =∑

j: xj≥t f(xj)

– λ(xi) = f(xi) / S(xi)

• After substitution:

– f(xi) = λ(xi)∏i−1j=1 (1− λj)

– S(xi) =∏i−1j=1(1− λj)

– S(t) =∏j:xj<t

(1− λj), t > 0

9-5

Page 7: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Right-Censored Data

• Survival time of i-th subject Ti, i = 1, · · · , n.

• Censoring time of i-th subject Ci, i = 1, · · · , n.

• Observed event for ith subject:

Yi = min(Ti, Ci), δi = 1{Ti≤Ci} =

{1, if Ti ≤ Ci0, if Ti > Ci

– Data are reported in pairs (Yi, δi)

= (observedTime, hasEvent)

• Assumption:

– Ci are predetermined and fixed, or

– Ci are random, mutually independent, and inde-pendent of Ti

9-6

Page 8: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Non-parametric Estimation

of Survival Function

And

Comparison Between Groups

With Censoring

9-7

Page 9: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Example: Remission Timesof Leukaemia Patients

• 21 leukemia patients treated with drug (6-mercaptopurine)

• 21 matched controls, no covariates.

> library(MASS) #Described in Venable & Ripley, Ch.13> data(gehan)

> gehanpair time cens treat

1 1 1 1 control2 1 10 1 6-MP3 2 22 1 control4 2 7 1 6-MP5 3 3 1 control6 3 32 0 6-MP7 4 12 1 control8 4 23 1 6-MP9 5 8 1 control10 5 22 1 6-MP11 6 17 1 control12 6 6 1 6-MP13 7 2 1 control14 7 16 1 6-MP15 8 11 1 control16 8 34 0 6-MP17 9 8 1 control.........................

9-8

Page 10: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Kaplan-Meier Estimator ofSurvival Function

• Also called product limit estimator

• Re-organize the data

Distinct Event Times t1 · · · ti · · · tk# of Events d1 · · · di · · · dk

# Survivors Right Before ti n1 · · · ni · · · nk

• Kaplan-Meier estimator S(t) :

– Estimate P (T > t1) by p1 = (n1 − d1)/n1

– Estimate P (T > ti|T > ti−1) by pi = (ni − di)/ni,i = 2, · · · , k

– If ti−1 < t < ti,

S(t) =

P (T > t) = P (T > t1)P (T > t|T > t1) = · · ·

P (T > t1){i−1∏m=2

P (T > tm|T > tm−1)}P (T > t|T > ti−1)

– Estimate S(t) by S(t) = S(ti−1)ni−dini

=∏i:ti≤t

ni−dini

,

(Kaplan & Meier, 1958, JASA)

9-9

Page 11: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Non-Parametric Fit ofSurvival Curves

• Represent data in the censored survival form– create a data structure ’Surv’

– ’+’ represents censored time

> library(survival)> Surv(gehan$time, gehan$cens)1 10 22 7 3 32+ 12 23 8 22 17 6 2 1611 34+ 8 32+ 12 25+ 2 11+ 5 20+ 4 19+ 15 68 17+ 23 35+ 5 6 11 13 4 9+ 1 6+ 8 10+

• K-M estimation of survival function– subset of results for the treatment group:

> fit <- survfit(Surv(time, cens) ~ treat, data=gehan)> summary(fit)

treat=6-MPtime n.risk n.event survival std.err lower95%CI upper95%CI

6 21 3 0.857 0.0764 0.720 1.0007 17 1 0.807 0.0869 0.653 0.996

10 15 1 0.753 0.0963 0.586 0.96813 12 1 0.690 0.1068 0.510 0.93516 11 1 0.627 0.1141 0.439 0.89622 7 1 0.538 0.1282 0.337 0.85823 6 1 0.448 0.1346 0.249 0.807

9-10

Page 12: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Interval K-M Estimatorof Survival Function

• Direct interval estimation of S(t)

– The variance of S(t) can be estimated by

var(S(t)) = [S(t)]2∑i:ti≤t

di

ni(ni − di)

– This is Greenwood’s formula (Greenwood, 1926)

– CI for S(t) is S(t)± 1.96√var[S(t)],

– S(t) ∈ [0,1], but var(S(t)) may be large enough

that the usual CI will exceed this interval

• Interval estimation using log S(t) (preferred)

– The variance of log S(t) can be estimated as

var[ log S(t) ] =∑i:ti≤t

di

ni(ni − di)

– Construct the CI for log S(t)

[L,U ] = log S(t)± 1.96√var[log S(t)],

– For CI of S(t), transform the limits with [eL, eU ].

9-11

Page 13: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Estimate the Cumulative Hazard

Function Λ(t) = − logS(t)

• Using the Kaplan-Meier estimator of S(t):

Λ(t) = − log S(t)

• Alternatively, the Nelson-Aalen Estimator:

Λ(t) =

{0, if t ≤ t1∑

i:ti≤tdini, if t ≥ ti

V ar(Λ(t) =∑i:ti≤t

di

n2i

– better small-sample-size performance than basedon the K-M procedure

– useful in comparing the fit of a parametric modelto its non-parametric alternative

9-12

Page 14: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Non-Parametric Fit ofSurvival Curves

>plot(fit, conf.int=TRUE, lty=3:2, log=TRUE,xlab="Remission (weeks)", ylab="Log-Survival", main="Gehan")

• Plot curves and CI; default CI are on the log scale

• ’log=TRUE’ plots y axis (i.e. S(t)) on the log scale(i.e. y axis shows the hegative cumulative hazard)

0 5 10 15 20 25 30 35

0.05

0.10

0.20

0.50

1.00

Gehan

Remission (weeks)

Sur

viva

l

control6−MP

9-13

Page 15: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Log-Rank Test for Homogeneity

• Non-parametric test

– Compare two populations with hazard functionsλi(t), i = 1,2.

– Collect two samples from each population.

– Construct a pooled sample with k distinct eventtimes

Distinct Failure t1 · · · ti · · · tkTime

Pool # of Failures d1 · · · di · · · dkSample # survivors n1 · · · ni · · · nk

right before tiSample # of Failures d11 · · · d1i · · · d1k

1 # survivors ti n11 · · · n1i · · · n1k

right before tiSample # of Failures d21 · · · d2i · · · d2k

2 # survivors n21 · · · n2i · · · n2k

right before ti

– Note di = d1i + d2i, ni = n1i + n2i, i = 1, · · · , k

– Test the hypotheses

H0 : λ1(t) = λ2(t), t ≤ τvs. Ha : λ1(t) 6= λ2(t) for some t ≤ τ

9-14

Page 16: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Log-Rank Test for

Homogeneity: Procedure

• Consider [ti, ti + ∆) for small ∆Sample Sample Pooled

1 2 Sample# failures d1i d2i di

# survivors at ti n1i − d1i n2i − d2i ni − di# of survivors ti n1i n2i ni

right before ti

• If λ1(t) 6= λ2(t) around ti, there should be associ-ation between sample and event in this 2 × 2 table

• d1i|(ni, di, n1i)H0∼ Hypergeometric(ni, di, n1i):

P (d1i = x|ni, di, n1i) =

(dix

)(ni − din1i − x

)(

nin1i

)=⇒ E[d1i|ni, di, n1i] =

din1i

ni

• Define U =∑k

i=1

[d1i − din1i

ni

]E[U ] = 0, var(U) =

∑ki=1

din1i(ni−di)(ni−n1i)n2i (ni−1)

U/var(U)1/2 H0, asy∼ N(0,1)

9-15

Page 17: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Log-Rank Test in R

(Venable & Ripley Sec.13.2)

> ?survdiff

> survdiff(Surv(time,cens)~treat,data=gehan)

N Observed Expected (O-E)^2/E (O-E)^2/Vtreat=6-MP 21 9 19.3 5.46 16.8treat=control 21 21 10.7 9.77 16.8

Chisq= 16.8 on 1 degrees of freedom, p= 4.17e-05

• Conclusion: Reject H0

• Warning: this test does not adjust for covariates,and may often be inappropriate

– Appropriate for this study, since the individualsare matched pairs

9-16

Page 18: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Parametric Estimation of

Survival Function

and

Comparison Between Groups

9-17

Page 19: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

The Likelihood Function

• The likelihood of i-th observation (yi, δi, xi):{f(yi|Xi), if δi = 1, (not censored)

S(yi|Xi), if δi = 0, (right− censored)

• The likelihood of the data is

L(β) ∝n∏i=1

{[f(yi|Xi)]δi[S(yi|Xi)]1−δi

}– The inference approaches based on the likeli-

hood function can be applied.

∗ See chapter by Lee on ML parameterestimation

– Use AIC or BIC to compare nested models.

– Graphical check of model adequacy

∗ Plot Λ(t) vs. t for exponential models;

∗ Plot log(Λ) vs. log(t) for Weibull models;

∗ Can also plot deviance residuals.

9-18

Page 20: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Accounting for Covariates:Models for Hazard Function

• General form: λ(t) = λ0(t)ex′β

– λ0(t): the baseline hazard

– ex′β: multiplicative effect, independent of time

• Implications for the Survival Function:

– S(t) = e−Λ(t) = e−∫ t

0λ(s) ds = e−Λ0(t)ex

′β

– logΛ(t) = log[−log S(t)] = logΛ0 + x′β

– If the model is appropriate, a plot of log[−log S(t)KM ]for different groups yields roughly parallel lines

9-19

Page 21: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Accounting for Covariates:Models for Hazard Function

• Exponential distribution, no predictors:

– λ(t) = ρ - constant hazard function

– S(t) = e−ρ t - survival function

• Exponential distribution, one predictor:

– λ(t) = eβ0+β1X = eβ0 · eβ1X,

– eβ0 plays the role of λ0,i.e. the constant baseline hazard

• Weibull distribution, no predictors:

– λ(t) = λp p tp−1 - hazard function

– S(t) = e−(λ t)p - survival function

• Weibull distribution, one predictor:

– λ(t) = p tp−1 · e(β0+β1X)p = p tp−1eβp0 · epβ1X

– eβ0+β1X plays the role of λ in overall hazard

– p tp−1eβp0 is the baseline hazard

– eβ0 plays the role of λ in the baseline hazard

9-20

Page 22: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

More on Proportional Hazard

• Assumption of proportional hazard implies:

– The hazard ratio for two subjects i and j withdifferent covariates, at a gven time is

λi(t)

λj(t)=λ0(t)

λ0(t)·eβ0+β1Xi

eβ0+β1Xj

= e(Xi−Xj)β

– constant over time

– only function of covariates

• Can graphically verify the assumption of

proportional hazard.

– Check for parallel lines of SKM(t) on the com-plementary log-log scale

– For the gehan dataset:

> plot(fit, lty=3:4, col=2:3, fun="cloglog",xlab="Remission (weeks)", ylab="log Lambda(t)")

> legend("topleft", c("control", "6-MP"),lty=4:3, col=3:2)

9-21

Page 23: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Verify the Assumptionof Proportional Hazard:

Gehan Study

1 2 5 10 20

−2.

0−

1.5

−1.

0−

0.5

0.0

0.5

1.0

Remission (weeks)

log

Lam

bda(

t)

control6−MP

• Good agreement with the additive structure on thelog-log scale

• The assumption of proportional hazard is plausible

9-22

Page 24: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Accounting for Covariates:Models fot Survival Function

• Also called Accelerated Failure Models

S(t) = S0(t · ex′β)

– S0 is the baseline survival function

– Covariates accelerate or contract the time toevent

– Survival time T0 = T ·ex′β has a fixed distribution

log T0 = log T + X′β, ⇒log T = log T0 −X′β

– Weibull (and its special case Exponential) arethe only distributions that can be simultaneously(and equivalently) specified as proportional haz-ard and accelerated failure models.

9-23

Page 25: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Accelerated Failure:Exponential

• Exponential distribution:

– λ(t) = ρ, S(t) = e−ρ t

– S0(t) = e−t - survival of the standard exponential

• Effect of the covariates:

– S(t) = S0(t · eβ0+β1X), eβ0+β1X = ρ

– If T0 ∼ exp(1), and T are the observed times

T0 = T · eβ0+β1X

log T0 = log T + β0 + β1X, and

log T = −β0 − β1X + log T0

• In R:

log T = β0 + β1X + σ · log ε,

– σ is a scale parameter fixed at σ = 1

– ε ∼ Exp(1)

– use opposite sign of β to estimate survival:

S(t) = S0(t · e−β0−β1X)

9-24

Page 26: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Accelerated Failure:Exponential

> fit.exponential <- survreg(Surv(time,cens) ~ treat,data=gehan, dist="exponential")

> summary(fit.exponential)Value Std. Error z p

(Intercept) 3.69 0.333 11.06 2.00e-28treatcontrol -1.53 0.398 -3.83 1.27e-04

Scale fixed at 1

Exponential distributionLoglik(model)= -108.5 Loglik(intercept only)= -116.8Chisq= 16.49 on 1 degrees of freedom, p= 4.9e-05

• Parameter interpretation:

– −βtreatcontrol = 1.53 > 0

– time until remission is longer for controls

• Predicted survival above T = 10 for trt:

– exp(-10*exp(-3.69))=0.7790189

– comparable to StrtKM(10) = 0.753

9-25

Page 27: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Accelerated Failure: Weibull

• Weibull distribution:

– λ(t) = λp p tp−1 - hazard function

– S(t) = e−(λ t)p - survival function

– Can be viewed as the survival function of theexponential random variable T ′ = T p ∼ Exp(λp)

• Effect of the covariates:

– Identical to the exponential:S(t′) = S0(t′ · eβ0+β1X), eβ0+β1X = λp

– If T0 ∼ exp(1), and T ′ are the observed times

T0 = T ′ · eβ0+β1X

log T0 = log T ′ + β0 + β1X

– Returning to the original notation T ′ = T p:

log T0 = p logT + β0 + β1X

log T = −1

pβ0 −

1

pβ1X +

1

plog T0

9-26

Page 28: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Accelerated Failure: Weibull

• In R Weibull is the default distribution:

log T = β0 + β1X + σ · log ε,

– σ is a scale parameter, σ = 1p, ε ∼ Exp(1)

– use −β/σ to estimate survival:

S(t) = S0(t1/σ · e−β0/σ−β1/σX)

> fit.weibull <- survreg(Surv(time,cens)~treat, data=gehan)

> summary(fit.weibull)Value Std. Error z p

(Intercept) 3.516 0.252 13.96 2.61e-44treatcontrol -1.267 0.311 -4.08 4.51e-05Log(scale) -0.312 0.147 -2.12 3.43e-02

Scale= 0.732Weibull distributionLoglik(model)= -106.6 Loglik(intercept only)= -116.4Chisq= 19.65 on 1 degrees of freedom, p= 9.3e-06

• Predicted survival above T = 10 for trt:

– exp(-10^(1/0.732)*exp(-3.1516/0.732))=0.7308616

– comparable to StrtKM(10) = 0.753

9-27

Page 29: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Variable Selectionand Prediction

• Compare models with and without pair asblocking factor

anova( survreg(Surv(time,cens) ~ treat, data=gehan),survreg(Surv(time,cens)~factor(pair)+treat, data=gehan))

Terms Res.Df -2*LL TestDf Deviance P(>|Chi|)treat 39 213.159 NA NA NA

(pair)+treat 19 181.343 +(pair) 20 31.81597 0.04529

• Prediction for the median survival

– On the linear predictor scale

> fit.weibull.nopairs <-survreg(Surv(time,cens)~treat, data=gehan)

> ?predict.survreg> pred.contr <- predict(fit.weibull.nopairs,

data.frame(treat=’control’), type="uquantile",p=0.5, se=TRUE)

– On the survival function scale

> exp( c(L=pred.contr$fit - 2*pred.contr$se.fit,U=pred.contr$fit + 2*pred.contr$se.fit) )

L.1 U.15.045779 10.396147

9-28

Page 30: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Cox Proportional Hazards

Model

A semi-parametric

approach

9-29

Page 31: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Cox Proportional Hazards(Relative Risk) Model

• Assume:

– Ci and Ti conditionally independent, given Xi.

– Observe k distinct exact failure times(i.e. δi = 1); t1 < t2 < · · · < tk.

– No ties in observed exact failure times(i.e., if δi = δj = 1, then yi 6= yj for i 6= j).

• Define the risk set at time t

R(t) = {i : yi ≥ t} = {individuals alive right before time t}

• Cox (1975) considered the conditional prob-ability

– Probability of event in a small interval aroundtj, given that the individual is in the risk set

P (ind. i failed at [tj, tj + ∆) | i ∈ R(tj)) ≈λ(tj)∆∑

k∈R(tj)λ(tj)∆

9-30

Page 32: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Partial Likelihood Approachto Estimate β (Cox, 1975)

• With covariate Xi(t), assume a proportional hazardsmodel

λ(t) = λ0(t) expXi(t)′β(t)

where λ0(t) is a baseline hazard and eX′iβ(t) is the rel-

ative risk. Treating λ0(t) as a nuisance parameter,one may estimate β(t) by maximizing the partiallikelihood

L(β) =∏j

expXi(tj)′β(tj)∑l∈R(tj)

expXl(t)′β(tj),

where the i(j)th item fails at tj and R(tj) is the riskset at tj.

• Conditional on the history up to tj and the fact thatone item fails at tj, each term within the productis proportional to the likelihood of a multinomialmodel.

• λ0(t) is not specified, which is the non-parametricpart of the model.

• Xl(t)′β(tj) is the parametric part of the model.

9-31

Page 33: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Partial Likelihood Approach toEstimate β (Cox, 1975)

• The partial likelihood function by Cox (1975) is

L(β) =nobs∏i=1

eX′iβ∑

k∈R(tj)eX

′kβ

=n∏i=1

{eX

′iβ∑

k∈R(yi)eX

′kβ

}δi

— Cox argued that the partial likelihood has allproperties of an usual likelihood, e.g., maximizingit for an optimal β =⇒ maximum partial likelihoodestimator (MPLE) β

• Breslow’s Estimator of the Baseline Cumulative Haz-ard Rate:

Λ0(t) =∑j:tj≤t

1∑i∈R(t) e

X′jβ=∑i:yi≤t

δi∑j∈R(yi)

eX′jβ

• There are different ways to relax Assumption 3to handle tied failure times, e.g., exact method,Efron’s method, and Breslow’s method.

Method R Options CommentExact method="exact" accurate, long

Efron’s method="efron" approximate, betterBreslow’s method="breslow" approximate

9-32

Page 34: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Cox Proportional Hazard

• Use exact method to account for ties

• β > 0 as expected

> fit.cox1 <-coxph(Surv(time,cens)~treat,method="exact", data=gehan)

> summary(fit.cox1)

n= 42, number of events= 30coef exp(coef) se(coef) z Pr(>|z|)

treatcontrol 1.6282 5.0949 0.4331 3.759 0.000170 ***

exp(coef) exp(-coef) lower .95 upper .95treatcontrol 5.095 0.1963 2.18 11.91

Rsquare= 0.321 (max possible= 0.98 )Likelihood ratio test= 16.25 on 1 df, p=5.544e-05Wald test = 14.13 on 1 df, p=0.0001704Score (logrank) test = 16.79 on 1 df, p=4.169e-05

9-33

Page 35: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Cox Proportional Hazard

• Add pair as block

> fit.cox2 <- coxph(Surv(time,cens)~treat+factor(pair),method="exact", data=gehan)

> summary(fit.cox2)coef exp(coef) se(coef) z Pr(>|z|)

treatcontrol 3.314679 27.513571 0.742620 4.463 8.06e-06 ***factor(pair)2 -5.015219 0.006636 1.550131-3.235 0.001215 **factor(pair)3 -3.598195 0.027373 1.547371-2.325 0.020053 *

.................................................Rsquare= 0.662 (max possible= 0.98 )Likelihood ratio test= 45.51 on 21 df, p=0.001484Wald test = 27.42 on 21 df, p=0.1573Score (logrank) test = 39.73 on 21 df, p=0.008023

• LR test compares log(partial likelihoods),

but the test has similar properties

> anova(fit.cox1, fit.cox2, test="Chisq")Analysis of Deviance TableCox model: response is Surv(time, cens)Model 1: ~ treatModel 2: ~ treat + factor(pair)

loglik Chisq Df P(>|Chi|)1 -74.5432 -59.915 29.256 20 0.08283 .

9-34

Page 36: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Visualize Model Fit

• Survival curve and CI for an individual withaverage covariates– K-M curves refer to unadjusted populations, Cox

curves refer to an ’average’ patient

– can make survival curves for the Cox model forspecific values of covariates by providing newdata

>plot(survfit(fit.cox1),lty=2:3,xlab="Remission (weeks)",ylab="Survival", main="Gehan", cex=1.5)

0 5 10 15 20 25 30 35

0.0

0.2

0.4

0.6

0.8

1.0

Gehan

Remission (weeks)

Sur

viva

l

9-35

Page 37: Survival Analysis - Purdue University · 2011-05-04 · Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9. Survival Data and Survival Functions Statistical analysis of

Model Diagnostics

• Cox-Snell residuals are most useful for examiningthe overall fit of a model

rc,i = − log[S(yi|xi)] = Λ(yi|xi) = Λ0(yi)eX′iβ

— Λ0(t) is estimated by the Breslow’s estimator.

— If the estimates Λ0(t) and β were accurate, {(rc,i, δi), i =1, · · · , n} are right-censored observations of Exponential(1).

— For the right-censored data {(rc,i, δi), i = 1, · · · , n},construct the Nelson-Aalen estimator Λ(t) and plotΛ(t) vs. t.

• Martingale residuals can be used to determine thefunctional form of a covariate

rm,i = δj − rc,j

— Fit the Cox model with all covariates except theone of interest, and plot the martingale residualsagainst the covariate of interest.

• Deviance residuals are approximately symmetricallydistributed about zero and large values may indicateoutliers.

rd,i = sign(rm,i)√−2[rm,i + δi log(rc,j)]

9-36