Lecture 12: Cox Proportional Hazards Model Introduction.
Post on 14-Dec-2015
232 Views
Preview:
Transcript
Lecture 12: Cox Proportional Hazards Model
Introduction
Cox Proportional Hazards Model
• Names– Cox regression– Semi-parametric proportional hazards– Proportional hazards model– Multiplicative hazards model
• When?– 1972
• Why?– Allows for adjustment of covariates (continuous and categorical) in a
survival setting– Allows prediction of survival based on a set of covariates
• Analogous to linear and logistic regression in many ways
Cox PHM Notation (K & M)
• Data on n individuals:– Tj : time on study for individual j
– dj : event time indicator for individual j
– Zj : vector of covariates for individual j
• More complicated: Zj(t)– Covariates are time dependent– May change with time/age
Basic Model
0
0
0 1exp
t
t
p
k kk
h t h t c
h t e
h t Z
Z
Z Z
Comments on the Basic Model
• h0(t):– Arbitrary baseline hazard– Notice that it varies by t
• b:– Regression coefficient vector– Interpretation is a log hazard ratio
• Semi-parametric form– Non-parametric baseline hazard– Parametric form assumed for covariate effects only
Linear Model Formulation
• Usual formulation
• Coding of covariates similar to linear and logistic (and other generalized linear models)
Refresher of Coding Covariates
• Should be nothing new• Two kinds of “independent” variables
– Quantitative– Qualitative
• Quantitative are continuous– Need to determine scale
• Units• Transformations?
• Qualitative are generally categorical– Ordered– Nominal– Coding affects interpretation
Why Proportional
• Hazard ratio• Does not depend on t (i.e. it is constant over time)• But, it is proportional (constant multiplicative factor)• Also referred to (sometimes) as the relative risk
Simple Example
• One covariate:
• Hazard ratio:
• Interpretation: exp{btrt}is the risk of having the event in the new treatment group vs. the standard treatment group
• Interpretation: At any point in time, the risk of the event in the new treatment group is exp{btrt} time the risk in the standard treatment group
1 new treatment
0 standard treatment
z
z
Fig 3.
Cantù M G et al. JCO 2002;20:1232-1237
©2002 by American Society of Clinical Oncology
Hazard Ratios
• Assumption: “proportional hazards”• The risk does not depend on time• That is “the risk is constant over time”• But that is still vague…
• Hypothetical example: Assume the hazard ratio is 0.5– Patient in new therapy group are at half the risk of death as
those in the standard treatment, at any given point in time
• Hazard function = P(die at time t| survived to time t)
Hazard Ratios
• Hazard ratio =
• Makes assumption that this ratio is constant over time
hazard function New
hazard function Std
Interpretation Again
• For any fixed point in time, individuals in the new treatment group have half the risk of death as the standard treatment group.
A Slightly More Complicated Example
• What if we had 2 binary covariates?• How is the hazard ratio estimated in this case?• What about the proportional hazards
assumption?
A Slightly More Complicated Example
• Consider a model that includes
A Slightly More Complicated Example
• Our model looks like:
A Slightly More Complicated Example
• From this we can estimate 4 possible hazard rates
A Slightly More Complicated Example
• And if we “compare” the different hazards by taking the ratio we get
A Slightly More Complicated Example• But what does this mean in terms of
proportional hazards?
Hazard ratio is not always valid…
Hazard ratio = 0.71
Let’s Think About the Likelihood…
Let’s Think About the Likelihood…
Let’s Think About the Likelihood…
Let’s Think About the Likelihood…
Partial Likelihood
• The partial likelihood is defined as
• Where– j = 1, 2, …, n– No ties– t1 < t2 < … < tD
– Z(i)k is the kth covariate associated with the individual whose failure time is ti
– R(ti) =Yi is the risk set at time ti
1
11
exp
expi
pD k ikk
pi
k ikj R t k
ZL
Z
Things to Notice
• Numerator only depends on information from a patient who experiences the event
• The denominator incorporates information across all patients in the risk set
Constructing the Likelihood
• Without Censoring…• Say we have the following data on n = 5
subjects– Observed times and even indicators:
• ti = 11, 12, 14, 16, 21
• di = 1, 1, 1, 1, 1
– And a single binary covariate• zi = 0, 1, 0, 1, 1
Constructing the Likelihood• First let’s construct our risk set for each
unique time
Constructing the Likelihood
• Now, we can construct our likelihood…
Constructing the Likelihood
• Now, we can construct our likelihood…
Constructing the Likelihood
• But what if we have censoring?• Consider the revised data:
– Observed times and even indicators:• ti = 11, 12, 14, 16, 21
• di = 1, 1, 0, 1, 0
– And a single binary covariate• zi = 0, 1, 0, 1, 1
Constructing the Likelihood
• Again let’s construct our risk set for each unique time
Constructing the Likelihood
• And again we can construct our likelihood…
Estimation
• The log-likelihood
• Maximize log-likelihood to solve for estimates of b
Estimation
• Maximize log-likelihood to solve for estimates of b
• Score equations and information matrices are found using standard approaches
• Solving for estimates can be done numerically (e.g. Newton-Raphson)
Tests of the Model
• Testing that bk = 0 for all k = 1, 2, …, p
• Three main tests– Chi-square/ Wald test– Likelihood ratio test– Score test
• All three have chi-square distribution with p degrees of freedom
Example: CGD
• Study examining the impact of gamma interferon treatment on infection in people with chronic granulotomous disease
• 203 subject– Main variable of interest is treatment
• Placebo• Gamma interferon
– Other variables • Demographics (age, height, weight)• Steroid use• Pattern of inheritance• Treatment center …
• Outcome: Time to first major infection
Cox PHM Approach> data(cgd)> st<-Surv(cgd$time, cgd$infect)> reg1<-coxph(st~cgd$treat)> reg1Call:coxph(formula = st ~ cgd$treat)
coef exp(coef) se(coef) z pcgd$treatrIFN-g -1.09 0.337 0.268 -4.06 4.9e-05
Likelihood ratio test=18.9 on 1 df, p=1.36e-05 n= 203, number of events= 76
> attributes(reg1)$names [1] "coefficients" "var" "loglik" "score" "iter" "linear.predictors" [7] "residuals" "means" "concordance" "method" "n" "nevent" [13] "terms" "assign" "wald.test" "y" "formula" "xlevels" "contrasts" "call" $class[1] "coxph"
Results> summary(reg1)Call:coxph(formula = st ~ cgd$treat)
n= 203, number of events= 76
coef exp(coef) se(coef) z Pr(>|z|) cgd$treatrIFN-g -1.0864 0.3374 0.2677 -4.059 4.93e-05 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
exp(coef) exp(-coef) lower .95 upper .95cgd$treatrIFN-g 0.3374 2.964 0.1997 0.5702
Likelihood ratio test= 18.92 on 1 df, p=1.364e-05Wald test = 16.47 on 1 df, p=4.933e-05Score (logrank) test = 18.07 on 1 df, p=2.124e-05
Fitting More Covariates in R> reg2<-coxph(st~treat+steroids+inherit+hos.cat+sex+age+weight, data=cgd)> reg2Call:coxph(formula = st ~ treat + steroids + inherit + hos.cat + sex + age + weight, data = cgd) coef exp(coef) se(coef) z ptreatrIFN-g -1.2025 0.300 0.2828 -4.253 2.1e-05steroids 1.7743 5.896 0.5852 3.032 2.4e-03inheritautosomal 0.6169 1.853 0.2824 2.184 2.9e-02hos.catUS:other 0.0589 1.061 0.3208 0.184 8.5e-01hos.catEurope:Amsterdam -0.5687 0.566 0.4432 -1.283 2.0e-01hos.catEurope:other -0.6232 0.536 0.4956 -1.257 2.1e-01sexfemale -0.6193 0.538 0.3872 -1.600 1.1e-01age -0.0861 0.917 0.0336 -2.566 1.0e-02weight 0.0235 1.024 0.0127 1.858 6.3e-02
Likelihood ratio test=41.2 on 9 df, p=4.65e-06 n= 203, number of events= 76
Next Time
More on constructing our hypothesis tests next time…
top related