Lecture 17: Regression Diagnostics II Residuals. Residuals are used to investigate the lack of fit of a model to a given subject For Cox regression, there’s.

Lecture 17: Regression Diagnostics II

Residuals

• Residuals are used to investigate the lack of fit of a model to a given subject

• For Cox regression, there’s no easy analog to the usual “observed minus predicted” residuals from linear regression

Using Residuals for Diagnostics

• Several types– Cox-Snell residuals– Martingale residuals– Deviance residuals (similar to Martingale)– Schoenfeld residuals

• What are residuals?– Linear regression:– What is the interpretation of a residual in Cox regression?– Not necessarily the same

ˆi i ir y y

Cox-Snell “residuals”

• Where – H0(Tj) = baseline cumulative hazard at time Tj

– Zjk is kth covariate value for the jth person

– bk is the coefficient estimate for the kth covariate

• Interpretation– CS residuals are the expected number of events for each observation– Think of as expected counts (not residuals)

• Theory– If model fits, r’s should look like a censored sample from a unit

exponential distribution (i.e. l = 1)– That is, deviations from expected should be small

0 1ˆ exp

j j jk kkr H T Z b

Why? Some Theory Behind This…• Assume X has survival distribution SX(X)

– SX(X) = P(X > x)

– HX(X)=-log(SX(X))

• Then SX(X) = exp{-HX(X)}

• Define Y = H(X) (Y is cumulative hazard of X)• Then

• Thus Y ~ exp(1) regardless of the distribution of X • And so HX(X) should look ~ exp(1)

S Y P Y y

P H X y

P X H y

Cox-Snell Residuals

• How do we use these residuals in linear regression?

• Assess:– Model fit– Model assumptions– Shape of covariates

• What should we compare these Cox-Snell residuals to?

Empirical vs. Fitted

• rj = Cox-Snell residuals • CS residuals are always >0• Hcs(t) = Nelson-Aalen H(t) = empirical cumulative

hazard– Estimate by fitting cox model with residuals as time

variable and dj as event indicator

• If model fits/obeys, i.e. assumptions/ covariates are appropriately modeled then:– Plot of rj’s vs. Hcs(t) should be…

Implementation

• Get Cox-Snell residuals– Get linear predictor(s) Zb

– Get baseline cumulative hazard– Multiply– OR… get them from R

• Get cumulative hazard estimates:– Estimate SNA(t) using KM approach with event time

= rj and event indicator = dj

– Transform to Hcs(t) scale

Getting Residuals#fit regressionreg<-coxph(st~dx+fab+ttrans+mtx+dnage+ptage+dnage*ptage, method="breslow")

#get cox-snell residualspar(mfrow=c(1,2))mart.res<-resid(reg, type="martingale") #or just use mart.res<-reg$residcs.res<-event-mart.res

#Plot of residuals vs. cum hazardfitres<-survfit(coxph(Surv(cs.res,bmt$Either)~1,method="breslow"),type="aalen")plot(fitres$time,-log(fitres$surv),type="s",xlab="Cox-Snell Residuals",

ylab="Estimated Cumulative Hazard Function", lwd=2)abline(0,1,col=2, lwd=2)

## Alternatively, usereg2<-survfit(Surv(cs.res,bmt$Either)~1)Htilde<-cumsum(reg2$n.event/reg2$n.risk)plot(reg2$time,Htilde,type="s",xlab="Cox-Snell Residuals",

ylab="Estimated Cumulative Hazard Function", lwd=2, col=4)abline(0,1,col=2,lwd=2)

Diagnostic Plots

0.0 0.5 1.0 1.5 2.0 2.5 3.0

Cox-Snell Residuals

0.0 0.5 1.0 1.5 2.0 2.5 3.0

Cox-Snell Residuals

What about MTX?

• Recall MTX did not meet the proportional hazards assumption

• Let’s look at each MTX group to see how this effects fit

• Also look at MTX stratified model

R Code: MTX Groupsst0<-Surv(dfs0, event0)reg0<-coxph(st0~dx0+fab0+ttrans0+dnage0+ptage0+dnage0*ptage0, method="breslow")

st1<-Surv(dfs1, event1)reg1<-coxph(st1~dx1+fab1+ttrans1+dnage1+ptage1+dnage1*ptage1, method="breslow")

cs.res0<-event0-resid(reg0, type="martingale")cs.res1<-event1-resid(reg1, type="martingale")

fitres0<-survfit(coxph(Surv(cs.res0,event0)~1,method="breslow"),type="aalen")fitres1<-survfit(coxph(Surv(cs.res1,event1)~1,method="breslow"),type="aalen")

plot(fitres$time,-log(fitres$surv),type="s",xlab="Cox-Snell Residuals", ylab="Estimated Cumulative Hazard Function", lwd=2, ylim=c(0, 3))

lines(fitres0$time,-log(fitres0$surv),type="s", col=3, lwd=2, lty=2)lines(fitres1$time,-log(fitres1$surv),type="s", col=4, lwd=2, lty=2)abline(0,1,col=2, lwd=2)legend(2, .5, c("Full","MTX=0","MTX=1"), col=1:3, lty=c(1,2,2), lwd=2, bty="n")

R Code: MTX Stratifiedst<-Surv(dfs, event)reg.strat<-coxph(st~dx+fab+ttrans+dnage+ptage+dnage*ptage+strata(mtx), method="breslow")

cs.strat<-event-resid(reg.strat, type="martingale")

fitres.strat<-survfit(coxph(Surv(cs.strat,event)~1,method="breslow"),type="aalen")

plot(fitres $time,-log(fitres$surv), type="s", xlab="Cox-Snell Residuals", ylab="Estimated Cumulative Hazard Function", lwd=2, ylim=c(0, 3))

lines(fitres.strat$time, -log(fitres.strat$surv), type="s", col=5, lwd=2)abline(0,1,col=2, lwd=2)legend(2, .5, c("Full","MTX stratifed"), col=c(1,5), lwd=2, bty="n")

R: MTX Groups

0.0 0.5 1.0 1.5 2.0 2.5 3.0

Cox-Snell Residuals

FullMTX=0MTX=1

MTX Stratified Model

0.0 0.5 1.0 1.5 2.0 2.5 3.0

Cox-Snell Residuals

FullMTX stratifed

Alternative Plots for CS Residuals

• There are alternative plots you can consider but they tend to require larger N

• We can plot:– CS residuals vs. Exp(1)– CS residuals vs. CS-NA cumulative hazard estimate

• Let’s consider a larger dataset…

Large Data Set

• Study examining factors that impact the time until first-time mother’s weaned their infants

• Data includes information on 927 mothers

• Variables in the data include– Race (white, black, other)– Mother in poverty– Smoking at childbirth– Alcohol use at child birth– Age of the mother– Years of school– Prenatal care after the 3rd month

Model> data(bfeed)> mod<-coxph(Surv(duration, delta)~factor(race)+poverty+smoke+alcohol+agemth+yschool+pc3mth, data=bfeed)> modCall:coxph(formula = Surv(duration, delta) ~ factor(race) + poverty + smoke + yschool + pc3mth, data = bfeed) coef exp(coef) se(coef) z pfactor(race)2 0.177 1.193 0.1040 1.70 0.0890factor(race)3 0.306 1.358 0.0970 3.16 0.0016poverty -0.222 0.801 0.0929 -2.39 0.0170smoke 0.249 1.283 0.0791 3.14 0.0017alcohol 0.170 1.185 0.1227 1.38 0.1700yschool -0.040 0.961 0.0192 -2.08 0.0370

Likelihood ratio test=29.7 on 6 df, p=4.51e-05 n= 927, number of events= 892

CS As Step Function vs. 45o Line###Obtaining CS residualscs.resid<-bfeed$delta-mod$resid

### Fitting Cox model for residuals### Compare to 450 linefitres<-survfit(coxph(Surv(cs.resid,bfeed$delta)~ 1,method="breslow"),type="aalen")plot(fitres$time, -log(fitres$surv), type="s",xlab="Cox-Snell Residuals", ylab="Estimated Cumulative Hazard Function", lwd=2, ylim=c(0, 8))abline(0,1,col=2, lwd=2)

Comparison to Exp(1)

###Obtaining CS residualscs.resid<-bfeed$delta-mod$resid

### Also compare to exponentialefit<-survfit(Surv(duration, delta)~1, data=bfeed)exp1<-rexp(10000, 1)plot(density(exp1), lwd=2, col=1)lines(density(cs.resid), col=2, lwd=2, lty=2)

Comparing CS and NA###Comparing NA to CSNAe<--log(efit$surv)cs<-cbind(cs.resid, bfeed$duration)na<-cbind(NAe, efit$time)all<-merge(cs,na, by=2, all=T)cs<-all$cs.resid[-927]na<-all$NAe[-927]plot(cs, na, pch=16, xlab="Cox-Snell", ylab="Cox-Snell - Nelson-Aalen")abline(0,1, col=2, lwd=2)fit1<-lm(na~cs)abline(fit1, lwd=2, col=3)

Problem with Cox-Snell Approach

• CS residuals can diagnose that the model does not fit

• But they don’t help figure out why or where• Note, overall pattern can be helpful (e.g. CS >

NA or vice versa)• Martingale residuals are better

Martingale Residuals

• When model is correct, E(Mj) = 0

• Range between -∞ and 1• Difference over time between observed and

expected number of events– Mj tends to be negative if estimated cumulative

hazard is too large– Mj tends to be positive if estimated cumulative

hazard is too small

0 1ˆ ˆ exp

j j j jk kk

M H T Z b

Martingale Residuals

• Average martingale can be computed for different values of a covariate– Or range of covariate values– Determines if Mjs tend to be positive or negative

in the range• Helps to find improper specification of effect

of covariate on hazard

ˆ ˆj j jM r

Use of Martingale Residuals

• To examine best functional form of the given covariate• Approach:

– Assume optimal model is:

– Fit model with only Z*

– Save Martingale residuals from Z* model– Plot Martingale residuals versus Z1

– Use smoother to help find best transformation• Only works on continuous or ordinal variables!

* * *1 0 1, exp expH t Z Z H t Z f Z

Example###Martingale Residuals- NHL vs HOD in BMTbmt2<-read.csv("H:/BMTRY_722_Summer2013/BMT2.csv")

fit1<-coxph(Surv(time, cens)~factor(dis)+factor(graft)+karn+factor(dis)*factor(graft), data=bmt2)

res1<-resid(fit1, type="martingale")

fit2<-coxph(Surv(time, cens)~factor(dis)+factor(graft)+wait+factor(dis)*factor(graft), data=bmt2)

res2<-resid(fit2, type="martingale")

par(mfrow=c(1,2))plot(bmt2$wait, res1, xlab="Waiting Time (months)", ylab="Martingale Residuals", pch=16)lines(lowess(bmt2$wait, res),col=2, lwd=2)lines(bmt2$wait, lm(res1~bmt2$wait), col=4, lwd=2)

plot(bmt2$kar, res2, xlab="Karnofsky Score", ylab="Martingale Residuals", pch=16)lines(lowess(bmt2$karn, res),col=2, lwd=2)lines(bmt2$karn, lm(res1~bmt2$wait), col=4, lwd=2)

Martingale Plots

Model with “Inappropriate” Waiting Time

> fit<-coxph(Surv(time, cens)~wait+factor(dis)+factor(graft)+karn+factor(dis)*factor(graft), data=bmt2)

> summary(fit)Call:coxph(formula = Surv(time, cens) ~ wait + factor(dis) + factor(graft) + karn + factor(dis) * factor(graft), data = bmt2)

n= 43, number of events= 26

coef exp(coef) se(coef) z Pr(>|z|) wait -0.01656 0.98357 0.01021 -1.623 0.10461 factor(dis)2 2.76033 15.8050 0.94738 2.914 0.00357 ** factor(graft)2 0.63940 1.89534 0.59372 1.077 0.28151 karn -0.04948 0.95172 0.01242 -3.984 6.77e-05 ***dis*graft -2.37093 0.09339 1.03548 -2.290 0.02204 *

How Do We Find Best Cutpoint?

• Want cutoff that gives largest difference between individuals in the two data-defined groups

• Clinically chosen value (i.e. what do clinicians find meaningful?

• Choose based on data (often good choice)– Recall Contal and O’Quigley

• Just keep in mind this may bias the model towards inclusion of covariate

Outcome-Oriented Choice

• Contal and O’Quigley steps1. Identify possible unique cut points2. Construct dichotomized predictor for all cut points3. Conduct log-rank test for each dichotomized

version of the variable4. Choose cutoff based on largest log-rank statistic

• Based on this procedure, waiting time of 84 months is “best” cut point

Model with Dichotomized Waiting Time

> #Model with dichotomized waiting time> iwait<-ifelse(84<=bmt2$wait, 1, 0)> fit.d<-coxph(Surv(time, cens)~iwait+factor(dis)+factor(graft)+karn++ factor(dis)*factor(graft), data=bmt2)> summary(fit.d)Call:coxph(formula = Surv(time, cens) ~ iwait + factor(dis) + factor(graft) + karn + factor(dis) * factor(graft), data = bmt2)

coef exp(coef) se(coef) z Pr(>|z|) iwait -2.05982 0.12748 1.05066 -1.961 0.0499 * factor(dis)2 2.32733 10.25050 0.73324 3.174 0.0015 ** factor(graft)2 0.66509 1.94467 0.59432 1.119 0.2631 karn -0.05504 0.94644 0.01234 -4.459 8.23e-06 ***dis:graft -2.06684 0.12658 0.92577 -2.233 0.0256 *

*NOTE: we can not use the p-value for our waiting time indicator. We must adjust for multiple comparisons because we consider MANY cut points for waiting time (pg. 273 in text).-Here the adjusted p-value is 0.679

What About Other Transformations

• Mayo Clinic trial in primary biliary cirrhosis (PBC) of the liver (1974 to 1984)

• 424 PBC patients • 312 randomized to placebo or D-penicillamine• Clinical, biochemical, and histologic measures

also collected• Goal:

– develop natural history model (ignoring treatment) to determine how baseline status impacts survival

PCB Survival

PBC Example

• Covariates of interest– Age– Albumin – Prothrombin time (i.e. clotting time)– Presence of edema– Serum bilirubin (mg/dL)

• Edema is a factor variable and is used “as is”• Age appeared to be okay as well• What about the other 3?

Where to Start

1. Fit a model with age and edema2. Get the martingale residuals from this fit3. Plot the martingale residuals

-vs. albumin-vs. bilirubin-vs. prothrombin time

4. Check possible transformations where necessary

Albumin

Bilirubin

The Problem Child… Clotting Time

• Log transformation is a good first guess but it doesn’t always work

• Deviations in the plot don’t necessarily lead us easily to the best functional form

• There are many we can try – Z×log(Z)– exp{Z}– Power transformations (think Box-Cox)

• So let’s “explore” a little

Try Z×lnZ and eZ

Power Transformations?

The Point?

• Sometimes it is difficult to find a good transformation

• Choose among the set of possibilities– Is one transformation more interpretable?– Does a particular transformation make clinical

sense?• Add log(bilirubin) and log(albumin) to the

model with age and edema to see if this helps

Model and Residuals> ### Model including bilirubin and albumin> fit<-coxph(Surv(time, status)~age + factor(edema)+log(bili)+log(albumin))> summary(fit)Call:coxph(formula = Surv(time, status) ~ age + factor(edema) + log(bili) + log(albumin))

coef exp(coef) se(coef) z Pr(>|z|) age 0.036521 1.037196 0.007777 4.696 2.65e-06 ***factor(edema)0.5 0.282970 1.327065 0.225888 1.253 0.210315 factor(edema)1 1.051715 2.862556 0.287048 3.664 0.000248 ***log(bili) 0.928080 2.529648 0.083925 11.058 < 2e-16 ***log(albumin) -2.716231 0.066123 0.648245 -4.190 2.79e-05 ***

> res1<-resid(fit, type="martingale")

Looking Again at Clotting Time### Model with JUST age and edemafit<-coxph(Surv(time, status)~age + factor(edema))plot(protime, res1, xlab="Clot Time", ylab="Martingale Residuals", pch=16, main="Model w/ Age & Edema")lines(lowess(protime, res1), col=2, lwd=2)lines(protime, fitted(lm(res1~protime)), col=4, lwd=4)

### Model including bilirubin and albuminfit<-coxph(Surv(time, status)~age + factor(edema)+log(bili)+log(albumin))res2<-resid(fit, type="martingale")plot(protime, res2, xlab="Clotting Time", ylab="Martingale Residuals", pch=16, main="Model with 4 covariates")lines(lowess(protime, res2), col=2, lwd=2)lines(protime, fitted(lm(res2~protime)), col=4, lwd=4)

Compare Residual Plots

Transformations in the Models?

What to Conclude?

• Transformations better but still not great> pt1.5<- (protime)^(-1.5)> fit<-coxph(Surv(time, status)~age + factor(edema)+log(bili)+log(albumin)+pt1.5)> fitCall:coxph(formula = Surv(time, status) ~ age + factor(edema) + log(bili) + log(albumin) + pt1.5)

coef exp(coef) se(coef) z page 0.0237 1.02e+00 0.00723 3.28 1.1e-03factor(edema)0.5 0.2821 1.33e+00 0.21378 1.32 1.9e-01factor(edema)1 0.9004 2.46e+00 0.28150 3.20 1.4e-03log(bili) 0.8360 2.31e+00 0.07702 10.86 0.0e+00log(albumin) -2.633 7.19e-02 0.60271 -4.37 1.3e-05pt1.5 -46.76 4.90e-21 20.35634 -2.30 2.2e-02

Likelihood ratio test=229 on 6 df, p=0 n= 418, number of events= 186

Concerns with Martingale Residuals?

• One problem with Martingale residuals… they tend to be asymmetric

• Range from -∞ to 1• These are therefore best used to assess

covariate form, NOT general goodness of fit.• Also note, there is susceptibility to overfitting

when playing around with functional form

Outliers

• Defined in survival as– an unusual observed failure time given the

covariate value Zj

• Martingale residuals do measure the degree to which the jth subject is an outlier

• BUT as we mentioned the distribution is heavily skewed

• Makes it hard to identify outliers

Deviance Residuals

• Deviance residuals are transformation of Martingale residuals

• Better behaved than Martingale residuals– More like ~N(0,1)

• Helpful for determining outliers• Negative for survival times that are smaller

than expected

12ˆ ˆ ˆ ˆsin 2 logj j j j j jD M M M

Deviance vs. Martingale Residuals

• Deviance residuals have shorter left and longer right tails

• Distribution more closely resembles ~N(0,1)• Because deviance residuals ~N, we can think

of outliers as values outside the range– (-3, 3)– More conservative? (-2.5, 2.5)

Compare to ~N(0, 1)

####################################### DEVIANCE RESIDUALS #######################################fit2<-coxph(Surv(time, delta) ~ factor(dtype) + factor(gtype) + score + wtime +

factor(dtype)*factor(gtype), data=hodg)

#Comparing Density of Martingale and deviance residuals to ~N(0, 1)par(mfrow=c(1,2))mart.res<-resid(fit2, type="martingale")plot(density(mart.res), main="Martingale Residuals", lwd=2)lines(seq(-3,3,0.1), dnorm(seq(-3,3,0.1)), col=2, lwd=2)

dev.res<-resid(fit2, type="deviance")plot(density(dev.res), main="Deviance Redisduals", lwd=2, ylim=c(0, 0.4))lines(seq(-3,3,0.1), dnorm(seq(-3,3,0.1)), col=2, lwd=2)

Compare to ~N(0, 1)

Martingale vs. Deviance Residuals#### Compare deviance to martingale residualspar(mfrow=c(2,2))fit1<-coxph(Surv(time, delta)~factor(dtype)+factor(gtype)+score+factor(dtype)*factor(gtype), data=hodg)mart.res<-resid(fit1,type="martingale")plot(hodg$wtime, mart.res, xlab="Time to Transplant (months)", ylab="Martingale Residuals", pch=16)lines(lowess(hodg$wtime, mart.res),col=2, lwd=2)lines(hodg$wtime, fitted(lm(mart.res~hodg$wtime)), col=4, lwd=2)

dev.res<-resid(fit1,type="deviance")plot(hodg$wtime, dev.res, xlab="Time to Transplant (months)", ylab="Deviance Residuals", pch=16)lines(lowess(hodg$wtime, dev.res),col=2, lwd=2)lines(hodg$wtime, fitted(lm(dev.res~hodg$wtime)), col=4, lwd=2)

fit2<-coxph(Surv(time, delta)~factor(dtype)+factor(gtype)+wtime+factor(dtype)*factor(gtype), data=hodg)mart.res<-resid(fit2,type="martingale")plot(hodg$score, mart.res, xlab="Karnofsky Score", ylab="Martingale Residuals", pch=16)lines(lowess(hodg$score, mart.res),col=2, lwd=2)lines(hodg$score, fitted(lm(mart.res~hodg$score)), col=4, lwd=2)

dev.res<-resid(fit2,type="deviance")plot(hodg$score, dev.res, xlab="Karnofsky Score", ylab="Deviance Residuals", pch=16)lines(lowess(hodg$score, dev.res),col=2, lwd=2)lines(hodg$score, fitted(lm(dev.res~hodg$score)), col=4, lwd=2)

Back to Outliers

• In order to uses our deviance residuals to determine potential outliers– Plot Dj versus the risk score,

• Again, anything outside of (-3, 3) or even more conservative…

k jkkb Z

R Code> fit<-coxph(Surv(time, status) ~ age + factor(edema) + log(bili) + log(albumin))> fitCall:coxph(formula = Surv(time, status) ~ age + factor(edema) + log(bili) + log(albumin))

coef exp(coef) se(coef) z page 0.0244 1.0247 0.00731 3.34 8.3e-04factor(edema)0.5 0.3350 1.3980 0.21269 1.58 1.2e-01factor(edema)1 1.0680 2.9096 0.27476 3.89 1.0e-04log(bili) 0.8571 2.3563 0.07629 11.23 0.0e+00log(albumin) -2.6560 0.0702 0.60476 -4.39 1.1e-05

Likelihood ratio test=224 on 5 df, p=0 n= 418, number of events= 186

> dev.res<-resid(fit, type="deviance")> lp<-predict(fit, type="lp")> plot(lp, dev.res, xlab=“Risk Score", ylab="Deviance Residual", pch=16)> abline(h=c(-2.5, 2.5), col="red", lwd=2)> abline(h=c(-3, 3), col=4, lwd=2)

Outlier Plot

Investigating Outliers> summary(dev.res) Min. 1st Qu. Median Mean 3rd Qu. Max. -2.648 -0.737 -0.438 -0.024 0.915 2.997 > summary(cbind(time, status, age, log(albumin), log(bili), edema)) time status age log(albumin) Min. : 41 Min. :0.000 Min. :26.28 Min. :0.6729 1st Qu.:1093 1st Qu.:0.000 1st Qu.:42.83 1st Qu.:1.1763 Median :1730 Median :0.000 Median :51.00 Median :1.2613 Mean :1918 Mean :0.445 Mean :50.74 Mean :1.2442 3rd Qu.:2614 3rd Qu.:1.000 3rd Qu.:58.24 3rd Qu.:1.3271 Max. :4795 Max. :1.000 Max. :78.44 Max. :1.5347 log(bili) edema Min. :-1.2040 Min. :1.000 1st Qu.:-0.2231 1st Qu.:1.000 Median : 0.337 Median :1.000 Mean : 0.5715 Mean :1.201 3rd Qu.: 1.2238 3rd Qu.:1.000 Max. : 3.3322 Max. :3.000

Investigating Outliers> fit coef exp(coef) se(coef) z page 0.0244 1.0247 0.00731 3.34 8.3e-04factor(edema)0.5 0.3350 1.3980 0.21269 1.58 1.2e-01factor(edema)1 1.0680 2.9096 0.27476 3.89 1.0e-04log(bili) 0.8571 2.3563 0.07629 11.23 0.0e+00log(albumin) -2.6560 0.0702 0.60476 -4.39 1.1e-05

> cbind(dev.res, cbind(time, status, age, log(albumin), log(bili), edema))[abs(dev.res) >= 2.5,]dev.res time status age log(alb) log(bili) edema

87 2.9905 198 1 37.28 1.482 0.0953 0119 2.5753 515 1 54.26 1.343 -0.5108 0293 -2.6483 1320 0 57.20 1.092 2.1401 1319 2.9967 41 1 46.00 1.075 1.6094 0.5331 2.6294 94 1 46.00 1.270 1.1314 1393 2.5264 193 1 52.00 1.085 -0.3567 1

Caveat with Deviance Residuals

• As we’ve seen, deviance residuals can be helpful for identifying outliers

• However, given that we are assuming a normal approximation for our residuals, we need to think about sample size

• In data with a large number of censored observations (>25%), deviance residuals will tend to be too large.

Influence

• Consider only fixed-time covariates• High leverage

– An unusual observation with respect to the covariate vector Zi

• High influence– An observation for which the combination

• Degree to which it is an outlier • And its leverage • = strong influence on estimates of b

Delta-Betas

• Let be the estimate of from all the data• Let be the estimate of from data with

the ith subject removed• Then the delta-beta is

• This is a measure the influence for the ith subject on the estimate of

Delta-Betas

• However, this is computationally intensive – Fit model n times

• There is an approximation that uses score residuals and the estimated variance-covariance matrix to calculate

• Each subject has one for each covariate in the model

Assessing Influence> ### A look at delta-betas for influential points> fit<-coxph(Surv(time, status)~age + factor(edema)+log(bili)+log(albumin))> dfbeta<-residuals(fit, type="dfbeta")> colnames(dfbeta)<-names(fit$coef)> head(round(dfbeta, 5)) age edema=0.5 edema=1 log(bili) log(albumin)1 -0.00011 0.00234 -0.01934 -0.00102 0.013642 -0.00026 0.00194 -0.00243 0.00078 -0.035753 0.00061 0.02693 0.01020 -0.00350 0.032054 0.00011 -0.00432 0.00191 0.00309 0.002155 -0.00051 -0.00259 -0.00101 0.00133 0.001246 0.00054 -0.00371 0.00224 -0.00174 0.02208

Assessing Influence> summary(dfbeta)

age factor(edema)0.5 factor(edema)1 Min. :-0.0038764 Min. :-0.0961947 Min. :-2.021e-01 1st Qu.:-0.000104 1st Qu.:-0.0028430 1st Qu.:-1.495e-03 Median : 0.0000331 Median : 0.0002851 Median : 4.948e-05 Mean : 0.0000000 Mean : 0.0000000 Mean : 0.000e+00 3rd Qu.: 0.0001296 3rd Qu.: 0.0016899 3rd Qu.: 1.910e-03 Max. : 0.0013937 Max. : 0.0442688 Max. : 5.544e-02 log(bili) log(albumin) Min. :-0.0296886 Min. :-0.114491 1st Qu.:-0.0002346 1st Qu.:-0.009506 Median : 0.0008752 Median :-0.001562 Mean : 0.0000000 Mean : 0.000000 3rd Qu.: 0.0016908 3rd Qu.: 0.007670 Max. : 0.0079822 Max. : 0.140274

Influence Plots>plot(pbc$id[-ids], dfbeta[,4], xlab="Patient ID", ylab="log(bilirubin) delta-beta", pch=16)

> pbc[ dfbeta[,"log(bili)"] < -.029, c(1,2,3,5,10,11,13)]id time status age edema log(bili) log(albumin)81 2540 1 63.26 0 2.67 1.29362 2267 1 49.00 0 2.89 1.11

Assessment of Influence

• Subject 81 is older and both subjects have high serum bilirubin (2 sd on log scale)

• Bilirubin is an important predictor of high risk, but subjects are in the upper 40th percentile of survival times

• We may want to do a sensitivity analysis with and without observations 81 and 362

• BUT unless we have very good reason (i.e. data entry error) to remove 81 and 362, we should not delete them

Schoenfeld Residuals

• Schoenfeld (1982) proposed first set of residuals for use with Cox regression packages– Schoenfeld D. Residuals for the proportional

hazards regression model. Biometrika, 1982, 69(1): 239-241.

• Instead of single residual for each individual, there is a separate residual for each individual for each covariate

• Based on the individual contributions to the derivative of the log partial likelihood – see chapter 6 in Hosmer and Lemeshow for more

detail, pg. 198-199

• Note, Schoenfeld residuals are not defined for censored individuals

• Consider a single covariate• At time ti we have

– R(ti) = number of people at risk at time ti

• Now say one person fails at time ti

• Probability that the lth subject fails at this specific time point is…

• Where l ϵ R(ti)

Subject fails ,l

eP l w t

• The weighted average of the kth covariate values for subjects in the risk set R(ti) is

• Where the weights are the estimated probability that the lth subject fails

k i lk l il R t

lk xl R t

z t z w t

• For the kth covariate… – Covariate-value, zik, for subject that died at time ti – is the expected value of the covariate for the

risk set at ti

• Plot Schoenfeld residuals against time to evaluate PH assumption

,ik k iz z t Schoenfeld residual

Schoenfeld Residuals: Properties

• Represent the difference between the observed covariate and the average over the risk set at that time

• Calculated for each covariate• Not defined for censored failure times.• Useful for assessing time trend or lack or

proportionality, based on plotting versus event time

• Generally use scaled Schoenfeld residuals

• Scaled Schoenfeld residuals are residuals after multiplication by the inverse variance-covariance matrix of ̂

Using Schoenfeld Residuals

• Test PH assumption per covariate– Tests correlation between survival time and scaled

Schoenfeld residuals• Graphical Display

– Scaled Schoenfeld residuals vs. time– Include smoother– Look for evidence of difference from 0

Using Schoenfeld Residuals

• We can test each covariate in our data to determine if the PH assumption is violated

• Do this using the cox.zph() function in R

PBC Example> ### Schoenfeld Residuals> fit<-coxph(Surv(time, status)~age + factor(edema)+log(bili)+log(albumin))> par(mfrow=c(2, 4))> test<-cox.zph(fit)> test rho chisq page -0.0191 0.0649 0.799factor(edema)0.5 -0.1067 2.2257 0.136factor(edema)1 -0.0697 0.8652 0.352log(bili) 0.0813 1.0670 0.302log(albumin) 0.0494 0.4851 0.486GLOBAL NA 4.5730 0.470

> plot(test)

Next Time

• Predictions from a Cox PHM

Lecture 17: Regression Diagnostics II Residuals. Residuals are used to investigate the lack of fit of a model to a given subject For Cox regression, there’s.

Documents

Building Cox-Type Structured Hazard Regression Models with.....

Logistic Regression versus Cox Regression Ch. Mélot, MD...

Analysis of Residuals ©2005 Dr. B. C. Paul. Examining...

AP Statistics Chapter 8 Linear Regression. Objectives:...

Regression Diagnostics and Advanced Regression...

Regression calibration for Cox regression under ... · Cox....

Cox Regression Model

Regression Diagnostics: Residuals

Chapter 8: Regression - SAGE Companion | 8: Regression...

Tutorial: Survival Estimation for Cox Regression Models ...

Cox Regression Methods

Part 2 Cox Regression

USING THE CALCULATOR 3.2 Residuals and the Least-Squares...

Cox Proportional-Hazards Regression for Survival Data ·...

Introduction to Cox Regression

STK4080 H16 1. Modiﬁcations of Cox-regression when prop...