Top Banner
MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016 Barbara McKnight, Ph.D. Professor Department of Biosta>s>cs University of Washington SESSION 1: SURVIVAL DATA: EXAMPLES Module 4: Introduc>on to Survival Analysis Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016 Barbara McKnight, Ph.D. Professor Department of Biosta>s>cs University of Washington
118

MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Jan 25, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-2

MODULE4:INTRODUCTIONTOSURVIVALANALYSIS

SummerIns>tuteinSta>s>csforClinicalResearch

UniversityofWashingtonJuly,2016

BarbaraMcKnight,Ph.D.

ProfessorDepartmentofBiosta>s>csUniversityofWashington

1-1

SESSION1:SURVIVALDATA:EXAMPLES

Module4:Introduc>ontoSurvivalAnalysis

SummerIns>tuteinSta>s>csforClinicalResearchUniversityofWashington

July,2016

BarbaraMcKnight,Ph.D.Professor

DepartmentofBiosta>s>csUniversityofWashington

Page 2: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-3

OVERVIEW•  Session1

–  Introductoryexamples –  Thesurvivalfunc>on–  SurvivalDistribu>ons–  MeanandMediansurvival>me

•  Session2 –  Censoreddata–  Risksets–  CensoringAssump>ons–  Kaplan-MeierEs>matorandCI–  MedianandCI

•  Session3 –  Two-groupcomparisons:logranktest –  Trendandheterogeneitytestsformorethantwogroups

•  Session4 –  Introduc>ontoCoxregression

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-4

OVERVIEW–MODULE8

Module8:SurvivalanalysisforObserva>onalData•  MorecomplicatedCoxmodels

–  Adjustment–  Interac>on

•  Hazardfunc>onEs>ma>on•  Compe>ngRisks•  Choiceof>mevariable•  Le^Entry•  Time-dependentcovariates

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 3: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-5

OVERVIEW–MODULE12

Module12:SurvivalanalysisinClinicalTrials•  Es>ma>ngsurvivala^erCoxmodelfit•  Moretwo-sampletests

– Weightedlogrank–  Addi>onaltests

•  Adjustment,precisionandpost-randomiza>onvariables•  Power•  Choiceofoutcome•  Informa>onaccrualinsequen>almonitoring

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-6

PRELIMINARIES

•  Nopriorknowledgeofsurvivalanalysistechniquesassumed•  Familiaritywithstandardone-andtwo-samplesta>s>cal

methods(es>ma>onandtes>ng)isassumed•  Emphasisonapplica>onratherthanmathema>caldetails•  Examples

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 4: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-7

SESSIONS/BREAKS

•  8:30–10:00

–  Breakun>l10:30•  10:30–12:00

–  Breakun>l1:30•  1:30–3:00

–  Breakun>l3:30•  3:30–5:00

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-8

WHATISSURVIVALANALYSISABOUT?

•  Studiestheoccurrenceofaneventover>me–  Timefromrandomiza>ontodeath(cancerRCT)–  Timefromacceptanceintoahearttransplantprogramtodeath–  Timefromrandomiza>ontodiagnosisofAlzheimer’sDisease–  Timefrombirthtoremovalofsupplementaryoxygentherapy–  Timefrommarriageun>lsepara>onordivorce–  Timeun>lfailureoflightbulb

•  Exploresfactorsthatarethoughttoinfluencethechancethat

theeventoccurs–  Treatment–  Age–  Gender–  BodyMassIndex–  Depression–  …..others SISCR2016:Module4IntroSurvival

BarbaraMcKnight

Page 5: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-9

YOUREXAMPLES

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-10

EXAMPLE1

•  LevamisoleandFluorouracilforadjuvanttherapyofresectedcoloncarcinomaMoerteletal,1990,1995

•  1296pa>ents•  StageB2orC•  3unblindedtreatmentgroups

–  Observa>ononly–  Levamisole(oral,1yr)–  Levamisole(oral,1yr)+fluorouracil(intravenous1yr)

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 6: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-11

EXAMPLE1

•  Randomiza>on– Adap>ve– B2,extentofinvasion,>mesincesurgery– C,extentofinvasion,>mesincesurgery,numberoflymphnodesinvolved

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-12

EXAMPLE1

•  Sta>s>calanalysis– Kaplan-Meiersurvivalcurves– Log-ranksta>s>c– Coxpropor>onal-hazardsmodelforallmul>variableanalysis

– Backwardregression,maximalpar>al-likelihoodes>matesta>s>c

– O’Brien-Flemingboundaryforsequen>almonitoring

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 7: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-13

EXAMPLE1

Figure1:Recurrence-freeintervalaccordingtotreatmentarm.Pa8entswhodied

withoutrecurrencehavebeencensored.5-FU=fluorouracil.SISCR2016:Module4IntroSurvival

BarbaraMcKnight

1-14

EXAMPLE1

•  Results(stageC)a^er2ndinterimanalysis•  Fluorouracil+Levamisolereducedthe– Recurrencerateby40%(p<0.0001)– Deathrateby33%(p<0.0007)

•  Levamisolereducedthe– Recurrencerateby2%– Deathrateby6%

•  Toxicitywasmild(withfewexcep>ons)•  Pa>entcomplianceexcellent

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 8: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-15

EXAMPLE1•  Rsurvivalpackagedata“colon”

–  929eligiblepa>ents(971randomized–42ineligible)–  Treatmentgroups(rx)–  Sex,age–  Obstruc>onofcolonbytumor(obstruct)–  Perfora>onofcolon(perfor)–  Adherencetonearbyorgans(adhere)–  Numberoflymphnodeswithdetectablecancer(nodes)–  Daysun>leventorcensoring(>me)–  Censoringstatus(status)

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-16

EXAMPLE1

•  Mul>variableanalysis:–  Propor>onalhazardsmodel–  “wekeptthevariableoftreatmentinthemodelandusedbackwardregressionforothercovariates”

–  Othercovariates(P<0.01)•  Depthofprimarytumorinvasion,•  Invasionofadjacentstructures•  Regionalimplants•  Numberofmetasta>clymphnodes•  Histologicaldifferen>a>on•  Preopera>vecarcinoembryonican>genlevel

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 9: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-17

EXAMPLE1

•  Mul>variableresults:– “A^eradjustmentforminorimbalancesinprognos>cvariablesamongtreatmentarms,therapywithfluorouracilpluslevamisolewasagainfoundtohaveanadvantageoverobserva>on(40%reduc>oninrecurrencerate;P<0.0001).”

– “Levamisolealonehadnodetectableadvantage(2%reduc>oninrecurrencerate;P=0.86).”

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-18

EXAMPLE2–ALZHEIMER’S

•  Petersenetal.2005,NEJM•  Mildcogni>veimpairment•  VitaminEandDonepezilandPlacebo•  Timefromrandomiza>ontoADdiagnosis•  Lengthoftreatment3years•  Doubleblind•  Outcome:PossibleorprobableAD

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 10: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-19

EXAMPLE2–ALZHEIMER’S

•  769enrolled•  212developedpossibleorprobableAD•  “Therewerenosignificantdifferences…duringthethree

yearsoftreatment”•  VitaminEvsPlacebo

–  HazardRa>o1.02(95%CI,0.74,1.41),p-value0.91•  DonepezilvsPlacebo

–  HazardRa>o0.80(95%CI,0.57,1.13),p-value0.42

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-20

EXAMPLE2–ALZHEIMER’S

•  Prespecifiedanalyses•  At6monthsintervals

–  DonepezilvsPlacebosignificantlyreducedlikelihoodofprogressiontoADduringthefirst12months(p-value0.04)

–  Findingsupportedbysecondaryoutcomemeasures–  Subgroup≥1apolipoproteinEϵ4allelessignificantlyreducedlikelihoodofprogressiontoADover3years

–  VitaminEvsPlacebo:nosignificantdifferences–  VitaminEvsPlacebo:alsonosignificanceforabovesubgroup

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 11: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-21

EXAMPLE2–RESULTS

•  Overallandat6and12months

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-22

EXAMPLE2–RESULTS

•  APOEϵ4results

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 12: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-23

EDITORIAL

•  “long-awaitedresults”•  DonepezilstandardtherapyforAD•  “Implica>ons….Enormous”

–  Clear-cutnega>vefindingsforVitaminE–  Especiallynoteworthy–  Despitedearthofevidenceofitsefficacy

–  Findingsfordonepezil“muchlessclear”–  “notquiteasdisappoin>ng”

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-24

EDITORIALCOMMENTS

•  “rateofprogression…somewhatlowerinthetreatmentgroupduringthefirstyearofthestudy”

•  “bytwoyears,eventhissmalleffecthadwornoff”•  Possibleexplana>on:“Reducedsta>s>calpowerlaterinthe

studyasthenumberofsubjectsatriskdeclinedowingtodeath,withdrawalanddevelopmentofAD

•  Secondaryanalysessuggest…benefitsworeoff

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 13: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-25

EXAMPLE2–RESULTS

•  Interes>ngsteps…..

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-26

“COUNTER”EXAMPLE

•  Resuscita>onOutcomesConsor>um–  Out-of-hospitalcardiacarrest–  Trauma>cinjury

•  Prehospitalinterven>ons•  Excep>onfrominformedconsent•  10RegionalCenters

–  7US–  3Canada

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 14: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-27

“COUNTER”EXAMPLE•  Times

–  Event(cardiacarrest,trauma>cinjury)–  911call–  ArrivalofEMS–  Treatmentstart–  Poten>aloutcomes

•  Returnofspontaneouscircula>on(Cardiacarrest)•  EDadmission•  Survivaltohospitaldischarge•  Neurologicallyintactsurvival•  28-daysurvival•  6-monthneurologicaloutcomes

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-28

“COUNTER”EXAMPLE

•  Timeofinjury/cardiacarrest(ordinarilyunknown)

•  911call•  Cardiacarrest:Manydeathsbeforeadmissiontohospital•  Trauma:Manydeathswithinthefirst24–48hours

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 15: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-29

SURVIVALDATAANDFUNCTION

•  Originalapplica>onsinbiometryweretosurvival>mesincancerclinicaltrials

•  Manyotherapplica>onsinbiometry:eg.diseaseonsetages•  Interestcentersnotonlyonaverageormediansurvival>me

butalsoonprobabilityofsurvivingbeyond2years,5years,10years,etc.

•  Bestdescribedwiththeen>resurvivalfunc>onS(t).– ForT=asubject’ssurvival>me,S(t)=P[T>t].– Characterizestheen>redistribu>onofsurvival>mesT.

– Givesusefulinforma>onforeacht.SISCR2016:Module4IntroSurvival

BarbaraMcKnight

1-30

SURVIVALFUNCTION

0 2 4 6 8

0.0

0.2

0.4

0.6

0.8

1.0

Survival Function

t

S(t

) =

Pr[

T >

t]

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 16: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-31

SURVIVALDISTRIBUTION

•  Con>nuousprobabilitydistribu>onof>mesT•  Onlynon-nega>veT’sarepossible:Pr(T<0)=0•  Densityfunc>on•  AreaunderthecurvebetweentwopointsistheprobabilityTisbetweenthetwopoints.

( ) ( )Δ →

= ≤ < + ΔΔ01lim Pr

tf t t T t t

t

f t( )

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-32

DENSITYANDSURVIVALFUNCTIONS

0 2 4 6 8

0.00

0.05

0.10

0.15

0.20

0.25

0.30

Density Function

t

f(t)

0 2 4 6 8

0.0

0.2

0.4

0.6

0.8

1.0

Survival Function

t

S(t

) =

Pr[

T >

t]

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 17: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-33

MEDIANSURVIVALTIME

0 2 4 6 8

0.0

0.2

0.4

0.6

0.8

1.0

Median Survival Time

t

S(t

) =

Pr[

T >

t]

median

0.5

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-34

MEDIANSURVIVALTIME

0 2 4 6 8

0.00

0.05

0.10

0.15

0.20

0.25

0.30

Density Function

t

f(t)

median

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 18: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-35

ILLUSTRATIVEDATA

|

|

|

|

|

|

0 2 4 6 8

time

id

|

|

|

|

|

|

65

43

21

D

D

D

D

D

D

id Y1 52 33 6.54 25 46 1

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-36

SURVIVALFUNCTIONESTIMATE

•  NonparametricEs>mate:reducees>mateby1/nevery>methereisanevent(death):Empiricalsurvivalfunc>ones>mate

0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

Survival Function Estimate

t

S(t)

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 19: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-37

MEDIANESTIMATE

0.0

0.2

0.4

0.6

0.8

1.0

Median Estimate

t

S(t)

1 2 median 4 5 6

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Byconven>on:medianisearliest>mewheresurvivales>mate≤.5

1-38

OTHERWAYSTODESCRIBEASURVIVALDISTRIBUTION

•  Sofarwehavelookedatthedensityfunc>onandsurvivalfunc>onS(t).

•  Alsoofinterest:“hazard”func>onλ(t)

•  Instantaneousrateatwhichdeathoccursattinthosewhoarealiveatt

•  Examples:–  Age-specificdeathrate–  Age-specificdiseaseincidencerate

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 20: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-39

HAZARDFUNCTIONFORHUMANS

0 20 40 60 80 100

Human Mortality

age in years

λ(t)

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-40

EQUIVALENTCHARACTERIZATIONS

•  Anyoneofthedensityfunc>on(f(t)),thesurvivalfunc>on(S(t))orthehazardfunc>on(λ(t))isenoughtodeterminethesurvivaldistribu>on.

•  Theyareeachfunc>onsofeachother:

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 21: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-41

EQUIVALENTCHARACTERIZATIONS

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

0 2 4 6 8

0.0

0.5

1.0

1.5

Hazard Function

t

λ(t)

0 2 4 6 8

0.0

0.4

0.8

Survival Function

t

S(t)

0 2 4 6 8

0.00

0.10

0.20

0.30

Density Function

t

f(t)

1-42

EQUIVALENTCHARACTERIZATIONS

0 2 4 6 8

0.0

0.5

1.0

1.5

Hazard Function

t

λ(t)

0 2 4 6 8

0.0

0.4

0.8

Survival Function

t

S(t)

0 2 4 6 8

0.00

0.10

0.20

0.30

Density Function

t

f(t)

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 22: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-43SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-44SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 23: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-45SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-46SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 24: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-47SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-48SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 25: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-49SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-50SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 26: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-51SISCR2016:Module4IntroSurvivalBarbaraMcKnight

1-52SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 27: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

1-53SISCR2016:Module4IntroSurvivalBarbaraMcKnight

In R

Call up packages we will use (assumes installed)

library(survival)

library(ggplot2)

library(ggfortify)

library(rms)

Get data (in survival package)

data(veteran)

Page 28: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Look at data.

head(veteran)

## trt celltype time status karno diagtime age prior

## 1 1 squamous 72 1 60 7 69 0

## 2 1 squamous 411 1 70 5 64 10

## 3 1 squamous 228 1 60 3 38 0

## 4 1 squamous 126 1 60 9 63 10

## 5 1 squamous 118 1 70 11 65 10

## 6 1 squamous 10 1 20 5 49 0

Survival Curve

Survival time variable, make survival object and get descriptives

Y <- Surv(veteran$time)

Shat <- survfit(Y ~ 1)

Shat

## Call: survfit(formula = Y ~ 1)

##

## n events median 0.95LCL 0.95UCL

## 137 137 80 52 99

Page 29: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Plot Survival Curve

plot(Shat, xlab = "Days", ylab = "Survival Probability")

0 200 400 600 800 1000

0.0

0.2

0.4

0.6

0.8

1.0

Days

Surv

ival P

roba

bilit

y

Plot Survival Curve: Other Options

plot(Shat, conf.int = FALSE, xlab = "Days",

ylab = "Survival Probability")

0 200 400 600 800 1000

0.0

0.2

0.4

0.6

0.8

1.0

Days

Surv

ival P

roba

bilit

y

Page 30: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Using ggplot2 and ggfortify

autoplot(Shat) + labs(x = "Days", y = "Survival Probability")

0%

25%

50%

75%

100%

0 250 500 750 1000Days

Surv

ival P

roba

bilit

y

Adding black and white theme.

autoplot(Shat) + theme_bw() + labs(x = "Days", y = "Survival Probability")

0%

25%

50%

75%

100%

0 250 500 750 1000Days

Surv

ival P

roba

bilit

y

Page 31: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Using rms

Shat2 <- npsurv(Y ~ 1)

survplot(Shat2, xlab = "Days")

Days

0 100 200 300 400 500 600 700 800 900

Surv

ival P

roba

bilit

y

0.0

0.2

0.4

0.6

0.8

1.0

Subset of the data: squamous tumors

with(veteran, table(celltype))

## celltype

## squamous smallcell adeno large

## 35 48 27 27

sqdata <- veteran[veteran$celltype == "squamous",]

Ysq <- Surv(sqdata$time)

Shatsq <- npsurv(Ysq~ 1)

Page 32: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Plot for Subset of the data: squamous tumors

survplot(Shatsq, xlab = "Days", n.risk = TRUE)

Days

0 100 300 500 700 900

Surv

ival P

roba

bilit

y

0.0

0.2

0.4

0.6

0.8

1.0

35 20 13 8 5 3 2 2 2 2

Your turn

Plot the survival curve for small cell tumors.

Page 33: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-1

Module4:Introduc2ontoSurvivalAnalysisSummerIns2tuteinSta2s2csforClinicalResearch

UniversityofWashingtonJune,2016

BarbaraMcKnight,Ph.D.

ProfessorDepartmentofBiosta2s2csUniversityofWashington

SESSION2:ONE-SAMPLEMETHODS

2-2

OUTLINE

•  Session2:– Censoreddata– Risksets– Censoringassump2ons– Kaplan-MeierEs2mator– Medianes2mator– StandarderrorsandCIs

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 34: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-3

OUTLINE

•  Session2:– Censoreddata– Risksets– Censoringassump2ons– Kaplan-MeierEs2mator– Medianes2mator– StandarderrorsandCIs

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-4

CLINICALTRIAL

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

|

|

|

|

|

|

0 2 4 6 8

calendar time

id

|

|

|

|

|

|

65

43

21

D

D

L

A

D

D

Start End

Page 35: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-5

CENSOREDDATA

|

|

|

|

|

|

0 2 4 6 8

survival time

id

65

43

21

D

D

L

A

D

D

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-6

CENSOREDDATA

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

“Censored”observa2onsgivesomeinforma2onabouttheirsurvival2me.

id Y �1 5 12 3 13 6.5 04 2 05 4 16 1 1|

|

|

|

|

|

0 2 4 6 8

survival time

id

65

43

21

D

D

L

A

D

D

Page 36: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-7

CENSOREDDATA

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

“Censored”observa2onsgivesomeinforma2onabouttheirsurvival2me.

id Y �1 5 12 3 13 6.5 04 2 05 4 16 1 1|

|

|

|

|

|

0 2 4 6 8

survival time

id

65

43

21

D

D

L

A

D

D

2-8

ESTIMATION

•  Canweusethepar2alinforma2oninthecensoredobserva2ons?

•  Twooff-the-top-of-the-headanswers:– Fullsample:Yes.Countthemasobserva2onsthatdidnotexperiencetheeventeverandes2mateS(t)asiftherewerenotcensoredobserva2ons.

– Reducedsample:No.Omitthemfromthesampleandes2mateS(t)fromthereduceddataasiftheywerethefulldata.

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 37: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-9

CENSOREDDATA

|

|

|

|

|

|

0 2 4 6 8

survival time

id

65

43

21

D

D

L

A

D

D

Problem: How to estimate:

Pr[T > 3.5] Pr[T > 6]

Full Sample: 46 = .67 2

6 = .33

Reduced Sample: 24 = .5 0

4 = 0

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-10

CENSOREDDATA

Based on the data and estimates on the previous page,

Q: Are the Full Sample estimates biased? Why or why not?

A:

Q: Are the Reduced Sample estimates biased? Why or why not?

A:

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 38: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-11

CENSOREDDATA

|

|

|

|

|

|

0 2 4 6 8

survival time

id

65

43

21

D

D

L

A

D

D

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-12

OUTLINE

•  Session2:– Censoreddata– Risksets– Censoringassump2ons– Kaplan-MeierEs2mator– Medianes2mator– StandarderrorsandCIs

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 39: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-13

RISKSETS

|

|

|

|

|

|

0 2 4 6 8

survival time

id

6

5

4

3

2

1

D

D

L

A

D

D

R1 R2 R3 R4

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-14

RISKSETS

|

|

|

|

|

|

0 2 4 6 8

survival time

id

6

5

4

3

2

1

D

D

L

A

D

D

R1{1,2,3,4,5,6}

R2{1,2,3,5}

R3{1,3,5}

R4{1,3}

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 40: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-15

CENSOREDDATAASSUMPTION

•  Importantassump2on:subjectswhoarecensoredat2metareatthesameriskofdyingattasthoseatriskbutnotcensoredat2met.– Whenwouldyouexpectthistobetrue(orfalse)forsubjectslosttofollow-up?

– Whenwouldyouexpectthistobetrue(orfalse)s2llaliveatthe2meoftheanalysis?

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-16

CENSOREDDATAASSUMPTION

•  Importantassump2on:subjectswhoarecensoredat2metareatthesameriskofdyingattasthoseatriskbutnotcensoredat2met.

•  Thismeanstherisksetat2metisanunbiasedsampleofthepopula2ons2llaliveat2met.

•  Canuseinforma2onfromtheunbiasedrisksetstoes2mateS(t)usingthemethodofKaplanandMeier(Product-LimitEs2mator).

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 41: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-17

OUTLINE

•  Session2:– Censoreddata– Risksets– Censoringassump2ons– Kaplan-MeierEs2mator– Medianes2mator– StandarderrorsandCIs

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-18

USINGRISKSETSINFOTOESTIMATES(t)

•  Repeatedlyusethefactthatfort2>t1,

Pr[T>t2]=Pr[T>t2andT>t1]=Pr[T>t2|T>t1]Pr[T>t1]•  Anobserva2oncensoredbetweent1andt2cancontributeto

thees2ma2onofPr[T>t2]byitsunbiasedcontribu2ontoes2ma2onofPr[T>t1].

0 t2t1

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 42: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-19

PRODUCT-LIMIT(KAPLAN-MEIER)ESTIMATE

Notation: Let t(1), t(2) . . . , t(J) be the ordered failure times in thesample in ascending order.

t(1) = smallest Y� for which �� = 1 (t(1) = 1 )

t(2) = 2nd smallest Y� for which �� = 1 (t(2) = 3 )...t(J) = largest Y� for which �� = 1 (t(4) = 5 )

Q: Does J = the number of observed deaths in the sample?

A:

Q:When does J = n?

A: SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-20

t(j)

|

|

|

|

|

|

0 2 4 6 8

survival time

id

6

5

4

3

2

1

D

D

L

A

D

D

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 43: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-21

MORENOTATION

For each t(j):

D(j) = number that die at time t(j)S(j) = number known to have survived beyond t(j)

(by convention: includes those known to have beencensored at t(j))

N(j) = number "at risk" of being observed to die at time t(j)(ie: number still alive and under observation just before t(j))

S(j) = N(j) �D(j)

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-22

FOREXAMPLEDATA● ●● ●

0 2 4 6 8

time

● ●

t(j) N(j) D(j) S(j) Product-limit (Kaplan-Meier) Estimator:

1 6 1 5

3 4 1 3 S(t) = j:t(j)t(1�D(j)N(j)) = j:t(j)t(

S(j)N(j))

4 3 1 25 2 1 1

for t in S(t)

[0, 1) 1 (empty product)

[1,3 ) 1 ⇥ 56 = .833

[3,4 ) 1 ⇥ 56 ⇥

34 = .625

[4,5 ) 1 ⇥ 56 ⇥

34 ⇥

23 = .417

[5,� ) 1 ⇥ 56 ⇥

34 ⇥

23 ⇥

12 = .208SISCR2016:Module4:IntroSurvival

BarbaraMcKnight

Page 44: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-23

K-MESTIMATOR

0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

Survival Function Estimate

t

S(t)

Note:doesnotdescendtozerohere(sincelastobserva2oniscensored).Q: Sincethees2matejumpsonlyatobserveddeath2mes,howdoes

informa2onfromthecensoredobserva2onscontributetoit?A:

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-24

OUTLINE

•  Session2:– Censoreddata– Risksets– Censoringassump2ons– Kaplan-MeierEs2mator– Medianes2mator– StandarderrorsandCIs

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 45: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-25

MEDIANSURVIVALCENSOREDDATA

0.0

0.2

0.4

0.6

0.8

1.0

Median Estimate, Censored Data

t

S(t)

1 2 3 median 5 6

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-26

OUTLINE

•  Session2:– Censoreddata– Risksets– Censoringassump2ons– Kaplan-MeierEs2mator– Medianes2mator– StandarderrorsandCIs

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 46: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-27

KMSTANDARDERRORS

Greenwood’s Formula:

•’V�r(S(t)) = S2(t)P

j:t(j)tD(j)

N(j)S(j)

• se(S(t)) =∆’V�r(S(t))

• Pointwise CI: (S(t)� z �2se(S(t)), S(t) + z �

2se(S(t)))

– Can include values < 0 or > 1.

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-28

LOG–LOGKMSTANDARDERRORS

Use complementary log log transformation to keep CI within (0,1):

•’V�r(log(� log(S(t)))) =P

j:t(j)tD(j)

N(j)S(j)

[log(S(t))]2

• se =∆’V�r(log(� log(S(t))))

• CI for log(� log(S(t))) :(log(� log(S(t)))� z �

2se, log(� log(S(t))) + z �

2se)

• CI for S(t) : ([S(t)]ez�/2se , [S(t)]e

�z�/2se)

– CI remains within (0,1).

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 47: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-29

GREENWOOD’SFORMULA

0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

Survival Function Estimate

t

S(t)

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-30

COMPLEMENTARYLOG-LOG

0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

Survival Function Estimate

t

S(t)

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 48: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-31

MEDIANCONFIDENCEINTERVAL

Confidence interval for the median is obtained by inverting the signtest of H0 : median = M (Brookmeyer and Crowley, 1982).

• With complete data T1, T2, . . . , Tn, the sign test ofH0 :median = M is performed by seeing if the observedproportion, P[Y > M] is too big (Binomial Distribution orNormal Approximation).

• With censored data (Y1, �1), (Y2, �2), . . . , (Yn, �n) givingincomplete data about T1, T2, . . . , Tn, we cannot always tellwhether T� > M:

When Y� M, �� = 1 observed death before M we know T� MWhen Y� > M death or censored after M we know T� > MWhen Y� M, �� = 0 censored before M we don’t know if

T� M or T� > M

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-32

MEDIANCONFIDENCEINTERVAL

Solution: Following Efron (self-consistency of KM), we estimatePr[T > M] when Y� M, �� = 0 using S(M)

S(Y�).

• For complete data, we let U� =⇢1 T� > M0 T� M

and our test is based onPn

�=1U�.

• For censored data, we let U� =

8<:

1 Y� > MS(M)S(Y�)

Y� M; �� = 00 Y� M; �� = 1

and our test is based onPn

�=1U�.

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 49: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-33

MEDIANCONFIDENCEINTERVAL

• It turns out, this is the same as basing our test ofH0 :median = M on a test of H0 : S(M) = 1

2 .

• So a 95% CI for the median contains all potential M for whichthe test of H0 : S(M) = 1

2 cannot reject at � = .05 (2 sided).

• Since S(M) only changes value at observed event times, thetest need only be checked at M = t(1), t(2), . . . , t(J).

• Originally proposed for Greenwood’s formula CIs for S(M), butany good CIs are OK.

• Implemented in many software packages.

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-34

MEDIANCONFIDENCEINTERVAL

Median Confidence Interval, Censored Data

t

S(t)

1 2 3 median 5 6

00.5

1

| ●

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 50: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-35

COLONCANCEREXAMPLE

•  ClinicaltrialatMayoClinic(Moerteletal.(1990)NEJM)

•  StageB2andCcoloncancerpa2ents;adjuvanttherapy

•  Threearms– Observa2ononly– Levamisole– 5-FU+Levamisole

•  StageCpa2entsonly•  Twotreatmentarmsonly

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-36

COLONCANCEREXAMPLE

0 500 1000 1500 2000 2500 3000

0.0

0.2

0.4

0.6

0.8

1.0

Days from Diagnosis

Surv

ival P

roba

bilit

y

LevLev+5FU

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 51: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-37

COLONCANCEREXAMPLE

0 500 1000 1500 2000 2500 3000

0.0

0.2

0.4

0.6

0.8

1.0

Days from Diagnosis

Surv

ival P

roba

bilit

yLevLev+5FU

Greenwood's Formula

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-38

COLONCANCEREXAMPLE

0 500 1000 1500 2000 2500 3000

0.0

0.2

0.4

0.6

0.8

1.0

Days from Diagnosis

Surv

ival P

roba

bilit

y

LevLev+5FU

Complementary log−log Transformation

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 52: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-39

PRESENTATION

N Events Median(days)

95%CI

LevamisoleOnly

310 161 2152 (1509,∞)

5FU+Levamisole

304 123 -- (2725,∞)

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-40

COLONCANCEREXAMPLE

0 500 1000 1500 2000 2500 3000

0.0

0.2

0.4

0.6

0.8

1.0

Days from Diagnosis

Surv

ival P

roba

bilit

y

LevLev+5FU

Complementary log−log Transformation

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 53: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

2-41

ESTIMATION

•  Es2mateS(t)usingKMcurve(nonparametric).– PointwisestandarderrorsandCis– Almostalwayspresented– Notappropriatewhentheeventofinteresthappensonlytosome(moreonthisthistomorrow)

•  Median:basedonKMcurve:ouenpresented(tooouen?)

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

2-42

TOWATCHOUTFOR

•  Meansurvival2mehardtoes2matewithoutparametricassump2ons–  Censoringmeansincompleteinforma2onaboutlargest2mes

– Meanoverrestricted2meintervalmaybeusefulinsomesevngs(someonthistomorrow)

•  Medianes2matemorecomplicatedthanmedianof2mes

•  EvenwithCIs,evalua2ngdifferencesbetweencurvesvisuallyissubjec2ve

•  Interpreta2onofsurvivalfunc2ones2matesdependsonvalidityofcensoringassump2ons

SISCR2016:Module4:IntroSurvivalBarbaraMcKnight

Page 54: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

In R

Load packages.

library(survival)library(rms)

Get data.

data(colon) # in survival packagehead(colon)

## id study rx sex age obstruct perfor adhere nodes status differ## 1 1 1 Lev+5FU 1 43 0 0 0 5 1 2## 2 1 1 Lev+5FU 1 43 0 0 0 5 1 2## 3 2 1 Lev+5FU 1 63 0 0 0 1 0 2## 4 2 1 Lev+5FU 1 63 0 0 0 1 0 2## 5 3 1 Obs 0 71 0 0 1 7 1 2## 6 3 1 Obs 0 71 0 0 1 7 1 2## extent surg node4 time etype## 1 3 0 1 1521 2## 2 3 0 1 968 1## 3 3 0 0 3087 2## 4 3 0 0 3087 1## 5 2 0 1 963 2## 6 2 0 1 542 1

Page 55: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Process data and compute survival curves.

df <- colon[colon$etype == 2,] # Use death times.df <- df[df$rx != "Obs",] # Omit observation only arm.Y <- with(df, Surv(time, status))Shats <-survfit(Y ~ rx, data = df)

Plot survival curves.

colors <- c("slateblue", "goldenrod")plot(Shats, lty = c(1,2),

col = colors, lwd = 2,xlab = "Days", ylab = "Survival Probability")

legend("bottomleft", lty = c(1,2),col = colors, lwd = 2,legend = c("Lev only", "Lev + 5FU"), bty = "n")

Page 56: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Plot survival curves.

0 500 1000 1500 2000 2500 3000

0.0

0.2

0.4

0.6

0.8

1.0

Days

Surv

ival P

roba

bilit

y

Lev onlyLev + 5FU

With censoring tick marks

plot(Shats, lty = c(1,2),col = colors, lwd = 2,xlab = "Days", ylab = "Probability Survival",mark.time = TRUE)

legend("bottomleft", lty = c(1,2),col = colors, lwd = 2,legend = c("Lev only", "Lev + 5FU"), bty = "n")

Page 57: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

With censoring tick marks

0 500 1000 1500 2000 2500 3000

0.0

0.2

0.4

0.6

0.8

1.0

Days

Prob

abilit

y Su

rviva

l

Lev onlyLev + 5FU

With CIs: Greenwood’s formula

ShatsG <- survfit(Y ~ rx, data = df, conf.type = "plain")plot(ShatsG, lty = c(1,2),

col = colors, lwd = 2,xlab = "Days", ylab = "Probability Survival",mark.time = TRUE, conf.int = TRUE)

legend("bottomleft", lty = c(1,2),col = colors, lwd = 2,legend = c("Lev only", "Lev + 5FU"), bty = "n")

Page 58: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

With CIs: Greenwood’s formula

0 500 1000 1500 2000 2500 3000

0.0

0.2

0.4

0.6

0.8

1.0

Days

Prob

abilit

y Su

rviva

l

Lev onlyLev + 5FU

With CIs: Complementary log-log formula

ShatsL <-survfit(Y ~ rx, data = df, conf.type = "log-log")plot(ShatsL, lty = c(1,2),

col = colors, lwd = 2,xlab = "Days", ylab = "Probability Survival",mark.time = TRUE, conf.int = TRUE)

legend("bottomleft", lty = c(1,2),col = colors, lwd = 2,legend = c("Lev only", "Lev + 5FU"), bty = "n")

Page 59: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

With CIs: Complementary log-log formula

0 500 1000 1500 2000 2500 3000

0.0

0.2

0.4

0.6

0.8

1.0

Days

Prob

abilit

y Su

rviva

l

Lev onlyLev + 5FU

Median CIs: Complementary log-log formula

0 500 1000 1500 2000 2500 3000

0.0

0.2

0.4

0.6

0.8

1.0

Days

Prob

abilit

y Su

rviva

l

Lev onlyLev + 5FU

Page 60: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Median CI summary

ShatsL

## Call: survfit(formula = Y ~ rx, data = df, conf.type = "log-log")#### n events median 0.95LCL 0.95UCL## rx=Lev 310 161 2152 1509 NA## rx=Lev+5FU 304 123 NA 2725 NA

With rms

Shat2 <- npsurv(Y ~ rx, data = df, conf.type = "log-log")survplot(Shat2, xlab = "Days", col = colors)

Days

0 350 700 1400 2100 2800 3500

Surv

ival P

roba

bilit

y

0.0

0.2

0.4

0.6

0.8

1.0

rx=Lev

rx=Lev+5FU

Page 61: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

With numbers at risk

survplot(Shat2, xlab = "Days", col = colors, n.risk = TRUE)

Days

0 350 1050 1750 2450 3150

Surv

ival P

roba

bilit

y

0.0

0.2

0.4

0.6

0.8

1.0

310 284 242 198 175 166 135 65 13 3 rx=Lev304 281 245 226 209 195 152 79 26 4 rx=Lev+5FU

rx=Lev

rx=Lev+5FU

With numbers at risk below

par(mar = c(8, 4,4,2) + .1)survplot(Shat2, xlab = "Days", col = colors,

n.risk = TRUE, y.n.risk = -.6)abline(h = .5, lty = 3)

Page 62: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

With numbers at risk below

Days

0 350 1050 1750 2450 3150

Surv

ival P

roba

bilit

y

0.0

0.2

0.4

0.6

0.8

1.0

310 284 242 198 175 166 135 65 13 3 rx=Lev304 281 245 226 209 195 152 79 26 4 rx=Lev+5FU

rx=Lev

rx=Lev+5FU

Your turn

1. Using the colon cancer data, plot treatment group survival curves comparingObservation only arm to Levamisole only and Levamisole + 5 FU arms.

2. Compute median survival times and CIs for each group.

Page 63: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-1

Module4:Introduc2ontoSurvivalAnalysisSummerIns2tuteinSta2s2csforClinicalResearch

UniversityofWashingtonJune,2016

BarbaraMcKnight,Ph.D.

ProfessorDepartmentofBiosta2s2csUniversityofWashington

SESSION3:TWOANDK-SAMPLEMETHODS

3-2

OVERVIEW•  Session1

–  Introductoryexamples –  Thesurvivalfunc2on–  SurvivalDistribu2ons–  MeanandMediansurvival2me

•  Session2 –  Censoreddata–  Risksets–  CensoringAssump2ons–  Kaplan-MeierEs2matorandCI–  MedianandCI

•  Session3 –  Two-groupcomparisons:logranktest –  Trendandheterogeneitytestsformorethantwogroups

•  Session4 –  Introduc2ontoCoxregression

SISC2016Module4:IntroSurvivalBarbaraMcKnight

Page 64: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-3

TESTING

•  Groupcomparisons– Twogroups– k-groupheterogeneity– k-grouptrend

•  Assume,H0:nodifferencesbetweengroups

SISC 2016 Module 4: Intro Survival Barbara McKnight

3-4

COLONCANCEREXAMPLE

0 500 1000 1500 2000 2500 3000

0.0

0.2

0.4

0.6

0.8

1.0

Days from Diagnosis

Surv

ival P

roba

bilit

y

LevLev+5FU

Complementary log−log Transformation

SISC2016Module4:IntroSurvivalBarbaraMcKnight

Page 65: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-5

THEP-VALUEQUESTION

•  Sta2s2calsignificance?

SISC 2016 Module 4: Intro Survival Barbara McKnight

3-6

COMPARINGSURVIVALDISTRIBUTIONS

•  Two-sampledata:comparingS1(t)andS2(t)–  (Y1i,δ1i),i=1,…,n1,T�S1(t)–  (Y2i,δ2i),i=1,…,n2,T�S2(t)

•  CouldlookatS2(t)-S1(t)atasingle2met,butthismightbemisleadingunlessallyoucareaboutissurvivalatthat2me.

SISC2016Module4:IntroSurvivalBarbaraMcKnight

Page 66: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-7

COMPARISONAT5YEARS

SISC2016Module4:IntroSurvivalBarbaraMcKnight

0.0

0.2

0.4

0.6

0.8

1.0

t

S(t)

5 years

0.0

0.2

0.4

0.6

0.8

1.0

t

S(t)

5 years

3-8

COMPARINGSURVIVALDISTRIBUTIONS

•  TherearemanywaystomeasureS2(t)-S1(t),thedistancebetweentwofunc2onsof2me

•  Here:focusonmostcommonlyusedtest:thelogranktest,whichcomparesconsistentra2osofhazardfunc2ons

•  Module12willconsiderothertests

SISC2016Module4:IntroSurvivalBarbaraMcKnight

Page 67: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-9

RISKSETS

|

|

|

|

|

|

0 2 4 6 8

survival time

id

6

5

4

3

2

1

D

D

L

A

D

D

R1{1,2,3,4,5,6}

R2{1,2,3,5}

R3{1,3,5}

R4{1,3}

SISC2016Module4:IntroSurvivalBarbaraMcKnight

3-10

LOGRANKTEST

•  Thetestisbasedona2x2tableofgroupbycurrentstatusateachobservedfailure2me(ieforeachriskset)

•  T(j),j=1,…m,asshownintheTablebelow.

SISC 2016 Module 4: Intro Survival Barbara McKnight

Event/Group 1 2 TotalDie d1(j) d2(j) D(j)

Survive n1(j)-d1(j)=s1(j) n2(j)-d2(j)=s2(j) N(j)-D(j)=S(j)AtRisk n1(j) n2(j) N(j)

Page 68: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-11

TWO-GROUPCOMPARISONS

•  Thecontribu2ontotheteststa2s2cateachevent2meisobtainedbycalcula2ngtheexpectednumberofdeathsinonegroup,assumingthattheriskofdeathatthat2meisthesameineachofthetwogroups.

•  Thisyieldstheusual“rowtotal7mescolumntotaldividedbygrandtotal”es2mator.Forexample,forgroup1,theexpectednumberis

•  Mostsomwarepackagesbasetheires2matorofthevarianceonthehypergeometricdistribu2on,definedasfollows:

SISC 2016 Module 4: Intro Survival Barbara McKnight

( )( ) ( )

( )= 1

1ˆ j j

jj

n DE

N

V j( ) =n1 j( )n2 j( )D j( ) N j( ) −D j( )( )

N j( )2 N j( ) −1( )

3-12

LOGRANKTWO-GROUPCOMPARISONS

•  Eachtestmaybeexpressedintheformofara2oofsumsovertheobservedsurvival2mesasfollows

•  Wheretj,j=1,…,J,aretheuniqueorderedevent2mes•  Underthenullhypothesisofnodifferenceinsurvivaldistribu2on,thep-

valueforQmaybeobtainedusingthechi-squaredistribu2onwithonedegree-of-freedom,whentheexpectednumberofeventsislarge.

SISC 2016 Module 4: Intro Survival Barbara McKnight

p = Pr χ 21 ≥Q( )

Q =[PJ

j=1(d1(j)�E1(j))]2

V(j)=[PJ

j=1

Ån1(j)n2(j)n1(j)+n2(j)

ãÅd1(j)n1(j)� d2(j)

n2(j)

ã]2

V(j)

Page 69: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-13

COLONCANCEREXAMPLE

•  ComparingLevandLev+5FU:

•  Log-ranktest:χ21=8.2,p-value=0.0042

SISC 2016 Module 4: Intro Survival Barbara McKnight

Group N Obs ExpLev 310 161 136.9

Lev+5FU 304 123 147.1Total 614 284 284.0

3-14

LOGRANKTEST

SISC2016Module4:IntroSurvivalBarbaraMcKnight

Othertests(generalizedWilcoxonandothers)cangivemoreweighttoearlyorlatedifferences.

0.0

0.2

0.4

0.6

0.8

1.0

Can Detect This

t

S(t)

0.0

0.2

0.4

0.6

0.8

1.0

But Not This

t

S(t)

Page 70: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-15

LOGRANKTEST

•  Detectsconsistentdifferencesbetweensurvivalcurvesover2me.

•  Bestpowerwhen:

–  H0:S1(t)=S2(t)foralltvsHA:S1(t)=[S2(t)]c,or

–  H0:λ1(t)=λ2(t)foralltvsHA:λ1(t)=cλ2(t)

•  Goodpowerwheneversurvivalcurvedifferenceisinconsistentdirec2on

SISC2016Module4:IntroSurvivalBarbaraMcKnight

3-16

STRATIFIEDLOGRANKTEST

•  Inalarge-enoughclinicaltrial,confoundingbiasduetoimbalancebetweentreatmentarmsisunlikely.

•  However,beterpowercanbeobtainedbyadjus2ngforstronglyprognos2cvariables.

•  Onewaytoadjust:stra2fiedlogranktest•  CanalsouseCoxregression(Modules8and12)

SISC2016Module4:IntroSurvivalBarbaraMcKnight

Page 71: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-17

STRATIFIEDLOGRANKTEST

•  AssumeRstrata(r=1,…,R)

•  Recall(non-stra2fied)log-rankteststa2s2c

•  Stra2fiedlog-ranktest

SISC2016Module4:IntroSurvivalBarbaraMcKnight

Q =d1,1 j( ) − E1,1 j( )( )

j1=1

J1

∑ + ...+ d1r j( ) − E1r j( )( ) + ...+ d1R j( ) − E1R j( )( )jR=1

JR

∑jr =1

Jr

∑⎡

⎣⎢

⎦⎥

2

V1 j( )j1=1

J1

∑ + ...+ Vr j( ) + ...+ VR j( )jR=1

JR

∑jr =1

Jr

Q =[PJ

j=1(d1(j)�E1(j))]2

V(j)

3-18

STRATIFIEDLOG-RANKTEST

•  H0: λ1r(t)=λ2r(t)foralltandr=1,…,R•  HA:λ1r(t)=cλ2r(t),c≠1,foralltandr=1,…,R

•  UnderH0teststa2s2c~χ21whenthenumberofeventsislarge

•  The and arebasedsolelyonsubjectsfromtherthstratum

•  Willbepowerfulwhendirec2onofgroupdifferenceisconsistentacrossstrataandover2me.

SISC2016Module4:IntroSurvivalBarbaraMcKnight

d1r j( ),E1r j( ) ( )r jV

Page 72: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-19

EXAMPLE-WHAS

•  Example:TheWorcesterHeartAtackStudy(WHAS)•  Goal:studyfactorsand2metrends

associatedwithlongtermsurvivalfollowingacutemyocardialinfarc2on(MI)amongresidentsoftheWorcester,MassachusetsStandardMetropolitanSta2s2calArea(SMSA)

•  Studybeganin1975•  Datacollec2onapproximatelyeveryotheryear•  Mostrecentcohort:subjectswhoexperiencedanMIin2001•  Themainstudy:over11,000subjects•  Here:asmallsamplefromthemainstudywithn=100

SISC2016Module4:IntroSurvivalBarbaraMcKnight

3-20

EXAMPLE-WHAS

•  t0: 2meofhospitaladmissionfollowinganacutemyocardialinfarc2on(MI)

•  Event:Deathfromanycausefollowinghospitaliza2onforanMI

•  Time:Timefromhospitaladmissionto–  Death–  Endofstudy–  Lastcontact

•  InterestineffectofgenderadjustedforageSISC2016Module4:IntroSurvival

BarbaraMcKnight

Page 73: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-21

TESTINGGENDERBYAGEGROUP

survdiff(formula = Yw[age_trend == 46] ~ gender[age_trend == 46], data = whas100)

n=25, 1 observation deleted due to missingness.

N Observed Expected (O-E)^2/E (O-E)^2/Vgender[age_trend == 46]=Male 20 5 6.53 0.357 1.95gender[age_trend == 46]=Female 5 3 1.47 1.584 1.95

Chisq= 1.9 on 1 degrees of freedom, p= 0.163

SISC2016Module4:IntroSurvivalBarbaraMcKnight

3-22

TESTINGGENDERBYAGEGROUP

> survdiff(Yw[age_trend == 65] ~ gender[age_trend == 65], data = whas100)Call:survdiff(formula = Yw[age_trend == 65] ~ gender[age_trend == 65], data = whas100)

n=23, 1 observation deleted due to missingness.

N Observed Expected (O-E)^2/E (O-E)^2/Vgender[age_trend == 65]=Male 17 4 5.6 0.458 2.41gender[age_trend == 65]=Female 6 3 1.4 1.833 2.41

Chisq= 2.4 on 1 degrees of freedom, p= 0.121

SISC2016Module4:IntroSurvivalBarbaraMcKnight

Page 74: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-23

TESTINGGENDERBYAGEGROUP

> survdiff(Yw[age_trend == 75] ~ gender[age_trend == 75], data = whas100)Call:survdiff(formula = Yw[age_trend == 75] ~ gender[age_trend == 75], data = whas100)

n=22, 1 observation deleted due to missingness.

N Observed Expected (O-E)^2/E (O-E)^2/Vgender[age_trend == 75]=Male 15 10 9.07 0.0947 0.273gender[age_trend == 75]=Female 7 4 4.93 0.1743 0.273

Chisq= 0.3 on 1 degrees of freedom, p= 0.602

SISC2016Module4:IntroSurvivalBarbaraMcKnight

3-24

TESTINGGENDERBYAGEGROUP

> survdiff(Yw[age_trend == 86] ~ gender[age_trend == 86], data = whas100)Call:survdiff(formula = Yw[age_trend == 86] ~ gender[age_trend == 86], data = whas100)

n=30, 1 observation deleted due to missingness.

N Observed Expected (O-E)^2/E (O-E)^2/Vgender[age_trend == 86]=Male 13 9 8.83 0.00318 0.00574gender[age_trend == 86]=Female 17 13 13.17 0.00213 0.00574

Chisq= 0 on 1 degrees of freedom, p= 0.94

SISC2016Module4:IntroSurvivalBarbaraMcKnight

Page 75: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-25

STRATIFIEDTEST

> survdiff(Yw ~ gender + strata(age_trend), data = whas100)Call:survdiff(formula = Yw ~ gender + strata(age_trend), data = whas100)

n=100, 1 observation deleted due to missingness.

N Observed Expected (O-E)^2/E (O-E)^2/Vgender=Male 65 28 30 0.138 0.402gender=Female 35 23 21 0.197 0.402

Chisq= 0.4 on 1 degrees of freedom, p= 0.526

SISC2016Module4:IntroSurvivalBarbaraMcKnight

3-26

UN-STRATIFIEDTEST

> survdiff(Yw ~ gender, data = whas100)Call:survdiff(formula = Yw ~ gender, data = whas100)

n=100, 1 observation deleted due to missingness.

N Observed Expected (O-E)^2/E (O-E)^2/Vgender=Male 65 28 34.7 1.29 4.06gender=Female 35 23 16.3 2.74 4.06

Chisq= 4.1 on 1 degrees of freedom, p= 0.044

SISC2016Module4:IntroSurvivalBarbaraMcKnight

Page 76: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-27

WHY?

> with(whas100,table(age_trend, gender)) genderage_trend Male Female 46 20 5 65 17 6 75 15 7 86 13 17

SISC2016Module4:IntroSurvivalBarbaraMcKnight

3-28

WHY?

Age and Gender

Age Group

gend

er

46 65 75 86

Mal

eFe

mal

e

SISC2016Module4:IntroSurvivalBarbaraMcKnight

Page 77: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-29

HETEROGENEITY

•  Whentherearemorethantwogroups,cantestfordifferencesomewherebetweengroups:

•  Nullhypothesis:•  Alterna2vehypothesis:somewhere

SISC2016Module4:IntroSurvivalBarbaraMcKnight

λ1 t( ) ≡ λ2 t( ) ≡ ...≡ λk t( )≡

3-30

COLONDATA:THREETREATMENTGROUPS

•  χ22=11.7(df=onefewerthannumberofgroups)

•  P-value:0.003

SISC2016Module4:IntroSurvivalBarbaraMcKnight

ObservedEvents

ExpectedEvents

Obs 161 146.1Lev 123 157.5

Lev+5FU 168 148.4452 452

Page 78: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-31

TREND

•  Whentherearemorethantwo“ordered”groups,itissome2mesofinteresttotestthenullhypothesisofnodifferenceagainsta“trend”alterna2ve

•  with<somewhere,or•  with>somewhere•  Placeboandtwoormoredosesofatherapeu2cagent

•  Pre-hypothesized

SISC2016Module4:IntroSurvivalBarbaraMcKnight

λ1 t( ) ≤ λ2 t( ) ≤ ...≤ λk t( ) λ1 t( ) ≥ λ2 t( ) ≥ ...≥ λk t( )

3-32

TREND•  Theteststa2s2cfortrenduses“scores”: s1,s2,…,sk

•  Nullhypothesis:•  Specificalterna2vehypothesis:

•  Goodpowerwhenaveragedifferencebetweenobservedandexpectedeventsgrowsordiminisheswithincreasingsi

SISC2016Module4:IntroSurvivalBarbaraMcKnight

λ1 t( ) ≡ λ2 t( ) ≡ ...≡ λk t( )

cs1λ1 t( ) ≡ cs2λ2 t( ) ≡ ...≡ cskλk t( ),c ≠ 1

s ji=1

k

∑ dij −Eij( )j=1

Jk

∑⎛

⎝⎜⎞

⎠⎟

2

′s Vs

Page 79: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-33

TREND

SISC2016Module4:IntroSurvivalBarbaraMcKnight

Group N Observed Expected

WellDifferen2ated 93 42 47.5

ModeratelyDifferen2ated 663 311 334.9

PoorlyDifferen2ated 150 88 58.6

Tumordifferen2a2onandall-causemortality:

Taronetrendtest:χ12=11.57,P=6.6×10-4

3-34

SUMMARY

•  Canuselogranktesttodetectconsistentdifferences(over2me)inthehazardofdying(theeventoccurring)usingcensoredsurvivaldata– Canstra2fyonprognos2cvariables

•  Cantestfordifferencesbetweenmorethantwogroups

•  Whenalterna2veisorderedbypriorhypothesis,cantestfortrendratherthanheterogeneity

SISC2016Module4:IntroSurvivalBarbaraMcKnight

Page 80: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

3-35

TOWATCHOUTFOR:

•  Onlyranksareusedfor“standard”tests•  Observa2onswith2me=0•  Crossinghazardfunc2ons•  P-valuenotvalidifyoudecidebetweentrendandheterogeneitytestamerlookingatthedata– Datatoldyouwhatyourhypothesiswas

SISC2016Module4:IntroSurvivalBarbaraMcKnight

In R

Load packages.

library(survival)library(rms)library(survMisc)library(foreign)

Page 81: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Get data.

data(colon) # in survival packagehead(colon)

## id study rx sex age obstruct perfor adhere nodes status differ## 1 1 1 Lev+5FU 1 43 0 0 0 5 1 2## 2 1 1 Lev+5FU 1 43 0 0 0 5 1 2## 3 2 1 Lev+5FU 1 63 0 0 0 1 0 2## 4 2 1 Lev+5FU 1 63 0 0 0 1 0 2## 5 3 1 Obs 0 71 0 0 1 7 1 2## 6 3 1 Obs 0 71 0 0 1 7 1 2## extent surg node4 time etype## 1 3 0 1 1521 2## 2 3 0 1 968 1## 3 3 0 0 3087 2## 4 3 0 0 3087 1## 5 2 0 1 963 2## 6 2 0 1 542 1

Process data and compute survival curves.

df <- colon[colon$etype == 2,] # Use death times.df <- df[df$rx != "Obs",] # Omit observation only arm.Y <- with(df, Surv(time, status))Shats <-survfit(Y ~ rx, data = df)

Page 82: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Plot survival curves.

colors <- c("slateblue", "goldenrod")plot(Shats, lty = c(1,2),

col = colors, lwd = 2,xlab = "Days", ylab = "Survival Probability")

legend("bottomleft", lty = c(1,2),col = colors, lwd = 2,legend = c("Lev only", "Lev + 5FU"), bty = "n")

Plot survival curves.

0 500 1000 1500 2000 2500 3000

0.0

0.2

0.4

0.6

0.8

1.0

Days

Surv

ival P

roba

bilit

y

Lev onlyLev + 5FU

Page 83: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Logrank test

survdiff(Y ~ rx, data = df)

## Call:## survdiff(formula = Y ~ rx, data = df)#### N Observed Expected (O-E)^2/E (O-E)^2/V## rx=Lev 310 161 137 4.24 8.21## rx=Lev+5FU 304 123 147 3.95 8.21#### Chisq= 8.2 on 1 degrees of freedom, p= 0.00417

Stratified logrank test

Get data.

whas100 <- read.dta("/Users/barb1/Documents/Biostat/Class/SIB/SISCR2016/Module4-Intro/whas100.dta")Yw <- with(whas100, Surv(surv, fstat == "Dead"))

Page 84: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Stratum-specific test

survdiff(Yw[age_trend == 46] ~ gender[age_trend == 46], data = whas100)

## Call:## survdiff(formula = Yw[age_trend == 46] ~ gender[age_trend ==## 46], data = whas100)#### n=25, 1 observation deleted due to missingness.#### N Observed Expected (O-E)^2/E (O-E)^2/V## gender[age_trend == 46]=Male 20 5 6.53 0.357 1.95## gender[age_trend == 46]=Female 5 3 1.47 1.584 1.95#### Chisq= 1.9 on 1 degrees of freedom, p= 0.163

Stratum-specific test

survdiff(Yw[age_trend == 65] ~ gender[age_trend == 65], data = whas100)

## Call:## survdiff(formula = Yw[age_trend == 65] ~ gender[age_trend ==## 65], data = whas100)#### n=23, 1 observation deleted due to missingness.#### N Observed Expected (O-E)^2/E (O-E)^2/V## gender[age_trend == 65]=Male 17 4 5.6 0.458 2.41## gender[age_trend == 65]=Female 6 3 1.4 1.833 2.41#### Chisq= 2.4 on 1 degrees of freedom, p= 0.121

Page 85: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Stratum-specific test

survdiff(Yw[age_trend == 75] ~ gender[age_trend == 75], data = whas100)

## Call:## survdiff(formula = Yw[age_trend == 75] ~ gender[age_trend ==## 75], data = whas100)#### n=22, 1 observation deleted due to missingness.#### N Observed Expected (O-E)^2/E (O-E)^2/V## gender[age_trend == 75]=Male 15 10 9.07 0.0947 0.273## gender[age_trend == 75]=Female 7 4 4.93 0.1743 0.273#### Chisq= 0.3 on 1 degrees of freedom, p= 0.602

Stratum-specific test

survdiff(Yw[age_trend == 86] ~ gender[age_trend == 86], data = whas100)

## Call:## survdiff(formula = Yw[age_trend == 86] ~ gender[age_trend ==## 86], data = whas100)#### n=30, 1 observation deleted due to missingness.#### N Observed Expected (O-E)^2/E (O-E)^2/V## gender[age_trend == 86]=Male 13 9 8.83 0.00318 0.00574## gender[age_trend == 86]=Female 17 13 13.17 0.00213 0.00574#### Chisq= 0 on 1 degrees of freedom, p= 0.94

Page 86: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Stratified logrank test

survdiff(Yw ~ gender + strata(age_trend), data = whas100)

## Call:## survdiff(formula = Yw ~ gender + strata(age_trend), data = whas100)#### n=100, 1 observation deleted due to missingness.#### N Observed Expected (O-E)^2/E (O-E)^2/V## gender=Male 65 28 30 0.138 0.402## gender=Female 35 23 21 0.197 0.402#### Chisq= 0.4 on 1 degrees of freedom, p= 0.526

Un-stratified logrank test

survdiff(Yw ~ gender, data = whas100)

## Call:## survdiff(formula = Yw ~ gender, data = whas100)#### n=100, 1 observation deleted due to missingness.#### N Observed Expected (O-E)^2/E (O-E)^2/V## gender=Male 65 28 34.7 1.29 4.06## gender=Female 35 23 16.3 2.74 4.06#### Chisq= 4.1 on 1 degrees of freedom, p= 0.044

Page 87: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Why?

with(whas100,table(age_trend, gender))

## gender## age_trend Male Female## 46 20 5## 65 17 6## 75 15 7## 86 13 17

Why?

mosaicplot(age_trend ~ gender, data = whas100, dir = "v",col = c("slateblue", "goldenrod"), main = "Age and Gender",xlab = "Age Group")

Age and Gender

Age Group

gend

er

46 65 75 86

Mal

eFe

mal

e

Page 88: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Three group test data

df2 <- colon[colon$etype == 2,] # Use death times.Y2 <- with(df2, Surv(time, status))Shats3 <-survfit(Y2 ~ rx, data = df2, conf.type = "log-log")

Three group test

survdiff(Y2 ~ rx, data = df2)

## Call:## survdiff(formula = Y2 ~ rx, data = df2)#### N Observed Expected (O-E)^2/E (O-E)^2/V## rx=Obs 315 168 148 2.58 3.85## rx=Lev 310 161 146 1.52 2.25## rx=Lev+5FU 304 123 157 7.55 11.62#### Chisq= 11.7 on 2 degrees of freedom, p= 0.0029

Page 89: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Trend test function (Dan Gillen)

## survtrend() : Function to compute the Tarone trend test formula : Surv()## response and covariate for trend test data : dataset containing response## and predictor print.table : if TRUE, prints observed and expected failures## under H_0 in each groupsurvtrend <- function(formula, data, print.table = TRUE) {

lrfit <- survdiff(formula, data = data)df <- length(lrfit$n) - 1score <- coxph(formula, data = data)$scoreif (print.table) {

oetable <- cbind(lrfit$n, lrfit$obs, lrfit$exp)colnames(oetable) <- c("N", "Observed", "Expected")print(oetable)

}cat("\nLogrank Test : Chi(", df, ") = ", lrfit$chisq, ", p-value = ", 1 -

pchisq(lrfit$chisq, df), sep = "")cat("\nTarone Test Trend : Chi(1) = ", score, ", p-value = ", 1 - pchisq(score,

1), sep = "")}

Trend test

Shats3t <-survfit(Y2 ~ differ, data = df2, conf.type = "log-log")colors <- c("slateblue", "goldenrod", "forestgreen")plot(Shats3t, lty = c(1:3),

col = colors, lwd = 2,xlab = "Days", ylab = "Survival Probability")

legend("bottomleft", lty = c(1:3),col = colors, lwd = 2,legend = c("Well", "Moderately", "Poorly"), bty = "n")

Page 90: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Trend test

0 500 1000 1500 2000 2500 3000

0.0

0.2

0.4

0.6

0.8

1.0

Days

Surv

ival P

roba

bilit

y

WellModeratelyPoorly

Trend test

survtrend(Y2 ~ differ, data = df2)

## N Observed Expected## differ=1 93 42 47.5287## differ=2 663 311 334.9173## differ=3 150 88 58.5540#### Logrank Test : Chi(2) = 17.18909, p-value = 0.0001851124## Tarone Test Trend : Chi(1) = 11.57379, p-value = 0.0006688778

Page 91: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Your turn

Using the colon cancer data in the survival package with overall survival (after loading

survival package type ?colon for documentation):

1. Perform the logrank test of whether the hazard ratio for all-cause mortality

associated with having more than 4 lymph nodes positive for cancer at diagnosis is

one.

2. Perform the logrank test of whether the hazard ratio for all-cause mortality

associated with having more than 4 lymph nodes positive for cancer at diagnosis is

one after stratification adjustment for treatment arm.

3. Perform the logrank test of whether the all-cause-mortality hazard depends on

extent of disease at diagnosis (heterogeneity test).

4. Perform the logrank test of whether the all-cause-mortality hazard is higher or

lower for greater extent of disease at diagnosis (trend test).

Write a “results” sentence or two for each of these analyses.

Page 92: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-1

Module4:Introduc1ontoSurvivalAnalysisSummerIns1tuteinSta1s1csforClinicalResearch

UniversityofWashingtonJuly,2016

BarbaraMcKnight,Ph.D.

ProfessorDepartmentofBiosta1s1csUniversityofWashington

SESSION4:INTRODUCTIONTOCOXREGRESSION

4-2

OVERVIEW•  Session1

–  Introductoryexamples –  Thesurvivalfunc1on–  SurvivalDistribu1ons–  MeanandMediansurvival1me

•  Session2 –  Censoreddata–  Risksets–  CensoringAssump1ons–  Kaplan-MeierEs1matorandCI–  MedianandCI

•  Session3 –  Two-groupcomparisons:logranktest –  Trendandheterogeneitytestsformorethantwogroups

•  Session4 –  Introduc1ontoCoxregression

SISC2016Module4:IntroSurvivalBarbaraMcKnight

Page 93: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-3

OUTLINE

•  Mo1va1on:– Confoundingandstra1fiedrandomiza1ondesigns

•  CoxRegressionmodel– Coefficientinterpreta1on– Es1ma1onandtes1ng– Rela1onshipto2-andK-sampletests

•  Examplesthroughout

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

4-4

OUTLINE

•  Mo#va#on:– Confoundingandstra#fiedrandomiza#ondesigns

•  CoxRegressionmodel– Coefficientinterpreta1on– Es1ma1onandtes1ng– Rela1onshipto2-andK-sampletests

•  Examplesthroughout

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 94: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-5

CONFOUNDING

•  Observa1onaldata:some1mesobservedassocia1onsbetweenanexplanatoryvariableandoutcomecanbeduetotheirjointassocia1onwithanothervariable.– Agerelatedtobothsexandriskofdeath.– Otherexamples?

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

4-6

PRECISIONINRCTS

•  Becauseofrandomiza1on,confounding/imbalanceusuallynotanissueexceptinsmalltrials.

•  Asinlinearregression,regressionmodelsforcensoredsurvivaldataallowgroupcomparisonsamongsubjectswithsimilarvaluesofadjustmentor“precision”variables(morelater).

•  Fairerandpossiblymorepowerfulcomparisonaslongasadjustmentvariablesarenottheresultoftreatment.

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 95: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-7

STRATIFIEDRANDOMIZATION

•  Forstrongpredictors:concernaboutpossiblerandomiza1onimbalance– Clinicorcenter– Stageofdisease– Sex– Age

•  Adjustforstra1fica1onvariablesinanalysis– Morepowerfulifpredictorsarestrong– Samecondi1oningasthesampling

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

4-8

OUTLINE

•  Mo1va1on:– Confoundingandstra1fiedrandomiza1ondesigns

•  CoxRegressionmodel– Coefficientinterpreta#on– Es#ma#onandtes#ng– Rela#onshipto2-andK-sampletests

•  Examplesthroughout

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 96: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-9

COXREGRESSIONMODEL

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

• Usually written in terms of the hazard function

• As a function of independent variables �1,�2, . . . �k,

�(t) = �0(t)e�1�1+···+�k�k"

relative risk / hazard ratio

log�(t) = log�0(t) + �1�1 + · · · + �k�k"

intercept

4-10

RELATIVERISK/HAZARDRATIO

�(t|�1, . . . ,�k) = �0(t)e�1�1+···+�k�k

�(t|�1,...,�k)�(t|0,...,0) = e�1�1+···+�k�k

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 97: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-11

REGRESSIONMODELS

LS Linear Regression: Y = �0 + �1�1 + · · · + �k�k + �

Linear: Y ⇠ N(�,�2) � = EY = �0 + �1�1 + · · · + �k�k

Cox: T ⇠ S(t) �(t) = �0(t)e�1�1+···+�k�k

" "Distribution of Dependence of distribution

outcome variable on �1, . . . �k

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

4-12

PROPORTIONALHAZARDSMODEL

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 98: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-13

EXAMPLE

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Single binary �:

� =⇢1 Test treatment0 Standard treatment

�(t) = �0(t)e��

Interpretation of e�:

"Relative risk (or hazard ratio) comparing test treatment to stan-dard".

�(t) for � = 1: �0(t)e�·1 = �0(t)e�

�(t) for � = 0: �0(t)e�·0 = �0(t)

ratio: e�(1�0) = e�

4-14

EXAMPLE

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Proportional Hazards

t

λ(t)

Parallel Log Hazards

t

logλ(t)

Page 99: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-15

RELATIONSHIPTOSURVIVALFUNCTION

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

4-16

PICTURE

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

t

λ(t)

Hazard Function

t

S(t)

Survival Function

Page 100: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-17

ESTIMATESANDCONFIDENCEINTERVALS

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

• We estimate � by maximizing the "partial likelihood function"

• Requires iteration on computer

• � is a MPLE (Maximum Partial Likelihood Estimator)

• We do not need to estimate �0(t) to do this

• Most packages will estimate se(�) using the information matrixfrom this PL.

• 95% CI for �: (�� 1.96se(�), �+ 1.96se(�))

• 95% CI for RR = e� : (e��1.96se(�), e�+1.96se(�))

4-18

PARTIALLIKELIHOOD

Data for the �th subject: (t�, ��,�1�, . . .�k�)

For subject with the jth ordered failure time : (t(j),1,�1(j), . . . ,�k(j))

PL(�1, . . . ,�k) =JY

j=1

e�1�1(j)+···+�k�k(j)P

�:t��t(j) e�1�1�+···+�k�k�

• (�1, . . . , �k) are the values of (�1, . . . ,�k) that maximizePL(�1, . . . ,�k). (MPLEs)

• Compares � values for the subject who failed at time t(j) tothose of all subjects at risk at time t(j).

• Does not depend on the values of the t�, only on their order.

• Does not depend on �0(t).

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 101: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-19

RISKSETPICTURE

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

|

|

|

|

|

|

0 2 4 6 8

survival time

1

1

0

0

0

1

x

D

D

L

A

D

D

1 vs 0.5 0 vs 0.5 1 vs 0.67 1 vs 0.5

Risk Sets and Treatment

4-20

FULLLIKELIHOOD

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

L(�,�0(t)) =Y

Failures

Pr[T = t�]Y

Censorings

Pr[T > t�]

=Y

Failures

�(t�|��)S(t�|��)Y

Censorings

S(t�|��)

=nY

�=1[�(t�|��)]��S(t�|��)

=nY

�=1[�0(t�)e���]��e�

R t�0 �0(s)e��ds

Page 102: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-21

PARTIALLIKELIHOODLet Ht represent the entire history of failure, censoring and � in thesample before time t.

Then the likelihood can be rewritten as follows:

L(�,�0(t)) =JY

j=1Pr[�th subject fails at t(j)|Ht(j) , some subject fails at t(j)] ·

Pr[Ht(j) , some subject fails at t(j)]

=JY

j=1

�(t(j)|�(j))P�:t��t(j) �(t(j)|��)

·JY

j=1Pr[Ht(j) , some subject fails at t(j)]

=JY

j=1

�0(t(j))e��(j)P�:t��t(j) �0(t(j))e

���·

JY

j=1Pr[Ht(j) , some subject fails at t(j)]

=JY

j=1

e��(j)P

�:t��t(j) e���·

JY

j=1Pr[Ht(j) , some subject fails at t(j)]

= | {z } | {z }Partial Likelihood Depends on �0(·) and �Depends only on �

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

4-22

HYPOTHESISTESTS

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Three tests of H0 : � = 0 are possible:

1. Wald test: �se(�)

2. (Partial) Likelihood ratio test

3. Score test: (⇡ logrank test)

Likelihood ratio test is best, but requiresfitting full (� = �) and reduced (� = 0) models.

Page 103: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-23

LIKELIHOODSANDTESTS

Four Hypothesis Tests

β

log

likel

ihoo

d

β 0

} Likelihood Ratio Test

Slope = Score

Wald test

Log Likelihood Function

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

4-24

COLONCANCEREXAMPLE

•  LevamisoleandFluorouracilforadjuvanttherapyofresectedcoloncarcinoma– Moerteletal.NewEnglandJournalofMedicine.1990;322(6):352–358.

– Moerteletal.Annalsofinternalmedicine.1995;122(5):321–326.

•  1296pa1ents•  StageB2orC•  3unblindedtreatmentgroups

–  Observa1ononly–  Levamisole(oral,1yr)–  Levamisole(oral,1yr)+5fluorouracil(intravenous1yr)

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 104: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-25

COLONCANCEREXAMPLE

•  ClinicaltrialatMayoClinic•  StageB2andCcoloncancerpa1ents;adjuvanttherapy

•  Threearms– Observa1ononly– Levamisole– 5-FU+LevamisoleatMayoClinic

•  StageCpa1entsonly•  Twotreatmentarmsonly

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

4-26

COLONCANCEREXAMPLE

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

0 500 1000 1500 2000 2500 3000

0.0

0.2

0.4

0.6

0.8

1.0

Days from Diagnosis

Surv

ival P

roba

bilit

y

LevLev+5FU

Complementary log−log Transformation

Page 105: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-27

COLONCANCEREXAMPLE

Variable

n

Deaths

Hazardra#o

CI

P-value

LevamisoleOnly 310 161 1.0(reference) -- --

Levamisole+5FU 304 123 0.71 (0.56,0.90) .004

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Q:Whichgrouphasbepersurvival?A:

4-28

TESTCOMPARISON

Test Sta#s#c P-value

Wald’s 8.13 .004

Score 8.21 .004

LikelihoodRa1o 8.21 .004

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Two-sidedtests

Page 106: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-29

ANOTHEREXAMPLE

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Three groups: use indicators for two

�1 =⇢1 Levamisole Only0 otherwise �2 =

⇢1 Levamisole + 5FU0 otherwise

Model: �(t) = �0(t)e�1�1+�2�2

RRs: Levamisole Only vs. Observation e�1Levamisole + 5FU vs. Observation e�2Levamisole + 5FU vs. Levamisole Only e�2��1

4-30

HEURISTICHAZARDS

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

t

λ(t)

Proportional Hazards

t

log(λ(t))

Parallel Log Hazards

Page 107: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-31

COLONCANCERVariable n Deaths HazardRa#o 95%CI P-value

Observa1onOnly 315 168 1.0(reference) -- --

LevamisoleOnly 310 161 0.97 (0.78,1.21) 0.81

Levamisole+5FU 204 123 0.69 (0.55,0.87) 0.002

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Q:Whichgrouphasbestsurvival?A:

4-32

TESTCOMPARIOSN

Test Sta#s#c P-value

Wald’s 11.56 .003

Score 11.68 .003

LikelihoodRa1o 12.15 .002

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Samehypothesisas3-groupheterogeneitytest.Scoretestissameinlargesamples.

Page 108: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-33

COLONCANCERTRIALDATA

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

0 500 1000 1500 2000 2500 3000

0.0

0.2

0.4

0.6

0.8

1.0

Days from Diagnosis

Surv

ival P

roba

bility

ObsLevLev+5FU

Colon Cancer Trial: All Three Groups

4-34

TREND

• When there are several groups, it is sometimes of interest totest whether risk increases from one group to the next:

– Several dose groups– Other ordered variable– Example: tumor differentiation

• For � =

8<:

1 well differentiated2 moderately differentiated3 poorly differentiated

Model: �(t) = �0(t)e��

• Score test is the same as the trend test

• Could use other values for � (actual dose levels)

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Page 109: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-35

TREND

For � =

8<:

1 well differentiated2 moderately differentiated3 poorly differentiated

Model: �(t) = �0(t)e��

Interpretation of e�: HR associated with the comparison of oneworse differentiation group to one better:

• poorly differentiated to moderately differentiated, or

• moderately differentiated to well differentiated

Q: What is HR comparing poorly differentiated to welldifferentiated?

A:SISCR2016:Module4IntroSurvival

BarbaraMcKnight

4-36

TREND

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

t

λ(t)

WellModeratelyPoor

Proportional Hazards

t

log(λ(t))

WellModeratelyPoor

Parallel Log Hazards

Page 110: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-37

TRENDWITHDIFFERENTIATION

HazardRa#o

95%CI

Onecategoryworsedifferen1a1on(well,moderately,poor)

1.4 (1.1,1.8)

P=.003(trend)

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Onepresenta1onbaseden1relyontrend(“groupedlinear”)model:

Ipreferpresen1nghazardra1osandCI’sbasedondummyvariablemodel,andprovidingP-valuefortrend.

4-38

TRENDWITHDIFFERENTIATION

n Deaths HazardRa#o 95%CI

Welldifferen1ated 66 26 1.0(reference) --

Moderatelydifferen1ated

434 196 1.2 (0.80,1.8)

Poorlydifferen1ated

98 54 1.8 (1.2,3.0)

P=.003(trend)

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

Mypreferredpresenta1onbasedondummyvariablemodewithtrendP-value:

Iusuallywouldnotpresentthisforanaprioritrendhypothesis,butforcomparisonhere,theheterogeneityP-value(2df)is0.009.

Page 111: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

4-39

TOWATCHOUTFOR:

•  CoefficientsinCoxregressionareposi1velyassociatedwithrisk,notsurvival.–  Posi1veβmeanslargevaluesofxareassociatedwithshortersurvival.

•  Withoutcertaintypesof1me-dependentcovariates,Coxregressiondoesnotdependontheactual1mes,justtheirorder.–  Canaddaconstanttoall1mestoremovezeros(somepackagesremoveobserva1onswith1me=0)withoutchanginginference

•  ForLRT,nestedmodelsmustbecomparedbasedonsamesubjects.–  Ifsomevaluesofvariablesinlargermodelaremissing,thesesubjectsmustberemovedfromfitofsmallermodel.

SISCR2016:Module4IntroSurvivalBarbaraMcKnight

In R

Load packages.

library(survival)library(rms)library(survMisc)library(foreign)

Page 112: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Get data.

data(colon) # in survival packagehead(colon)

## id study rx sex age obstruct perfor adhere nodes status differ## 1 1 1 Lev+5FU 1 43 0 0 0 5 1 2## 2 1 1 Lev+5FU 1 43 0 0 0 5 1 2## 3 2 1 Lev+5FU 1 63 0 0 0 1 0 2## 4 2 1 Lev+5FU 1 63 0 0 0 1 0 2## 5 3 1 Obs 0 71 0 0 1 7 1 2## 6 3 1 Obs 0 71 0 0 1 7 1 2## extent surg node4 time etype## 1 3 0 1 1521 2## 2 3 0 1 968 1## 3 3 0 0 3087 2## 4 3 0 0 3087 1## 5 2 0 1 963 2## 6 2 0 1 542 1

Process data and compute survival curves.

df <- colon[colon$etype == 2,] # Use death times.df <- df[df$rx != "Obs",] # Omit observation only arm.Y <- with(df, Surv(time, status))Shats <-survfit(Y ~ rx, data = df)

Page 113: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Plot survival curves.

0 500 1000 1500 2000 2500 3000

0.0

0.2

0.4

0.6

0.8

1.0

Days

Surv

ival P

roba

bilit

y

Lev onlyLev + 5FU

Fit Cox model

model1 <- coxph(Y ~ rx, data = df)

## Warning in coxph(Y ~ rx, data = df): X matrix deemed to be singular;## variable 2

Page 114: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Results

summary(model1)

## Call:## coxph(formula = Y ~ rx, data = df)#### n= 614, number of events= 284#### coef exp(coef) se(coef) z Pr(>|z|)## rxLev 0.3417 1.4073 0.1199 2.851 0.00436 **## rxLev+5FU NA NA 0.0000 NA NA## ---## Signif. codes: 0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1#### exp(coef) exp(-coef) lower .95 upper .95## rxLev 1.407 0.7106 1.113 1.78## rxLev+5FU NA NA NA NA#### Concordance= 0.541 (se = 0.015 )## Rsquare= 0.013 (max possible= 0.996 )## Likelihood ratio test= 8.21 on 1 df, p=0.00416## Wald test = 8.13 on 1 df, p=0.00436## Score (logrank) test = 8.21 on 1 df, p=0.004174

Data with All Three Groups

df2 <- colon[colon$etype == 2,] # Use death times.Y2 <- with(df2, Surv(time, status))

Page 115: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Dummy variable model

model2 <- coxph(Y2 ~ rx, data = df2)

Summary

summary(model2)

## Call:## coxph(formula = Y2 ~ rx, data = df2)#### n= 929, number of events= 452#### coef exp(coef) se(coef) z Pr(>|z|)## rxLev -0.02664 0.97371 0.11030 -0.241 0.80917## rxLev+5FU -0.37171 0.68955 0.11875 -3.130 0.00175 **## ---## Signif. codes: 0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1#### exp(coef) exp(-coef) lower .95 upper .95## rxLev 0.9737 1.027 0.7844 1.2087## rxLev+5FU 0.6896 1.450 0.5464 0.8703#### Concordance= 0.536 (se = 0.013 )## Rsquare= 0.013 (max possible= 0.998 )## Likelihood ratio test= 12.15 on 2 df, p=0.002302## Wald test = 11.56 on 2 df, p=0.003092## Score (logrank) test = 11.68 on 2 df, p=0.002906

Page 116: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Trend model

model3 <- coxph(Y2 ~ differ, data = df2)

Summary

summary(model3)

## Call:## coxph(formula = Y2 ~ differ, data = df2)#### n= 906, number of events= 441## (23 observations deleted due to missingness)#### coef exp(coef) se(coef) z Pr(>|z|)## differ 0.32788 1.38803 0.09618 3.409 0.000651 ***## ---## Signif. codes: 0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1#### exp(coef) exp(-coef) lower .95 upper .95## differ 1.388 0.7204 1.15 1.676#### Concordance= 0.544 (se = 0.011 )## Rsquare= 0.013 (max possible= 0.998 )## Likelihood ratio test= 11.51 on 1 df, p=0.0006916## Wald test = 11.62 on 1 df, p=0.0006515## Score (logrank) test = 11.57 on 1 df, p=0.0006689

Page 117: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Dummy Variables for Di�erentiation

model4 <- coxph(Y2 ~ factor(differ), data = df2)

Summary

summary(model4)

## Call:## coxph(formula = Y2 ~ factor(differ), data = df2)#### n= 906, number of events= 441## (23 observations deleted due to missingness)#### coef exp(coef) se(coef) z Pr(>|z|)## factor(differ)2 0.04963 1.05088 0.16441 0.302 0.76275## factor(differ)3 0.53196 1.70226 0.18764 2.835 0.00458 **## ---## Signif. codes: 0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1#### exp(coef) exp(-coef) lower .95 upper .95## factor(differ)2 1.051 0.9516 0.7614 1.450## factor(differ)3 1.702 0.5875 1.1784 2.459#### Concordance= 0.544 (se = 0.011 )## Rsquare= 0.017 (max possible= 0.998 )## Likelihood ratio test= 15.25 on 2 df, p=0.0004872## Wald test = 16.85 on 2 df, p=0.0002195## Score (logrank) test = 17.19 on 2 df, p=0.0001855

Page 118: MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS · 1 - 2 MODULE 4: INTRODUCTION TO SURVIVAL ANALYSIS Summer Ins>tute in Sta>s>cs for Clinical Research University of Washington July, 2016

Your turn

Using all-cause mortality as the outcome for the colon data in the survival package in R:

1. Fit a Cox model with a binary treatment indicator relating whether more than 4

lymph nodes were positive for disease at diagnosis is related to the hazard of

all-cause mortality.

2. Fit a Cox model with dummy-variable indicators for whether extent of disease at

diagnosis is related to the hazard of all-cause mortality.

3. Fit a Cox model with “grouped-linear” measure for how extent of disease at

diagnosis is related to the hazard of all-cause mortality.

Write a “results” sentence or two for each of these analyses.