10CER Module10 SurvivalAnalysis Martin€¦ · Hazard regression models. 1) Modeling an outcome 2) ... B. Goals of survival analysis C. Survival time and censored data D. Calculating
Post on 06-Jul-2020
2 Views
Preview:
Transcript
2/20/2012
1
Brook I. Martin PhD MPHDartmouth Collegeg
Introduction to Survival Analysis
About me
• Health services PhD from University of hWashington.
• Health services faculty at Dartmouth College.
• Affiliated with Dartmouth‐Hitchcock Medical Center Department of Orthopaedics.
• Primary research interest is in the quality of care for musculoskeletal & spinal problems.
2
Overall goals
Review framework for choosing an analysis
1) Introduction to survival analysis.
2) Descriptive analysis of survival data.
3) Applied introduction to Cox‐Proportional Hazard regression models.
1) Modeling an outcome
2) Diagnostic tests of PH assumptions
3
2/20/2012
2
Data types
Type Example
Continuous Age; annual salary; WBC
Dichotomous Infection (yes/no); Death (yes/no)
Count Number (rates) of surgical procedures.
Survival # days until death or other event.
Categorical (Ordinal)
Health rating:1 = Excellent; 2 = Very good; 3 = Good; 4 = Fair; 5 = Poor
Categorical (Nominal)
Insurance:1 = Medicare/aid; 2 = Private; 4 = HMO; 5 = Other
4
Data types
Type Example
Continuous Age; annual salary; WBC
Dichotomous Infection (yes/no); Death (yes/no)
Count Number (rates) of surgical procedures.
Survival # days until death or other event.
Categorical (Ordinal)
Health rating:1 = Excellent; 2 = Very good; 3 = Good; 4 = Fair; 5 = Poor
Categorical (Nominal)
Insurance:1 = Medicare/aid; 2 = Private; 4 = HMO; 5 = Other
5
Analysis frameworkDependentVariable
Univariable analysis (e.g. descriptive)
Bivariable analysis Multivariableanalysis
Continuous MeansT‐test
Correlation;T‐test
Analysis of covariance
Dichotomous Proportion Chi‐square Logistic Regression
Count Incidence & Rate‐difference or ratio Poisson regressionprevalence
g
Survival Kaplan‐Meier survival Log‐rank; Wilcoxon(other) for survival data
Cox‐proportionalhazard regression
Categorical (Ordinal)
Proportion;Wilcoxon signed rank
Spearman’s test;Mann‐Whitney test;
Ordered logistic regression
Categorical (Nominal)
Proportion Chi‐square; Mantel‐Haenszel test
Multinomial logistics
6
2/20/2012
3
Analysis frameworkDependentVariable
Univariable analysis (e.g. descriptive)
Bivariable analysis Multivariableanalysis
Continuous MeansT‐test
Correlation;T‐test
Analysis of covariance
Dichotomous Proportion Chi‐square Logistic Regression
Count Incidence & Rate‐difference or ratio Poisson regressionprevalence
g
Survival Kaplan‐Meier survival Log‐rank; Wilcoxon(other) for survival data
Cox‐proportionalhazard regression
Categorical (Ordinal)
Proportion;Wilcoxon signed rank
Spearman’s test;Mann‐Whitney test;
Ordered logistic regression
Categorical (Nominal)
Proportion Chi‐square; Mantel‐Haenszel test
Multinomial logistics
7
Examples
Martin BI, Mirza SK, Comstock BA, Gray DT, Kreuter W, Deyo RA. Reoperation rates following lumbar spine surgery and the influence of spinal fusion procedures. Spine (Phila Pa 1976). 2007 Feb 1;32(3):382‐7.
Bederman SS, Kreder HJ, Weller I, Finkelstein JA, Ford MH, Yee AJ. The who, what and when of surgery for the degenerative lumbar spine: a population‐based study of surgeon factors, surgical procedures, recent trends and reoperation rates. Can J Surg. 2009 Aug;52(4):283‐290.
8
Part I: Survival Analysis
1) Introduction to survival analysis
A. What is survival analysis
B. Goals of survival analysis
C Survival time and censored dataC. Survival time and censored data
D. Calculating survival
E. Why use survival analysis
9
2/20/2012
4
What is survival analysis?
• Statistical procedures used when the outcome of interest is time until an event occurs.
• Example: Time until death, Time until repeat surgerysurgery.
• Definitions:
Time = survival time (days/years)
Event = indication of failure
Censoring = when survival data is not fully known.
10
Goals of survival analysis?
• Estimate and interpret survivor and/or hazard functions.
• To compare survivor and/or hazard functions between groups of interestbetween groups of interest.
• To assess the relationship of explanatory variables to survival time ( Cox‐Proportional Hazard regression)
11
Survival data:
A
B
C
subjects
Time Event Censor
No 12 0 1
12
Time (weeks)
0 2 4 6 8 10 12
D
EStudy s
2/20/2012
5
Survival data:
A
B
C
subjects
Time Event Censor
YES
No 12 0 1
8 1 0
13
Time (weeks)
0 2 4 6 8 10 12
D
EStudy s
Survival data:
A
B
C
subjects
Time Event Censor
YES
No
No
12 0 1
8 1 0
8 0 1
14
Time (weeks)
0 2 4 6 8 10 12
D
EStudy s
Survival data:
A
B
C
subjects
Time Event Censor
YES
No
No
12 0 1
8 1 0
8 0 1
15
Time (weeks)
0 2 4 6 8 10 12
D
EStudy s No 4 0 1
2/20/2012
6
Survival data:
A
B
C
subjects
Time Event Censor
YES
No
No
12 0 1
8 1 0
8 0 1
16
Time (weeks)
0 2 4 6 8 10 12
D
EStudy s No
Yes
4 0 1
6 1 0
Survival data:
Time Event Censor
A 12 0 1
B 8 1 0
C 8 0 1
ID δ = (0,1) Where 1 if failure0 if censored
T = random variable for survival time (>=0)t = specific value for T
17
D 4 0 1
E 6 1 0
Survival > 8 weeks?H0: T > t = 8
Survival data:
Time Event Censor
A 12 0 1
B 8 1 0
C 8 0 1
ID δ = (0,1) Where 1 if failure0 if censored
T = random variable for survival time (>=0)t = specific value for T
18
D 4 0 1
E 6 1 0
Survival > 8 weeks?H0: T > t = 8
n tn δn
2/20/2012
7
Survivor function:
Survivor function S(t): The probability thata subject survives longer than a specific t
• Always decrease as t increases
S(t)
S(t) = 11
19
Always decrease as t increases• At t= 0; S(t) = 1• At t = ∞; S(t) = 0• In practice represented by a step curve.
S(∞) = 00
Survivor function:
Survivor function S(t) – The probability thata subject survives longer than a specific t
• Always decrease as t increases
S(t)
S(t) = 11
20
Always decrease as t increases• At t= 0; S(t) = 1• At t = ∞; S(t) = 0• In practice represented by a step curve.
S(∞) = 00
S(t)
S(t) = 1
0
1
t
Hazard function:
The limit, as Δt approaches 0, of a probability statement about survival, divided by Δt
h(t) = lim P( t ≤ T < Δt | T ≥ t)/ Δt
h(t) ≥ 0; has no upper limit.h(t)
∞
21
View the hazard as an “instantaneous” rateof the event per a unit of time, given that the subject has survived to time t.
S(t) : not failingH(t) : failing
h(t)
0
t
2/20/2012
8
Calculating survival:
ID T δ cen
A 12 0 1
B 8 1 0
C 8 0 1
D 4 0 1
ID T δ cen
‐‐ 0 0 0
D 4 0 1
J 4 1 0
K 5 0 1
t(j) #fail
# cen # at risk
t(0) 0 0 11
Table of ordered failure times
22
D 4 0 1
E 6 1 0
F 10 1 0
G 10 0 1
H 9 0 1
I 8 1 0
J 4 1 0
K 5 0 1
E 6 1 0
B 8 1 0
C 8 0 1
I 8 1 0
H 9 0 1
F 10 1 0
G 10 0 1
A 12 0 1
t(4) 1 2 11
t(6) 1 0 8
t(8) 2 2 7
t(10) 1 2 3
Calculating survival:
ID T δ Cen
‐‐ 0 0 0
D 4 0 1
J 4 1 0
K 5 0 1
t(j) #fail
# cen # at risk
t(0) 0 0 11
t(4) 1 2 11
Table of ordered failure timesAverage survival time (ignoring censoring) = 84/ 11 = 7.6 weeks
Average Hazard time = # fail / ∑ ti = 5/ 84 = 0.06
23
E 6 1 0
B 8 1 0
C 8 0 1
I 8 1 0
H 9 0 1
F 10 1 0
G 10 0 1
A 12 0 1
t(6) 1 0 8
t(8) 2 2 7
t(10) 1 2 3
Calculating survival:
T bl f d d f il ti
General Kaplan‐Meier survival formula:
S(t(j)) = S(t(j‐1)) * P(T >t(j) | T ≥ t(j))
24
t(j) #fail
# cen # at risk
S(t)
t(0) 0 0 11 1
t(4) 1 2 11 0.91
t(6) 1 0 8 0.80
t(8) 2 2 7 0.57
t(10) 1 2 3 0.37
Table of ordered failure times
(1.00 * 10/11)
(0.91 * 7/8)
(0.80 * 5/7)
(0.57 * 2/3)
2/20/2012
9
Part 2: Describing survival data
Examples in Stata using:stsetstdescribestlist
25
stliststgraphsttest
Analysis frameworkDependentVariable
Univariable analysis (e.g. descriptive)
Bivariable analysis Multivariableanalysis
Continuous MeansT‐test
Correlation;T‐test
Analysis of covariance
Dichotomous Proportion Chi‐square Logistic Regression
Count Incidence & Rate‐difference or ratio Poisson regressionprevalence
g
Survival Kaplan‐Meier survival Log‐rank; Wilcoxon(other) for survival data
Cox‐proportionalhazard regression
Categorical (Ordinal)
Proportion;Wilcoxon signed rank
Spearman’s test;Mann‐Whitney test;
Ordered logistic regression
Categorical (Nominal)
Proportion Chi‐square; Mantel‐Haenszel test
Multinomial logistics
26
ExampleData source: Washington State Comprehensive Hospital Abstract
Reporting System (CHARS).
Design: Retrospective analysis of administrative data.
Patients: Patients who underwent a lumbar herniated disc surgery.
Primary outcome: Time until the first instance of a repeat surgery (“reoperation”)
Independent variables: Age, sex, comorbidity, previous surgery, Insurance.
Citation: Martin BI, 2007. Spine.
27
2/20/2012
10
Analysis using survival data
First, use stset, to tell Stata that you are working with survival data.
stset reoper, failure(resurgmax)
28
Name of time variable Name of event indicator variable
Univariate analysis
Describe basic survival data
stdescribe
29
Analysis using survival data
Use sts list, to list survival data.
stset list, by(PAYGRP) compare
30
2/20/2012
11
Analysis using survival data
Use sts graph, to examine Kaplan‐Meier survival curve.
sts graph
31
Analysis using survival data
sts graph, by(PAYGRP) risktablexlable(0(2)12) scheme(s2mono)
32
Analysis using survival data
sts graph, by(PAYGRP) xlabel(0(2)12) scheme(s2mono) failure
33
2/20/2012
12
Analysis using survival data
sts graph, by(PAYGRP) xlabel(0(2)12) scheme(s2mono) hazard
34
Bivariate testing
Test for a difference between insurance groups using log‐rank test:
sts test PAYGRP
35
Part 3: Cox Proportional Hazard Regression Analysis
• Multivariable regression for time‐to‐event data.
• Testing PH‐model assumption
36
2/20/2012
13
Analysis frameworkDependentVariable
Univariable analysis (e.g. descriptive)
Bivariable analysis Multivariableanalysis
Continuous MeansT‐test
Correlation;T‐test
Analysis of covariance
Dichotomous Proportion Chi‐square Logistic Regression
Count Incidence & Rate‐difference or ratio Poisson regressionprevalence
g
Survival Kaplan‐Meier survival Log‐rank; Wilcoxon(other) for survival data
Cox‐proportionalhazard regression
Categorical (Ordinal)
Proportion;Wilcoxon signed rank
Spearman’s test;Mann‐Whitney test;
Ordered logistic regression
Categorical (Nominal)
Proportion Chi‐square; Mantel‐Haenszel test
Multinomial logistics
37
Cox Proportional Hazard Model:
General form
h(t, X) = h0(t) exp[β1X1 + … + βnXn]
38
Hazard Ratio:The measure of effect used in survival analysis that describes the association between an exposure and an outcome.
Calculated as the hazard for one set of characteristic (exposure) compared to an alternative set of characteristics (unexposed).
HR = ĥ(t, X*)/ ĥ(t, X) = exp[∑ βi(X*I – Xi)]
39
( , )/ ( , ) p[∑ βi( I i)]
Typically expressed as an exponential from a regression coefficient:HR = 1 no associationHR > 1 exposure associated with greater risk of outcomeHR <1 exposure associated with lower risk of outcome.
2/20/2012
14
Cox‐Proportional Hazard Regression
Stata command for Cox‐PH Regression:
stcox i.PAYGRP female i.AGE4 positive_quan
40
PH assumptionThe requirement for the HR to constant over time…
…or, that the hazard for one exposure is proportional (not dependent on time) to the hazard for an alternative exposure.
41
ĥ(t, X*) = constant * ĥ(t, X)
Can be evaluated •Graphically•Through GOF statistics
PH assumptionThe requirement for the HR to constant over time…
…or, that the hazard for one exposure is proportional (not dependent on time) to the hazard for an alternative exposure.
h(t, X)
∞
X = 1
X = 0
42
ĥ(t, X*) = constant * ĥ(t, X)
Can be evaluated •Graphically•Through GOF statistics
0
0 4 8 12weeks
2/20/2012
15
PH assumptionThe requirement for the HR to constant over time…
…or, that the hazard for one exposure is proportional (not dependent on time) to the hazard for an alternative exposure.
h(t, X)
∞
X = 1
X = 0
43
ĥ(t, X*) = constant * ĥ(t, X)
Can be evaluated •Graphically•Through GOF statistics
0
0 4 8 12weeks
HR 2 weeks: ĥ(t=2, X = 1)/ ĥ(t=2, X = 0) < 1
HR 10 weeks: ĥ(t=10, X = 1)/ ĥ(t=10, X = 0) > 1
Thus, HR is not constant over time, and the PH assumption is not met! & Cox‐PH regression is not appropriate.
Graphically examine PH assumption
Log‐Log survival curves: stphplot, by(female) adjust(positive _quan PAYGRP AGE4)
44
Graphically examine PH assumption
Observed (K‐M) versus expected survival (Cox PH) plot: stcoxkm, by(PAYGRP)
45
2/20/2012
16
PH assumptionAnalysis of Shoefeld’sresiduals (fits a smooth function of time to the residuals )
estat phtest
46
estat phtest, detail
estat phtest, plot(age)
PH assumptionAnalysis of Shoefeld’sresiduals (fits a smooth function of time to the residuals )
estat phtest
47
estat phtest, detail
estat phtest, plot(age)
PH assumptionAnalysis of Shoefeld’sresiduals (fits a smooth function of time to the residuals )
estat phtest
48
estat phtest, detail
estat phtest, plot(age)
2/20/2012
17
When PH‐assumption is not met
• Separately model for groups of interest• Stratified Cox‐PH regression.• Include Time‐varying covariates.
49
Separate models for each strata
Report separate survival models for groups of intereststcox female i.AGE4 positive_quan if PAYGRP == 0
stcox female i.AGE4 positive_quan if PAYGRP == 1
50
PAYGRP == 1
stcox female i.AGE4 positive_quan if PAYGRP == 2
Stratified Cox‐PH modelSeparately report survival for groups of interest using Stratified Cox‐PH regression.For age group = 0: h(t, X) = h00(t) exp[β1X1 + … + βnXn ]For age group = 1: h(t, X) = h01(t) exp[β1X1 + … + βnXn ]For age group = 2: h(t, X) = h02(t) exp[β1X1 + … + βnXn ]For age group = 3: h(t, X) = h03(t) exp[β1X1 + … + βnXn ]
stcox female i.AGE4 positive_quan , strata(PAYGRP)
51
Coefficients are for variables that meet PH assumption, adjusted for the variables that they are stratified on; not possible to obtain HR for the effect on PAYGRP
2/20/2012
18
Use time‐varying coefficientsOriginal model assumes no interaction with time:h(t, X) = h0(t) exp[β1X1 + … + βnXn]
Time‐varying model allows for hazard to vary with time:h*(t, X) = h0(t) exp[β1X1+ …+ βnXn + δjXj(t)]
stcox PAYGRP female i.AGE4 positive_quan , tvc(AGE4) texp(year5)
52
Split data by timestset time, failure(resurgmax) id(id)stsplit cat, at(4 , 8)gen year4 = PAYGRP*(cat==4)gen year8 = PAYGRP*(cat==8)stcox PAYGRP female i.AGE4 positive_quan year4 year8
53
Special situation
• Known survival distribution– Use parametric survival analysis (“streg”)
• Correlated data– Termed “shared frailty” in survival modelsUse “shared” option– Use “shared” option
• Survey data– Use “svyset”, followed by “svycox”
• Competing risks: when multiple causes of failure may occur.– Use “stcrreg”
54
2/20/2012
19
Summary
• Survival analysis is used to examine risk of an event over time.
• Demonstrated how to calculate the survival function and the hazard functionsurvival function and the hazard function.
• Non‐parametric methods for evaluating and describing survival data.
• Performing Cox‐PH requires an detailed evaluation of the PH assumption.
55
Thanks!
Brook I Martin@Dartmouth eduBrook.I.Martin@Dartmouth.edu
56
top related