Top Banner
BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration ….2020\Stata for Survival Analysis 2020.docx Page 1 of 16 8. Introduction to Survival Analysis Illustration – Stata Users Spring 2020 1. Illustration: DPCA Study of Primary Biliary Cirrhosis ……………… 2. Prepare Data for Survival Analysis …………..………………………. 3. Model Free Approaches ……………………………..………………… a. Descriptives ……………………………………………………….. b. Kaplan-Meier Curve Estimation ………………..……………….. c. Kaplan-Meier Curve Plot ……………………….………………. d. Log Rank Test for Equality of Survival Distributions ….….…….. 4. Cox PH Model Regression …………………………………….………. a. Fit Cox PH Model …………………………………………………. b. Multivariable Model Development ………………………………... c. Side-by-side Comparison of Models ………………………………. 5. Regression Diagnostics for Cox PH Model ……………………...……. a. Test of Proportional Hazards …………………………….…..…….. b. Graphical Assessment of Proportional Hazards ……………………. c. Test of Overall Goodness-of-Fit ……………..…………………….. 2 3 4 4 6 7 8 9 9 12 12 14 14 14 16
16

Stata for Survival Analysis 2020people.umass.edu/biep640w/pdf/Stata for Survival Analysis... · 2020-04-21 · Data Dictionary/Coding Manual. This illustration utilizes the following

Aug 08, 2020

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration

    ….2020\Stata for Survival Analysis 2020.docx Page 1 of 16

    8. Introduction to Survival Analysis Illustration – Stata Users

    Spring 2020

    1. Illustration: DPCA Study of Primary Biliary Cirrhosis ……………… 2. Prepare Data for Survival Analysis …………..………………………. 3. Model Free Approaches ……………………………..………………… a. Descriptives ……………………………………………………….. b. Kaplan-Meier Curve Estimation ………………..……………….. c. Kaplan-Meier Curve Plot ……………………….………………. d. Log Rank Test for Equality of Survival Distributions ….….…….. 4. Cox PH Model Regression …………………………………….………. a. Fit Cox PH Model …………………………………………………. b. Multivariable Model Development ………………………………... c. Side-by-side Comparison of Models ………………………………. 5. Regression Diagnostics for Cox PH Model ……………………...……. a. Test of Proportional Hazards …………………………….…..…….. b. Graphical Assessment of Proportional Hazards ……………………. c. Test of Overall Goodness-of-Fit ……………..……………………..

    2

    3

    4 4 6 7 8

    9 9

    12 12

    14 14 14 16

  • BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration

    ….2020\Stata for Survival Analysis 2020.docx Page 2 of 16

    1. Illustration

    DPCA Study of Primary Biliary Cirrhosis

    Preliminary – Download from the course website stata data set pbc.dta. DPCA Study of Primary Biliary Cirrhosis source: Dickson ER, Grambsch PM and Fleming TR (1989) Prognosis in primary biliary-cirrhosis - model for decision making. Hepatology, 10, 1-7. Introduction. Bile is a fluid produced in your liver which functions in the digestion of food and, in aids in ridding your body of worn-out red blood cells, cholesterol and toxins. The disease primary biliary cirrhosis is an autoimmune disease in which the body turns against its own cells, in this case bile ducts. As the bile ducts are increasingly damaged, harmful substances can accumulate. This can lead to irreversible scarring of liver tissue (this is cirrhosis). Among other things, the sufferer can experience abdominal pain, internal bleeding and, ultimately, liver failure. Primary biliary cirrhosis is also a risk factor for liver cancer. This illustration utilizes data from a randomized controlled trial of D-penicillamine (DPCA) for the treatment of primary biliary cirrhosis. A total of n=312 consenting subjects were enrolled and randomized to either active treatment or placebo-control (presumably this group received standard care). Time zero is date of diagnosis and initiation of treatment. Study participants were followed to event of end-stage liver disease or censoring. Thus, these are an example of “right” censored data. Over the approximate 10 years of follow-up, 125 events of death (40%) were observed. The goal of these analyses was to assess the benefit of randomization to DPCA on survival, overall and after adjustment for selected, important, covariates. Data Dictionary/Coding Manual. This illustration utilizes the following variables in pbc.dta.

    Variable Codings Label years Continuous (range: 0.11 – 12.47) Time to death (in years) status 1 = dead 0 = censored Event/censoring indicator rx 1 = DPCA 0 = Control Treatment/randomization histol 1=lowest, 2, 3, 4=highest Severity of liver damage at dx bilirubin Continuous, mg/dl Serum bilirubin

  • BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration

    ….2020\Stata for Survival Analysis 2020.docx Page 3 of 16

    2. Prepare Data for Survival Analysis

    . use "/Users/cbigelow/Desktop/pbc.dta" (PBC Natural Hx Data) . * Check data set (variables of interest only) . codebook years status rx histol bilirubin, compact Variable Obs Unique Mean Min Max Label --------------------------------------------------------------------------------------------------------- years 312 301 5.49312 .1122519 12.47365 Time to Death (in Years) status 312 2 .400641 0 1 Alive/Dead rx 312 2 .4935897 0 1 treatment histol 312 4 3.032051 1 4 Histologic stage of disease bilirubin 312 85 3.25609 .3 28 Serum Bilirubin in mg/dl --------------------------------------------------------------------------------------------------------- . * ---- Declare Data to be Survival Data ------* . * Time to event: years . * Censoring: status (1=dead, 0=censored) . * Command is stset TIMETOEVENT, failure(CENSORVARIABLE) . stset years, failure(status) failure event: status != 0 & status < . obs. time interval: (0, years] exit on or before: failure ------------------------------------------------------------------------------ 312 total observations 0 exclusions ------------------------------------------------------------------------------ 312 observations remaining, representing 125 failures in single-record/single-failure data 1713.854 total analysis time at risk and under observation at risk from t = 0 earliest observed entry t = 0 last observed exit t = 12.47365 . * Describe survival data using command stsum . stsum failure _d: status analysis time _t: years | incidence no. of |------ Survival time -----| | time at risk rate subjects 25% 50% 75% ---------+--------------------------------------------------------------------- total | 1713.853528 .0729351 312 4.071184 9.295004 . Interpretation: The 25th and 50th percentiles of survival are shown. The 25th percentile is 4.07 years and says that 25% of participants have survival times less than 4.07 years. The missing value for the 75th percentile is the result of the high prevalence of censoring in this cohort.

  • BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration

    ….2020\Stata for Survival Analysis 2020.docx Page 4 of 16

    3. Model Free Approaches

    a. Descriptives

    . * Continuous variables . sort rx . tabstat years bilirubin, by(rx) statistics(n mean sd min q max) columns(statistics) format(%8.2f) longstub rx variable | N mean sd min p25 p50 p75 max ---------------------+-------------------------------------------------------------------------------- Placebo years | 158.00 5.52 3.00 0.11 3.37 5.19 7.24 12.47 bilirubin | 158.00 2.87 3.63 0.30 0.80 1.40 3.20 20.00 ---------------------+-------------------------------------------------------------------------------- DPCA years | 154.00 5.47 3.16 0.14 3.15 4.96 7.59 12.38 bilirubin | 154.00 3.65 5.28 0.30 0.70 1.30 3.60 28.00 ---------------------+-------------------------------------------------------------------------------- Total years | 312.00 5.49 3.08 0.11 3.26 5.04 7.40 12.47 bilirubin | 312.00 3.26 4.53 0.30 0.80 1.35 3.45 28.00 ------------------------------------------------------------------------------------------------------ . * Discrete variables . fre rx histol status rx -- treatment --------------------------------------------------------------- | Freq. Percent Valid Cum. ------------------+-------------------------------------------- Valid 0 Placebo | 158 50.64 50.64 50.64 1 DPCA | 154 49.36 49.36 100.00 Total | 312 100.00 100.00 --------------------------------------------------------------- histol -- Histologic stage of disease ----------------------------------------------------------- | Freq. Percent Valid Cum. --------------+-------------------------------------------- Valid 1 | 16 5.13 5.13 5.13 2 | 67 21.47 21.47 26.60 3 | 120 38.46 38.46 65.06 4 | 109 34.94 34.94 100.00 Total | 312 100.00 100.00 ----------------------------------------------------------- status -- Alive/Dead ---------------------------------------------------------------- | Freq. Percent Valid Cum. -------------------+-------------------------------------------- Valid 0 Censored | 187 59.94 59.94 59.94 1 Dead | 125 40.06 40.06 100.00 Total | 312 100.00 100.00 ----------------------------------------------------------------

  • BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration

    ….2020\Stata for Survival Analysis 2020.docx Page 5 of 16

    . tab2 rx status, row column -> tabulation of rx by status +-------------------+ | Key | |-------------------| | frequency | | row percentage | | column percentage | +-------------------+ | Alive/Dead treatment | Censored Dead | Total -----------+----------------------+---------- Placebo | 93 65 | 158 | 58.86 41.14 | 100.00 | 49.73 52.00 | 50.64 -----------+----------------------+---------- DPCA | 94 60 | 154 | 61.04 38.96 | 100.00 | 50.27 48.00 | 49.36 -----------+----------------------+---------- Total | 187 125 | 312 | 59.94 40.06 | 100.00 | 100.00 100.00 | 100.00 Interpretation: Among n=158 randomized to PLACEBO, there were 65 deaths (41%) Among n=154 randomized to active treatment DPCA, there were 60 deaths (40%)

  • BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration

    ….2020\Stata for Survival Analysis 2020.docx Page 6 of 16

    b. Kaplan-Meier Curve Estimation Note – must have previously issued command stset to declare data as survival data see again, page 3) . * Single Group Kaplan-Meier Curve Estimation . * Command is sts list . sts list failure _d: status Kaplan- Meier Estimates analysis time _t: years Beg. Net Survivor Std. Time Total Fail Lost Function Error [95% Conf. Int.] ------------------------------------------------------------------------------- .1123 312 1 0 0.9968 0.0032 0.9775 0.9995 .1396 311 1 0 0.9936 0.0045 0.9746 0.9984 .1944 310 1 0 0.9904 0.0055 0.9705 0.9969 .2108 309 1 0 0.9872 0.0064 0.9662 0.9952 --- rows omitted --- 12.19 7 0 1 0.3406 0.0528 0.2398 0.4438 12.21 6 0 1 0.3406 0.0528 0.2398 0.4438 12.23 5 0 1 0.3406 0.0528 0.2398 0.4438 12.32 4 0 1 0.3406 0.0528 0.2398 0.4438 12.34 3 0 1 0.3406 0.0528 0.2398 0.4438 12.38 2 0 1 0.3406 0.0528 0.2398 0.4438 12.47 1 0 1 0.3406 0.0528 0.2398 0.4438 ------------------------------------------------------------------------------- . * Two Group Kaplan-Meier Curve Estimation . * Command is sts list, by(GROUPVAR) Note: Must have sorted by GROUPVAR first . sort rx . sts list, by(rx) failure _d: status Kaplan- Meier Estimates analysis time _t: years Beg. Net Survivor Std. Time Total Fail Lost Function Error [95% Conf. Int.] ------------------------------------------------------------------------------- Placebo .1123 158 1 0 0.9937 0.0063 0.9559 0.9991 .1944 157 1 0 0.9873 0.0089 0.9503 0.9968 .3587 156 1 0 0.9810 0.0109 0.9423 0.9938 .3833 155 1 0 0.9747 0.0125 0.9340 0.9904 ---- rows omitted -- DPCA .1396 154 1 0 0.9935 0.0065 0.9548 0.9991 .2108 153 1 0 0.9870 0.0091 0.9491 0.9967 .3012 152 1 0 0.9805 0.0111 0.9408 0.9937 ---- rows omitted -- -------------------------------------------------------------------------------

  • BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration

    ….2020\Stata for Survival Analysis 2020.docx Page 7 of 16

    c. Kaplan-Meier Curve Plot . * ---- Single Group: Kaplan Meier Curve ---* . * --- no frills plot --* . sts graph . * with frills --* . sts graph, xlabel(0(1)13) ylabel(0(.20)1) xtitle("Years Since Diagnosis") ytitle("KM Estimated Percent Alive") title("DPCA Study of Primary Biliary Cirrhosis") subtitle("n=312, # events=125") caption("graph02.png", size(vsmall)) . * With Greenwood CI limits . sts graph, gwood legend(off) xlabel(0(1)13) ylabel(0(.20)1) xtitle("Years Since Diagnosis") ytitle("KM Estimated Percent Alive (95%CI)") title("DPCA Study of Primary Biliary Cirrhosis") subtitle("n=312, # events=125") caption("graph03.png", size(vsmall))

    No Frills With Frills With Greenwood CI Limits

  • BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration

    ….2020\Stata for Survival Analysis 2020.docx Page 8 of 16

    . * Two Group Kaplan-Meier Curve Estimation

    . * Command is sts graph, by(GROUPVAR) OPTION OPTION OPTION Note: Must have sorted by GROUPVAR first

    . sort rx

    . sts list, by(rx) . * with frills ---* . sts graph, by(rx) xlabel(0(1)13) ylabel(0(.20)1) xtitle("Years Since Diagnosis") ytitle("KM Estimated Percent Alive") title("DPCA Study of Primary Biliary Cirrhosis") subtitle("n=312, # events=125") caption("graph04.png", size(vsmall))

    d. Log Rank Test of Equality of Survival Distributions . * ---- Log Rank Test (NULL: equality of survival distributions among rx groups) . * Command is sts test GROUPVAR . sts test rx failure _d: status analysis time _t: years Log-rank test for equality of survivor functions | Events Events rx | observed expected --------+------------------------- Placebo | 65 63.22 DPCA | 60 61.78 --------+------------------------- Total | 125 125.00 chi2(1) = 0.10 Pr>chi2 = 0.7498 Interpretation: Do NOT reject. Assumption of the null hypothesis has NOT led to an unlikely result (p-value = .75). We have no statistically significant evidence that the survival distributions are not the same.

  • BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration

    ….2020\Stata for Survival Analysis 2020.docx Page 9 of 16

    4. Cox PH Model Regression

    Recall. The Cox PH model models the hazard of event (in this case death) at time “t” as the product of a baseline hazard times exp(linear model in the predictors X1, X2, …. Xp). Here, p=3 because we have 3 predictors of interest:

    X1 = rx , 0/1 indicator of randomization X2 = histol, ordinal measure of degree of tissue damage at diagnosis X3 = bilirubin, continuous (mg/dl) Note. The predictor histol is an ordinal predictor. So we will need to replace it with appropriately defined design variables prior to modeling. a. Fit Cox PH Model . * Single Predictor Model: rx (User coded as 0/1 already) . stcox rx Cox regression -- Breslow method for ties No. of subjects = 312 Number of obs = 312 No. of failures = 125 Time at risk = 1713.853528 LR chi2(1) = 0.10 Log likelihood = -639.92903 Prob > chi2 = 0.7498 ------------------------------------------------------------------------------ _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- rx | .9444768 .1692173 -0.32 0.750 .6647918 1.341828 ------------------------------------------------------------------------------ Interpretation: Relative to control patients, patients treated with DPCA have lower hazard of death (HR = .94) at all times of follow-up. This very small benefit is not statistically significant (p-value = .75). Notice that the 95% CI for the HR includes the null value of 1. . * Single Predictor Model: rx (Stata defined design variable) . stcox i.rx Cox regression -- Breslow method for ties No. of subjects = 312 Number of obs = 312 No. of failures = 125 Time at risk = 1713.853528 LR chi2(1) = 0.10 Log likelihood = -639.92903 Prob > chi2 = 0.7498 ------------------------------------------------------------------------------ _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- rx | DPCA | .9444768 .1692173 -0.32 0.750 .6647918 1.341828 ------------------------------------------------------------------------------ Interpretation: SAME. Relative to control patients, patients treated with DPCA have lower hazard of death (HR = .94) at all times of follow-up. This very small benefit is not statistically significant (p-value = .75). Notice that the 95% CI for the HR includes the null value of 1.

    h(t; X1,...Xp )=h0(t) exp[ β1X1+...+βpXp ]

  • BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration

    ….2020\Stata for Survival Analysis 2020.docx Page 10 of 16

    . * Single Predictor Model: histol (user created design variables) . generate histol2=0 . replace histol2=1 if histol==2 (67 real changes made) . generate histol3=0 . replace histol3=1 if histol==3 (120 real changes made) . generate histol4=0 . replace histol4=1 if histol==4 (109 real changes made) . stcox histol2 histol3 histol4 failure _d: status analysis time _t: years Cox regression -- Breslow method for ties No. of subjects = 312 Number of obs = 312 No. of failures = 125 Time at risk = 1713.853528 LR chi2(3) = 52.72 Log likelihood = -613.62114 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- histol2 | 4.987976 5.143153 1.56 0.119 .6610611 37.63631 histol3 | 8.580321 8.685371 2.12 0.034 1.179996 62.39165 histol4 | 21.38031 21.57046 3.04 0.002 2.959663 154.4493 ------------------------------------------------------------------------------ Interpretation: Recall. Higher score on histol (valid scores = 1, 2, 3, 4) represent greater level of liver tissue damage present at diagnosis. This model shows that higher (“worse”) values of histol at diagnosis are associated with poorer prognosis (Hazard ratio estimates increase from 1 to 4.98 to 8.58 to 21.4, relative to the referent group histol=1). This is highly statistically significant. Caveat: Note that the confidence intervals are wide. . * Single Predictor Model: histol (Stata generated design variables) . stcox i.histol failure _d: status analysis time _t: years Cox regression -- Breslow method for ties No. of subjects = 312 Number of obs = 312 No. of failures = 125 Time at risk = 1713.853528 LR chi2(3) = 52.72 Log likelihood = -613.62114 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- histol | 2 | 4.987976 5.143153 1.56 0.119 .6610611 37.63631 3 | 8.580321 8.685371 2.12 0.034 1.179996 62.39165 4 | 21.38031 21.57046 3.04 0.002 2.959663 154.4493 ------------------------------------------------------------------------------ Interpretation: SAME. Higher (“worse”) values of histol at diagnosis are associated with poorer prognosis (Hazard ratio estimates increase from 1 to 4.98 to 8.58 to 21.4, relative to the referent group histol=1). This is highly statistically significant. Caveat: Note that the confidence intervals are wide.

  • BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration

    ….2020\Stata for Survival Analysis 2020.docx Page 11 of 16

    . * Single Predictor Model: bilirubin . stcox bilirubin failure _d: status analysis time _t: years Iteration 0: log likelihood = -639.97989 Iteration 1: log likelihood = -611.85115 Iteration 2: log likelihood = -597.6878 Iteration 3: log likelihood = -597.6845 Refining estimates: Iteration 0: log likelihood = -597.6845 Cox regression -- Breslow method for ties No. of subjects = 312 Number of obs = 312 No. of failures = 125 Time at risk = 1713.853528 LR chi2(1) = 84.59 Log likelihood = -597.6845 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- bilirubin | 1.160509 .0151044 11.44 0.000 1.131279 1.190494 ------------------------------------------------------------------------------ Interpretation: Associated with each 1 unit (1 mg/dl) increase in bilirubin is an increased risk of death at all times of follow-up (HR = 1.16, 95% CI = 1.13 – 1.19). This is highly statistically significant (p-value < < .0001). . * Use option nohr to obtain betas instead of hazard ratios . stcox bilirubin, nohr failure _d: status analysis time _t: years Cox regression -- Breslow method for ties No. of subjects = 312 Number of obs = 312 No. of failures = 125 Time at risk = 1713.853528 LR chi2(1) = 84.59 Log likelihood = -597.6845 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ _t | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- bilirubin | .1488587 .0130153 11.44 0.000 .1233492 .1743683 ------------------------------------------------------------------------------

  • BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration

    ….2020\Stata for Survival Analysis 2020.docx Page 12 of 16

    b. Multivariable Model Development . *------------ LR Tests ---------* . * --- rx controlling for histol ------* . quietly: stcox i.histol . eststo model_histol . quietly: stcox i.histol i.rx . eststo model_histolrx . lrtest model_histol model_histolrx Likelihood-ratio test LR chi2(1) = 0.67 (Assumption: model_histol nested in model_histolrx) Prob > chi2 = 0.4138 Interpretation: Do not reject. After adjustment for histol, randomization to DPCA is NOT associated with survival (LR Test p-value = .41) . * --- rx controlling for bilirubin ------* . quietly: stcox bilirubin . eststo model_bili . quietly: stcox bilirubin i.rx . eststo model_bilirx . lrtest model_bili model_bilirx Likelihood-ratio test LR chi2(1) = 1.20 (Assumption: model_bili nested in model_bilirx) Prob > chi2 = 0.2732 Interpretation: Do not reject. After adjustment for bilirubin, randomization to DPCA is NOT associated with survival (LR Test p-value = .27) . * --- rx controlling for both histol and bilirubin . quietly: stcox i.histol bilirubin . eststo model_both . quietly: stcox i.histol bilirubin i.rx . eststo model_bothrx . lrtest model_both model_bothrx Likelihood-ratio test LR chi2(1) = 0.76 (Assumption: model_both nested in model_bothrx) Prob > chi2 = 0.3837 Interpretation: Do not reject. After adjustment for both histol and bilirubin, randomization to DPCA is NOT associated with survival (LR Test p-value = .38) c. Side-by-side Comparison of Models . quietly: stcox i.rx . eststo model1 . quietly: stcox bilirubin i.rx . eststo model2 . quietly: stcox i.histol i.rx . eststo model3

  • BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration

    ….2020\Stata for Survival Analysis 2020.docx Page 13 of 16

    . quietly: stcox bilirubin i.histol i.rx . eststo model4 . * Display Betas and Summary Statistics . estout model1 model2 model3 model4, stats(n chi2 bic, star(chi2)) prehead("Betas") Betas ---------------------------------------------------------------------------- model1 model2 model3 model4 b b b b ---------------------------------------------------------------------------- 0b.rx 0 0 0 0 1.rx -.0571242 -.2006959 -.1469394 -.1579999 bilirubin .1513976 .1475188 1b.histol 0 0 2.histol 1.628527 1.52564 3.histol 2.176732 1.92305 4.histol 3.09258 2.796094 ---------------------------------------------------------------------------- n chi2 .1017198 85.79155*** 53.38545*** 127.504*** bic 1285.601 1205.654 1249.546 1181.171 ---------------------------------------------------------------------------- KEY: Chi2 = Value of LR test comparing the model fit (“full”) to intercept only (“reduced”) bic = Schwarz’ Bayesian Information Criterion = It is a function of the log-likelihood. Smaller values indicate a better fit. . * Display Hazard Ratios and Model Fit Statistics. Option eform produces hazard ratios * . estout model1 model2 model3 model4, eform stats(n chi2 bic, star(chi2)) prehead("Hazard Ratios") Hazard Ratios ---------------------------------------------------------------------------- model1 model2 model3 model4 b b b b ---------------------------------------------------------------------------- 0b.rx 1 1 1 1 1.rx .9444768 .8181612 .8633463 .8538499 bilirubin 1.163459 1.158955 1b.histol 1 1 2.histol 5.096362 4.598085 3.histol 8.817444 6.841793 4.histol 22.03384 16.38054 ---------------------------------------------------------------------------- n chi2 .1017198 85.79155*** 53.38545*** 127.504*** bic 1285.601 1205.654 1249.546 1181.171 ----------------------------------------------------------------------------

  • BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration

    ….2020\Stata for Survival Analysis 2020.docx Page 14 of 16

    5. Regression Diagnostics for Cox PH Model a. Test of Proportional Hazards . * Test of proportional hazards . quietly: stcox bilirubin i.histol i.rx . estat phtest, detail Test of proportional-hazards assumption Time: Time ---------------------------------------------------------------- | rho chi2 df Prob>chi2 ------------+--------------------------------------------------- bilirubin | 0.09686 0.88 1 0.3485 1b.histol | . . 1 . 2.histol | 0.01775 0.04 1 0.8424 3.histol | 0.00187 0.00 1 0.9834 4.histol | -0.04811 0.29 1 0.5914 0b.rx | . . 1 . 1.rx | -0.09026 0.99 1 0.3204 ------------+--------------------------------------------------- global test | 13.19 5 0.0216 ---------------------------------------------------------------- Interpretation: The global test is significant (p-value - .02) … but …. For each predictor, do not reject the assumption of proportional hazards b. Graphical Assessment of Proportional Hazards . * Assessment of PH Assumption: Randomization/Treatment . stphplot, by(rx) adjust(bilirubin histol) nolntime plot1opts(symbol(none) color(red) lpattern(dash)) plot2opts(symbol(none) color(navy)) title("Assessment of PH Assumption") subtitle(" Predictor is rx") xtitle("Years")

    Interpretation: Looks reasonable. Note: adjust(bilirubin histol) tells Stata to set these variables to their mean values The option nolntime tells Stata to plot time on the horizontal, not the logarithm of time.

  • BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration

    ….2020\Stata for Survival Analysis 2020.docx Page 15 of 16

    . * Assessment of PH Assumption: Histol . stphplot, by(histol) adjust(bilirubin rx) nolntime plot1opts(symbol(none) color(black) lpattern(dash)) plot2opts( symbol(none) color(navy)) plot3opts(symbol(none) color(green)) plot4opts(symbol(none) color(red)) title("Assessment of PH Assumption") subtitle(" Predictor is histol") xtitle("Years")

    Interpretation: Looks reasonable, except for the group, histol=1. I checked on this (not shown); there was just 1 death in this group . * Assessment of PH Assumption: Bilirubin, using Quartile Groupings . centile bilirubin, c(25,50,75) -- Binom. Interp. -- Variable | Obs Percentile Centile [95% Conf. Interval] -------------+------------------------------------------------------------- bilirubin | 312 25 .8 .7 .9 | 50 1.35 1.2 1.8 | 75 3.475 3.12635 4.5 . generate bilicat=bilirubin . recode bilicat (min/0.8=1) (0.81/1.35=2) (1.351/3.475=3) (3.4751/max=4) if bilirubin !=. (bilicat: 307 changes made) . stphplot, by(bilicat) adjust(rx) nolntime plot1opts(symbol(none) color(black) lpattern(dash)) plot2opts(symbol(none) color(navy)) plot3opts(symbol(none) color(green)) plot4opts(symbol(none) color(red)) title("Assessment of PH Assumption") subtitle(" Predictor is Quartile of Bilirubin") xtitle("Years")

    Interpretation: Looks reasonable.

  • BIOSTATS 640 – Spring 2020 8. Survival Analysis Stata Illustration

    ….2020\Stata for Survival Analysis 2020.docx Page 16 of 16

    c. Test of Overall Goodness of Fit

    Note #1. This test utilizes a command stcoxgof that must be downloaded from the itnernet

    Note #2. The command stcoxgof will not work with factor variables. Therefore, in fitting my model, I replaced histol with the 0/1 indicators of levels 2, 3, and 4. See again page 10.

    Note #3. The command stcoxgof also requires that you have saved thie martingale residuals. This is accomplished with the option mgale(NAMEYOUPROVIDE) . findit stcoxgof ----- not shown: Downloading of stcoxgof --- . stcox bilirubin histol2 histol3 histol4 rx, mgale(mgale) . stcoxgof Goodness-of-fit test for the inclusion of design variables based on 3 quantiles of risk (Added variables version of the Groennesby and Borgan test) Score test chi2(2) = 2.318 Prob > chi2 = 0.3137 Likelihood-ratio test LR chi2(2) = 2.356 Prob > chi2 = 0.3079 (Table collapsed on quantiles of linear predictor) -------------------------------------------------------------------------------- Quantile | of Risk | Observed Expected z p-Norm Observations ----------+--------------------------------------------------------------------- 1 | 19 22.87 -.809 .418 108 2 | 36 34.409 .271 .786 102 3 | 70 67.721 .277 .782 102 | Total | 125 125 312 -------------------------------------------------------------------------------- Interpretation: Good. Do not reject. We do not have statistically significant evidence of a poor fit (p-value = .31). Caveat: It is quite possible that that additional regression diagnostics will reveal issues!