Top Banner
Survival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare the efficacy of two drugs for treating a sometimes fatal illness. Two groups of patients with the disease were identified. One group was given Drug A. The other group was given Drug B. The age of the patient at the time of drug administration was recorded. The patients were then monitored by a special team of patient observers. The age of the patient at time of death was recorded and the survival duration from the time the patient began taking the drug computed. Several of the patients lived for many years. The study was terminated when the last patient in the two groups died – more than 60 years after the beginning of the study. (The original researcher died while waiting for the last patient to die. The original researcher’s grandchildren were available to continue the analyses.) The grandchildren used the Mann-Whitney U-test to compare survival times between the two groups. (The U-test was used because survival times are notoriously positively skewed.) This is the appropriate way to compare the efficacy of the two drugs. Problems: 1) The long amount of our time it will take to observe survival times of all patients. 2) What to do about persons who get “lost” – from whom contact was lost. These patients give incomplete data. Should they be ignored – treated as missing values? Survival Analysis – 1 Printed on 10/26/2016
81

Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Jun 01, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Survival/Failure Analysis(AKA Event History Analysis)

T & F Chapter 11

Data Example 1. A medical doctor wished to compare the efficacy of two drugs for treating a sometimes fatal illness. Two groups of patients with the disease were identified. One group was given Drug A. The other group was given Drug B.

The age of the patient at the time of drug administration was recorded.

The patients were then monitored by a special team of patient observers.

The age of the patient at time of death was recorded and the survival duration from the time the patient began taking the drug computed. Several of the patients lived for many years.

The study was terminated when the last patient in the two groups died – more than 60 years after the beginning of the study. (The original researcher died while waiting for the last patient to die. The original researcher’s grandchildren were available to continue the analyses.)

The grandchildren used the Mann-Whitney U-test to compare survival times between the two groups. (The U-test was used because survival times are notoriously positively skewed.)

This is the appropriate way to compare the efficacy of the two drugs.

Problems:

1) The long amount of our time it will take to observe survival times of all patients.

2) What to do about persons who get “lost” – from whom contact was lost.These patients give incomplete data. Should they be ignored – treated as missing values?

Because of Problem 1 above, we typically do NOT wait until every participant in our research has died before analyzing.

Instead we define a Window of Observation, and observe participants only while that window is open.

Survival Analysis – 1 Printed on 10/26/2016

Page 2: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Window of Observation

The problem is that we don’t have an infinite period of time to wait until everyone quits or dies. So what do we do about the persons who are still alive when the window of observation is closed?

Plus, it may be the case that we lose contact with people so for some people we won’t know how long they survived regardless of the length of the window of observation. All we know about them is that they were alive until a specific time. We don’t know whether they’re still alive or not after that time.

The window of observation is the specific time period in which participant survival is recorded.

At some time, we begin recording whether or not each person is surviving or not. At some later time, we quit monitoring each patient.

Because the window is of finite duration, this necessarily results in incomplete information on some participants.

Of particular importance is the fact some will still be alive/working when we quit observing.

This means that we won’t have accurate survival times for some people.

Medical literature

Two treatments for a disease are given. We attempt to record1) Whether or not each patient died – the dichotomous outcome – and 2) how long each patient survived until death – the continuous outcome.

Group A given Drug A.Group B given Drug B.

Turnover literature

Persons are hired by an organization into two different buildings. We attempt to record1) Whether or not each employee quits or retires and2) how long each employee is employed before leaving the organization.

Building A: Kill and Debone chickensBuilding B: Cook the chicken carcasses

Survival Analysis – 2 Printed on 10/26/2016

Page 3: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Overview of Types of cases in survival analysis

-------|------------------------------------------------------|--------

Ideal Cases – each starting time and ending time is known.

Right Censored Cases: Cases whose ending times (time of termination/deathare unknown. These are the most common problem cases.

The above cases are still employed/surviving at the time monitoring ends. ??????????????????????????????

The above case is lost to follow-up (quit answering phone, left state, etc.)

Left Censored Cases: Cases whose starting times are unknown.

We will not include such cases in the analyses conducted here.

Cases whose starting times and ending times are unknown,Fagettaboutit – these are not analyzable.

???? ????

Survival Analysis – 3 Printed on 10/26/2016

????????????????????????????

Time

Monitoring of cases begins, i.e., Window

opens

Monitoring of cases ends, i.e., Window

closes

????????

Page 4: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Incorrect Analysis 1: Use death/quit rates as a proxy for survival

Assuming that persons with long survival times will be less likely to die within the window of observation, we could use death or quit rates as an indicator of survival time.

We could use logistic regression to assess the relation of death or quit rates to independent variables.

(Use linear regression in a pinch praying that the God of statistics won’t strike you down).

Problem – it’s possible to create situations in which most people would agree that the distributions of survival times are different even though proportions of outcomes are identical.

Consider the following . . . Assume we’re dealing with employment.

In the figures, each arrow represents duration of employment for a person. The horizontal axis is time. The vertical line at the left represents the time at which the window of observation opened. The vertical line at the right represents the time at which the window closed. The -> of the arrow represents death/termination.

Group A – Termination Rate = 100%

Group B – Termination Rate = 100%

Clearly, Group A has longer average employment times, but both have the exact same proportion of turnovers – 100% in this example.

So, comparison of death/quit rates may certainly give us an inaccurate picture of the differences between the groups.

Survival Analysis – 4 Printed on 10/26/2016

Page 5: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Incorrect Analysis 2 – Analyze only the durations within the window of observation. Ignore the deaths/turnovers.

Group A. Average Survival time =

Group B. Average Survival time =

In the example above, the two groups have equal (ultimate) death rates but different survival times just within the window of observation – In Group A all subjects had “time to die.” In Group B, subjects were still living when the window of observation closed. In this case, analysis of survival times within the finite window of observation will give an incorrect picture of the lack of difference between the groups.

Each type of incomplete analysis ignores the other aspect of the complete dependent variable. We need a method of analysis that takes into account both aspects.

Survival analysis is an analytic technique that combines both aspects.

Comparisons of different groups includes . . .

Comparison of proportion dying / leaving

Comparison of time surviving / staying.

Survival Analysis – 5 Printed on 10/26/2016

Page 6: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Survival Analysis (also called Event History Analysis)

An analytic technique that models both survival times and proportions of deaths / quits.

3 separate techniques available in SPSS – Life Table, Kaplan-Meier, Cox Regression

Key concept common to all techniques

Survival function – most important one of all of them

A plot of proportion surviving from time 0 up to a given time vs. time

A cumulative plot.

Generally decreasing curve, since proportion surviving can only remain constant or decrease across time.

Separate curves for separate groups

The curve represents both aspects of survival.

1) The height of the curve at a point represents the proportion surviving up to that time.

2) The curve also represents duration of stay/life (how far the curve has progressed to the right from t=0). The distance along the X-axis represents the average survival time for those who have a specific survival rate.

So, the survival curve is a two-dimensional representation of the two aspects of survival – survival rates and length of life/employment.

Survival Analysis – 6 Printed on 10/26/2016

0 Time

ProportionSurviving

30

50%

100%At time 30, 63% have survived.

At time 30, 54% have survived.

63% have survived at time, t.

At 60% survival rate, average length of time was 17. survived at time, t.

60%

At 60% survival rate, average length of time was 85. survived at time, t.

10 20 40 50 60 70 80

Page 7: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Comparing groups.

The vertical axis represents proportion of survivals or turnovers.

Within a vertical slice at any point, turnover rates up to a particular time can be compared.In the following, we see that at time t, Group A had a higher survival rate than Group B.

Comparing Average survival times between two groups.

Within a horizontal slice at any point, average survival times can be compared.

In the following, we see that for a 70% survival rate, average survival time was longer for Group A than it was for Group B.

When comparing groups we will usually compare the whole curve for each group. The group whose curve is generally above the others is the group with best survival.

Survival Analysis – 7 Printed on 10/26/2016

Time

B

A

70%Time

B

A

t

Page 8: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Three general types of Survival Analysis

1. Life Tables analysis.

The window of observation is cut up into n equal-length intervals.

Proportions of persons surviving/dying within each interval are computed.

This is the original method.

Useful for analysis of one group or for comparison of a few groups defined by levels of a single categorical factor.

Can’t incorporate quantitative predictors.

Can’t incorporate more than 2 qualitative predictors in SPSS.

Cannot analyze interactions of 2 or more predictors.

2. Kaplan-Meier analysis.

Event-based. Rather than defining intervals based on time, intervals are defined based on occurrence of death/termination. Each death/termination marks the end of one interval and the beginning of a subsequent interval.

Can’t incorporate quantitative predictors.

Can’t incorporate more than 2 qualitative predictors in SPSS.

Cannot analyze interactions of 2 or more predictors.

Survival function graphs printed by SPSS’s K-M procedure show censored cases, a plus.

Survival Analysis – 8 Printed on 10/26/2016

Page 9: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

3. Cox Proportional Hazards Regression (Cox Regression)

A very general, procedure.

Based on a specific mathematical model of survival developed by Cox.

Estimates hazard probabilities for whole sample.

Then estimates ratios of hazards to this overall hazard function for groups/persons with different values of IV’s

As implemented in SPSS, output and analyses look at lot like logistic regression.

Can incorporate quantitative predictors.

Can incorporate multiple qualitative and quantitative factors.

Can incorporate interactions.

Survival Analysis – 9 Printed on 10/26/2016

Page 10: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Based on Tabachnick Table 11.1, p. 515Analyzed using SPSS Life Tables

Suppose the efficacy of Drug 0 is being compared with that of Drug 1. Each was formulated to prolong life of patients with a usually terminal form of cancer. Seven patients were given Drug 0 and five were given Drug 1. Patients were observed for up to 12 months. After 12 months, the window of observation closed and the results were entered into SPSS. Note that this problem is analogous to a turnover problem in organizational research with two groups of employees treated differently.

The SPSS syntax to invoke the analysis.

SAVE OUTFILE='G:\MdbT\P595\P595AL07-Survival analysis\TAndFDancingData.sav' /COMPRESSED.SURVIVAL TABLE=months BY drug(0 1) /INTERVAL=THRU 12 BY 1 /STATUS=outcome(1) /PRINT=TABLE /PLOTS (SURVIVAL)=months BY drug.

Survival Analysis – 10 Printed on 10/26/2016

Like ANOVA

Like multiple t-tests

Page 11: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Survival Analysis [DataSet0] G:\MdbT\P595\P595AL07-Survival analysis\TAndFDancingData.savSurvival Variable: months

Life Table

First-order Controls

Interval Start Time

Number Entering Interval

Number Withdraw

ing during Interval

Number Exposed to Risk

Number of

Terminal Events

Proportion

Terminating

Proportion

Surviving

Cumulative

Proportion

Surviving at End of Interval

Std. Error of

Cumulative

Proportion

Surviving at End of Interval

Probability Density

Std. Error of

Probability Density

Hazard Rate

Std. Error of Hazard

Ratedrug 0 0 7 0 7.000 0 .00 1.00 1.00 .00 .000 .000 .00 .00

1 7 0 7.000 1 .14 .86 .86 .13 .143 .132 .15 .152 6 0 6.000 2 .33 .67 .57 .19 .286 .171 .40 .283 4 0 4.000 1 .25 .75 .43 .19 .143 .132 .29 .284 3 0 3.000 1 .33 .67 .29 .17 .143 .132 .40 .395 2 0 2.000 1 .50 .50 .14 .13 .143 .132 .67 .636 1 0 1.000 0 .00 1.00 .14 .13 .000 .000 .00 .007 1 0 1.000 0 .00 1.00 .14 .13 .000 .000 .00 .008 1 0 1.000 0 .00 1.00 .14 .13 .000 .000 .00 .009 1 0 1.000 0 .00 1.00 .14 .13 .000 .000 .00 .0010 1 0 1.000 0 .00 1.00 .14 .13 .000 .000 .00 .0011 1 0 1.000 1 1.00 .00 .00 .00 .143 .132 2.00 .00

1 0 5 0 5.000 0 .00 1.00 1.00 .00 .000 .000 .00 .001 5 0 5.000 0 .00 1.00 1.00 .00 .000 .000 .00 .002 5 0 5.000 0 .00 1.00 1.00 .00 .000 .000 .00 .003 5 0 5.000 0 .00 1.00 1.00 .00 .000 .000 .00 .004 5 0 5.000 0 .00 1.00 1.00 .00 .000 .000 .00 .005 5 0 5.000 0 .00 1.00 1.00 .00 .000 .000 .00 .006 5 0 5.000 0 .00 1.00 1.00 .00 .000 .000 .00 .007 5 0 5.000 1 .20 .80 .80 .18 .200 .179 .22 .228 4 0 4.000 1 .25 .75 .60 .22 .200 .179 .29 .289 3 0 3.000 0 .00 1.00 .60 .22 .000 .000 .00 .0010 3 0 3.000 2 .67 .33 .20 .18 .400 .219 1.00 .6111 1 0 1.000 0 .00 1.00 .20 .18 .000 .000 .00 .0012 1 1 .500 0 .00 1.00 .20 .18 .000 .000 .00 .00

The results suggest that survival is significantly longer with Drug 1 – the top (orange) curve.

Survival Analysis – 11 Printed on 10/26/2016

It’s my assumption that all the times are collected and the median of those times is reported here. It should correspond closely to the intersection of survival functions and a horizontal line at 50% survival.

Page 12: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Tabachnick Table 11.1, p. 511 Analyzed using SPSS Kaplan-Meier

Analyze Survival Kaplan-Meier . . .

KM months BY drug /STATUS=outcome(1) /PRINT TABLE MEAN /PLOT SURVIVAL /TEST LOGRANK BRESLOW TARONE /COMPARE OVERALL POOLED.

Survival Analysis – 12 Printed on 10/26/2016

[Define Event] had already been pressed when this screen shot was taken.

Page 13: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Kaplan-Meier [DataSet2] G:\MdbT\InClassDatasets\Survival(T&Bp511).sav

Case Processing Summary

drug Total N N of Events

Censored

N Percent0 7 7 0 .0%

1 5 4 1 20.0%

Overall 12 11 1 8.3%

Survival Table

drug Time StatusCumulative Proportion Surviving at the Time

N of Cumulative Events N of Remaining CasesEstimate Std. Error0 1 1.000 1 .857 .132 1 6

2 2.000 1 . . 2 53 2.000 1 .571 .187 3 44 3.000 1 .429 .187 4 35 4.000 1 .286 .171 5 26 5.000 1 .143 .132 6 17 11.000 1 .000 .000 7 0

1 1 7.000 1 .800 .179 1 42 8.000 1 .600 .219 2 33 10.000 1 . . 3 24 10.000 1 .200 .179 4 15 12.000 0 . . 4 0

Means and Medians for Survival Time

drug

Meana Median

Estimate Std. Error95% Confidence Interval

Estimate Std. Error95% Confidence Interval

Lower Bound Upper Bound Lower Bound Upper Bound0 4.000 1.272 1.506 6.494 3.000 1.309 .434 5.5661 9.400 .780 7.872 10.928 10.000 .894 8.247 11.753Overall 6.250 1.081 4.131 8.369 5.000 2.598 .000 10.092a. Estimation is limited to the largest survival time if it is censored.

Overall Comparisons

Chi-Square df Sig.Log Rank (Mantel-Cox) 3.747 1 .053

Breslow (Generalized Wilcoxon) 4.926 1 .026

Tarone-Ware 4.522 1 .033

Test of equality of survival distributions for the different levels of drug.

Survival Analysis – 13 Printed on 10/26/2016

Page 14: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

As was the case with the analysis using the LIFE TABLES procedure, the results support the conclusion that survival is significantly longer with Drug 1.

Survival Analysis – 14 Printed on 10/26/2016

Note that censored cases are denoted with a + on the survival function.

Page 15: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Tabachnick Table 11.1, p. 511 – start here on 11/6/17Analyzed using SPSS Cox Regression

The program will not produce a survival curve for a group of cases defined by the value of a variable unless that variable is a categorical variable. (Reminds me of the RCMDR Factor issue.)

For that reason, I told the program that drug is a categorical variable so that survival curves for each value of drug could be obtained.

Since drug is a dichotomy, the analysis could be done without labeling it categorical, but in that case the survival curves for each value of drug could not have been generated.

Survival Analysis – 15 Printed on 10/26/2016

Page 16: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

The left panel would yield 1 plot The right panel yields a plot for each value of drug.

COXREG months /STATUS=outcome(1) /PATTERN BY drug /CONTRAST (drug)=Indicator(1) /METHOD=ENTER drug /PLOT SURVIVAL /CRITERIA=PIN(.05) POUT(.10) ITERATE(20).

Cox Regression

[DataSet2] G:\MdbT\InClassDatasets\Survival(T&Bp511).sav

Case Processing SummaryN Percent

Cases available in analysis Eventa 11 91.7%Censored 1 8.3%Total 12 100.0%

Cases dropped Cases with missing values 0 .0%Cases with negative time 0 .0%Censored cases before the earliest event in a stratum

0 .0%

Total 0 .0%Total 12 100.0%a. Dependent Variable: months

Categorical Variable Codingsb

Frequency (1)druga 0 7 0

1 5 1a. Indicator Parameter Codingb. Category variable: drug

Survival Analysis – 16 Printed on 10/26/2016

As mentioned above if you want separate predicted survival functions for each value of a categorical variable, put the name of that categorical variable here.

Page 17: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Block 0: Beginning Block

Omnibus Tests of Model Coefficients

-2 Log Likelihood40.740

Block 1: Method = Enter

Omnibus Tests of Model Coefficientsa

-2 Log LikelihoodOverall (score) Change From Previous Step Change From Previous Block

Chi-square df Sig. Chi-square df Sig. Chi-square df Sig.37.394 3.469 1 .063 3.346 1 .067 3.346 1 .067

a. Beginning Block Number 1. Method = Enter

Variables in the Equation

B SE Wald df Sig. Exp(B)drug -1.176 .658 3.192 1 .074 .309

Covariate Means and Pattern Values

MeanPattern

1 2drug .417 .000 1.000

I strongly recommend that you create a plot such as the one immediately above by hand to make sure you understand the Cox Regression results. I do it every time I use this procedure.

Survival Analysis – 17 Printed on 10/26/2016

Cox regression coefficient signs are relative to death, not survival. So, a positive sign means that larger values of the independent variable have higher death rates. And negative signs mean that larger values of the independent variable have lower death rates.

Drug

Death

10

In Cox Regression, we’re predicting DEATH, not survival.

Page 18: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

COXREG plots are plots of predicted survival, not actual survival. In this sense, they’re like the tables and plots of estimated marginal means from GLM. I usually report observed survival functions, using Kaplan-Meier, rather than these predicted survival functions. However, these are certainly useful in situations in which you want to show what survival should be for specific groups at specific times controlling for the other variables in the equation.

Survival Analysis – 18 Printed on 10/26/2016

Y-hats

The Cox-Regression plots are y-hat plots, not observed survival functions.

They are predicted survival, not actual survival.

Note, however, that they are predicted SURVIVAL curves, not death curves.

From Kaplan-Meier

Page 19: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Real Life Example: Turnover at a local Manufacturing Plant

1. Effect of Friends and/or family at the plant

In this study, turnover at a local manufacturing plant was studied. On the application blank, applicants were asked to indicate whether or not they had friends or family already working at the plant.

Some did not respond to this question. They’re included in the analysis.A screen shot of the data editor

The variable, wsfr2, represents whether or not the applicant had friends at the company.

wsfr2 = 0.50 means yes.wsfr2 = -0.50 means no.wsfr2 = 0.15 means no info.

Wsfr2 was created to deal with missing values in a special way. The fact that the values are fractional has no bearing on the analyses. They could just as well have been 0, 1, 2 or 1, 2, 3.

Having said that, because the LIFE TABLES procedures requires integer values of each factor, I’ll skip it here.

Kaplan-Meier analysis is shown

Some of SPSS’s procedures are written so that a grouping variable can have any kind of values. K-M is one of them.

K-M allows you to simply specify the name of the factor, and the program figures out how many groups are implied by the values of the factor.

That’s good unless you have a grouping variable with some incidental values representing unique cases or groups of cases – cases you wish to be excluded from the analysis.

Survival Analysis – 19 Printed on 10/26/2016

Page 20: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

KM dos BY wsfr2 /STATUS=status(1) /PRINT TABLE MEAN /PLOT SURVIVAL /TEST LOGRANK BRESLOW TARONE /COMPARE OVERALL POOLED .

Survival Analysis – 20 Printed on 10/26/2016

Page 21: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Kaplan-Meier

[DataSet3] G:\MdbR\1TurnoverArticle\TurnoverArticleDataset061005.savLarge table was deleted.

Case Processing Summarywsfr2 Whether F/F at company

for whole sample analyses Total N

N of

Events

Censored

N Percent

-.50 423 174 249 58.9%

.15 Whole sample missing value 100 40 60 60.0%

.50 778 220 558 71.7%

Overall 1301 434 867 66.6%

Means and Medians for Survival Time

wsfr2 Whether F/F at company

for whole sample analyses

Meana Median

Estimat

e

Std.

Error

95% Confidence Interval

Estimat

e

Std.

Error

95% Confidence Interval

Lower

Bound

Upper

Bound

Lower

Bound

Upper

Bound

-.50 610.597 25.559 560.500 660.693 667.000 . . .

.15 Whole sample missing value 579.795 49.233 483.299 676.291 528.000 151.013 232.014 823.986

.50 769.900 18.559 733.524 806.277 . . . .

Overall 706.965 15.009 677.548 736.383 . . . .

a. Estimation is limited to the largest survival time if it is censored.

Note that there is no estimate of median survival for the 0.50 group. I’m not absolutely sure why, but I believe it’s because more than 50% of the persons in that group were still on the job at the end of the observation window. For that reason, a median was not computable.

Overall ComparisonsChi-Square df Sig.

Log Rank (Mantel-Cox) 25.344 2 .000

Breslow (Generalized Wilcoxon) 25.325 2 .000

Tarone-Ware 25.004 2 .000

Test of equality of survival distributions for the different levels of wsfr2

Whether F/F at company for whole sample analyses.

Clearly there are significant differences in overall survival between the groups.

Survival Analysis – 21 Printed on 10/26/2016

-.50 = No friends.15 = No info.50 = Had friends

Page 22: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

The data strongly suggest that applicants who had friends or family at the company had higher survival rates at all times up to 1100 days (about 3 years).

For example, at the end of 1 year survival (leftmost arrow in the above figure) rate of those with friends and family was about 70% while that for those who said they did not have friends or family at the organization was about 60%.

By two years (middle arrow), the rate of retention of those with was about 68% while the rate of those without had decreased to 50%.

The fact that the curve for those for whom no information was available was between the other two curves suggests that those employees for whom no information was available were a mixture of some who did have friends and family and those who did not.

Survival Analysis – 22 Printed on 10/26/2016

Had friends or family

No friends or family

Missing response

1 year 2 years

Note the huge difference in proportion surviving after two years – almost 20% difference between those with friends and those without friends.

3 years

Page 23: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Same analysis using SPSS Cox RegressionAnalyze Survival Cox Regression . . .

Survival Analysis – 23 Printed on 10/26/2016

In my limited experience with group coding variables in survival analysis, I’ve found that Dummy Variable (Indicator in SPSS) coding is the one that is most useful.

Page 24: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

COXREG dos /STATUS=status(1) /PATTERN BY wsfr2 /CONTRAST (wsfr2)=Indicator /METHOD=ENTER wsfr2 /PLOT SURVIVAL /CRITERIA=PIN(.05) POUT(.10) ITERATE(20).

Cox RegressionCase Processing Summary

N Percent

Cases available in analysis Eventa 434 33.4%

Censored 867 66.6%

Total 1301 100.0%

Cases dropped Cases with missing values 0 0.0%

Cases with negative time 0 0.0%

Censored cases before the

earliest event in a stratum

0 0.0%

Total 0 0.0%

Total 1301 100.0%

a. Dependent Variable: dos

Survival Analysis – 24 Printed on 10/26/2016

About 1/3 of the employees were still working when the window of observation closed.

Remember, the variable representing different groups must have been specified as categorical.

Page 25: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Categorical Variable Codingsa

Frequency (1) (2)

wsfr2b -.50=-.50 423 1 0

.15=Whole sample missing value 100 0 1

.50=.50 778 0 0

a. Category variable: wsfr2 (Whether F/F at company for whole sample analyses)

b. Indicator Parameter Coding

Block 0: Beginning BlockOmnibus Tests

of Model Coefficients

-2 Log Likelihood

5871.672

Block 1: Method = EnterOmnibus Tests of Model Coefficientsa

-2 Log

Likelihood

Overall (score) Change From Previous Step Change From Previous Block

Chi-square df Sig. Chi-square df Sig. Chi-square df Sig.

5847.172 25.290 2 .000 24.499 2 .000 24.499 2 .000

a. Beginning Block Number 1. Method = Enter

Variables in the EquationB SE Wald df Sig. Exp(B)

wsfr2 24.812 2 .000

wsfr2(1) .489 .102 23.230 1 .000 1.631

wsfr2(2) .425 .172 6.104 1 .013 1.529

Recall that the sign of each coefficient is relative to “Termination”.

WSFR2(1) compares the proportion terminating in the -.50 group to the proportion in the +.50 group.

Since the coefficient is +.489, this says that the -.50 group has larger probability of terminating than the .50 group.

Same for the wsfr2(2) – The no response group has greater probability of terminating than the +.50 group.

Covariate Means and Pattern Values

Mean

Pattern

1 2 3

wsfr2(1) .325 1.000 .000 .000

wsfr2(2) .077 .000 1.000 .000

Survival Analysis – 25 Printed on 10/26/2016

Term’d

I’m not sure what this table is for.

The reference group is the wsfr2 = +0.50 “Have Friends” group.

0 1

Page 26: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Survival Analysis – 26 Printed on 10/26/2016

Predicted survival for the whole sample.

These are predicted survival curves, which is why they’re so smooth.

We might use these data to read the minds of those who did not respond to the “Do you have friends or family?” The similarity of their survival function to the “No Friends” function suggests that most did not have friends or family at the organization.

Page 27: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Using Survival Analysis to score and validate selection test questions.An I/O consulting firm gave a 30-question pre-employment questionnaire to 1000+ employees of a local company. Each question had from one to five alternatives. The consulting company wanted to identify questions that predicted long tenure with the organization. (They would have preferred to identify questions that predicted high performance, but it was not possible to get good performance data. Don’t get me started on why organizations don’t gather good performance data.)

In order to identify responses associated with long tenure, a survival analysis was conducted for each question. A few of the analyses are presented below.

For each survival function, each curve is the survival function of persons who made a particular response to the item. I picked only those for which the difference in survival curves was significant or approached significance.

Question 1Overall Comparisons

Chi-Square df Sig.

Log Rank (Mantel-Cox) 5.382 2 .068

Breslow (Generalized Wilcoxon) 4.307 2 .116

Tarone-Ware 4.756 2 .093

Survival Analysis – 27 Printed on 10/26/2016

The numbers represent the 3 possible responses to the question, coded as +1, 0, -1.

For this question, I believe we treated +1 as an indicator of long tenure and both 0 and -1 as indicators of short tenure.

+1

0

-1?

Page 28: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Question 2Overall Comparisons

Chi-Square df Sig.

Log Rank (Mantel-Cox) 7.647 4 .105

Breslow (Generalized Wilcoxon) 6.950 4 .139

Tarone-Ware 7.298 4 .121

Survival Analysis – 28 Printed on 10/26/2016

+1

0

-1?

As in the case of the question on the previous page, the response coded as +1 was treated as an indicator of long tenure and all other responses were treated as indicators of short tenure.

Page 29: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Question 3Overall Comparisons

Chi-Square df Sig.

Log Rank (Mantel-Cox) 5.070 3 .167

Breslow (Generalized Wilcoxon) 5.525 3 .137

Tarone-Ware 5.493 3 .139

Test of equality of survival distributions for the different levels of GenQ4 Gen

Q4 L:I prefer a job that / S: How often you experience conflict with a co-

worker?.

Survival Analysis – 29 Printed on 10/26/2016

+1

0

There were very few persons who responded +1 or 0, but those who did were treated as long tenure and those who responded 0 as short tenure.

Page 30: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Question 4

Overall Comparisons

Chi-Square df Sig.

Log Rank (Mantel-Cox) 7.753 4 .101

Breslow (Generalized Wilcoxon) 6.762 4 .149

Tarone-Ware 7.439 4 .114

Test of equality of survival distributions for the different levels of GenQ3 Gen

Q3 L: Recieved safety training? / S: You are asked to do more physically

demanding work than you were hired to do because someone out sick, how

do you react?.

Survival Analysis – 30 Printed on 10/26/2016

+1

0, -1

+1: Long tenureElse: Short tenure

Page 31: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Question 5

Overall Comparisons

Chi-Square df Sig.

Log Rank (Mantel-Cox) 10.971 4 .027

Breslow (Generalized Wilcoxon) 9.931 4 .042

Tarone-Ware 10.597 4 .031

Test of equality of survival distributions for the different levels of GenQ2 Gen

Q2 L: Your team in disagreement over who will clean the floor. What

method is fair?/ S: Recent supervisor rate dependability?.

Survival Analysis – 31 Printed on 10/26/2016

+1

0

Long Tenure

Short Tenure

Page 32: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Question 6

Overall Comparisons

Chi-Square df Sig.

Log Rank (Mantel-Cox) 8.052 3 .045

Breslow (Generalized Wilcoxon) 12.729 3 .005

Tarone-Ware 10.614 3 .014

Test of equality of survival distributions for the different levels of GenQ1

GenQ1 L: Which strategies inspire a team and help be more effective?/

S:Your team in disagreement over who will clean the floor. What method is

fair?.

Survival Analysis – 32 Printed on 10/26/2016

+1

0

Page 33: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Creation of an overall Tenure Index

Thirty questions were evaluated in the above fashion.

After examination of the individual survival curves for the 30 questions, those for which significant differences in survival between responses were identified by examining the survival analysis for each question as shown above.

Finally, an overall index was calculated, using syntax like the following . . .

In this particular case, the response associated with long survival added 1 to the index.

The response associated with short survival subtracted 1 from the index.

Tenure Scale Computation

Compute genshort=0.if ((genq1=3 or genq1=4)) genqshort=genqshort + 1.if ((genq1=1 or genq1=2)) genqshort=genqshort - 1.if ((genq2=3 or genq2=4)) genshort=genshort + 1.if ((genq2=1 or genq2=2 or genq2=5)) genshort=genshort - 1.if ((genq6=3)) genshort=genshort + 1.if ((genq6=1 or genq6=2)) genshort=genshort - 1.if ((genq12=1)) genshort=genshort + 1.if ((genq12=3)) genshort=genshort - 1.if ((genq13=1)) genshort=genshort + 1.if ((genq13=2 or genq13=3 or genq13=4)) genshort=genshort - 1.if ((genq21=1 or genq21=3)) genshort=genshort + 1.if ((genq21=2)) genshort=genshort - 1.

Survival Analysis – 33 Printed on 10/26/2016

Page 34: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Validity of the Tenure Index

The following is not based on the scale above but on a similar scale.

The median score on the scale was determined to be -14,Group 0 was all employees with an index value less than or equal to -14 – persons who generally responded with the “short tenure” answer.

Group 1 was all employees with an index value greater than -14 – persons who generally responded with the “long tenure” answer.

The graph indicates that those in Group 1, with large values of the index, had a nearly 70% retention rate after 50 months.

Those in Group 0 had a 40% retention rate after the same length of time.

The implication of this analysis would be to recommend to the company to use the scale in hiring of employees, giving preference in hiring to those with higher scores on the scale.

Remember that these responses were obtained at time of application. The effect lasted for 4 years.

Potential problems

The above curve was based on the same sample that was used to select the questions. So clearly there is capitalization on chance. The scale should be tested on a different sample. That is the results need to be cross validated.

Survival Analysis – 34 Printed on 10/26/2016

Group 0 – low tenure

Group 1 – high tenure

1 yr 2 yr 3 yr 4 yr

Page 35: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Multivariate Analysis using Cox Regression

Turnover as a function of 1) friends at the organization (wsfr2) and 2) sex of the employee, and 3) ethnic group of the employee (neth)

COXREG dos /STATUS=status(1) /PATTERN BY wsfr2 /CONTRAST (neth)=Indicator(1) /CONTRAST (wsfr2)=Indicator /METHOD=ENTER wsfr2 nsex neth /PLOT SURVIVAL /CRITERIA=PIN(.05) POUT(.10) ITERATE(20).

Wsfr2 -0.50 does not have friends at company 0.15 no info on whether has friends0.50 friends at the company

Nsex 1 Female2 Male

Neth 1 Employee is White2 Employee is Black

Survival Analysis – 35 Printed on 10/26/2016

Page 36: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

3 Employee is American Indian or Asian or Hispanic

Survival Analysis – 36 Printed on 10/26/2016

Page 37: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Cox Regression

[DataSet1] G:\MDBR\1TurnoverArticle\TurnoverArticleDataset061005.sav

Case Processing Summary

N Percent

Cases available in analysis

Eventa 434 33.4%

Censored 867 66.6%

Total 1301 100.0%

Cases dropped

Cases with missing values 0 0.0%

Cases with negative time 0 0.0%

Censored cases before the earliest

event in a stratum0 0.0%

Total 0 0.0%

Total 1301 100.0%

a. Dependent Variable: dos Days of service: termdate-effdate or 3/1/1-effdate or 12/31/4-effdate

Categorical Variable Codingsa,c

Frequency (1) (2)

wsfr2b

-.50=-.50 423 1 0

.15=Whole sample missing value 100 0 1

.50=.50 778 0 0

nethb

1.00=White 903 0 0

2.00=Black 324 1 0

3.00=Am Ind,Asian,Hisp 74 0 1

a. Category variable: wsfr2 (Whether F/F at company for whole sample analyses)

b. Indicator Parameter Coding

c. Category variable: neth (1=White, 2=Black, 3=Am Ind,Asian, Hisp)

No interactions were included in this analysis.

Survival Analysis – 37 Printed on 10/26/2016

Note that

Wsfr2 = 0.50 (friends) is the reference group

Neth = 1 (white) is the reference group

Page 38: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Block 0: Beginning Block

Omnibus Tests of

Model Coefficients

-2 Log Likelihood

5871.672

Block 1: Method = Enter

Omnibus Tests of Model Coefficientsa

-2 Log Likelihood Overall (score) Change From Previous Step Change From Previous Block

Chi-square df Sig. Chi-square df Sig. Chi-square df Sig.

5827.342 42.322 5 .000 44.330 5 .000 44.330 5 .000

a. Beginning Block Number 1. Method = Enter

Variables in the Equation

B SE Wald df Sig. Exp(B)

wsfr2 22.427 2 .000

wsfr2(1) .464 .102 20.763 1 .000 1.590

wsfr2(2) .421 .173 5.969 1 .015 1.524

nsex -.223 .100 4.952 1 .026 .800

neth 10.799 2 .005

neth(1) .088 .109 .657 1 .417 1.092

neth(2) -.908 .295 9.490 1 .002 .403

Covariate Means and Pattern Values

Mean Pattern

1 2 3

wsfr2(1) .325 1.000 .000 .000

wsfr2(2) .077 .000 1.000 .000

nsex 1.421 1.421 1.421 1.421

neth(1) .249 .249 .249 .249

neth(2) .057 .057 .057 .057

Survival Analysis – 38 Printed on 10/26/2016

0 Friends 1=No Fr

Term’dRemember we’re predicting “Termination”, not survival

We found this previously.

We found this previously.

1= AsA/Hisp

? 1=MV0 Friends

Term’d

Term’d

0=W1=Fem 2=Male

Term’d

Page 39: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

The Kaplan-Meier Curve, for comparison . . .

Survival Analysis – 39 Printed on 10/26/2016

Predicted

These graphs replicate what we found above without other variables (NSEX and NETH) in the equation.

Page 40: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Testing for Interactions in Cox Regression

1. The interaction of Friends and Nsex

Block 1: Method = Enter

Omnibus Tests of Model Coefficientsa

-2 Log Likelihood

Overall (score) Change From Previous Step Change From Previous Block

Chi-square

df Sig. Chi-square

df Sig. Chi-square

df Sig.

5824.879 44.989 7 .000 46.792 7 .000 46.792 7 .000a. Beginning Block Number 1. Method = Enter

Variables in the EquationB SE Wald df Sig. Exp(B)

wsfr2 3.022 2 .221wsfr2(1) .429 .306 1.975 1 .160 1.536wsfr2(2) -.333 .530 .394 1 .530 .717nsex -.282 .138 4.158 1 .041 .754neth 10.964 2 .004neth(1) .097 .109 .800 1 .371 1.102neth(2) -.907 .295 9.464 1 .002 .404nsex*wsfr2 2.517 2 .284nsex*wsfr2(1) .023 .213 .011 1 .915 1.023nsex*wsfr2(2) .541 .347 2.424 1 .119 1.717

So no significant interaction means that the effect of having friends is the same for Females as it is for Males

Survival Analysis – 40 Printed on 10/26/2016

To specify that an interaction be tested, click on the 1st variable name, then while holding down the CTRL key or Command on the Mac, click on the 2nd variable name.

Finally, click on the >a*b> button.

Page 41: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

2. The interaction of Friends and Neth – assessed in a separate analysis.

Block 1: Method = Enter

Omnibus Tests of Model Coefficientsa

-2 Log Likelihood

Overall (score) Change From Previous Step Change From Previous Block

Chi-square

df Sig. Chi-square

df Sig. Chi-square

df Sig.

5820.584 49.194 9 .000 51.088 9 .000 51.088 9 .000a. Beginning Block Number 1. Method = Enter

Variables in the EquationB SE Wald df Sig. Exp(B)

wsfr2 27.320 2 .000wsfr2(1) .599 .121 24.386 1 .000 1.820wsfr2(2) .623 .209 8.846 1 .003 1.864nsex -.224 .100 4.973 1 .026 .799neth 6.603 2 .037neth(1) .298 .150 3.934 1 .047 1.347neth(2) -.465 .344 1.835 1 .176 .628neth*wsfr2 6.377 4 .173neth(1)*wsfr2(1) -.392 .230 2.906 1 .088 .675neth(2)*wsfr2(1) -1.222 .791 2.385 1 .123 .295neth(1)*wsfr2(2) -.534 .378 1.995 1 .158 .586neth(2)*wsfr2(2) -1.093 1.075 1.035 1 .309 .335

Again, the lack of a significant interaction means that the effect of Friends is the same for each ethnic group.

What the heck? What about the interaction of nsex and neth?

Block 1: Method = Enter – assessed again in a separate analysis.

Omnibus Tests of Model Coefficientsa

-2 Log Likelihood Overall (score) Change From Previous Step Change From Previous BlockChi-square df Sig. Chi-square df Sig. Chi-square df Sig.

5827.298 42.395 7 .000 44.373 7 .000 44.373 7 .000a. Beginning Block Number 1. Method = Enter

Variables in the EquationB SE Wald df Sig. Exp(B)

wsfr2 22.438 2 .000wsfr2(1) .465 .102 20.791 1 .000 1.591wsfr2(2) .421 .173 5.941 1 .015 1.523nsex -.216 .118 3.365 1 .067 .806neth .888 2 .642neth(1) .112 .327 .117 1 .732 1.118neth(2) -.739 .884 .700 1 .403 .477neth*nsex .043 2 .979neth(1)*nsex -.018 .234 .006 1 .940 .982neth(2)*nsex -.125 .624 .040 1 .842 .883

Nope.Survival Analysis – 41 Printed on 10/26/2016

Page 42: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Comparing Turnover in two plantsA company was interested in determining the causes of turnover in two of its plants.

Plant A: One part of the preparation of food for sale to retailers is undertaken.Plant B: A different part of the preparation of food for sale to retailer is undertaken.

Each plant is managed by a different person.

The overall “survival” of employees in the two plants, reploc=1 and reploc=2, is as follows . . .

filter off.compute reploc = newloc.value labels reploc 1 "A" 2 "B".filter by useme.KM dayswrkd by reploc /STATUS=termed(1)/PRINT MEAN /PLOT SURVIVAL/TEST LOGRANK BRESLOW TARONE /COMPARE OVERALL POOLED.Kaplan-Meier[DataSet1] G:\MDBR\???\AllEmployeesNN041025.sav

Case Processing Summary

reploc Total N N of Events Censored

N Percent

1.00 A 310 126 184 59.4%

2.00 B 837 285 552 65.9%

Overall 1147 411 736 64.2%

Means and Medians for Survival Time

reploc Meana Median

Estimate Std. Error 95% Confidence Interval Estimate Std. Error 95% Confidence Interval

Lower Bound Upper Bound Lower Bound Upper Bound

1.00 A 355.796 16.345 323.760 387.832 377.000 39.815 298.962 455.038

2.00 B 424.911 9.197 406.884 442.938 559.000 . . .

Overall 407.357 8.081 391.519 423.195 489.000 33.040 424.242 553.758

a. Estimation is limited to the largest survival time if it is censored.

Overall Comparisons

Chi-Square df Sig.

Log Rank (Mantel-Cox) 13.633 1 .000

Breslow (Generalized Wilcoxon) 10.203 1 .001

Tarone-Ware 11.880 1 .001

Test of equality of survival distributions for the different levels of reploc.

Survival Analysis – 42 Printed on 10/26/2016

Page 43: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

filter off.

Clearly, employee “retention/survival” is best in Plant B – reploc = 2.

The manager of Plant A was pretty defensive.

Survival Analysis – 43 Printed on 10/26/2016

Plant B

Plant A

Page 44: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Are these differences in survival rates the same for the different ethnic groups employed by the company?

Perhaps the differences between buildings are due to the fact that the different buildings have different proportions of ethnic groups coupled with the fact that the different ethnic groups have different survival rates.

neweth * reploc Crosstabulationreploc

Total1.00 A 2.00 Bneweth .00 White or Black Count 130 219 349

% within reploc 41.4% 25.7% 29.9%1.00 Hispanic Count 184 634 818

% within reploc 58.6% 74.3% 70.1%Total Count 314 853 1167

% within reploc 100.0% 100.0% 100.0%

These differences suggest that the difference in survival between buildings might be a side-effect of the difference in proportion of Hispanics in the two buildings combined with the difference in survival between Hispanics vs. White/Black,

The way to resolve this issue is to perform a multivariate analysis, assessing the Plant effect while controlling for the Ethnic Group effect..

This can only be done with Cox Regression.

Survival Analysis – 44 Printed on 10/26/2016

Hispanic

White/Black

Page 45: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Multivariate analysis joint effect of plant and ethnic group.

filter off.filter by useme.COXREG dayswrkd /STATUS=termed(1) /METHOD=ENTER reploc neweth /CRITERIA=PIN(.05) POUT(.10) ITERATE(20).

Cox Regression

Case Processing Summary

N PercentCases available in analysis Eventa 411 33.9%

Censored 736 60.7%Total 1147 94.6%

Cases dropped Cases with missing values 65 5.4%Cases with negative time 0 0.0%Censored cases before the earliest event in a stratum 0 0.0%

Total 65 5.4%Total 1212 100.0%a. Dependent Variable: dayswrkd

Block 0: Beginning Block

Omnibus Tests of Model Coefficients

-2 Log Likelihood5312.092

Block 1: Method = Enter

Omnibus Tests of Model Coefficientsa

-2 Log Likelihood

Overall (score) Change From Previous StepChange From Previous

BlockChi-

square df Sig.Chi-

square df Sig.Chi-

square df Sig.5222.145 101.652 2 .000 89.947 2 .000 89.947 2 .000

a. Beginning Block Number 1. Method = Enter

Variables in the EquationB SE Wald df Sig. Exp(B)

reploc -.159 .111 2.072 1 .150 .853neweth -.916 .102 80.462 1 .000 .400

Covariate MeansMean

reploc 1.730neweth .697

filter off.

So, when controlling for differences in ethnic groups, no difference in survival (turnover) between the two buildings was found. The manager of Building A was very happy with this result.

Survival Analysis – 45 Printed on 10/26/2016

Since both factors – reploc and neweth – are dichotomous, I did not bother to identify them as categorical variables for SPSS. I will not be able to get survival curves for the individual combinations, though, because they’re not identified as categorical.

Page 46: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Survival Analysis of a phenomenon with a positive outcomePEG vs. PEGJ Example – skipped in 2018

The data for this example compared two methods of feeding trauma patients, one using a percutaneous esophagogastrojejunostomy (PEGJ) and the other using percutaneous esophagogastrostomy (PEG). It was hoped that the data would show that the PEGJ technique would provide continuous uninterrupted nutrition with greater consistency than with PEG. Time to reach a nutrition goal was the continuous dependent variable. Patients were observed for 14 days. Whether or not a patient reached the goal was the status. Reaching the goal was the +1 state. A patient who had not reached the goal in 14 days, was treated as a censored case. Group=1 is the PEGJ group. Group=2 is the PEG group.

NUTRSD NUTRGOAL DAYSGOAL GOALIN14 GROUP ISS AGE02/15/98 02/16/98 1 1 1 29 4301/10/98 01/12/98 2 1 1 5 8802/14/98 02/18/98 4 1 1 29 3702/02/98 02/06/98 4 1 1 27 3601/10/98 01/13/98 3 1 1 13 9201/09/98 . 15 0 2 19 7301/02/98 01/04/98 2 1 2 26 4201/20/98 01/22/98 2 1 2 36 5503/18/98 . 5 1 1 27 2302/04/98 02/06/98 2 1 2 13 7201/23/98 . 15 0 2 10 4502/01/98 02/02/98 1 1 1 22 5902/20/98 02/21/98 1 1 1 17 5402/03/98 02/04/98 1 1 2 14 7803/31/98 04/02/98 2 1 2 18 3004/13/98 04/15/98 2 1 2 27 4905/08/98 05/09/98 1 1 2 9 2204/14/98 04/20/98 6 1 2 9 6005/27/98 05/28/98 1 1 1 17 2705/13/98 . 15 0 2 29 9505/07/98 05/16/98 9 1 2 25 3104/16/98 04/17/98 1 1 2 32 3103/23/98 03/25/98 2 1 2 20 4104/07/98 04/08/98 1 1 2 16 2903/29/98 03/30/98 1 1 1 25 2404/30/98 05/01/98 1 1 2 29 5205/05/98 05/08/98 3 1 2 38 7905/28/98 05/30/98 2 1 1 4 7606/08/98 06/10/98 2 1 2 16 7005/27/98 05/28/98 1 1 1 9 2704/27/98 04/29/98 2 1 1 22 8704/10/98 04/11/98 1 1 1 27 3602/26/98 03/04/98 6 1 1 25 5403/27/98 03/28/98 1 1 1 29 2204/17/98 04/18/98 1 1 1 22 2202/25/98 03/05/98 8 1 1 25 7903/18/98 03/19/98 1 1 1 25 5601/28/98 01/29/98 1 1 1 17 6603/23/98 03/24/98 1 1 1 16 2004/29/98 05/03/98 4 1 1 26 2207/19/98 08/02/98 14 1 2 34 3308/13/98 08/15/98 2 1 1 25 4908/25/98 . 15 0 2 26 7710/06/98 10/07/98 1 1 2 34 1909/10/98 09/11/98 1 1 2 27 3608/14/98 08/15/98 1 1 1 30 3508/25/98 08/27/98 2 1 2 27 2909/20/98 09/21/98 1 1 2 36 6209/29/98 10/01/98 2 1 2 17 1910/09/98 . 15 0 2 38 74

Survival Analysis – 46 Printed on 10/26/2016

DAYSGOAL is the “length of the arrow” variable in the first handout.

GOALIN14 is a variable which represents whether the goal was reached or not.

GOALIN14=1 means that the goal was reached.

GOALIN14=0 means that the case is right-censored.

Page 47: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

NUTRSD NUTRGOAL DAYSGOAL GOALIN14 GROUP ISS AGE10/02/98 10/03/98 1 1 1 10 4008/26/98 09/04/98 9 1 2 18 4808/19/98 08/21/98 2 1 1 18 3108/03/98 08/04/98 1 1 1 41 4608/25/98 08/28/98 3 1 2 24 3709/17/98 . 15 0 2 26 7507/02/98 . 15 0 1 19 2808/03/98 08/05/98 2 1 2 13 5207/15/98 07/17/98 2 1 2 38 7107/27/98 08/01/98 5 1 2 34 3304/30/98 05/02/98 2 1 2 4 6105/29/98 05/30/98 1 1 1 29 5805/16/98 05/18/98 2 1 2 19 4206/20/98 06/23/98 3 1 1 25 1908/30/98 . 15 0 1 25 7004/30/98 05/02/98 2 1 2 43 3307/01/98 07/02/98 1 1 1 43 7909/29/98 . 15 0 2 17 1805/28/98 06/08/98 11 1 2 36 5707/15/98 07/16/98 1 1 2 27 5908/11/98 08/12/98 1 1 1 19 4310/12/98 10/13/98 1 1 1 36 1808/24/98 08/25/98 1 1 1 20 8410/22/98 . 15 0 1 25 1710/08/98 10/09/98 1 1 2 25 2010/06/98 . 15 0 2 17 3107/30/98 08/02/98 3 1 1 22 2604/16/98 04/17/98 1 1 1 38 1810/08/98 10/09/98 1 1 1 25 3408/19/98 08/21/98 2 1 1 34 2203/20/98 03/21/98 1 1 1 25 4806/20/98 06/21/98 1 1 1 11 4507/30/98 07/31/98 1 1 1 25 3309/07/98 . 15 0 2 36 2807/17/98 07/18/98 1 1 1 22 6209/15/98 09/17/98 2 1 2 20 4707/07/98 07/08/98 1 1 1 33 2710/01/98 10/02/98 1 1 2 25 3309/11/98 09/12/98 1 1 1 41 31

Specifying the analysis using Life Tables . . .

Survival Analysis – 47 Printed on 10/26/2016

Page 48: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

The output of LIFE TABLESSURVIVAL TABLE=DAYSGOAL BY GROUP(1 2) /INTERVAL=THRU 15 BY 1 /STATUS=GOALIN14(1) /PRINT=TABLE /PLOTS ( SURVIVAL)=DAYSGOAL BY GROUP .

Survival Analysis

G:\MdbT\P595\P595AL07-Survival analysis\PEGPEGJData.sav

Survival Variable: DAYSGOAL

Life Table

46 0 46.000 0 .00 1.00 1.00 .00 .000 .000 .00 .00

46 0 46.000 28 .61 .39 .39 .07 .609 .072 .88 .15

18 0 18.000 6 .33 .67 .26 .06 .130 .050 .40 .16

12 0 12.000 3 .25 .75 .20 .06 .065 .036 .29 .16

9 0 9.000 3 .33 .67 .13 .05 .065 .036 .40 .23

6 0 6.000 1 .17 .83 .11 .05 .022 .022 .18 .18

5 0 5.000 1 .20 .80 .09 .04 .022 .022 .22 .22

4 0 4.000 0 .00 1.00 .09 .04 .000 .000 .00 .00

4 0 4.000 1 .25 .75 .07 .04 .022 .022 .29 .28

3 0 3.000 0 .00 1.00 .07 .04 .000 .000 .00 .00

3 0 3.000 0 .00 1.00 .07 .04 .000 .000 .00 .00

3 0 3.000 0 .00 1.00 .07 .04 .000 .000 .00 .00

3 0 3.000 0 .00 1.00 .07 .04 .000 .000 .00 .00

3 0 3.000 0 .00 1.00 .07 .04 .000 .000 .00 .00

3 0 3.000 0 .00 1.00 .07 .04 .000 .000 .00 .00

43 0 43.000 0 .00 1.00 1.00 .00 .000 .000 .00 .00

43 0 43.000 11 .26 .74 .74 .07 .256 .067 .29 .09

32 0 32.000 15 .47 .53 .40 .07 .349 .073 .61 .15

17 0 17.000 2 .12 .88 .35 .07 .047 .032 .13 .09

15 0 15.000 0 .00 1.00 .35 .07 .000 .000 .00 .00

15 0 15.000 1 .07 .93 .33 .07 .023 .023 .07 .07

14 0 14.000 1 .07 .93 .30 .07 .023 .023 .07 .07

13 0 13.000 0 .00 1.00 .30 .07 .000 .000 .00 .00

13 0 13.000 0 .00 1.00 .30 .07 .000 .000 .00 .00

13 0 13.000 2 .15 .85 .26 .07 .047 .032 .17 .12

11 0 11.000 0 .00 1.00 .26 .07 .000 .000 .00 .00

11 0 11.000 1 .09 .91 .23 .06 .023 .023 .10 .10

10 0 10.000 0 .00 1.00 .23 .06 .000 .000 .00 .00

10 0 10.000 0 .00 1.00 .23 .06 .000 .000 .00 .00

10 0 10.000 1 .10 .90 .21 .06 .023 .023 .11 .11

Interval StartTime.000

1.000

2.000

3.000

4.000

5.000

6.000

7.000

8.000

9.000

10.000

11.000

12.000

13.000

14.000

.000

1.000

2.000

3.000

4.000

5.000

6.000

7.000

8.000

9.000

10.000

11.000

12.000

13.000

14.000

First-order Controls1

2

GROUP

NumberEnteringInterval

NumberWithdrawin

g duringInterval

NumberExposed to

Risk

Number ofTerminalEvents

ProportionTerminatin

gProportionSurviving

CumulativeProportion

Surviving atEnd ofInterval

Std. Error ofCumulativeProportion

Surviving atEnd ofInterval

ProbabilityDensity

Std. Errorof

ProbabilityDensity

HazardRate

Std.Error ofHazardRate

Median Survival Time

1.82

2.70

First-order Controls1

2

GROUPMed Time

Survival Analysis – 48 Printed on 10/26/2016

Page 49: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

First-order Control: GROUP

These data are strange because the “event” is something that is sought after - reaching a feeding goal, rather than something that is to be avoided - death or termination. So for these data, lower "survival" is preferred, since the "event" is not death, but reaching a nutrition goal. The sooner a patient reached the nutrition goal the better. Thus, the investigators hoped that patients in the PEJ condition would reach those goals faster, leading to lower "survival" curves. In this case, survival should be called "Failure to reach feeding goal."

Survival Analysis – 49 Printed on 10/26/2016

Since the outcome is a good event, the faster the curve falls to zero, the better.

So the group performing best is the group with the lowest curve.

Page 50: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Analysis of the same data using Kaplan-Meier

KM DAYSGOAL BY GROUP /STATUS=GOALIN14(1) /PRINT TABLE MEAN /PLOT SURVIVAL HAZARD /TEST LOGRANK BRESLOW TARONE /COMPARE OVERALL POOLED .

Kaplan-Meier

G:\MdbT\P595\P595AL07-Survival analysis\PEGPEGJData.sav

Case Processing Summary

46 43 3 6.5%

43 34 9 20.9%

89 77 12 13.5%

GROUP1

2

Overall

Total N N of Events N Percent

Censored

Means and Medians for Survival Time

2.717 .527 1.685 3.750 1.000 . . .

5.488 .857 3.808 7.169 2.000 .214 1.581 2.419

4.056 .517 3.043 5.069 2.000 .211 1.587 2.413

GROUP1

2

Overall

Estimate Std. Error Lower Bound Upper Bound

95% Confidence Interval

Estimate Std. Error Lower Bound Upper Bound

95% Confidence Interval

Meana

Median

Estimation is limited to the largest survival time if it is censored.a.

Survival Analysis – 50 Printed on 10/26/2016

Page 51: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Overall Comparisons

8.479 1 .004

9.588 1 .002

9.306 1 .002

Log Rank (Mantel-Cox)

Breslow (GeneralizedWilcoxon)

Tarone-Ware

Chi-Square df Sig.

Test of equality of survival distributions for the different levels of GROUP.

Survival Analysis – 51 Printed on 10/26/2016

Page 52: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

The same analysis using Cox Regression

Survival Analysis – 52 Printed on 10/26/2016

One requirement of the Cox Regression analysis is that the hazard functions be proportional. That means that for any two values of a covariate, the ratio of hazards for those two values across time be constant.

This eliminates hazard functions which cross or which are parallel.

Roughly speaking the hazard function should look like the following . . .

That is, the hazard functions diverge over time.

Page 53: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

COXREG DAYSGOAL /STATUS=GOALIN14(1) /PATTERN BY GROUP /CONTRAST (GROUP)=Indicator(1) /METHOD=ENTER GROUP /PLOT SURVIVAL HAZARD /CRITERIA=PIN(.05) POUT(.10) ITERATE(20) .

Cox Regression

G:\MdbT\P595\P595AL07-Survival analysis\PEGPEGJData.sav

Case Processing Summary

77 86.5%

12 13.5%

89 100.0%

0 .0%

0 .0%

0 .0%

0 .0%

89 100.0%

Event a

Censored

Total

Cases availablein analysis

Cases with missing values

Cases with negative time

Censored cases beforethe earliest event in astratum

Total

Cases dropped

Total

N Percent

Dependent Variable: DAYSGOALa.

Categorical Variable Codings b

46 0

43 1

1

2

GROUP aFrequency (1)

Indicator Parameter Codinga.

Category variable: GROUPb.

Block 0: Beginning Block

Omnibus Tests of Model Coefficients

618.281-2 Log Likelihood

Block 1: Method = Enter

Omnibus Tests of Model Coefficients a,b

612.895 5.448 1 .020 5.385 1 .020 5.385 1 .020-2 Log Likelihood Chi-square df Sig.

Overall (score)

Chi-square df Sig.

Change From Previous Step

Chi-square df Sig.

Change From Previous Block

Beginning Block Number 0, initial Log Likelihood function: -2 Log likelihood: 618.281a.

Beginning Block Number 1. Method = Enterb.

Survival Analysis – 53 Printed on 10/26/2016

Page 54: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Variables in the Equation

-.542 .235 5.332 1 .021 .582GROUPB SE Wald df Sig. Exp(B)

Covariate Means and Pattern Values

.483 .000 1.000GROUPMean 1 2

Pattern

The above graph presents predicted proportions. They are analogous to plots of y-hats vs. predictors in a regression analysis.

When you perform a Cox-regression analysis, you may also have to run a Kaplan-Meier analysis just for the observed survival curves the K-M procedure produces.

Survival Analysis – 54 Printed on 10/26/2016

1

No Goal

2

Goal

Page 55: Survival Analysis - University Homepage€¦ · Web viewSurvival/Failure Analysis (AKA Event History Analysis) T & F Chapter 11 Data Example 1. A medical doctor wished to compare

Survival Analysis – 55 Printed on 10/26/2016