Top Banner
A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become a widely used methodology in diverse fields of research such as medicine, economics and political science. This script gives a brief introduction to these statistical methods. It is meant to serve as a self-learning text, combining the basic theoretical background of survival analysis with practical applications. Moreover, it provides some recom- mendations on the basic literature as well as hints for reading on more advanced topics. It is intended for students beginning to study survival analysis as a companion to the introductory literature. The theoretical background of this script is mainly based on Alisson (2004), Kiefer (1988) as well as Kleinbaum and Klein (2005) while the practical applications build on Fox and Weisberg (2011). *I am grateful to Thomas Braendle, Thorsten Henne, and Reto Odermatt for their helpful comments. **University of Basel, Faculty of Business and Economics, Peter-Merian-Weg 6, 4002 Basel, Switzerland, phone: 0041-(0)61 267 33 03, [email protected] 1
25

A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

Apr 22, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

A Short Introduction to Survival Analysis*

Ulrich Matter**

Last revision: 20 June 2012

Abstract

Survival analysis has become a widely used methodology in diverse fields of research suchas medicine, economics and political science. This script gives a brief introduction to thesestatistical methods. It is meant to serve as a self-learning text, combining the basic theoreticalbackground of survival analysis with practical applications. Moreover, it provides some recom-mendations on the basic literature as well as hints for reading on more advanced topics. It isintended for students beginning to study survival analysis as a companion to the introductoryliterature. The theoretical background of this script is mainly based on Alisson (2004), Kiefer(1988) as well as Kleinbaum and Klein (2005) while the practical applications build on Foxand Weisberg (2011).

*I am grateful to Thomas Braendle, Thorsten Henne, and Reto Odermatt for their helpful comments.**University of Basel, Faculty of Business and Economics, Peter-Merian-Weg 6, 4002 Basel, Switzerland, phone:

0041-(0)61 267 33 03, [email protected]

1

Page 2: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

Contents1 Introduction: what is survival analysis? 3

2 Properties of survival data 32.1 Sampling methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 Censoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3 A data set: financial aid for released prisoners . . . . . . . . . . . . . . . . . . . . . 5

3 Analyzing survival data 53.1 Why not simple OLS or logit? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63.2 Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63.3 Estimating survivor functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4 Proportional hazards models 114.1 Parametric proportional hazards models . . . . . . . . . . . . . . . . . . . . . . . . 114.2 The Cox proportional hazards model . . . . . . . . . . . . . . . . . . . . . . . . . . 124.3 The Cox model with time-varying covariates . . . . . . . . . . . . . . . . . . . . . . 134.4 Tests and diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

5 Further topics 185.1 Repeated events and competing risks . . . . . . . . . . . . . . . . . . . . . . . . . . 185.2 Unobserved heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

6 Recommended literature 19

2

Page 3: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

1 Introduction: what is survival analysis?Survival analysis is a branch in statistics which includes a variety of “statistical methods designedto describe, explain or predict the occurrence of events” (Alisson 2004: 369). Originated frombiostatistics, survival analysis has become a widely used methodology in many fields of research.Depending on the field of research, survival analysis might therefore also be called event historyanalysis (in sociology), failure time analysis (in engineering/reliability theory), transition analysisor duration analysis (in economics/econometrics). Survival analysis is applied to answer questionssuch as: Does smoking decrease the lifespan? Which manufacturing process increases the lifespanof a light bulb? What drives the duration of an individual’s unemployment status, the durationof a strike, or the duration of a recession? What makes some states adopt a certain policy earlierthan others?

2 Properties of survival dataGenerating survival data means observing a sample of research subjects (individuals) over a pre-defined time period and recording wether and when the individuals experience the event. Thereby,“[...] an event may be defined as a qualitative change that occurs at some particular point in time.“(Alisson 2004: 369). Some examples for such an event are: an individual dies, an individual gets ajob, or a state adopts a certain policy. Basic survival data consist of a variable measuring the timethat has passed (the duration) before an individual experiences the event (or until the study ends)and a variable indicating if the individual experiences that event during the observation period ornot. The way the data are generated can have important implications for the analysis (Jenkins2005: 3). The gist of generating survival data are the consideration about how individuals enterthe sample, or in other words: when does the observation period start?

2.1 Sampling methods

There are two important sampling methods to be considered: stock sampling and flow sampling.Collecting data by stock sampling means randomly choosing individuals that are currently in thestate of interest (alive, unemployed, policy not yet adopted) and follow them until a pre-specifieddate. This means that for all individuals the observation period begins at the same date and endsat the same date. For example, to study what drives the duration of unemployment one mighttake a random sample from all persons registered as unemployed on the first of August 2011 andobserve them for one year.

Alternatively, collecting data with flow sampling implies randomly selecting individuals thatenter the state of interest (being born, losing one’s job, facing a new policy invention) during apredefined interval of time and follow them until a pre-specified date or for a pre-specified periodof time. In terms of the unemployment example: Take a random selection of all persons registeringas unemployed between the first of August and the first of September 2011 and follow them untilthe first of September 2012. Figure 1 gives a graphical impression of survival data generated byflow or stock sampling.

3

Page 4: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

Figure 1: Stock sampling, flow sampling and censoring

● ● ●

Flow Sampling

Time

Indi

vidu

als

Start Stop s. Stop obs.

1

2

3

4

● ● ●

Stock Sampling

TimeIn

divi

dual

s

Start obs. Stop obs.

1

2

3

4

Notes: The vertical lines indicate the beginning and end of the observation (obs.) period and thesampling (s.).

2.2 Censoring

Flow sampling has usually the advantage of knowing exactly when somebody enters the state ofinterest which might not be the case for stock sampling.1 But in some cases this is also not clearin the case of flow sampling. Imagine you want to study the duration between an HIV infectionand the outbreak of AIDS. Using flow sampling in that case involves a practical problem: HIVcarriers can only enter your sample in the moment they have been positively tested for HIV, butit is hardly possible to find out when they actually were infected (Kleinbaum/Klein 2005: 8). Theproblem of not knowing the exact point in time an individual has entered the state of interestis referred to as left-censoring. Another obvious problem which occurs independent of the choiceof a sampling method is that not all individuals will experience the event during the observationperiod. In such a case we only know that an individual hasn’t experienced the event before the endof the observation period. This problem is referred to as right-censoring.2 In Figure 1 censoringis indicated with a dotted line. In the case of flow sampling we know exactly when individual 4has entered the state of interest, but not when her state will change (the event will occur). Weonly know that it hasn’t occurred before the end of the observation period. Whereas in the case ofstock sampling we neither know when individual 4 has entered the state of interest and nor whenher state will change.

Because left censoring is not very common whereas right censoring is an issue in almost all1In our unemployment example it might be possible to figure out when somebody has registered as unemployed,

even when using stock sampling. In such cases the choice between flow and stock sampling is rather based onpractical considerations.

2The issue of right-censoring also comes up if an individual leaves the risk set before the end of the observationperiod without having experienced the event. This is often the case in clinical studies in hospitals. Some patientsmight leave the risk set simply because they change the hospital and cannot be followed further.

4

Page 5: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

Table 1: Example of survival data

week arrest fin age race wexp mar paro prio20 1 0 27 1 0 0 1 317 1 0 18 1 0 0 1 825 1 0 19 0 1 0 1 1352 0 1 23 1 1 1 1 152 0 0 19 0 1 0 1 352 0 0 24 1 1 0 0 223 1 0 25 1 1 1 1 052 0 1 21 1 1 0 1 452 0 0 22 1 0 0 0 652 0 0 20 1 1 0 0 0

Notes: Example of a data set with survival data: the first rows of the recidivism study data fromRossi et al. (1980).

applications of survival data, the rest of this script is only focused on the case of right censoreddata.

2.3 A data set: financial aid for released prisoners

A real world example of how a data set with survival data looks like is given in Table 1. It thefirst rows and columns of a data set based on an experimental study of recidivism from Rossi et al.(1980)3, which is widely used as an example in the survival analysis literature outside medicalresearch.4 The data contain variables of 432 male prisoners who were followed for a year afterhaving been released from prison. The focus of the study was on whether randomly assignedfinancial aid has a significant effect on a released prisoner being rearrested. The state of interestin this setting is “being a free man” and the event is to be rearrested.

The first two columns are typical for survival data. The variable “week” is the week of the firstarrest after having been released from prison, or if the released prisoner had not been arrestedagain, the week the observation period ended. In other words: the duration or the censoringtime. The variable “arrest” indicates if the released prisoner had been rearrested (arrest=1) or not(arrest=0). In the survival analysis jargon: if the event had occurred or if the observation wascensored. The rest of the columns in Table 1 show time-invariant personal characteristics of thereleased prisoners.

3 Analyzing survival dataIn this script, I will focus on the most basic concepts and the most common methods of survivalanalysis likely to be relevant to economists. Before going into the methodological details of survival

3The data are available as text-file at http://stat.ethz.ch/education/semesters/ss2011/seminar/Rossi.txt.4See Fox and Weisberg (2011) whose examples are the basis for many examples in this script.

5

Page 6: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

analysis, I will give some reasons for why we need such distinctive methods.

3.1 Why not simple OLS or logit?

Given the data from above, one could be tempted to estimate a model like

log(Ti) = X �iβ + εi (1)

where Ti is the “week”-variable and Xi a vector of explanatory variables such as financial aid.Or alternatively

pi = P (Yi = 1) = F (X �iβ) (2)

where Yi is the “arrest”-variable and Xi a vector of explanatory variables such as financial aid.As noted by Jenkins (2005), estimating (1) with plain OLS bears a problem with the censored

cases. Keeping them in the data means treating them as complete durations (occurred events) andtherefore leads to a disproportionately high number of events at the censoring time. Excludingthese cases from the data would lead to a too high number of events with low durations. In bothcases the fitted OLS line would have the wrong slope. A usual way for an economist to deal withthis problem would be to use a so called Tobit model and estimate it with maximum-likelihood.This might work if the explanatory variables don’t vary over time. If time varying covariates areincluded, a new problem arises because Ti does not vary over time. So it is not clear which valueof a time varying covariate should be chosen.

Why not use an approach like (2)? Besides similar problems with the censored cases, estimatingthe probability that a released prisoner gets rearrested does not take account of differences induration Ti. In other words: the timing of the arrest would be ignored (Alisson 2004: 370).

3.2 Basic concepts

For matters of simplicity we assume time (and therefore duration) to be continuous. A very simpleway to specify the probability distribution of continuous durations is the distribution function

F (t) = P (T < t). (3)

The distribution function of t represents the probability that a realization of the random variableT is less than a value t. Furthermore f(t) is the density function corresponding to (3) and thuscan be written as

f(t) = dF (t)/dt. (4)

An alternative specification of the probability distribution of duration and an important conceptin survival analysis is the survivor function

S(t) = 1− F (t) = P (T ≥ t), (5)

which is the probability that a realization of the random variable T is greater than or equal tot. Or in other words: the probability that the event has not yet occurred by time t.

6

Page 7: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

To understand another important representation of such a distribution, imagine a game of dice(example based on Kiefer 1988: 648).5 The game goes as follows: You start in round one and tossa die, if it is a 6 you win and get to round two. If it is not a 6 you lose and the game is over. Inround two you toss the die again and again you win with a 6 or lose with any other and so on. Thegame continues until you lose at one point. We are interested in the probability of losing in roundt, lets call this probability f(t). Losing in round t can be seen as a sequence of winning gamesbefore round t. This in turn can be defined as a sequence of conditional probabilities. Lets call theconditional probability of losing in round t, given a win in previous rounds λ(t). The probabilityof losing in, for example, round 5 can now be expressed as

f(5) = λ(1− λ)4 (6)

In survival analysis, λ(t) is referred to as the hazard function of t. In this example, λ(t) is thehazard function of “rounds played”. In the example of released prisoners λ(t) would be the hazardfunction of “weeks in freedom”. Note that in the example of the dice game, λ(t) is 5

6 for all t sinceall rounds are played with the same fair die. This might not be the case in other applicationssuch as studying the duration of unemployment (this is an important aspect throughout survivalanalysis to which I will turn to later in this script).

The relation between the hazard function λ(t) and the density function f(t) as shown in (6)implies that “for each specification in terms of a hazard function there is a mathematically equiva-lent specification in terms of a probability distribution” (Kiefer 1988: 649). One can even displaya more specific relation between the concepts shown above. Assuming continuous time, the hazardfunction is formally defined as

λ(t) = lim�t→0

P [(t ≤ T < t+�t)|T ≥ t]

�t(7)

and some math6 can show that this is the same as

λ(t) =f(t)

S(t). (8)

This represents nicely that the hazard function “[...] assesses the instantaneous risk of demiseat time t, conditional on survival to that time” (Fox and Weisberg 2011: 2). Moreover, it showshow to derive one representation of the duration distribution from another.

So far this section has dealt with how to represent different duration distributions and nothow these distributions could look like. Some distributions have convenient features for survivalanalysis and are therefore frequently used to model duration data. Kiefer (1988: 652 ff.) mentionsthe exponential, the Weibull and the log-logistic distribution. Figure 2 illustrates the densityfunction of the Weibull distribution for certain parameter values.

5Note that in this example, unlike in the rest of this script, t is discrete by construction.6See Appendix AI of this document for a detailed derivation.

7

Page 8: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

Figure 2: Weibull distribution

0 1 2 3 4

0.0

0.2

0.4

0.6

t

f(t)

Notes: The density function of the Weibull distribution with parameters gamma=0.86 and al-pha=1.5.

The reason to choose one over the other lies mainly in different assumptions about the hazardfunction. In the example of released prisoners, it seems reasonable to assume that the hazardof being rearrested does not depend systematically on the duration. In other words: there is notheoretical argument that a released prisoner is more likely to commit a crime two weeks afterrelease than 52 weeks after. Thus, we would assume that there is no duration dependence whichimplies a constant hazard. In this case we should choose the exponential distribution7.

In the example of unemployment, on the other hand, assuming a constant hazard might bedifficult to justify. The fact that somebody has been unemployed for a long time could be seenas a bad signal for potential employers, reducing the chances of being hired. In such a case itwould therefore be reasonable to assume negative duration dependence (decreasing hazard). Hereit would therefore be advisable to choose a specification of the Weibull distribution to model thedata8. While there is no assumption about the underlying distribution of duration needed in basicsurvival analysis, it can be crucial for more sophisticated methods.

3.3 Estimating survivor functions

Although time is indeed continuous in the real world, survival data from the real world normallyis not. To stay with the released prisoners example: a former prisoner might theoretically bearrested at any point in time after having been released, but we only have observations about

7The exponential distribution (for γ > 0) is defined as F (t) = 1– exp(–γt). Applying (4), (5) and (8) shows thatλ(t) = γ. Thus, the hazard, given any t, is constant. For a derivation and more details on this matter see AppendixAII.

8The Weibull distribution is defined as F (t) = 1– exp(–γtα). With (4), (5) and (8) one can show that λ(t) =γαtα–1. Thus, the hazard is increasing in t if α > 1, constant if α = 1 and decreasing with t if α < 1 (Kiefer, 1988:655). For a derivation and more details on this matter see Appendix AIII.

8

Page 9: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

whether he has been arrested within week one or week two and so forth. Consequently, anyempirical representation of the concepts introduced above have de facto discrete durations in theform of chosen intervals of continuous time as an input.9 To estimate the survivor function S(t) forthe released prisoners it is straightforward to estimate the probability of being rearrested in weekone, the probability of being rearrested in week two, and so forth until week 52. A comfortableway to compute this is to use a table where all observations are aggregated and ordered by theirduration as shown in Table 2 for the released prisoners example.

Table 2: Estimation of a survivor function

Time N.Risk N.Event Survival SE CI 95% low CI 95% up1 1 432 1 0.9977 0.0023 0.9932 1.00002 2 431 1 0.9954 0.0033 0.9890 1.00003 3 430 1 0.9931 0.0040 0.9853 1.00004 4 429 1 0.9907 0.0046 0.9817 0.99985 5 428 1 0.9884 0.0051 0.9784 0.99866 6 427 1 0.9861 0.0056 0.9751 0.99727 7 426 1 0.9838 0.0061 0.9720 0.99588 8 425 5 0.9722 0.0079 0.9568 0.98789 9 420 2 0.9676 0.0085 0.9510 0.9844

10 10 418 1 0.9653 0.0088 0.9482 0.9827

Notes: The first ten rows of a lifetable to estimate the survivor function for the released prisonersexample. Data source: Rossi et al. (1980).

The row number in Table 2 thus stands for the ordered duration number j. “Time” stands forthe week after having been released (the duration measured in weeks), “N.Risk” for the numberof prisoners still in freedom at the beginning of that week and “N.Event” the number of prisonersbeing rearrested during that week. What we are interested in is the column “Survival” which showsan estimation for the probability of not being arrested in a week at the beginning of that week.How to compute those values?

1. Compute the empirical hazard rate for every week j : λ(tj)=dj

nj, where dj is the number

of rearrests during week number j and nj the number of released prisoners at risk of beingrearrested at the beginning of that week (still in freedom).

2. Then compute S(t) for that week as : S(tj) =j�i=1(1− λ(ti))

The value of “Survival” for week one in Table 2 was computed accordingly:

S(1) = 1− (1/432) = 0.9976852

9Although most models (like the ones presented in the following chapters) do actually assume t to be continuous,they are very often applied to data with de facto discrete durations (as in the released prisoners example). Forpractical purposes the models’ characteristics are still adequate.

9

Page 10: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

and for week two:

S(2) = [1− (1/432)]× [1− (1/431)] = 0.9953704.

Furthermore, standard errors and confidence intervals can be computed as presented in columns"SE", "CI 95% low" and "CI 95% up" in Table 2.10 If the smallest time units in the data set areused as intervals to estimate the survivor function – as done in the example above – this methodis called the Kaplan-Meier method (or Kaplan-Meier estimate of the survivor function). Usingthe Kaplan-Meier method means smoothening the estimated survivor function as good as possiblegiven the data. Due to the possibly vast number of intervals used, it does often not make senseto actually interpret the table computed with Kaplan-Meier. Often more useful is to plot theestimated survivor function. Figure 3 shows the plot of our estimated survivor function for thereleased prisoners. The dashed lines represent the estimated confidence interval.

Figure 3: Plot of the survivor function for released prisoners

0 10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

1.0

t

S(t)

survivor functionconfidence interval

Notes: Plot of the estimated survivor function (S(t)) against duration (t). Data source: Rossi etal. (1980).

The plot indicates that the probability of “still being a free man” is decreasing in the weeksof living in freedom. This is of course not surprising. However, it might be more interesting tocompare the survivor functions of different groups. In our example, we want to know if financialaid is actually preventing released prisoners from being rearrested. Figure 3 therefore displaysplots of the estimated survivor function of the group of prisoners with and without financial aid.

10Standard errors can be computed with the method of Greenwood (Glantz 2002: 398): S(tj)�� di

ni(ni−di).

10

Page 11: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

Figure 4: Comparison of survivor functions

0 10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

1.0

t

S(t)

no financial aidfinancial aid

Notes: Plot of the estimated survivor functions for the two groups of released prisoners (with andwithout financial aid). Data source: Rossi et al. (1980).

This graphical analysis indicates that financial aid indeed prevents released prisoners fromgoing back to prison. Before drawing conclusions, we might like to test this effect on its statisticalsignificance. This can be done by using a log-rank test on the null hypothesis that the two groupshave the same survivor function. In the case of financial aid the null hypothesis can only berejected at the 10%-significance level. Hence there is only weak evidence that financial aid has aneffect. Moreover, we might also be interested in possible effects of other variables which cannot beincluded in the same graph and test, respectively. Other methods are needed that allow for theinclusion of covariates.

4 Proportional hazards models

4.1 Parametric proportional hazards models

Including explanatory variables in duration models is unfortunately not as straightforward as inlinear regression models. A widely used approach is the proportional hazards specification. In thatspecification one assumes that the effect of the regressors (covariates) is to multiply the baselinehazard by a factor which does not depend on duration t (Kiefer 1988: 664). Therefore, “[t]he term’proportional hazards’ refers to the effect of any covariate having a proportional and constant effectthat is invariant to when in the process the values of the covariate changes” (Box-Steffensmeierand Jones 1997: 1433). This can be expressed with an exponential regression specification

11

Page 12: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

λ(t, x,β,λ0) = λ0(t) exp(x�β) (9)

where λ0 stands for the baseline hazard (the basic distribution of durations). The effect of thecovariates in such a model is thus to “shift” the baseline hazard up and down.

To estimate the betas in such a model parametrically, an assumption about the underlyingbaseline hazard λ0 is necessary. This means one has to make an assumption about the basicduration distribution as discussed in chapter 3.2. When assuming that the durations have aWeibull distribution, the baseline hazard would be λ(t) = γαtα–1. Thus, the Weibull proportionalhazards model is

λ(t, x,β,α, γ) = γαtα–1 exp(x�β). (10)

This is a parametric proportional hazards model because, except for the values of the unknownparameters α, γ and the β�s, its functional form is completely specified (Kleinbaum and Klein 2005:96). The betas in such a model can be estimated (parametrically) using maximum-likelihood.11Although this approach can be useful if theory suggests a certain duration distribution, it can beproblematic if the assumption about the baseline hazard is arbitrary.

4.2 The Cox proportional hazards model

Unlike parametric proportional hazards models the Cox proportional hazards model does notneed any assumption about the duration distribution. Cox (1972) showed that with his partial-

likelihood approach the βs can be estimated without any specification of the baseline hazard λ0.In this approach, only the probabilities for those subjects who experience the event are explicitlyconsidered in the likelihood function in order to estimate the βs, therefore the notation “partial-likelihood” (Kleinbaum and Klein 2005: Chapter 3). Unlike typical formulations of the likelihoodfunction the Cox likelihood is not based on the distribution of the outcome but only on the observedorder of events . Setting up the likelihood function in this manner automatically leads to cancelingout the baseline hazard λ0.12

While not depending on an arbitrary assumption about the baseline hazard is clearly an advan-tage, estimating a proportional hazards model with Cox’ method also has its weaknesses. Com-pared to a parametric model it uses less information and the estimation is therefore less efficient.Furthermore, if more than one event occurs at the same time, the likelihood function cannot beset up using partial-likelihood. There are computational approximation methods13 to deal withthis matter but the problem remains if the number of events occurring at the same time is large(Box-Steffensmeier and Jones 1997: 1434).

As mentioned in section 3.3, there is a difference in the duration of time spent in freedombetween released prisoners with and without financial aid. To test simultaneously the effectsof financial aid and several additional covariates on the hazard of being rearrested, we fit a Coxproportional hazards model to these data. In a first step only the following time-invariant variablesare included:

• fin: financial aid indicator (1 = received financial aid).

• age: age of the prisoner in years when released.11Assuming a distribution of the underlying baseline hazard technically permits setting up and maximizing the

log-likelihood.12See Appendix AIV for more details.13See Therneau and Grambsch (2000: 48) for a short discussion of these methods.

12

Page 13: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

• race: indicator for race (1= Afro-American, 0 = other).

• wexp: indicator for full-time work experience prior to arrest (1 = had experience, 0 = not).

• mar : indicator for marriage (1 = individual was married when released, 0 = individual wasnot married).

• paro: parole indicator (1 = individual was released on parole, 0 = not).

• prio: number of convictions prior to release.

The results are presented in Table 3. The exponentiated coefficients can be read as “multiplicativeeffects on the hazard” (Fox and Weisberg 2011: 6). Hence, holding the other covariates constant,financial aid reduces the weekly hazard of being rearrested by 32 percent (0.68-1 = -0.32).14 Thisis a remarkable effect which is also statistically significant at the 5% significance level. Note thatunlike in the simple comparison of the survivor functions in chapter 4.3, the effect of financial aidseems to be more significant and is comparable to effects of other covariates.

Table 3: Determinants of recidivism

coef exp(coef) se(coef) z Pr(>|z|)fin -0.3794 0.6843 0.1914 -1.9826 0.0474age -0.0574 0.9442 0.0220 -2.6109 0.0090race 0.3139 1.3688 0.3080 1.0192 0.3081wexp -0.1498 0.8609 0.2122 -0.7058 0.4803mar -0.4337 0.6481 0.3819 -1.1357 0.2561paro -0.0849 0.9186 0.1958 -0.4336 0.6646prio 0.0915 1.0958 0.0286 3.1938 0.0014

Notes: Estimated coefficients of the Cox proportional hazards model. Data source: Rossi et al.(1980).

4.3 The Cox model with time-varying covariates

So far, this analysis included covariates that do not change over the period of observation (inthis example 52 weeks). This might be a reason for concern because some time-varying variablespossibly explain better why some released prisoners were rearrested earlier than others. The dataset of the example contains such a variable. For each week it indicates whether a released prisonerwas employed (1) or not (0). The problem is that such a variable cannot simply be added tothe Cox model estimated above. The Cox proportional hazards model can only deal with time-invariant variables. However, there is a way to include time-varying covariates. The trick is tomodify the data appropriately. In most cases the values of time-varying variables are not knownfor every point during the observation period. Rather they are known for certain intervals suchas days or weeks. These intervals have to be coded in the right manner to include time-varying

14Note that for small values up to around ±0.05 the coefficients can also be interpreted directly as effects (0.05 ≈exp(0.05)− 1).

13

Page 14: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

variables in a Cox proportional hazards model. According to Alisson (2004: 377) the data must beformatted as follows: Individuals are represented in several records, where each record stands for aninterval of time during which all covariates are constant. Furthermore the following variables haveto be defined for each record: the starting time of the interval, the stopping time and an indicatorvariable, equal to 1 only in the interval (record) during which the event occurred.15 Transformingthe survival data in this way is called episode splitting. (Alisson 2004: 377).16 Table 4 shows thefirst columns and rows of the released prisoners’ data set after episode splitting.17

Table 4: Survival data after episode splitting

id time start stop event.time week arrest emp1 1 1 0 1 0 20 1 02 1 2 1 2 0 20 1 03 1 3 2 3 0 20 1 04 1 4 3 4 0 20 1 05 1 5 4 5 0 20 1 06 1 6 5 6 0 20 1 07 1 7 6 7 0 20 1 08 1 8 7 8 0 20 1 09 1 9 8 9 0 20 1 0

10 1 10 9 10 0 20 1 011 1 11 10 11 0 20 1 012 1 12 11 12 0 20 1 013 1 13 12 13 0 20 1 014 1 14 13 14 0 20 1 015 1 15 14 15 0 20 1 016 1 16 15 16 0 20 1 017 1 17 16 17 0 20 1 018 1 18 17 18 0 20 1 019 1 19 18 19 0 20 1 020 1 20 19 20 1 20 1 021 2 1 0 1 0 17 1 0

Notes: The first rows of the recidivism study data from Rossi et al. (1980) after episode splitting .

With this data set one can include the current employment status of the released prisoners as acovariate in the Cox model. Estimating the coefficients in this extended model leads to the resultspresented in Table 5.

15To economists such a data set might look like a “long panel” with some additional variables for the timedimension.

16An alternative to modifying the data with episode splitting is to write a program which is executed as part ofthe estimation process and assigns the appropriate value of the time varying variable to each time an event occurs(Alisson 2004: 377). Note that depending on the software you use for your estimations, one or the other methodmight already be implemented and applied automatically once you include time-varying covariates in the model.

17Kleinbaum and Klein(2005: 271) call this representation of the data “counting process data layout”.

14

Page 15: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

Table 5: Determinants of recidivism II

coef exp(coef) se(coef) z Pr(>|z|)fin -0.3567 0.7000 0.1911 -1.8664 0.0620age -0.0463 0.9547 0.0217 -2.1320 0.0330race 0.3387 1.4031 0.3096 1.0938 0.2740wexp -0.0256 0.9748 0.2114 -0.1209 0.9038mar -0.2937 0.7455 0.3830 -0.7669 0.4431paro -0.0642 0.9378 0.1947 -0.3298 0.7416prio 0.0851 1.0889 0.0290 2.9401 0.0033emp -1.3283 0.2649 0.2507 -5.2981 0.0000

Notes: Estimated coefficients of the Cox proportional hazards model including a time varyingcovariate. Data source : Rossi et al. (1980).

The results clearly indicate that having a job reduces the hazard of being rearrested more (andmore significantly) than getting financial aid. Note that with this model specification financial aidis statistically less significant than before while the employment indicator is highly statisticallysignificant.

4.4 Tests and diagnostics

As with other statistical models, it might be desirable to test for the joint hypothesis that allestimated coefficients of a Cox model are equal to zero. Analogous to the F-test in a linearregression and similar to other applications of maximum likelihood, this can be done with a chi-square test like the Wald, likelihood ratio or log rank test (Kiefer 1988: 674). In the case of theextended Cox model above all three tests show highly significant results. The joint hypothesis thatall coefficients are equal to zero can thus be rejected.

Furthermore it is advisable to determine whether the fitted Cox model describes the data prop-erly. One important aspect that should be checked is whether the proportional hazards assumption(that each covariate has the same effect on the hazard at any point of the observation period) holds.This can be done using so called scaled Schoenfeld residuals18 (Fox and Weisberg 2011: 13).

As a first step to evaluate whether the proportional hazards assumption holds, we plot thescaled Schoenfeld residuals against time as shown in Figure 4 for the variables age and prio.

18See Schoenfeld (1982) and Winnett and Sasieni (2001) for further details.

15

Page 16: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

Figure 5: Diagnostics with Schoenfeld residuals

Time

Beta

(t) fo

r age

7.9 14 20 25 32 37 44

−0.5

0.0

0.5

1.0

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

splineconfidence interval

TimeBe

ta(t)

for p

rio7.9 14 20 25 32 37 44

0.0

0.5

1.0

●●●●

● ●

●●

●●

●●

●●

●●

●●

● ●

splineconfidence interval

Note: Scaled Schoenfeld residuals plotted against event time time (based on the estimated Coxmodel presented in Table 5).

The solid black line in the plots is a smoothing spline of the plotted scaled Schoenfeld residuals.If this line systematically deviates from a horizontal line (here represented as a grey line) there’sprobably an issue with non-proportional hazards (Fox and Weisberg 2011: 14). In this example thevariable age appears to have a downwards trend over time rather then being horizontal. For thisvariable the proportional hazards assumption might not hold and the model above would thereforesuffer from misspecification. To make sure whether this is the case, one can test for nonzero slopesof a fitted trend line in these residual plots (Grambsch and Therneau 1994: 523). Table 6 showsthe results from testing the proportional hazards assumption in this manner for each covariateand for the model as a whole. As the graphical diagnostics indicated there is indeed evidence fornon-proportional hazards for age. Additionally, the variable indicating full time employment priorto arrest (“wexp”) seems also to violate the proportional hazards assumption.

16

Page 17: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

Table 6: Proportional hazards tests

rho chisq pfin 0.03 0.09 0.77age -0.26 11.05 0.00race -0.11 1.37 0.24wexp 0.22 6.29 0.01mar 0.06 0.44 0.51paro -0.04 0.19 0.66prio -0.01 0.01 0.94emp 0.04 0.22 0.64GLOBAL 16.77 0.03

Note: Scaled Schoenfeld residuals plotted against event time time (based on the estimated Coxmodel presented in Table 5).

Hence the model above is misspecified. There is, however, an easy way to correct that mis-specification by building interactions between covariates and time into the Cox model (Fox andWeisberg 2011: 14). Estimating a Cox model including a linear interaction of time and age leadsto the results presented in Table 7.

Table 7: Determinants of recidivism III

coef exp(coef) se(coef) z Pr(>|z|)fin -0.3617 0.6965 0.1911 -1.8932 0.0583age 0.0471 1.0482 0.0392 1.2028 0.2291race 0.3297 1.3906 0.3093 1.0659 0.2865wexp -0.0034 0.9966 0.2128 -0.0160 0.9872mar -0.2637 0.7682 0.3829 -0.6886 0.4910paro -0.0674 0.9348 0.1949 -0.3458 0.7295prio 0.0861 1.0899 0.0290 2.9696 0.0030emp -1.3265 0.2654 0.2508 -5.2891 0.0000age:stop -0.0036 0.9964 0.0014 -2.5056 0.0122

Notes: Estimated coefficients of the Cox proportional hazards model including a linear interactionof time and age. Data source : Rossi et al. (1980).

The interaction term of age and time (here the variable “stop”) is highly statistically significant,confirming the non proportional effect of the variable age on the hazard over time.

17

Page 18: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

5 Further topicsThis section is meant to give a short overview over some more advanced topics in survival analysis.In particular, it gives an idea of how to analyze with survival data with repeated events and howto deal with unobserved heterogeneity. Two topics that are likely to be important when usingsurvival analysis in an economic setting.

5.1 Repeated events and competing risks

More complex survival data can also consist of events that can be experienced several times ordifferent types of events as possible outcomes. Job terminations and marriages are examplesfor repeatable events. One possibility to deal with repeated events is to treat each event as aseparate observation to estimate a Cox model. The data are therefore structured similar to theepisode splitting layout used in the case of time-varying covariates. In this case, fixed or randomeffects as well as robust standard errors can be applied to deal with the dependence betweenevents experienced by the same individual (Alisson 2004: 382).19 Here, I want to give a briefoverview of the frequently used models with robust standard errors.20 Probably the simplestapproach to estimate a Cox-like model with repeated events is the model suggested by Andersenand Gill (1982) (hereafter the AG-model). The AG-model builds on the assumption that eventsexperienced by the same individual are independent from each other and has therefore also beencalled “Independent Increment model” (Therneau and Hamilton 1997: 2034). This means, the factthat an individual has already experienced one or several events before has no influence on its riskof experiencing further events (Box-Steffensmeier and Zorn 2002: 1073). While such an assumptioncan be problematic, it can be dealt with by explicitly modeling the effect of previously experiencedevents (the inclusion of the number of previous experienced events as a covariate). Alternativemethods do not need such strong assumptions and take the order of events into account. In themarginal model proposed by Wei et al. (1989) the data are stratified by event number (rank ofevent). The risk set for the kth event at any point in time consequently contains all observationsthat have not yet experienced k events (Box-Steffensmeier and Zorn 2002: 1074). Thus, eachindividual normally appears in all of the strata (Therneau and Hamilton 1997: 2035). Alternatively,in the conditional model suggested by Prentice et al. (1981) the risk set at time t for the kth eventonly contains those observations under study at t who have already experienced k -1 events ofthat type (Box-Steffensmeier and Zorn 2002: 1075). Hence the conditional model takes the exactsequence of events into account.

If an individual can experience one out of several possible types of events, this is called competing

risks. An example for competing risks are different reasons of death. In that case one would definea separate proportional hazards equation for each event type (reason of death) and estimate themseparately in a Cox model for the specific event type, treating all other events as though theindividual was censored at the time one of those events occur (Alisson 2004: 380).

5.2 Unobserved heterogeneity

As shown in equation (9), proportional hazards models do not contain a random error term. Itis unlikely that all variations in the hazard are completely explained by the covariates that are

19The idea and implication of either random or fixed effects in survival analysis is very similar to its applicationin linear regression models. While fixed effects in a linear regression model allows each individual to have a distinctintercept, fixed effects in a proportional hazards model allows each individual to have a distinct baseline hazardfunction.

20This group of models has been referred to as “variance-correction models for repeated events” (Box-Steffensmeierand Zorn 2002:1071).

18

Page 19: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

included in the model. There is an issue with unobserved heterogeneity in such models. The modelcan be expanded by a random variable ε which represents the unobserved individual effect andthen be estimated using random effects (Jenkins 2005: 82). Note that fixed effects as describedabove can only be applied if the data contain repeated events. However, according to Alisson(2004: 382) any attempt to control for unobserved heterogeneity “is futile when no more than oneevent is observed for each individual, as in the case of death.”

6 Recommended literatureThe goal of this script was to give a short introduction to the statistical methods of survival analysisand only covers a very small part of the field. This section gives some recommendations for furtherreading on survival analysis. The literature mentioned here has been a reliable source to write thisscript and should therefore suit well to build on the knowledge acquired so far.

As an alternative brief non-technical introduction seeAlisson (2004). This essay gives manyexamples and all one needs to get a good overview. For those who want an extended introduction,I recommend Kleinbaum and Klein (2005). Their introductory book is easy to follow and helpsa lot to understand the matter intuitively. Its disadvantage (for economists) might be its strongfocus on biostatistics and medical research. Many examples build on controlled experiments suchas in a clinical study, which are rare in empirical economic research. A focus on economics as wellas more mathematical and statistical depth and details can be found in Kiefer (1988). This ratheradvanced text covers all crucial aspects and might still be a very sound introduction for someonewith solid knowledge of statistics and econometrics. Last but not least for the econometricians, Irecommend Lancaster (1992). Unlike many advanced books about survival analysis, Lancaster’sbook has a very clear econometrics/economics focus. It contains all the mathematical backgroundof modeling and inference in the analysis of survival data in economics. Detailed information onall of this literature can be found in the references of this script.

19

Page 20: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

Appendix

AI: Representation of the hazard rate

The hazard rate in continuous time λ(t) can be written as

λ(t) = lim�t→0

P [(t ≤ T < t+�t)|T ≥ t]

�t. (11)

The expression

P [(t ≤ T < t+�t)|T ≥ t] (12)

stands for the conditional probability to die in the interval [t, t + ∆t), given one is still aliveat t (hence the term T ≥ t). This probability is divided by the “size” of the interval (∆t). If ∆ttends to 0, λ(t) describes “the instantaneous exit rate”. The conditional probability (12) can alsobe written as

P (t ≤ T < t+∆t, t ≤ T )

P (t ≤ T )=

P (t ≤ T < t+∆t)

P (t ≤ T ). (13)

Now we write these probabilities in terms of the distribution function of durations F (t)

F (t+∆t)− F (t)

1− F (t). (14)

With this we can rewrite (11) as

λ(t) = lim∆t→0

F (t+∆t)−F (t)1−F (t)

∆t= lim

∆t→0

F (t+∆t)− F (t)

∆t× 1

1− F (t). (15)

Differential calculus tells us that

lim∆t→0

F (t+∆t)− F (t)

∆t=

dF (t)

dt= f(t). (16)

Hence

λ(t) =f(t)

1− F (t)=

f(t)

S(t). (17)

20

Page 21: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

AII: The hazard function with exponentially distributed durations

The exponential distribution (for γ > 0) is defined as

F (t) = 1− exp(−γt) (18)

applying (16) and (17) we get

dF (t)

dt= γ exp(−γt) = f(t) (19)

and

1− F (t) = exp(−γt) = S(t). (20)

Therefore

λ(t) =f(t)

S(t)=

γ exp(−γt)

exp(−γt)= γ. (21)

Hence, the hazard is constant.

AII: The hazard function with Weibull distributed durations

The Weibull distribution is defined as

F (t) = 1− exp(−γtα). (22)

applying (16) and (17) we get

dF (t)

dt= αγtα−1 exp(−γtα) = f(t) (23)

and

1− F (t) = exp(−γtα) = S(t). (24)

Therefore

21

Page 22: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

λ(t) =f(t)

S(t)=

αγtα−1 exp(−γtα)

exp(−γtα)= αγtα−1. (25)

Thus, the hazard is increasing with t if α > 1, constant if α = 1 and decreasing with t if α < 1.The following graph illustrates this point.

Figure 6: Weibull hazard function

0 1 2 3 4

0.0

0.5

1.0

1.5

2.0

2.5

t

λ(t) alpha = 1.5

alpha = 0.5alpha = 1

Note: Increasing, decreasing and constant Weibull hazard functions with a equal gamma parametervalue (0.85).

AIII: Cox’ partial-likelihood approach

As Kiefer (1988: 668) puts it, “the intuition [of the partial-likelihood approach] is that, in theabsence of all information about the baseline hazard, only the order of the durations providesinformation about the unknown coefficients”. Hence, the first step when applying the partial-likelihood approach is to order the data according to durations from the shortest to the longest.Assuming no censoring and no ties, the contribution to the likelihood of the first observation isthen

λ0(t) exp(x�1β)

n�i=1λ0(t) exp(x�

iβ)=

exp(x�1β)

n�i=1 exp(x

�iβ)

. (26)

This is the conditional probability that the first individual experiences the event at the smallestduration, given that any of the individuals could have experienced it at that duration. Note thatthe unknown baseline hazard λ0(t) is cancelled out and therefore is not entering the likelihood

22

Page 23: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

function. The likelihood function can then be set up as usual as the product of the individualcontributions to the likelihood:

L(β) =�

i=1

exp(x�iβ)�n

j=i exp(x�jβ)

(27)

from which the log-likelihood

l(β) =n�

i=1

(x�iβ)− ln

n�

j=i

exp(x�jβ)

(28)

results. Note that (27) and (28) are significantly more complex expressions if the data containcensored observations and ties.

23

Page 24: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

ReferencesAlisson, P. (2004). Event History Analysis. In Hardy, M. A. and Alan, B., editors, Handbook of

Data Analysis. Sage Publications Ltd., London.

Andersen, P. K. and Gill, R. D. (1982). Cox’s Regression Model for Counting Processes: A LargeSample Study. The Annals of Statistics, 10(4):1100–1120.

Box-Steffensmeier, J. M. and Jones, B. S. (1997). Time is of the Essence: Event History Modelsin Political Science. American Journal of Political Science, 41(4):1414–1461.

Box-Steffensmeier, J. M. and Zorn, C. (2002). Duration Models for Repeated Events. The Journal

of Politics, 64(4):1069–1094.

Cox, D. R. (1972). Regression Models and Life-Tables. Journal of the Royal Statistical Society.

Series B (Methodological), 34(2):187–220.

Fox, J. and Weisberg, S. (2011). Cox Proportional-Hazards Regression for Survival Data inR test. Online Appendix to An R Companion to Applied Regression, Second Edition,http://socserv.mcmaster.ca/jfox/Books/Companion/appendix/Appendix-Cox-Regression.pdf.

Glantz, S. (2002). Primer of biostatistics. McGraw-Hill Medical Pub. Division, New York.

Grambsch, P. M. and Therneau, T. M. (1994). Proportional Hazards Tests and Diagnostics Basedon Weighted Residuals. Biometrika, 81(3):515–526.

Jenkins, S. P. (2005). Survival Analysis. Unpublished manuscript, Institute for Social and EconomicResearch, University of Essex, Colchester, UK.

Kiefer, N. M. (1988). Economic Duration Data and Hazard Functions. Journal of Economic

Literature, 26(2):646–679.

Kleinbaum, D. and Klein, M. (2005). Survival Analysis: a Self-Learning Text. Statistics for biologyand health. Springer, New York.

Lancaster, T. (1992). The Econometric Analysis of Transition Data. Econometric Society mono-graphs. Cambridge University Press, Cambridge, UK.

Prentice, R. L., Williams, B. J., and Peterson, A. V. (1981). On the Regression Analysis ofMultivariate Failure Time Data. Biometrika, 68(2):373–379.

Rossi, P., Berk, R., and Lenihan, K. (1980). Money, Work, and Crime: Experimental Evidence.Quantitative studies in social relations. Academic Press, New York.

Schoenfeld, D. (1982). Partial Residuals for The Proportional Hazards Regression Model.Biometrika, 69(1):239–241.

Therneau, T. and Grambsch, P. (2000). Modeling survival data: extending the Cox model. Statisticsfor biology and health. Springer, New York.

Therneau, T. M. and Hamilton, S. A. (1997). rhDNase as an example of recurrent event analysis.Statistics in Medicine, 16(18):2029–2047.

Wei, L. J., Lin, D. Y., and Weissfeld, L. (1989). Regression Analysis of Multivariate IncompleteFailure Time Data by Modeling Marginal Distributions. Journal of the American Statistical

Association, 84(408):1065–1073.

24

Page 25: A Short Introduction to Survival Analysis · 2017-11-29 · A Short Introduction to Survival Analysis* Ulrich Matter** Last revision: 20 June 2012 Abstract Survival analysis has become

Winnett, A. and Sasieni, P. (2001). A Note on Scaled Schoenfeld Residuals for the ProportionalHazards Model. Biometrika, 88(2):565–571.

25