Introduction to survival analysispublicifsv.sund.ku.dk/~pka/SACT18-part1/intro18.pdfIntroduction to survival analysis Per Kragh Andersen Section of Biostatistics, University of Copenhagen

Definitions and examples Non-parametric estimation Parametric models Non-parametric testing Delayed entry Miscellaneous

Introduction to survival analysis

Per Kragh Andersen

Section of Biostatistics, University of Copenhagen

DSBS CourseSurvival Analysis in Clinical Trials

January 2018

1 / 65


Overview

Definitions and examples

Non-parametric estimation

Parametric models

Non-parametric tests

Delayed entry

Miscellaneous

www.biostat.ku.dk/˜pka/SACT18-part1

2 / 65


Definitions and examples

3 / 65


Examples

Time to death or other event of interest from a well-defined timeorigin:

Time from start of randomized clinical trial to death

or ... to some composite end-point

Time from randomization to occurrence of side effect

Time from birth to death

Time from birth to first marriage

Time from first employment to pension

Time from filling a cavity in a tooth to filling falls out

What is special about survival data?

(Right)-censoring: For some subjects the event is not observedand we will only know an interval in which it did not occur.

4 / 65


A small data set

●

Times (months)

Indi

vidu

al

0 5 10 15 20 25

1

2

3

4

5

6

7

8

9

10

11

12

●

●

●

●

●

●

●

●

●

●

●

5 / 65


A small data set

6 / 65


A small data set

Ordered times: 5, 6*, 7 , 8, 9*, 12*, 14, 15, 16, 20*, 22*, 23,* indicates censored survival times.How to estimate the mean survival time?

5 + 6 + 7 + 8 + 9 + 12 + 14 + 15 + 16 + 20 + 22 + 23

12=

157

12= 13.08?

5 + 7 + 8 + 14 + 15 + 16 + 23

7=

88

7= 12.57?

Which fraction of patients survives past 12 months?

6

12= 0.5?

We need inference methods that are able to account for censoring.This leads to a focus on other parameters than the mean value.

7 / 65


Survival and hazard functions

Let T be the time to the event of interest:

S(t) = P(T > t)

= probability of survival beyond time t

= 1− F (t), F (t) is the failure risk before time t.

λ(t) = rate or hazard function

= − d

dtlog(S(t))

=dF (t)

S(t), i.e.,

λ(t)dt ≈ P(T ≤ t + dt | T > t)

= probability of failure before t + dt given survival beyond t.

8 / 65


Math

Relationship between survival and hazard functions:

S(t) = exp

(−∫ t

0λ(s)ds

)= exp(−Λ(t));

Λ(t) is the integrated or cumulative hazard function.

Note that this relationship between ‘rate’ (λ(t)) and ’risk’(F (t) = 1− S(t)) requires that

there are no competing risks (much more later),

the distribution is absolutely continuous

9 / 65


Interpretation

The survival and distribution functions S(t) and F (t) are simplecumulative fractions of patients having survived until or havingfailed by time t.

The hazard function λ(t) describes the instantaneous risk per timeunit of failing ‘now’ given alive.

The integrated hazard function:

has a derivative (‘slope’) that is the hazard function

is the expected number of ‘renewals’ before time t in a certain(strange?) experiment.

(When, in Part II of the course, we discuss recurrent events thisexperiment is much more natural.)

10 / 65


Other parameters

Obviously, one can also consider the mean life time

E (T ) = µ =

∫ ∞0

S(t)dt.

This, however, depends critically on the right-hand tail of thedistribution of T which we typically do not see because ofcensoring.Some times one studies the restricted mean life time

E (T ∧ τ) = µ(τ) =

∫ τ

0S(t)dt.

However, the interpretation is less nice (average time lived beforetime τ) and its value (obviously!) depends on the choice of τ .

11 / 65


Math

Proofs by partial integration or, more simply:

T =

∫ T

01dt =

∫ ∞0

I (T > t)dt

and take expectations (E(I (T > t)

)= S(t)).

Similarly,

T ∧ τ =

∫ T∧τ

01dt =

∫ τ

0I (T > t)dt.

12 / 65


Population and sample

We are used to considering our data as a sample from some(target) population, and the parameters refer to this population.

That is no different in survival analysis, however, it is important torealize that the target population is a complete population, i.e.,without censoring.

Our ambition in survival analysis is therefore to draw inference onparameters like the survival function S(t) or the hazard functionλ(t) from a potentially completely observed population based onincomplete (censored) data.

This is quite ambitious and requires certain assumptions.

13 / 65


Target population; censoring

For this ambition to be feasible:

1 the complete population should be well-defined

2 censoring should not leave us with a biased sample

Requirement 1 basically tells that the event under study shouldhappen for every one in the population.

This means that we need to distinguish between situations wherethere are no competing risks and where there are competing risks(much more later).

14 / 65


Independent censoring

Requirement 2 is the assumption of independent censoring (bysome denoted non-informative censoring).

This means that individuals censored at any given time t shouldnot be a biased sample of those who are at risk at time t.

Stated in other words: the hazard function λ(t) gives the eventrate at time t, i.e. the failure rate given that the subject is stillalive (T > t).

Independent censoring then means that the extra information thatthe subject is not only alive, but also uncensored at time t doesnot change the failure rate.

15 / 65


Independent censoring

Typically, independent censoring cannot be tested from theavailable data - it is a matter of discussion.

Censoring caused by being alive at the end of study can usuallysafely be taken to be “independent”. However, one should be moresuspicious to other kinds of loss to follow-up before end of study.

It is strongly advisable always to keep track of subjects who arelost to follow-up and to note the reasons for loss to follow-up (e.g.,drop-out of follow-up schedule or emigration).

The above discussion of independent censoring should be thoughtof as ‘for given covariates’. This means that censoring may dependon covariates as long as these covariates are accounted for in thehazard model (e.g., using the Cox regression model).

16 / 65


The PBC-3 trial in liver cirrhosis

Lombard et al. (1993, Gastroenterology)

Multi-centre randomized trial in patients with primary biliarycirrhosis.

Patients (n = 349) recruited 1 Jan, 1983 - 1 Jan, 1987 from sixEuropean hospitals and randomized to CyA (176) or placebo (173).

Followed until death or liver transplantation (no longer than 31 Dec,1989); CyA: 30 died, 14 were transplanted; placebo: 31 died, 15were transplanted; 4 patients were lost to follow-up before 1989.

Primary outcome variable: time to death, incompletely observed(right-censoring), due to: liver transplantation, loss to follow-up,alive 31 Dec, 1989

In some analyses, the outcome is defined as “time to failure ofmedical treatment”, i.e. to the composite end-point of either deathor liver transplantation

17 / 65


LEADER trial

Marso et al. (2016, NEJM)

9340 patients with Type 2 diabetes and high cardiovascular risk

randomized to liraglutide or placebo

410 sites in 32 countries

primary outcome: composite end-point including death fromcardiovarcular cause, non-fatal infarction, non-fatal stroke

minimum planned follow-up: 42 (+1) months, maximum 60 (+1)months

4668 received liraglutide with 608 events (13.0%)

4672 received placebo with 694 events (14.9%)

18 / 65


SUSTAIN-6 trial

Marso et al. (2016, NEJM)

3297 patients with Type 2 diabetes

randomized to semaglutide or placebo

230 sites in 20 countries

primary outcome: composite end-point including death fromcardiovarcular cause, non-fatal infarction, non-fatal stroke

planned follow-up: 104 + 5 weeks

1648 received semaglutide with 108 events (6.6%)

1649 received placebo with 146 events (8.9%)

19 / 65


Non-parametric estimation

20 / 65


The Kaplan-Meier estimator

The survival function is S(t) = P(T > t).

Notation:Distinct failure or censoring times: 0 < t1 < t2 < ...Number of failures observed at those times d(t1), d(t2), ...(NB: these are typically 0 or 1.)Number of subjects at risk at (i.e., just before) those times:Y (t1),Y (t2), ...

For t > s: to survive beyond t one must first survive beyond s!That is,

P(T > t) = P(T > s)P(T > t | T > s).

21 / 65


The Kaplan-Meier estimator

Idea: estimate P(T > tj | T > tj−1) by

Y (tj)− d(tj)

Y (tj)= 1−

d(tj)

Y (tj)

and estimate S(t) (for tj−1 ≤ t < tj) by the Kaplan-Meierestimator:

S(t) =(1− d(t1)

Y (t1)

)(1− d(t2)

Y (t2)

)· · ·(1−

d(tj−1)

Y (tj−1)

)=

∏tj≤t

(1−

d(tj)

Y (tj)

).

Median: inft{S(t) ≤ 0.5}, time point at which the K-M estimatorgoes below 0.5.Not always estimable.

22 / 65


Confidence limits

The standard error (SD) of the Kaplan-Meier estimator may beestimated by Greenwood’s formula:

SD(S(t)) = S(t)

√√√√∑tj≤t

d(tj)

Y (tj)(Y (tj)− d(tj) + 1).

To get an approximate 95% confidence interval for S(t), one mayuse simple linear limits S(t)± 1.96 · SD(S(t)).However, to eliminate problems with range restrictions when S(t)is close to 0 or 1, transformations (i.e., using the delta-method)may be used, e.g. the log(− log) transformation, which leads tothe interval

(S(t))a ≤ S(t) ≤ (S(t))b,

where b = 1/a and a = exp(1.96 · SD(S(t))/(− log(S(t)))).

23 / 65


The Nelson-Aalen estimator

This estimator of the integrated hazard function Λ(t) builds on thesame idea as the Kaplan-Meier estimator: estimate

λ(t)dt ≈ P(T ≤ t + dt | T > t) byd(tj)

Y (tj)when t = tj .

That is,

Λ(t) =∑tj≤t

d(tj)

Y (tj).

Further,

SD(Λ(t)) =

√√√√∑tj≤t

d(tj)

(Y (tj))2.

Note how censored observations are used for both K-M and N-Aa:a subject censored at tj gives rise to no jump in the estimator butcontributes to the size, Y (t) of the risk set for t ≤ tj .

24 / 65


A small data set: estimates

25 / 65


Math

Properties of the estimators may be based on counting processes:

N(t) = number of observed failures in [0, t], counting process

N(t) has intensity process Y (t)λ(t), i.e.P(dN(t) = 1 | past) ≈ Y (t)λ(t)dt,

By the Doob-Meyer decompositionM(t) = N(t)−

∫ t0 Y (u)λ(u)du is a martingale

Λ(t) =∫ t

0I (Y (u)>0)

Y (u) dN(u)

Λ(t)−∫ t

0 I (Y (u) > 0)λ(u)du =∫ t

0I (Y (u)>0)

Y (u) dM(u) is also amartingale

From this, (approximate) unbiasedness, consistency,asymptotic normality and variance formula follow

26 / 65


A small data set: Y (t) and N(t)

27 / 65


More math

Why not estimate Λ(t) by − log(S(t)) or S(t) by exp(−Λ(t))?This is because the relation S(t) = exp(−Λ(t)) holds forabsolutely continuous distributions and our estimators are discretedistributions.For discrete distributions, the relationship between cumulativehazard (measure) and survival function is given by theproduct-integral:

S(t) =∏u<t

(1− dA(u)

)and S(t) is, indeed, the product-integral of Λ(t).Properties of the K-M estimator follow from those of N-Aa via thisrelationship (the product-integral is a continuous and differentiablemapping).In practice, it makes little difference using S(t) or exp(−Λ(t)).

28 / 65


The PBC 3 trial (CyA: dashed, composite end-point)

29 / 65


The PBC 3 trial (CyA: dashed, composite end-point)

0 1 2 3 4 5 6

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Years

Cum

ula

tive h

azard

30 / 65


The LEADER trial

31 / 65


The LEADER trial

32 / 65


The LEADER trial

33 / 65


The SUSTAIN-6 trial

34 / 65


Doing it in SAS, Kaplan-Meier

PROC LIFETEST DATA = small PLOT = SURV ;

TIME months*status (0);

RUN ;

PROC PHREG DATA = small PLOT = SURV ;

MODEL months*status (0)=;

BASELINE / METHOD =PL;

RUN ;

If the option METHOD=PL is not specified then the ‘exp(-N-Aa)’estimator is obtained.

35 / 65


Doing it in SAS, Nelson-Aalen

PROC LIFETEST DATA = small PLOT = LS NELSON ;

TIME months*status (0);

RUN ;

PROC PHREG DATA = small PLOT = CUMHAZ ;

MODEL months*status (0)=;

BASELINE / METHOD = CH;

RUN ;

If the option NELSON (or AALEN) is not specied then the ‘-log(KM)’estimator is obtained.The option METHOD=CH (or BRESLOW) is default and, hence, notneeded.

36 / 65


Parametric models

37 / 65


Examples

Non-parametric inference (including the Cox model - more later)has become the standard method in survival analysis.

However, useful parametric models do exist:The exponential distribution with constant hazard: λ(t) = λ for all t. This is arestrictive assumption which is often not justified, however, this is the modelunderlying the calculation of simple ‘occurrence/exposure’ rates.

Piecewise exponential models have piecewise constant hazards: λ(t) = λj whensj−1 ≤ t < sj for pre-specified intervals, 0 = s0 < s1 < ... < sJ =∞. This leadsto interval-specific occurrence/exposure rates and provides the basis for Poissonregression models

Another simple extension of the exponential model is the Weibull model withλ(t) = λαtα−1. Mathematically simple, rather flexible (e.g., both increasing,constant, and decreasing hazard functions), but rarely used in practice.

log-normal models also exist (no simple hazard function)

38 / 65


Likelihood

New notation:

Observation times T1, ..., Tn

Failure indicators D1, ...,Dn, Di = I (Ti = Ti )

Likelihood function when hazard function is λθ(t):

L(θ) =n∏

i=1

(λθ(Ti )

)Di exp(−∫ Ti

0λθ(t)dt

).

Standard inference via score function, observed information etc.Martingale-based proof of ‘standard’ asymptotic properties: thescore D log L(θ0) is a martingale at the true parameter value θ0.

When the full distribution of T is parametrically specified via θ,parameters like mean and median are also a function of θ.However, since the right-hand tail of the distribution is notobserved because of censoring, one is reluctant to quote the mean.

39 / 65


Piecewise constant hazard

The hazard function is λ(t) = λj when sj−1 ≤ t < sj forpre-specified intervals, 0 = s0 < s1 < ... < sJ =∞.The maximum likelihood estimator is most easily expressed incounting process notation:

N(t) =∑i

I (Ti ≤ t,Di = 1), Y (t) =∑i

I (Ti ≥ t).

Then

λj =N(sj )−N(sj−1)∫ sj

sj−1Y (t)dt

,

i.e., number of failures in interval j divided by the total time at riskin interval j . Further, from the observed information:

SD(λj) =

√N(sj)− N(sj−1)∫ sj

sj−1Y (t)dt

.

40 / 65


Doing it in SAS, 2 intervals

DATA pwch SET small;

risk= MIN (months ,cut); logrisk= LOG (risk);

N=( status =1)*( months LE cut); interval =1;

OUTPUT ;

IF months GT cut THEN DO ;

risk=months -cut; logrisk=log(risk);

N=status; interval =2; OUTPUT ;

END ;

RUN ;

Similarly with any number of time intervals.

A general SAS MACRO, %LEXIS for cutting follow-up time is alsoavailable.

41 / 65


Doing it in SAS, 2 intervals

PROC GENMOD DATA = pwch;

CLASS interval;

MODEL N = interval / DIST = POISSON OFFSET =

logrisk NOINT ;

ESTIMATE ’rate1 ’ interval 1 0 / EXP ;

ESTIMATE ’rate2 ’ interval 0 1 / EXP ;

RUN ;

42 / 65


LEADER and SUSTAIN-6 trials

Both reports quote a single rate for each type of outcome, e.g.LEADER:

Liraglutide PlaceboEvents Rate per 100 years Events Rate per 100 years

Primary end-point 608 3.4 694 3.9Death from any cause 381 2.1 447 2.5

SUSTAIN-6Semaglutide Placebo

Events Rate per 100 years Events Rate per 100 yearsPrimary end-point 108 3.24 146 4.44

Death from any cause 62 1.82 60 1.76

43 / 65


Non-parametric testing

44 / 65


A general test statistic

We want to compare hazard functions λ1(t) and λ2(t) in twogroups.

Counting process notation: In group j we have: Nj(t) = number ofobserved events in [0, t], Yj(t) = number at risk just before time t.Nelson-Aalen estimators for Λj(t) =

∫ t0 λj(u)du:

Λj(t) =

∫ t

0

I (Yj(u) > 0)

Yj(u)dNj(u), j = 1, 2.

Idea in general test statistic: look at K -weighted differencesbetween increments in Nelson-Aalen estimators:

U(t) =

∫ t

0K (u)

(d Λ1(u)− d Λ2(u)

).

45 / 65


The logrank test

Different choices of K (·) provide different tests with differentproperties.

The most common choice is

K (t) =Y1(t)Y2(t)

Y1(t) + Y2(t)

leading to

U(t) = N1(t)−∫ t

0

Y1(u)

Y1(u) + Y2(u)(dN1(u) + dN2(u)).

Evaluated at t =∞ we get the logrank test:

U(∞) = ‘Observed’ - ‘Expected’ (in group 1).

46 / 65


The logrank test

From each 2 by 2 table at a failure point ti :

Group Died Survived Alive before

1 dN1(ti ) Y1(ti )− dN1(ti ) Y1(ti )2 dN2(ti ) Y2(ti )− dN2(ti ) Y2(ti )

dN1(ti ) + dN2(ti ) Y1(ti ) + Y2(ti )

we add the observed dN1(ti ) and expected

Y1(ti )

Y1(ti ) + Y2(ti )(dN1(ti ) + dN2(ti ))

numbers of failures from one group (here group 1).

47 / 65


The logrank test

The logrank test (as we shall see later) has optimality propertiesagainst proportional hazards alternatives:

λ2(t) = θλ1(t).

Using instead weights given by K (t) = Y1(t)Y2(t), a test statisticis obtained where values of ‘observed - expected’ at earlier timepoints are given larger weight.This test statistic, in fact, when there are no censored observationsis the two-sample Wilcoxon (Mann-Whitney) test.

For either choice of K (·), the statistic (U(∞))2, properlynormalized, is referred to the χ2

1−distribution.

The logrank test has developed into the test of choice, and anypaper using a different test will be looked upon with suspicion.

48 / 65


Math

Doob-Meyer decomposition for each j = 1, 2 (Mj is a martingale):

Nj(t) =

∫ t

0Yj(u)λj(u)du + Mj(t).

U(t) =

∫ t

0K (u)

(dN1(u)/Y1(u)− dN2(u)/Y2(u)

)Using the decomposition under H0 : λ1(t) = λ2(t) we see that

U(t) =

∫ t

0K (u)

(dM1(u)/Y1(u)− dM2(u)/Y2(u)

)is a martingale, i.e. E (U(t)) = 0 and the asymptotic distribution(a normal distribution) together with the normalizing variance canbe found by a martingale CLT.

49 / 65


Doing it in SAS

PROC LIFETEST DATA =pbc3;

TIME followup*status (0);

STRATA tment;

RUN ;

This will give both the logrank test and the censored dataWilcoxon test (the ‘Gehan-Breslow’ test).

If we do not want the latter then a ‘/TEST=LOGRANK’ option canbe added to the STRATA command.

50 / 65


The stratified logrank test

Comparison of two groups (say, Z = 1 and Z = 0) afteradjustment for a categorical variable (X ) can be performed usingthe stratified logrank test.Here, observed and expected numbers of failures (e.g., for Z = 1)are first computed within strata given by values of X and,subsequently, added across strata.

In SAS X should be the STRATA variable and the stratified teststatistic is obtained using a TEST Z; command.

51 / 65


Delayed entry

52 / 65


Delayed entry

Some times, subjects are not observed from time 0 but only from alater entry time, Vi , that is, subject i is only observed conditionallyon having survived until Vi .

This is denoted delayed entry or left truncation and is often presentif age is the primary time variable.

A change of time variable causing delayed entry changes how risksets are composed - see graph for the small data set.

53 / 65


A small data set - risk sets in time

●

Time in study

Times (months)

Indi

vidu

al

0 5 10 15 20 25

1

2

3

4

5

6

7

8

9

10

11

12

●

●

●

●

●

●

●

●

●

●

●

54 / 65


A small data set - risk sets in age

●

Age as time

Age (months)

Indi

vidu

al

0 5 10 15 20 25

1

2

3

4

5

6

7

8

9

10

11

12

●

●

●

●

●

●

●

●

●

●

●

55 / 65


Handling delayed entry

The methods discussed so far immediately generalize to data withdelayed entry:

Re-define the individual at-risk indicator as I (Vi < t ≤ Ti )

Re-define risk set size as Y (t) =∑

i I (Vi < t ≤ Ti )

Then formulations of estimators etc. via counting processnotation (N(t),Y (t)) still apply

56 / 65


Doing it in SAS

In PROC LIFETEST, delayed entry has (surprisingly) not beenimplemented. However, delayed entry may be handled using PROC

PHREG, e.g. Kaplan-Meier:

PROC PHREG DATA = small PLOT =S;

MODEL age*status (0)=/ ENTRY =entryage;

BASELINE / METHOD =PL;

RUN ;

To get the logrank test, the fact that the logrank test is a scoretest in a Cox model may be applied (more later).

57 / 65


Miscellaneous

58 / 65


Overview

So far, we have focussed on models for the hazard function andestimation of hazard ratios (which then implied models forS(t) = 1− F (t)).Other parameters may be targeted:

The risk difference in τ :

F1(τ)− F2(τ).

The τ−restricted mean life time:

E (T ∧ τ) =

∫ τ

0S(t)dt.

log-linear models for T , accelerated failure time models

E (log(Ti )) = α0 + α1Xi1 + ...+ αpXip.

59 / 65


Risk difference in τ

Without covariates, this can be estimated from the K-M-estimatorS2(τ)− S1(τ).From a Cox model with treatment variable Z and other covariatesX , the risk difference at τ between treatment groups could beestimated by direct adjustment/standardization:

1

n

(∑i

S(τ | Z = 1,Xi )−∑i

S(τ | Z = 0,Xi )).

This is also known as the g-formula in (modern) causal inference.

What if we want direct covariate effects for F (τ) (instead ofindirectly via the hazard function)?

One may use pseudo-observations.

60 / 65


Pseudo-observations

Let S be the K-M-estimator and S (−i) the same estimator appliedto the data set (of size n − 1) obtained by eliminating subject i .Then the pseudo-observation for the (possibly incompletelyobserved) survival indicator I (Ti > τ) is:

Si (τ) = n · S(τ)− (n − 1) · S (−i)(τ).

This may be used as response variable for a generalized linearmodel

g(S(τ | X )) = α0 + α1X1 + ...+ αpXp

and parameters may be estimated by solving the generalizedestimating equations (GEE, with working (co-)variance Vi ):

U(α) =∑i

∂

∂α(g−1(αTXi ))V−1

i (Si (τ)− g−1(αTXi )) = 0.

61 / 65


Pseudo-observations

Why does it work?

Without censoring S(τ) = 1n

∑i I (Ti > τ) and the ith

pseudo-observation is then simply Si (τ) = I (Ti > τ).

With censoring (NB: should be independent of covariates)

E (Si (τ) | X ) ≈ E (I (Ti > τ) | X ),

that is, the pseudo-observation has approximately the correctconditional expectation given covariates, and the GEE areunbiased.

The estimating equations may be solved using standard software(e.g., SAS PROC GENMOD), and there is a SAS MACRO available forcomputing the pseudo-observations.

62 / 65


Restricted mean life time

Survival analysis deals with time variables and, yet, our preferredtarget parameters have no time dimension in contrast to:

The mean life time E (T ) =∫∞

0 S(t)dt (right-hand tailunobserved)

The median life time inft{S(t) ≤ 0.5} (in principle, quantileregression exists but is not much used)

The restricted mean life time µ(τ) = E (T ∧ τ) =∫ τ

0 S(t)dt(may be analyzed - as we shall see - but requires a choice of τ)

Non-parametric estimation is obvious:

µ(τ) =

∫ τ

0S(t)dt,

i.e., area (up to τ) under the K-M curve - may be estimated (incl.an SD) using SAS PROC LIFETEST with a TIMELIM=tau option.

63 / 65


Restricted mean life time

The restricted mean life time is having a revival in survivalanalysis, e.g. Royston and Parmar (BMC Med. Res. Meth., 2013).

How to do regression? As for the risk difference, one may useplug-in based on a Cox model or (as promoted by R& P) based onsome flexible parametric model.

An alternative is to use pseudo-observations. Compute:

µi (τ) = n · µ(τ)− (n − 1) · µ(−i)(τ)

and use them as responses in GEE (working (co-)variance Vi ):

U(α) =∑i

∂

∂α(g−1(αTXi ))V−1

i (µi (τ)− g−1(αTXi )) = 0

when fitting a generalized linear model:g(µ(τ | X )) = α0 + α1X1 + ...+ αpXp. 64 / 65


The accelerated failure time model

Survival analysis deals with a quantitative non-negative responsevariable Ti and, therefore, a regression model of choice could be

log(Ti ) = α0 + α1X1 + ...+ αpXp + σεi .

Such models do, indeed, exist and mainly parametric models areused where the error term εi , is assumed to follow a specifieddistribution.

The parameters have nice interpretations as acceleration factors:for a binary X , ‘time moves exp(α) faster’ for X = 1 compared toX = 0.The model may be fitted using SAS PROC LIFEREG which offerschoices of normal, logistic, and extreme value distributions.The latter corresponds to a Weibull distribution for T and is theonly model that is both an AFT and a proportional hazards model(Cox model with Weibull baseline).

65 / 65

Introduction to survival analysispublicifsv.sund.ku.dk/~pka/SACT18-part1/intro18.pdfIntroduction to survival analysis Per Kragh Andersen Section of Biostatistics, University of Copenhagen

Documents