Top Banner
. . . . . . Lecture7: Survival Analysis Lecture7: Survival Analysis Antonello Maruotti Lecturer in Medical Statistics, S3RI and School of Mathematics University of Southampton Southampton, 6 February 2013 1 / 16
76

Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

May 30, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Lecture7: Survival Analysis

Lecture7: Survival Analysis

Antonello Maruotti

Lecturer in Medical Statistics, S3RI and School of MathematicsUniversity of Southampton

Southampton, 6 February 2013

1 / 16

Page 2: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Lecture7: Survival Analysis

Outline

Introduction

Basic definitions

The hazard

2 / 16

Page 3: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Lecture7: Survival Analysis

Introduction

A couple of questions and...

I What makes survival data so special that their analysis needsa special treatment, even as long as a one-term course?

I Why isn’t it simply covered as a sub-topic in, let’s say,regression analysis?

3 / 16

Page 4: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Lecture7: Survival Analysis

Introduction

...a clarification

I Survival data subsume more than only times from birth todeath for some individuals.

I Analysis of duration data, that is the time from a well-definedstarting point until the event of interest occurs.

4 / 16

Page 5: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Lecture7: Survival Analysis

Introduction

Examples

I how long patients survived after diagnosis or treatment

I the length of unemployment spells

I how long a marriage lasts

I how long PhD students need to finish writing their theses

I and more...

5 / 16

Page 6: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Lecture7: Survival Analysis

Introduction

Features

I Survival data result from a dynamic process and we want tocapture these dynamics in the analysis properly.

I The observation scheme for duration data can be rathercomplex, leading to data that are somehow cut.

6 / 16

Page 7: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Lecture7: Survival Analysis

Basic definitions

The basic functions

In the following we will assume that time is running continuously,and we therefore will describe duration by a continuous randomvariable, denoted by T .

I T ≥ 0

I f (t) ⇒ density function

I F (t) ⇒ cumulative density function (cdf)

I S(t) ⇒ survival function

7 / 16

Page 8: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Lecture7: Survival Analysis

Basic definitions

Recall that...

I The density function f (t) describes how the total probabilityof 1 is distributed over the domain of T .

I The function f (t) itself is not a probability and can takevalues bigger than 1. But still one can derive basic propertiesfrom looking at the density.

I For regions where the density has large values the area underthe curve over an interval of given length will be larger ascompared to an interval of same length where the density islower.

I Regions over which the density is high are regions where weexpect to observe more data points than in regions with lowdensities.

8 / 16

Page 9: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Lecture7: Survival Analysis

Basic definitions

Recall that...

I The cdf F (t) is defined as F (t) := P(T ≤ t) which can becomputed from the density as

F (t) =

∫ t

0f (s)ds

.

I A cdf is an increasing function, even strictly increasing if thedensity f (t) > 0 everywhere.

I F (0) = 0 and limt→∞ F (t) = 1.

I There is a one-to-one link between f (t) and F (t) asF ′(t) = f (t). Knowing one of the functions means, at least inprinciple, knowing the other (you may have to take thederivative or perhaps solve an ugly integral).

9 / 16

Page 10: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Lecture7: Survival Analysis

Basic definitions

Recall that...

I Instead of looking at the cdf, which gives the probability ofsurviving at most t time units, one prefers to look at survivalbeyond a given point in time. This is described by the survivalfunction S(t):

S(t) = P(T > t) = 1− P(T ≤ t) = 1− F (t)

I Consequently, S(t) starts at 1 for t = 0 and then declines to 0for t → ∞.

I It should be obvious that knowing any one of f (t), F (t) andS(t) allows to derive the other two functions.

10 / 16

Page 11: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Lecture7: Survival Analysis

Basic definitions

To summarize

Pr(a ≤ T ≤ b)

All the three functions introduced so far allowed to describe, in oneway or another, how the survival times are distributed over thepotential range.

11 / 16

Page 12: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Lecture7: Survival Analysis

The hazard

The dynamic process

I Density, cdf and survival function look at the marginaldistribution

I Conditioning on the survival experience so far, we have

Pr(t < T ≤ t +∆t | T > t)

I Defining the Hazard Rate

h(t) = lim∆t→0

Pr(t < T ≤ t +∆t | T > t)

∆t

12 / 16

Page 13: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Lecture7: Survival Analysis

The hazard

The hazard in more details

The basic information in the hazard is, first of all, its qualitativebehavior.

13 / 16

Page 14: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Lecture7: Survival Analysis

The hazard

Some useful identities

I h(t) = f (t)S(t) ⇒ f (t) = h(t)S(t)

I h(t) = [− log S(t)]′

I S(t) = exp{−∫ t0 h(s)ds

}I Define the cumulative hazard H(t)

H(t) =

∫ t

0h(s)ds ⇒ S(t) = exp{−H(t)}or log S(t) = −H(t)

14 / 16

Page 15: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Lecture7: Survival Analysis

The hazard

By using the definition of conditional probabilities

Pr(t < T ≤ t +∆t | T > t) =Pr([t < T ≤ t +∆t] ∩ [T > t])

Pr(T > t)

=Pr(t < T ≤ t +∆t | T > t)

Pr(T > t)

It may be helpful to sketch this relation graphically

15 / 16

Page 16: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Lecture7: Survival Analysis

The hazard

An example

16 / 16

Page 17: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis: Non Parametric Estimation

Survival Analysis:Non Parametric Estimation

Antonello Maruotti

Lecturer in Medical Statistics, S3RI and School of MathematicsUniversity of Southampton

Southampton, 6 February 2013

1 / 14

Page 18: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis: Non Parametric Estimation

Outline

General Concepts

Non Parametric Estimation (no censoring)

Non Parametric Estimation (including censoring)

2 / 14

Page 19: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis: Non Parametric Estimation

General Concepts

Few remarks before starting

I Each subject has a beginning and an end anywhere along thetime line of the complete study.

I In many clinical trials, subjects may enter or begin the studyand reach end-point at vastly differing points.

I Each subject is characterized by

1. Survival time2. Status at the end of the survival time (event occurrence or

censored)3. The study group they are in.

3 / 14

Page 20: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis: Non Parametric Estimation

General Concepts

Censoring

I The total survival time for that subject cannot be accuratelydetermined.

I The subject drops out, is lost to follow-up, or required data arenot available

I The study ends before the subject had the event of interestoccur, i.e., they survived at least until the end of the study,

I There is no knowledge of what happened thereafter.

4 / 14

Page 21: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis: Non Parametric Estimation

General Concepts

Censoring

I Right censoring: the period of observation expires, or anindividual is removed from the study, before the event occurs.

I Left censoring: the initial time at risk is unknown.

I Interval censoring: both right and left censored

5 / 14

Page 22: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis: Non Parametric Estimation

Non Parametric Estimation (no censoring)

Estimation

I Random variable T with cdf F (t)

I S(t) = 1− F (t)

I With no censored observations:

S(t) = 1− F (t)

I To estimate F (t) at each time t:I data t1, . . . , tnI parameter of interest θ = F (t) = Pr(T ≤ t)

I θ = #obs.≤tn =

∑ni=1 I(0,ti )

(t)

n

6 / 14

Page 23: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis: Non Parametric Estimation

Non Parametric Estimation (no censoring)

Confidence intervals

I Confidence interval for F (t):

θ ∓ zα/2

√θ(1− θ)

n

I Confidence interval for S(t):

1− θ ∓ zα/2

√θ(1− θ)

n

7 / 14

Page 24: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis: Non Parametric Estimation

Non Parametric Estimation (no censoring)

8 / 14

Page 25: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis: Non Parametric Estimation

Non Parametric Estimation (including censoring)

Estimation

I To estimate the proportions θiI ni = # of individuals at risk at the beginning of the i-th

intervalI di = # of individuals experiencing the event

θi =ni − di

ni

I Kaplan Meier estimator

S(t) =∏i :ti≤t

ni − dini

I It reduces to 1− F (t) with no censored observations

9 / 14

Page 26: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis: Non Parametric Estimation

Non Parametric Estimation (including censoring)

Example

Subject Group Survival # surviving Event # surviving Cumulativetime at risk after event survival

in the interval rate

1 1 1 6 1 5 1× 56

2 1 2 5 1 4 1× 56× 4

53 1 3 4 1 3 1× 5

6× 4

5× 3

44 1 4 3 1 2 1× 5

6× 4

5× 3

4× 2

35 1 4.5 2 1 1 1× 5

6× 4

5× 3

4× 2

3× 1

26 1 5 0

7 2 0.5 6 1 5 1× 56

8 2 0.75 5 1 4 1× 56× 4

59 2 1 4 1 3 1× 5

6× 4

5× 3

410 2 1.5 0

11 2 2 2 1 1 1× 56× 4

5× 3

4× 1

212 2 3.5 1 1 0 1× 5

6× 4

5× 3

4× 1

2

10 / 14

Page 27: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis: Non Parametric Estimation

Non Parametric Estimation (including censoring)

Example

1

1

0.00

0.25

0.50

0.75

1.00

0 1 2 3 4 5analysis time

group = 1 group = 2

Surviving functions by group

11 / 14

Page 28: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis: Non Parametric Estimation

Non Parametric Estimation (including censoring)

Understanding KM analysis

I The lengths of the horizontal lines represent the survivalduration for that interval.

I The interval is terminated by the occurrence of the event ofinterest.

I The vertical distances between horizontal lines illustrate thechange in the cumulative probability.

I The KM curve is a step-wise estimator, not a smooth function.

I What about estimate of point survival?

I Which is the effect of censoring?

12 / 14

Page 29: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis: Non Parametric Estimation

Non Parametric Estimation (including censoring)

Comparison of KM estimates

I It is simple to visualize the difference between two survivalcurves.

I The difference must be quantified in order to assess statisticalsignificance.

I MethodsI log-rank test ⇒ Most sensitive to consistent differenceI Wilcoxon test ⇒ Most sensitive to early differencesI hazard ratio ⇒ gives relative event rate in the groups

13 / 14

Page 30: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis: Non Parametric Estimation

Non Parametric Estimation (including censoring)

Log-Rank test: Example

Time Group 1 Group 2 Group 1 Group 2 Group 1 Group 2Event Event At Risk At Risk Expected Expected

0,5 0 1 6 6 0,50 0,500,75 0 1 6 5 0,55 0,451 1 1 6 4 1,20 0,802 1 1 5 2 1,43 0,573 1 0 4 1 0,80 0,203,5 0 1 3 1 0,75 0,254 1 0 3 0 1,00 0,004,5 1 0 2 0 1,00 0,00

The logrank test statistic is constructed by computing the observedand expected number of events in one of the groups at each

observed event time and then adding these to obtain an overallsummary across all time points where there is an event.

χ2 = 3.07; p − value = 0.0798

14 / 14

Page 31: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis: Non Parametric Estimation

Non Parametric Estimation (including censoring)

What to avoid

I Compare mean survival ⇒ Censoring makes this meaningless

I Overinterpret the tail of a survival curve ⇒ There aregenerally few subjects in tails

I Compare proportions surviving at a fixed time ⇒ Fine fordescription, not for hypothesis testing

15 / 14

Page 32: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Cox Proportional Hazards Regression for Survival Data

Cox Proportional Hazards Regressionfor Survival Data

Antonello Maruotti

Lecturer in Medical Statistics, S3RI and School of MathematicsUniversity of Southampton

Southampton, 6 February 2013

1 / 11

Page 33: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Cox Proportional Hazards Regression for Survival Data

Outline

Some simple distributions

The Cox PH model

Model diagnostics

2 / 11

Page 34: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Cox Proportional Hazards Regression for Survival Data

Some simple distributions

Survival distributions

I Survival analysis focuses on the distribution of survival times.

I Although there are well known methods for estimatingunconditional survival distributions, most interesting survivalmodeling examines the relationship between survival and oneor more predictors.

I In principle, every distribution on R+ can serve to characterizesurvival data.

I Constant hazardI Gompertz distributionI Weibull distribution

3 / 11

Page 35: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Cox Proportional Hazards Regression for Survival Data

Some simple distributions

Survival distributions

Modeling of survival data usually employs the hazard function

h(t) = lim∆t→0

Pr(t < T ≤ t +∆t | T > t)

∆t

I Constant hazard: h(t) = λ ⇒ S(t) = e−λt

I Gompertz: h(t) = aebt , a > 0, b > 0 ⇒ S(t) = eab[1−ebt ]

I Weibull: h(t) = λata−1 ⇒ S(t) = e−λta

4 / 11

Page 36: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Cox Proportional Hazards Regression for Survival Data

The Cox PH model

Regression-like model

A parametric model based on the exponential distribution may bewritten as

log hi (t) = β0 + β1xi1 + · · ·+ βpxip

log-baseline hazard

The constant β0 represents a kind of log-baseline hazard

5 / 11

Page 37: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Cox Proportional Hazards Regression for Survival Data

The Cox PH model

The Cox model

The Cox model leaves the baseline hazard functionβ0(t) = log h0(t) unspecified

log hi (t) = β0(t) + β1xi1 + · · ·+ βpxip

The model is semiparametric, because while the baseline hazardcan take any form, the covariates enter the model linearly.

I The baseline hazard does not depend on covariates, but onlyon time

I The covariates are time-constant

I Proportional hazard assumption follows

6 / 11

Page 38: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Cox Proportional Hazards Regression for Survival Data

The Cox PH model

The hazard ratio

For two observations i and j , the hazard ratio

hi (t)

hj(t)=

h0(t) exp(β1xi1 + · · ·+ βpxip)

h0(t) exp(β1xj1 + · · ·+ βpxjp)

=exp(β1xi1 + · · ·+ βpxip)

exp(β1xj1 + · · ·+ βpxjp)

= exp

(p∑

l=1

βl(xil − xjl)

)

is independent of time t. Consequently, the Cox model is aproportional hazards model.

7 / 11

Page 39: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Cox Proportional Hazards Regression for Survival Data

The Cox PH model

The hazard ratio: an example

I Only one covariate: TreatmentI xi = 1 ⇒ PlaceboI xj = 0 ⇒ Treatment

I Hazard ratio is then exp(β1)

I We expect that hazard is larger in the placebo group, i.e. thehazard ratio is expected grater than 1.

8 / 11

Page 40: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Cox Proportional Hazards Regression for Survival Data

The Cox PH model

Time-constant covariates

I Not changing over time (e.g. gender)

I Values are set at time t = 0

I Variables unlikely to change are often consideredtime-constant

I Other variables are sometimes treated as time independent

I Time-dependent covariates are allowed, but PH assumptionsis not satisfied (an extended Cox model is needed)

9 / 11

Page 41: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Cox Proportional Hazards Regression for Survival Data

The Cox PH model

Advantages

I Robustness

I Because of the model form, the estimated hazards are alwaysnon-negative

I We can estimate fixed effects and compute the hazard ratioeven though the baseline hazard is left unspecified

Can we use a logistic model?

10 / 11

Page 42: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Cox Proportional Hazards Regression for Survival Data

Model diagnostics

Checking proportional hazards

I Test and graphical diagnostic for PH may be based on scaledSchoenfeld residuals

I Influential observations

I Nonlinearity

11 / 11

Page 43: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Survival Analysis - Stata

Antonello Maruotti

Lecturer in Medical Statistics, S3RI and School of MathematicsUniversity of Southampton

Southampton, 6 February 2013

1 / 32

Page 44: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Outline

Introduction

Coding

Kaplan-Meier

PH Cox model

2 / 32

Page 45: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Introduction

Aim

Illustrate how to use Stata to

I prepare survival data for analysis

I estimate hazard and survival functions

3 / 32

Page 46: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Introduction

Data manipulationA manipulation of the data is needed to facilitate summary andanalysis.

help st

4 / 32

Page 47: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Introduction

Assumptions

I Continuous time survival data

I Single failure data, i.e. one record per unit

I No complications such as truncation and/or missing values

I Data do not need to be weighted

5 / 32

Page 48: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Introduction

Data structure

Data have a very simple structure

I One row per unit (e.g. subject)

I The survival time and the censoring status must be includedas variables (1= failure, 0 = otherwise)

I Covariates (explanatory variables) could be included

6 / 32

Page 49: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Introduction

Data description

7 / 32

Page 50: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Coding

stset

stset declares the data in memory to be st dataI Main

I Time variable ⇒ survival timeI Failure variable ⇒ censoring status

I OptionsI Origin time expression sets when a subject becomes at riskI Enter time expressions specifies when a subject first comes

under observationI Exit time expression specifies the latest time under which the

subject is both under observation and at risk.

8 / 32

Page 51: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Coding

stset in practice

9 / 32

Page 52: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Coding

stset in practice

10 / 32

Page 53: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Coding

stset: example

11 / 32

Page 54: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Coding

Using stsetNew variables in the data, why? Which is your meaning? Shouldyou use them?

12 / 32

Page 55: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Coding

Using stset

I st is a binary variable indicating cases included (1) orexcluded (0) from the analysis

I d is a censoring indicator

I t is the survival time

I t0 is the time at which units are observed to be at risk

13 / 32

Page 56: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Coding

Using stset

14 / 32

Page 57: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Coding

Summary statisticsYou must stset your data before using

I stdescribe produces a summary of the st data

I stsum summarizes survival-time data

15 / 32

Page 58: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Kaplan-Meier

Kaplan-Meier

I Simple single-spell type

I Right censoring

I No left censoring (truncation)

16 / 32

Page 59: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Kaplan-Meier

sts

Survival times are treated as observations on a continuous variable

I sts list

I sts graph

I sts test

I sts generate

17 / 32

Page 60: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Kaplan-Meier

sts list

18 / 32

Page 61: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Kaplan-Meier

sts list: example

19 / 32

Page 62: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Kaplan-Meier

sts graph

20 / 32

Page 63: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Kaplan-Meier

sts graph: example

0.00

0.25

0.50

0.75

1.00

0 200 400 600 800 1000analysis time

Kaplan−Meier survival estimate

21 / 32

Page 64: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Kaplan-Meier

sts graph: example

0.00

1.00

2.00

3.00

0 200 400 600 800 1000analysis time

Nelson−Aalen cumulative hazard estimate

22 / 32

Page 65: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Kaplan-Meier

sts graph: example

.001

.002

.003

.004

.005

0 200 400 600 800analysis time

Smoothed hazard estimate

23 / 32

Page 66: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Kaplan-Meier

sts graph: stratification

24 / 32

Page 67: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Kaplan-Meier

sts graph: stratification

11

1

3

2

1

21

22

112

1

1

22

1

3

11

1

2

1

11

2

11 3

211

22 3

11 1

111

1

0.00

0.25

0.50

0.75

1.00

0 200 400 600 800 1000analysis time

sex = 1 sex = 2

Kaplan−Meier survival estimates

25 / 32

Page 68: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Kaplan-Meier

sts test

26 / 32

Page 69: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

Kaplan-Meier

sts test

27 / 32

Page 70: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

PH Cox model

stcox

28 / 32

Page 71: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

PH Cox model

stcox: options for model checking

29 / 32

Page 72: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

PH Cox model

stcox: example

30 / 32

Page 73: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

PH Cox model

stphplot: model checking

31 / 32

Page 74: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

PH Cox model

stphplot: model checking

−2

02

4−

ln[−

ln(S

urvi

val P

roba

bilit

y)]

2 3 4 5 6 7ln(analysis time)

sex = 1 sex = 2

32 / 32

Page 75: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

PH Cox model

estat phtest: model checking

33 / 32

Page 76: Lecture7: Survival Analysis - University of Southampton · Lecture7: Survival Analysis Introduction...a clari cation I Survival data subsume more than only times from birth to death

. . . . . .

Survival Analysis - Stata

PH Cox model

estat phtest: model checking

34 / 32