Top Banner
Survival Analysis for Randomized Clinical Trials Ziad Taib March 9, 2012
73

Survival Analysis for Randomized Clinical Trials

Jan 31, 2016

Download

Documents

mimis

Survival Analysis for Randomized Clinical Trials. Ziad Taib March 9, 2012. I The log-rank test. INTRODUCTION TO SURVIVAL TIME DATA ESTIMATING THE SURVIVAL FUNCTION THE LOG RANK TEST. Survival Analysis. Survival Analysis in RCT. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Survival Analysis for Randomized Clinical Trials

Survival Analysis for Randomized Clinical Trials

Ziad Taib

March 9, 2012

Page 2: Survival Analysis for Randomized Clinical Trials

I The log-rank test

1. INTRODUCTION TO SURVIVAL TIME DATA

2. ESTIMATING THE SURVIVAL FUNCTION

3. THE LOG RANK TEST

Page 3: Survival Analysis for Randomized Clinical Trials

Survival Analysis

Page 4: Survival Analysis for Randomized Clinical Trials

Survival Analysis in RCT

• For survival analysis, the best observation plan is prospective. In clinical investigation, that is a randomized clinical trial (RCT).

• Random treatment assignments.

• Well-defined starting points.

• Substantial follow-up time.

• Exact time records of the interesting events.

Page 5: Survival Analysis for Randomized Clinical Trials

Survival Analysis in Observational Studies

• Survival analysis can be used in observational studies (cohort, case control etc) as long as you recognize its limitations.

• Lack of causal interpretation.• Unbalanced subject characteristics.• Determination of the starting points.• Lost of follow-up.• Ascertainment of event times.

Page 6: Survival Analysis for Randomized Clinical Trials

Elements of Survival Experiments

• Event Definition (death, adverse events, …)• Starting time • Length of follow-up (equal length of follow-up,

common stop time)• Failure time (observed time of event since start of

trial)• Unobserved event time (censoring, no event

recorded in the follow-up, early termination, etc)

time

End of follow up time

eventstart Early termination

Page 7: Survival Analysis for Randomized Clinical Trials

When to use survival analysis

• Examples– Time to death or clinical endpoint– Time in remission after treatment of CA– Recidivism rate after alcohol treatment

• When one believes that 1+ explanatory variable(s) explains the differences in time to an event

• Especially when follow-up is incomplete or variable

Page 8: Survival Analysis for Randomized Clinical Trials

Standard Notation for Survival Data

• Ti -- Survival (failure) time

• Ci -- Censoring time

• Xi =min (Ti ,Ci) -- Observed time

• Δi =I (Ti ≤Ci) -- Failure indicator: If the ith subject had an event before been censored, Δi=1, otherwise Δi=0.

• Zi(t) – covariate vector at time t.

• Data: {Xi , Δi , Zi(·) }, where i=1,2,…n.

Page 9: Survival Analysis for Randomized Clinical Trials

Describing Survival Experiments

• Central idea: the event times are realizations of an unobserved stochastic process, that can be described by a probability distribution.

• Description of a probability distribution:1. Cumulative distribution function

2. Survival function

3. Probability density function

4. Hazard function

5. Cumulative hazard function

Page 10: Survival Analysis for Randomized Clinical Trials

Relationships Among Different Representations

• Given any one, we can recover the others.

t

duuhtFtTPtS0

})(exp{)(1)(

t

duuhthtf0

})(exp{)()(

)(log)( tSt

th

)(

)()|Pr()( lim

0 tS

tf

t

tTttTtth

t

t

duuhtH0

)()(

Page 11: Survival Analysis for Randomized Clinical Trials

Descriptive statistics

• Average survival– Can we calculate this with censored data?

• Average hazard rate– Total # of failures divided by observed

survival time (units are therefore 1/t or 1/pt-yrs)

– An incidence rate, with a higher value indicating lower survival probability

• Provides an overall statistic only

Page 12: Survival Analysis for Randomized Clinical Trials

Estimating the survival function

There are two slightly different methods to create a survival curve.

• With the actuarial method, the x axis is divided up into regular intervals, perhaps months or years, and survival is calculated for each interval. • With the Kaplan-Meier method, survival is recalculated every time a patient dies. This method is preferred, unless the number of patients is huge.

The term life-table analysis is used inconsistently, but usually includes both methods.

Page 13: Survival Analysis for Randomized Clinical Trials

Life Tables (no censoring)

In survival analysis, the object of primary interest is the survival function S(t).Therefore we need to develop methods for estimating it in a good manner. The most obvious estimate is the empirical survival function:

patients # Total

n t larger tha timessurvival with patients #ˆ tS

time

1 2 3 4 5 6 7 8 9 10

110

100ˆ S 1

10

101ˆ S 9.0

10

92ˆ S

0

6.010

65ˆ S

Page 14: Survival Analysis for Randomized Clinical Trials

Example: A RAT SURVIVAL STUDY

In an experiment, 20 rats exposed to a particular type of radiation, were followed over time. The start time of follow-up was the same for each rat. This is an important difference from clinical studies where patients are recruited into the study over time and at the date of the analysis had been followed for different lengths of time. In this simple experiment all individuals have the same potential follow-up time. The potential follow-up time for each of the 20 rats is 5 days.

Page 15: Survival Analysis for Randomized Clinical Trials

Survival Function for Ratsa

ˆ[ ]P T t ˆ ˆ( ) [ ]S t P T t

Page 16: Survival Analysis for Randomized Clinical Trials

Proportion of rats dying on each of 5

daysSurvival Curve for Rat Study

Page 17: Survival Analysis for Randomized Clinical Trials

Confidence Intervals for Survival Probabilities

From above we see that the "cumulative" probability of surviving three days in the rat study is 0.25. We may want to report this probability along with its standard error. This sample proportion of 0.25 is based on 20 rats that started the study. If we assume that (i) each rat has the same unknown probability of surviving three days, S(3), and (ii) assume that the probability of one rat dying is not influenced by whether or not another rat dies, then we can use results associated with the binomial probability distribution to obtain the variance of this proportion. The variance is given by

Page 18: Survival Analysis for Randomized Clinical Trials

(3) [1 (3)] 0.25 0.75[ (3) ] 0.009375

20

S SVARIANCE S

n

0096.020

)3(ˆ1)3(ˆ)3(ˆ

)1,0()3(ˆ

)3()3(ˆ

SSSVar

NSVar

SSZ

•This can be used to test hypotheses about the theoretical probability of surviving three days as well as to construct confidence intervals.•For example, the 95% confidence interval for is given by

0.25 +/- 1.96 x 0.0968 or ( 0.060,0.440) We are 95% confident that the probability of surviving 3 days, meaning THREE OR MORE DAYS, lies between 0.060 and 0.440.

Page 19: Survival Analysis for Randomized Clinical Trials

In generalThis approach has many drawbacks

1. Patients are recruited at different time periods

2. Some observations are censored

3. Patients can differ wrt many covariates

4. Continuous data is dicretised

Page 20: Survival Analysis for Randomized Clinical Trials

Kaplan-Meier survival curves• Also known as product-limit formula

• Accounts for censoring

• Generates the characteristic “stair step” survival curves

• Does not account for confounding or effect modification by other covariates– Is that a problem?

Page 21: Survival Analysis for Randomized Clinical Trials

In general

16 9 9ˆ ˆ ˆ(2) [ 1] [ 2 | 1] 0.4520 16 20

S P T P T T

The same as before! Similariliy

ˆ ˆ ˆ ˆ(3) [ 1] [ 2 | 1] [ 3 | 2]

0.80 0.5625 0.5556

16 9 5 50.25

20 16 9 20

S P T P T T P T T

Page 22: Survival Analysis for Randomized Clinical Trials

stands for the proportion of patients who survive day i among those who survive day i-1. Therefore it can be estimated according to

We proceed as in the case without censoring

ki

i

PPPkS

iTiTP

...

1|Pr

21

iP

i)day risk at Patients ofNumber (

i)day during events ofnumber (Total - i)day risk at Patients ofNumber (ˆ

patients) ofnumber Total(

1)day during events ofnumber (Total - patients) ofnumber Total(1̂

iP

P

Censored Observations (Kaplan-Meier)

Page 23: Survival Analysis for Randomized Clinical Trials

K-M Estimate: General Formula

•Rank the survival times as t(1)≤t(2)≤…≤t(n).

•Formula

tt i

ii

in

dntS

)(

)(ˆ •ni patients at risk

•di failures

1

190.95

20P

3

170.89474

19P

1(1) (0) 1 0.95 0.95S S P

1 3

19 17(3) (0) 1 1 0.95 0.89464 0.85

20 19S S P P

PROC LIFETEST

Page 24: Survival Analysis for Randomized Clinical Trials

Confidence Intervals

Page 25: Survival Analysis for Randomized Clinical Trials

Using SAS

Page 26: Survival Analysis for Randomized Clinical Trials

Using SAS

Page 27: Survival Analysis for Randomized Clinical Trials
Page 28: Survival Analysis for Randomized Clinical Trials

Comparing Survival Functions

• Question: Did the treatment make a difference in the survival experience of the two groups?

• Hypothesis: H0: S1(t)=S2(t) for all t ≥ 0.

• Two tests often used :1. Log-rank test (Mantel-Haenszel Test);

2. Cox regression

Page 29: Survival Analysis for Randomized Clinical Trials

A numerical Example

Page 30: Survival Analysis for Randomized Clinical Trials
Page 31: Survival Analysis for Randomized Clinical Trials

During the th interval, let• Nt be the number of patients at risk in the drug

group at the beginning of the interval.• Mt be the number of patients at risk in the placebo

group at the beginning of the interval.• At the number of events during the interval in the

drug group• Ct the number of events during the interval in the

placebo group• Tt = Nt + Mt

The Log-rank test

Page 32: Survival Analysis for Randomized Clinical Trials

The Log-rank test

• The above contingency table is a way of summarising the data at hand. Notice though that the marginals Nt and Mt are fixed.

• In principle the problem can be formulated as a formal test of

• H0: Drug has no effect D(t) = C(t)against

• H0: Drug is effective D(t) = K(t)

Under H0, random allocation

Page 33: Survival Analysis for Randomized Clinical Trials

This situation is similar to one where we have a total of Tt balls in an urn. These balls are of two different kinds D and C. Dt balls are drawn at random (those who experience events).• We denote by At the number of balls of type D among the Dt drawn.

TTtt = N = Ntt + M + Mtt

DDtt

Page 34: Survival Analysis for Randomized Clinical Trials

We can thus calculate probabilities of events like {At = a} using the hypergeometric distribution.

t

t

t

tt

t

D

T

aD

M

a

N

aAP )(

The corresponding mean and variance are.

1

D-Tar ;

2tt2

ttt

tttt

t

tttt TT

MNDAV

T

NDAE

Page 35: Survival Analysis for Randomized Clinical Trials

)1,0(NA

Zt

ttt

The above can be used to derive the following test statistic where Nt and Mt are supposed to be large, which is often the case in RCT.

and2t

2t Z

Assume now that we want to combine data from k successive such intervals. We can then define U according to

k

tttAU

1

21

2

UVar

UQ HM

And use the statistic

Reject H0 for large values of QM-H

Page 36: Survival Analysis for Randomized Clinical Trials

Some comments

1. The log-rank test is a significance test. It does not say anything about the size of the difference between the two groups.

2. The parameter can be used as measure of the size of that difference. It is also possible to compute confidence intervals for .

Page 37: Survival Analysis for Randomized Clinical Trials

Further comments3.Instead of QM-H we can formulate variants

with weights

21

1

2

2

1

k

ttt

k

tttt

AVar

A

Q under Ho.

1. t =1 M-H

2. t = Tt Gehan 1965

3. t = Tarone and Ware 1977tT

Page 38: Survival Analysis for Randomized Clinical Trials

A numerical example

Page 39: Survival Analysis for Randomized Clinical Trials

Using SAS

Page 40: Survival Analysis for Randomized Clinical Trials

Limitation of Kaplan-Meier curves

• What happens when you have several covariates that you believe contribute to survival?

• Example– Smoking, hyperlipidemia, diabetes, hypertension, contribute to

time to myocardial infarct• Can use stratified K-M curves – but the combinatorial

complexity of more than two or three covariates prevents practical use

• Need another approach – multivariate Cox proportional hazards model is most commonly used – (think multivariate regression or logistic regression)

Page 41: Survival Analysis for Randomized Clinical Trials

• Introduction to the proportional hazard model (PH)

• Partial likelihood

• Comparing two groups

• A numerical example

• Comparison with the log-rank test

II Cox Regression

Page 42: Survival Analysis for Randomized Clinical Trials

The model

)exp()()( 110 ikkii xxthth

Understanding “baseline hazard” h0(t)

)]()(exp[)(

)(111 jkikkji

j

i xxxxth

th

Page 43: Survival Analysis for Randomized Clinical Trials

Cox Regression Model

• Proportional hazard. • No specific distributional assumptions (but

includes several important parametric models as special cases).

• Partial likelihood estimation (Semi-parametric in nature).

• Easy implementation (SAS procedure PHREG).

• Parametric approaches are an alternative, but they require stronger assumptions about h(t).

Page 44: Survival Analysis for Randomized Clinical Trials

Cox proportional hazards model, continued

• Can handle both continuous and categorical predictor variables (think: logistic, linear regression)

• Without knowing baseline hazard ho(t), can still calculate coefficients for each covariate, and therefore hazard ratio

• Assumes multiplicative risk—this is the proportional hazard assumption– Can be compensated in part with interaction terms

Page 45: Survival Analysis for Randomized Clinical Trials

Cox Regression

• In 1972 Cox suggested a model for survival data that would make it possible to take covariates into account. Up to then it was customary to discretise continuos variables and build subgroups.

• Cox idea was to model the hazard rate function

tthtTttTtP )()|(

where h(t) is to be understood as an intensity i.e. a probability by time unit. Multiplied by time we get a probability. Think of the analogy with speed as distance by time unit. Multiplied by time we get distance.

Page 46: Survival Analysis for Randomized Clinical Trials

Cox’ suggestion is to model h using

vector.covariate the,...,

vector;parameter the,...,

;)()(

1

1

0

iki

kT

i

ZZ

eththT

i

Z

β

i

where each parameter is a measure of the importance of the corresponding variable.

A consequence of this is that two individuals with different covariate values will have hazard rate functions which differ by a multiplicative term which is the same for all values of t.

Page 47: Survival Analysis for Randomized Clinical Trials

The covariates can be discrete, continuous or even categorical. It is also possible to generalise the model to allow time varying covariates.

Cthth

Ceeth

eth

th

th

iethth

T

T

T

T

i

).()(

;)(

)(

)(

)(

;2,1 ;)()(

21

)

0

0

2

1

0

21

2

1

i

x-(xβ

Page 48: Survival Analysis for Randomized Clinical Trials

Example

Assume we have a situation with one covariate that takes two different values 0 and 1. This is the case when we wish to compare two treatments

ethth

thth

k

).()(

);()(

;1

;;1

02

01

1

21 x0;x

β

h1(t)

h2(t)

t

Page 49: Survival Analysis for Randomized Clinical Trials

From the definition we

see immediately that

i

Te

iT

t

t

iT

t

i

tS

e

e

e

tS

e

duuh

eduuh

duuh

i

)(0

)(

)(

)(

0

0

0

0

0

• The baseline can be interpreted as the case corresponding to an individual with covariate values zero.

• The name semi-parametric is due to the fact that we do not model h0 explicitly. Usual likelihood theory does not apply.

Comments

Page 50: Survival Analysis for Randomized Clinical Trials

Maximum likelihood

Page 51: Survival Analysis for Randomized Clinical Trials
Page 52: Survival Analysis for Randomized Clinical Trials
Page 53: Survival Analysis for Randomized Clinical Trials

Partial likelihood

Page 54: Survival Analysis for Randomized Clinical Trials
Page 55: Survival Analysis for Randomized Clinical Trials
Page 56: Survival Analysis for Randomized Clinical Trials

Let L() stand for the likelihood function and for some parameter (vector) of interest. Then, the maximum likelihood estimates are found by solving the set of equations

It can be shown that the estimate to have an asymptotically normal distribution with means i and variance/covariance matrix

“Partial” Likelihood estimates

Define the information matrix, I() to have elements

,...,2,1 ,0)(log

i

L

i

.)(log

)(2

ji

ij

LEI

1)( I

Page 57: Survival Analysis for Randomized Clinical Trials

• We take logLj = lj

J

jj

J

j Rij

T

J

jj

l

eLog

LogL

LogL

j

iT

1

1

1

β

β

β

Page 58: Survival Analysis for Randomized Clinical Trials

Score and information

22

2

2

j

iT

j

iT

j

iT

j

iT

ki

k

j

iT

j

iT

Ri

Riki

Ri

Rij

Ri

Riki

kjk

jk

e

ex

e

exl

e

ex

xl

U

β

• The two derivatives can be used to get parameter estimates and information.

• One difficulty we avoided so far is the occurence of ties.

Page 59: Survival Analysis for Randomized Clinical Trials

Ties• dj = the # of failures at tj.• sj = the sum of the covariate values for patients who die at time tj.• Notice that sum (arbitrary) order must be imposed on the ties.

J

jj

J

j Rijj

T leLogdLogLj

iT

11

xβsββ

22

2

2

j

iT

j

iT

j

iT

j

iT

ki

k

j

iT

j

iT

Ri

Riki

Ri

Ri

jj

Ri

Riki

jkjk

j

e

ex

e

ex

dl

e

ex

dsl

Page 60: Survival Analysis for Randomized Clinical Trials

A numerical ExampleTIME TO RELIEF OF ITCH SYMPTOMS FOR PATIENTS USING A STANDARD AND EXPERIMENTAL CREAM

Page 61: Survival Analysis for Randomized Clinical Trials

What about a t-test?

The mean difference in the time to “cure” of 1.2 days is not statistically significant between the two groups.

Page 62: Survival Analysis for Randomized Clinical Trials

Using SASPROC PHREG ; MODEL RELIEFTIME * STATUS(0) = DRUG ;

Page 63: Survival Analysis for Randomized Clinical Trials

Note that the estimate of the DRUG variable is -1.3396 with a p value of 0.0346. The negative sign indicates a negative association between the hazard of being cured and the DRUG variable. But the variable DRUG is coded 1 for the new drug and coded 2 for the standard drug. Therefore the hazard of being cured is lower in the group given the standard drug. This is an awkward but accurate way of saying that the new drug tends to produce a cure more quickly than the standard drug. The mean time to cure is lower in the group given the new drug. There is an inverse relationship between the average time to an event and the hazard of that event.

Page 64: Survival Analysis for Randomized Clinical Trials

• At each time point the cure rate of the standard drug is about 25% of that of the new drug. Put more positively we might state that the cure rate is 1.8 times higher in the group given the experimental cream compared to the group given the standard cream.

ˆ 2ˆ(2 1) 1.33962

ˆ 11

( )0.262

( )

h t ee e

h t e

The ratio of the hazards is given by

Page 65: Survival Analysis for Randomized Clinical Trials

Comparing two groups in general

Let • nAj be the number at risk in group 1 at time tj• nBj be the number at risk in group 2 at time tjthen

J

j BA

A

BA

A

j

J

j BA

A

jj

jj

j

jj

j

jj

j

nen

en

nen

endI

nen

endsU

1

2

1

;0

0 22

I

UQ

The score test

Page 66: Survival Analysis for Randomized Clinical Trials

A numerical example

• Assume that a study aiming at comparing two groups resulted in the following data

• U(0)=-10.2505

• I(0)=6.5957

;509192.1ˆ 2211.0

;)(

)(

)(

)(

;2,1 ;)()(

150919ˆ

0.0

0

2

1

0

ee

eeth

eth

th

th

ieththT

i

.1

Zβ i

The treatment reduces the hazard rate to almost ¼ compared with placebo.

Page 67: Survival Analysis for Randomized Clinical Trials

Log-rank vs the score test (PHREG)

• The two methods give very similar answers, why?

• On one hand the score test can be based on the statistic

;

0

0

2

12

2

1

2

J

j

BAj

J

j j

Aj

A

j

jj

j

j

n

nnd

n

ndd

I

UQ

;

groups;both in risk at #

groups;both in events #

;0

;0

12

1

j

jj

jj

j

jj

j

j

Aj

BAj

BAj

J

j

BAj

J

j j

Aj

A

ds

nnn

ddd

n

nndI

n

nddU

Page 68: Survival Analysis for Randomized Clinical Trials

CURE NO CURE

DRUG dAj nAj-dAj nAj

CONTROL dBj nBj-dBj nBj

TOTAL dj nj-dj nj

Page 69: Survival Analysis for Randomized Clinical Trials

• Compare this to the Log-rank statistic

21

2

UVar

UQ HM

1

D-Tar ;

2tt2

ttt

tttt

t

tttt TT

MNDAV

T

NDAE

k

tttAU

1

1ar ;

22

nn

nn-dnddVσ

n

nddE

j

jj

j

j

j

j

BAjjj

Ajj

Aj

Aj

;

1

2

12

2

1

J

j j

BAjjj

J

j j

Aj

A

HM

j

jj

j

j

nn

nn-dnd

n

ndd

Q

Page 70: Survival Analysis for Randomized Clinical Trials

;2

12

2

1

J

j

BAj

J

j j

Aj

A

j

jj

j

j

n

nnd

n

ndd

Q ;

)1(

)(2

12

2

1

J

j j

BAjjj

J

j j

Aj

A

HM

j

jj

j

j

nn

nn-dnd

n

ndd

Q

LOG-RANK TESTSCORE TEST

1. dj = 1 i.e. no ties and the two formulas are identical. In general the two formulas are approximately the same when nj is large.

2. For the example in a case with ties there is a slight difference

- Q=15.930

- QM-H=16.793

Page 71: Survival Analysis for Randomized Clinical Trials

Generalizations of Cox regression

1. Time dependent covariates

2. Stratification

3. General link function

4. Likelihood ratio tests

5. Sample size determination

6. Goodness of fit

7. SAS

Page 72: Survival Analysis for Randomized Clinical Trials

References

• Cox & Oakes (1984) “Analysis of survival data”. Chapman & Hall.

• Fleming & Harrington (1991) “Counting processes and survival analysis”. Wiley & Sons.

• Allison (1995). “Survival analysis using the SAS System”. SAS Institute.

• Therneau & Grambsch (2000) “Modeling Survival Data”. Springer.

• Hougaard (2000) “Analysis of Multivariate survival data”. Springer.

Page 73: Survival Analysis for Randomized Clinical Trials

Questions or Comments?