01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.

101/2015

EPI 5344:Survival Analysis in

EpidemiologyCox regression: Introduction

March 17, 2015

Dr. N. Birkett,School of Epidemiology, Public Health &

Preventive Medicine,University of Ottawa

201/2015

Objectives

• Review proportional hazards• Introduce Cox model and methods of

estimation• Tied data

301/2015

Exponential model (1R)

• Exponential model– Most common parametric model in epidemiology– Assumes a constant h(t) = λ– How did we create the likelihood function?

• Subjects can have two types of ‘ends’– Death– Censored

• Each contribute to the likelihood function but in different ways

401/2015

Exponential Model (2R)

• Likelihood contribution of a death at time ti:

• Likelihood contribution if censored at time :– Actual time of ‘failure’ is unknown.– Must survive until at least time

– Multiply these across all deaths and all censored events to get full likelihood

501/2015


Where:

N = # events

PT = Person-time of

follow-up

601/2015

Exponential Model (4R)• How do we find the MLE for λ?

701/2015


• What if we want to examine predictors of the outcome?– λ is allowed to vary by sex, age, cholesterol,

etc.• Use the same approach but now, instead

of ‘λ’, we have the following in the likelihood function:

8

End of review

01/2015

9

Proportional hazard models (1)

• Now, use this approach BUT do not pre-specify form

for h(t)

• We start with proportional hazards

• Hazard (h(t)) = rate of change in survival conditional

on having survived to that point in time.

01/2015

10

Hazard models (2)

• Suppose we want to compare two treatment groups– Different survival is expected they have different hazards

– How can we summarize this?

01/2015

In general, HR(t) will be different at different follow-up times

1101/2015

h2(t)

h1(t)

This can be hard to describe and interpret• Effect of the treatment varies with length of follow-up

1201/2015

h2(t)

h1(t)

• HR could switch from below to above 1.0

13

Hazard models (3)

• SUPPOSE that HR(t) were constant at all follow-up times.– Effect of the treatment is the same at all times

PROPORTIONAL HAZARDS model (PH)

• This does not require that h(t) be constant;• It can vary in an unconstrained manner.

01/2015

1401/2015

h2(t)

h1(t)

1501/2015

h2(t)

h1(t)

HR

1601/2015

1701/2015

Cox models (1)

• For most of the rest of this course, we will assume a Proportional hazards model:

h1(t) = h0(t) * HR

• h0(t) is the ‘baseline’ or reference hazard.– Contains all of the time variability of the hazard.

• HR is assumed to remain the same for all follow-up time.

Constant over follow-up time

1801/2015

Cox models (2)

• HR can still be affected by predictor variables– Race– Exposure (low/mid/high)– Sex– Caloric intake

• For now, we will assume that these are – measured at baseline (time ‘0’)– remain fixed during follow-up

1901/2015

Cox models (3)

• In general, we have:

• Most common model assumes that ln(HR) is a linear function of the predictors. This is similar to the model for logistic regression and linear regression.

• NOTE: there is no intercept!– This is ‘subsumed’ into the baseline hazard term h0(t)

2001/2015

Cox models (4)

• HR model can be written:

• How does the fit into our ‘hazard’ model? Our base model is:

2101/2015

Cox models (5)

• This implies:

• But, so what? How do we estimate the Betas?– As with exponential model, it appears we

need to know the shape of h0(t)

2201/2015

Cox models (6)

• COX (1972) SHOWED THAT THIS IS WRONG!– Can estimate the Beta’s without needing to model h0(t)

– Semi-parametric model– Based on:

• Risk sets• Partial likelihoods

• We will skip a lot of math – Use an intuitive approach– Method relates to approach used with exponential model

2301/2015

Cox models (7)

• Start off trying to build a likelihood for the data based on the whole model (with baseline hazard included)

• Concentrate on the times when events happened– Similar to the Kaplan-Meier method

• S(t) only changes when an event happens• can ignore losses between events

• Action happens within Risk Set at the event times.

2401/2015

Cox models (8)

• Action happens within Risk Set at the event times.

• The theory assumes that only one event happens at any point in time– This is not the ‘real world’– In theory, time is continuous.

• So no two events happen at the same time

– We’ll deal with ‘ties’ later on

2501/2015

Cox models (9)

• Consider the risk set at time ‘ti’ when an event happens– Each subject in risk set has a probability of

being the one having the event• Higher hazard higher probability

• ‘likelihood’ contribution from person ‘j’ in risk set is:

2601/2015

Cox models (10)

• Using the definition of conditional probability, this is:

• How do we get the numerator and denominator?

• The hazard is a measure of how likely an event is to occur for a person– Higher hazards an event is more likely

2701/2015

Cox models (11)

• We can get:

2801/2015

Cox models (12)

Now, because the hazards are proportional, we have:

29

Cox models (13)

• The likelihood contribution from this event (risk set) can be written:

Cancel out the h0(t)

01/2015

30

Cox models (14)

• The final likelihood contribution from this risk set is:

• Which does not depend on h0(t)

01/2015

3101/2015

Cox models (15)

• Now, multiply all of the contributions from each risk set (defined when an event occurs)

• Produces a Partial Likelihood• Estimate the Betas using MLE.

3201/2015

Cox models (16)

• We can ignore censored times since we are not estimating the actual hazard

• Beta’s depend only on the ranking of events, not on the actual event times– Implies that Cox does not give the same

estimates as Person-time epidemiology analyses

– Standard Cox models do not estimate survival, just relative survival

3301/2015

D

D

D

C

C

t1 t2t3

Let’s consider a simple example.• Three events three risk sets to consider

34

For subject ‘m’, the hazard function is:

1st event.

risk set: 1/2/3/4/5

Subject with event: 3

Likelihood contribution:

01/2015

35

But, we have:

So, likelihood contribution from risk set #1 is:

01/2015

36

Extending this to the other risk sets:

2nd event.

risk set: 1/2/4



3nd event.

risk set: 4



01/2015

37

Overall Partial Likelihood is:

This can easily be extended to very large data sets.

• Writing out the entire partial likelihood function would be ‘crazy’

• But, this is what our computer has to do

01/2015

38

Suppose that we are using the Cox model. Let’s also limit to one predictor. Then, we have:

• Partial Likelihood form is now:

• We will see this layout again

01/2015

39

‘Ties’ (1)

• Above assumed that only one event

happened at any given time– True ‘in theory’ because time is a continuous

variable.

– No true in reality because time is measured

‘coarsely’.• For example

– Only get measurement data every year

– Time of event measured to the day, not hour/min/second

01/2015

40

‘Ties’ (2)

• More than one event at the same time is called a ‘tied’ event.

• How do we modify the method to handle tied event times?

01/2015

41

‘Ties’ (3)

• Two main approaches to ‘ties’– Discrete models

• Change the basic theory underlying the model• Assumes that event times are discrete points• Relates to logistic regression• Useful when event time can only occur at fixed

points– graduation from high school

– Exact method• Often implemented using an approximation.

01/2015

42

‘Ties’ (4)

• Exact method– Suppose we have two events (s1 & s2) which occur

at the same time due to imprecise measurement of the event time.

– IF we had been able to measure the event time with enough precision, we would know if s1

occurred first or second• Birth of twins

– We don’t know, so we assume that the two possibilities are equally likely.

01/2015

43

‘Ties’ (5)

• Suppose s1 occurred before s2.– Likelihood contribution would be:

• Suppose s2 occurred before s1.– Likelihood contribution would be:

01/2015

44

‘Ties’ (6)

• Don’t know order. Each is equally likely.• Overall likelihood contribution is:

01/2015

45

‘Ties’ (7)

• A bit messy but not too bad.

• However, consider the recidivism data.– 5 arrests occurred in week 8

– We don’t know which order they occurred in

– 120 potential orders (= 5!)

– Each order contributes a likelihood product with 5 terms

– Need to add up 120 of these products to give ONE

contribution.

• Can rapidly get even worse!01/2015

46

‘Ties’ (8)

• Computationally demanding– Not that big a task for modern computers

• Two approximate methods have been developed– Breslow

– Efron

• Both are ‘OK’ as long as number of ties is not too big– Efron is better.

• With modern computers, using the exact approach is

likely fine.01/2015

47

‘Ties’ (9): Summary

01/2015

Situation Comment

No ties • All methods give the same results

A few ties (<2%) • All methods give similar results

Many ties • Approximations are all biased towards ‘0’.• Prefer Efron to Breslow.• Exact methods are best but be careful

about computational demands

SAS default method is Breslow

4801/2015

01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.

Documents

hr model

common model

base model

baseline time

time ti

time variability

likelihood function

hr h0t