Top Banner
Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission
37

Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Event History Analysis 7

Sociology 8811 Lecture 21

Copyright © 2007 by Evan SchoferDo not copy or distribute without permission

Page 2: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Announcements

• Paper Assignment #2 Due April 26• Try to find a dataset soon

• Class topic: • Parametric EHA models; diagnostics• Later (if time allows): AFT models, discrete time

models

Page 3: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Parametric Proportional Hazard Models

• Cox models do not specify a functional form for the hazard curve, h(t)

• Rather, they examine effects of variables net of a baseline hazard trend (to be inferred from the data)

• h(t) = h0(t)eX = h0(t)exp(X)

• Parametric models specify the general shape of the hazard curve

• Approach is more familiar – more like regression– We can model Y as a constant, a linear function, a logit

function, a binomial function (poisson), etc

• For instance, we could assume h(t) was a linear– Then solve for values of a hazard slope that best fit the data

(plus effects of other covariates on hazard rate).

Page 4: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Parametric Proportional Hazard Models

• Parametric models work best when you choose a curve that fits the data

• Just like OLS regression – which works best when the relationship between two variables is roughly linear

• If the actual relationship between two variables is non-linear, coefficient estimates may be incorrect

– Though sometimes one can transform variables (e.g., logging them) to get a good fit…

– Parametric models are more efficient than Cox models• They can generate more precise estimates for a given sample size• But, they can also be more wildly incorrect if you mis-specify h(t)!

– Note: These are proportional hazard models – like Cox!• You must still check the proportional hazard assumption.

Page 5: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Exponential (Constant Rate) Model

• Exponential models are simplest:)()( 2211)( βXaXbXbXba eeth nn

• Note that there is no “t” in the equation… no coefficient that specifies time dependence of the hazard rate

– Rather, there are just exponentiated BXs– PLUS: a, the constant

• Note 2: Box-Steffensmeier & Jones: h(t)=e-(X)

• An exponential model solves for the constant value (a) that best fits the data…

• Along with values of Bs, which reflect effects of X vars• In effect, the model assumes a constant hazard rate .

Page 6: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Exponential (Constant Rate) Model

• Another way of looking at it: An exponential model is a lot like a cox model

• But, with the assumption that the baseline hazard is a constant!

)(0 )()( βXethth

Cox

)()()( βXaXa eeeth Exponential

Page 7: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Exponential (Constant Rate) Model

• Basic Model. Constant reflects base rate. streg gdp degradation education democracy ngo ingo, dist(exponential) nohr

Exponential regression -- log relative-hazard form

No. of subjects = 92 Number of obs = 1938No. of failures = 77Time at risk = 1938 Wald chi2(6) = 94.29Log pseudolikelihood = 282.11796 Prob > chi2 = 0.0000

------------------------------------------------------------------------------ | Robust _t | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- gdp | -.044568 .1842564 -0.24 0.809 -.4057039 .3165679 degradation | -.4766958 .1044108 -4.57 0.000 -.6813372 -.2720543 education | .0377531 .0130314 2.90 0.004 .0122121 .0632942 democracy | .2295392 .0959669 2.39 0.017 .0414475 .417631 ngo | .4258148 .1576803 2.70 0.007 .1167671 .7348624 ingo | .3114173 .365112 0.85 0.394 -.4041891 1.027024 _cons | -4.565513 1.864396 -2.45 0.014 -8.219663 -.9113642------------------------------------------------------------------------------

Constant shows base hazard rate estimated from data:

exp(-4.57) = .01

Page 8: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Exponential (Constant Rate) Model

• Suppose we plotted the baseline hazard rate estimated from our exponential model

• It would be a flat line: h(t) = .01– This is the estimated hazard if all X vars are zero

• If we plotted the estimated hazard for some values of X (ex: democracy = 10), we would get a higher value

– Since democracy has a positive effect, Democ = 10 would yield a higher hazard than democ = 0

– But, again, the estimated hazard rate trend would be a flat line over time…

Page 9: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Exponential Model: Baseline Hazard• Ex: stcurve, hazard

-.96

9705

91.

030

294

Ha

zard

func

tion

1970 1980 1990 2000analysis time

Exponential regression

See, the estimated baseline hazard really is flat!

Page 10: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Exponential Model: Estimated Hazard• stcurve, hazard at1(democ=1) at2(democ=10)

.05

.1.1

5.2

.25

.3H

aza

rd fu

nctio

n

1970 1980 1990 2000analysis time

democracy=1 democracy=10

Exponential regression

Here are estimated hazards for 2 groups

Other vars pegged at mean

Page 11: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Exponential Model: Baseline Hazard• Issue: Actual hazard is rising. A problem?

0.0

2.0

4.0

6.0

8.1

1970 1980 1990 2000analysis time

Smoothed hazard estimateIs an exponential model appropriate?

Answer:

It can be, IF we have X variables that account for increasing hazard

If not, fit will be poor!

Page 12: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Exponential (Constant Rate) Model• Cleves et al. 2004, p. 216:

• In the exponential model, h(t) being constant means that the failure rate is independent of time, and thus the failure process is said to lack memory.

• You may be tempted to view exponential regression as suitable for use only in the simplest of cases. This would be unfair. There is another sense in which the exponential model is the basis for all other models.

• The baseline hazard… is constant … the way in which the overall hazard varies is purely a function of X. The overall hazard need not be constant with time; it is just that every bit of how the hazard varies must be specified in BX. If you fully understand a process, you should be able to do that.

• When you do not understand a process, you are forced to assign a role to time, and in that way, you hope, put to the side your ignorance and still describe the part of the process that you do understand.

• In addition, exponential models can be used to model the overall hazard as a function of time, if they include t or functions of t as covariates.

Page 13: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Exponential (Constant Rate) Model• The exponential model is extremely flexible…

• You specify substantive covariates (X variables) to explain failures

– It is probably not due to some inherent feature of time, but rather due to some variable that you hope to control for

– If you do a great job, you will fully explain why hazard rate appears to go up (or down) over time

• And, you can include functions of time as independent variables to address temporal variation

– Independent (X) variable scan include time dummies, log time, linear time, time interactions, etc

– That is, if you can’t explain time variation with substantive X variables, you can add time variables to model it

• But, if you mis-specify your model, results will be biased– In that case, you might be better off with a Cox model…

Page 14: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Piecewise Exponential Model

• If you have a lot of cases, you can estimate a piecewise model

– Essentially a separate model for different chunks of time

• Model will yield different coefficients and base rate (constant) for multiple chunks of time

• Even if hazard is not constant over time, it may be more or less constant in each period

– This allows you to effectively model any hazard trend

– A related approach: Put in time-period dummies• This gives a single set of bX coefficient estimates• But, allows you to specify changes in the hazard rate

over different periods– NOTE: Don’t forget to omit one of the time dummies!

Page 15: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Parametric Models

• Let’s try a more complex parametric model• Example: Let’s specify a linear time trend

)(0 )()( βXetβath

Linear

)()()( βXaXa eeeth Exponential

• In this case, we estimate a constant (a) and slope (0) which best summarize the time dependence of the hazard rate

• Note: this isn’t common – we have better options…

Page 16: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Gompertz Models

• Another option: an exponentiated line• Rather than a linear function of time and exponentiated

function of X, we’ll exponentiate everything:

• Slope coefficient is often represented by gamma: • Note: Exponentiation alters the line… it isn’t a simple

linear function anymore. – It is flat if gamma = 0– It is monotonically increasing if gamma > 0– It is monotonically decreasing if gamma < 0

)()()( 0)( βXtaβXtβa eeeth Exponentiated Linear: Gompertz

Page 17: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Gompertz Models• Exponentiating a linear function generates a

curve defined by the value of gamma () • Model estimates value of that best fits the data

= 0

< 0

> 0

>> 0

Page 18: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Gompertz Model• Example: streg gdp degradation education democracy ngo

ingo, robust nohr dist(gompertz)Gompertz regression -- log relative-hazard form

No. of subjects = 92 Number of obs = 1938No. of failures = 77Time at risk = 1938 Wald chi2(6) = 46.48Log pseudolikelihood = 307.64758 Prob > chi2 = 0.0000

_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- gdp | .4633559 .2104244 2.20 0.028 .0509316 .8757802 degradation | -.4394712 .1434178 -3.06 0.002 -.720565 -.1583775 education | .0026837 .0145341 0.18 0.854 -.0258026 .03117 democracy | .2890106 .092612 3.12 0.002 .1074943 .4705268 ngo | .2522894 .1658275 1.52 0.128 -.0727265 .5773054 ingo | .0037688 .2275176 0.02 0.987 -.4421575 .4496952 _cons | -253.035 45.28363 -5.59 0.000 -341.7892 -164.2807-------------+---------------------------------------------------------------- gamma | .124117 .0224506 5.53 0.000 .0801146 .1681195------------------------------------------------------------------------------

Model estimates gamma to be positive, significant. Implies increasing baseline hazard

Page 19: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Gompertz Model: Estimated Hazard• stcurve, hazard at1(democ=1) at2(democ=10)

Estimated hazards for 2 groups

Other vars pegged at mean

01

23

4H

aza

rd fu

nctio

n

1970 1980 1990 2000analysis time

democracy=1 democracy=10

Gompertz regression

Note: curves are actually proportional – hard to see because bottom curve is nearly zero…

Page 20: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Weibull Models

• Another option: the Weibull curve• Another curve that can fit monatonic hazards

• Model estimates p to best fit the model– Hazard is flat if p = 1– Hazard is monotonically increasing if p > 1– Hazard is monotonically decreasing if p < 1.

)(1)( βXap eptth Weibull

Page 21: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Weibull: Visually

• The Weibull family: Monotonic increasing or decreasing, depending on p

Time

Haz

ard

Rat

e

p = 1

p = 4

p = .5

p = 2

Page 22: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Weibull Model• Example: streg gdp degradation education democracy ngo ingo, robust nohr dist(weibull)

Weibull regression -- log relative-hazard form

No. of subjects = 92 Number of obs = 1938No. of failures = 77Time at risk = 1938 LR chi2(6) = 23.71Log likelihood = 307.6045 Prob > chi2 = 0.0006

_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- gdp | .4631871 .2360589 1.96 0.050 .0005202 .9258541 degradation | -.4396978 .1486662 -2.96 0.003 -.7310781 -.1483175 education | .0027319 .0141652 0.19 0.847 -.0250314 .0304953 democracy | .288927 .0913855 3.16 0.002 .1098147 .4680394 ngo | .2522595 .1610192 1.57 0.117 -.0633324 .5678514 ingo | .004058 .1835743 0.02 0.982 -.355741 .363857 _cons | -1884.071 280.0398 -6.73 0.000 -2432.939 -1335.203-------------+---------------------------------------------------------------- /ln_p | 5.511481 .1486542 37.08 0.000 5.220124 5.802837-------------+---------------------------------------------------------------- p | 247.5173 36.79449 184.9571 331.2381 1/p | .0040401 .0006006 .003019 .0054067------------------------------------------------------------------------------

Page 23: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Ancillary Parameters

• Gompertz & Weibull models have parameters that determine the shape of the curve

• Gamma (), p• Ex: Bigger = greater increase of h(t) over time

– You can actually specify covariate effects on those parameters

• Effectively allowing a different curve shape across values of X variables

• Ex: If you think that hazard increases more for men than women, you can look to see if Dmale affects

– streg male educ, dist(gompertz) ancillary(male) – Model estimates effect of male on hazard AND on gamma…

Page 24: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Parametric: Model Fit

• Parametric models use maximum likelihood estimation (MLE)

• Comparisons among nested models can be made using a likelihood ratio test (LR test)

• Just like logit: Addition of groups of variables can be tested with lrtest

– Some parametric models are themselves nested• Ex: A Weibull model simplifies to an exponential model

if p = 1– Thus, exponential is nested within Wiebull

• LR tests can be used to see if Weibull is preferable to exponential.

Page 25: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Parametric: Model Fit

• Parametric models use maximum likelihood estimation (MLE)

• Comparisons among nested models can be made using a likelihood ratio test (LR test)

• Just like logit: Addition of groups of variables can be tested with lrtest

– Some parametric models are themselves nested• Ex: A Weibull model simplifies to an exponential model

if p = 1– Thus, exponential is nested within Wiebull

• LR tests can be used to see if Weibull is preferable to exponential.

Page 26: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Parametric Model Fit: AIC

• Non-nested parametric models can be compared via the Akaike Information Criterion

)(2)ln(2 ckLAIC • k = # independent variables in the model• c = # shape parameters in model (ex: p in Weibull)

– Exponential has one parameter (a); Weibull has 2.

• AIC compares likelihoods, but corrects for parameters in the model – rewarding simpler models…

• Low values = better model fit– Even for negative values… -100 is better than -50.

Page 27: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Frailty

• Two kinds of models:– Shared Frailty – a “random effects” model

• Useful for clustered data (non-independent cases)• Can be used with Cox & parametric models• We’ll discuss this in detail in coming weeks

– Unshared Frailty• Models for “unobserved heterogeneity”• Only available for parametric models• Refers to individual-specific (unknown) characteristics

that affect likelihood of failure.

Page 28: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Unobserved Heterogeneity• Unobserved heterogeneity = differences

among cases in risk set that affect failure• Think of it as “omitted variable bias”

• Example: Effect of drug on mortality• Question: What half of the patients are smokers but

you didn’t know that?• An “unobserved” attribute that makes them different• Answer: The smokers and non-smokers might have

very different hazard rates…– But, you wouldn’t know to control for this…

Page 29: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Unobserved Heterogeneity

• Visually:

Time (months)0 10 20 30 40 50 60 70

Haz

ard

Rat

e

Non-Smokers

Smokers die early… exhausting the sample.

Then h(t) drops offSmokers

The observed hazard rate is modeled w/o

controlling for the cause of the drop off…

Observed h(t)

Page 30: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Unobserved Heterogeneity

• Result of unobserved heterogeneity:

• 1. Bias in the effects of covariates• Due to “uncontrolled antecedents” (Yamaguchi 1991)

• 2. Problems estimating duration effects• Because some leave the risk set early, resulting in a

“depressed” rate later on• Evidence of decline in hazard rate may be misleading.

Page 31: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Unobserved Heterogeneity

• Strategies:– 1. Develop fully-specified models

• The best solution

– 2. Specify the form of the heterogeneity (frailty)• Approach: assume unobserved alpha () – case-

specific factor that makes events more (or less) likely

)()|( thth Frailty Model

• Where h(t) is some familiar model (ex: Weibull)• Requires functional form assumptions to estimate

– Ex: Assume is gamma (or inv gaussian) distributed…

Page 32: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

PH Assumption & Outliers

• Models discussed today are proportional hazard models…

• Require the same assumption as Cox models• But, most of the “tests” of proportionality are only

available in Cox models• But: You can still use piecewise models and interaction

terms to check the assumption

• Cumulative Cox-Snell residuals can be used to identify outliers

• Use “predict”: predict ccs, ccsnell• Then, plot residuals by case ID, time, etc.

Page 33: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Parametric Models: Outliers• Cumulative Cox-Snell residuals vs case ID

LATVIAMACEDONIA

SLOVAKIA

SLOVENIA

ALGERIA

ANGOLA

BENIN

BUR-FASO

BURUNDICAMEROONCHAD

COMOROSCONGO

EGYPTETHIOPIA w eGAMBIA

GHANA

GUINEA

IVORY-CO

KENYA

MADAGASCMALAWIMALIMAURITAN

MAURITIUS

MOROCCO

MOZAMBIQNIGERNIGERIARWANDA

SENEGAL

SIERRA-L

SO-AFRICA

TANZANIA

TOGO

UGANDA

ZAMBIA

ZIMBABWE

CANADA

COSTA-RI

DOM-REP

EL-SALVA

GUATEMAHONDURAS

JAMAICAMEXICONICARAGPANAMA

TRIN&TOB

USA

ARGENTIN

BOLIVIA

BRAZIL

CHILE

COLOMBIA

ECUADOR

GUYANA

PARAGUAY

PERU

URUGUAY

BANGLAD

KAMPUCH

INDIA

INDONES

IRAN

ISRAEL

JAPAN

JORDANKOREA-R(S

LEBANON

MALAYSIA

NEPAL

PAKISTAN

PHILIPPI

SINGAPORSRI-LAN

SYRIA

THAILAND

TURKEY

BELGIUM

DENMARKFINLAND

ICELAND

IRELAND

LUXEMB

NETHERL

NORWAY

PORTUGAL

SWEDEN

SWITZERL

AUSTRAL

NEW-ZEAL

01

23

cum

. C

ox-

Sne

ll re

sidu

al

0 1000 2000 3000caseid

Note that Scandinavia has highest residuals

Probably not outliers, but interesting nevertheless

Page 34: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Accelerated Failure Time Models

• An alternative approach: model log time• Using parametric approach like exponential or Weibull• Focus is time rather than hazard rate

– But, models are similar to hazard rate models – just in a different “metric”

Xt )ln(• Where last term “e” is assumed to have a distribution

that defines the model (e.g., making it Weibull)

– AFT models aren’t very common in sociology• But, don’t be intimidated by them… they are similar to

parametric proportional hazard models…– But some software presents coefficient signs that are opposite!

Page 35: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Discrete Time EHA Models• Another completely different approach to EHA

– Described in Yamaguchi reading

• Break time into discrete chunks (ex: months, years)• Model dichotomous outcome (event vs. non-event) for

all chunks of time• Allows use of simple model, like logit

– Other common discrete time models: Probit, complementary log log models (“cloglog”)

– Data structure is similar to what we did for time-varying covariates, but…

• All records must cover the same length of time– Logit models don’t weight cases based on start/end time– Instead, time in analysis is represented simply by the number

of cases.

Page 36: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Choosing a Hazard Model

• A Cox model is a good starting point• Less problems due to accidental mis-specification of

the time-dependence of the hazard rate• Box-Steffensmeier & Jones point to cites: Cox models

are 95% as efficient as parametric models under many circumstances

– Cox models treat time dependence as a “nuisance”, put the focus on substantive covariates

• Which is often desirable.

Page 37: Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Choosing a Hazard Model

• Parametric models are good when • 1. You have strong theoretical expectations about the

hazard rate• 2. You are confident that you can fit the time

dependence well with a parametric model• 3. You need the most efficient estimates possible

• AGAIN: Substantive model specification is typically more important

• Biases due to omitted variables are often greater than biases due to poor model choice (e.g., Cox vs. Weibull)

• Also: In small samples, outliers are likely to be more important.