Top Banner
Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission
36

Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Event History Analysis 6

Sociology 8811 Lecture 20

Copyright © 2007 by Evan SchoferDo not copy or distribute without permission

Page 2: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Announcements

• Paper Assignment #2 handed out previously• Due April 26

• Class topic: • More details on Cox model diagnostics• Next class: parametric models, AFT models (if time)…• Then – new topics!

Page 3: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Review: Cox Model Basics

• Choosing how to deal with ties• Several algorithms available• Efron method is better than Breslow (default)• Exact marginal is even better – but time consuming

• The baseline hazard rate• Can be computed in stata…• Plus, you can estimate the hazard rates at particular

values of variables

• The proportional hazard assumption• We’ll continue discussing several methods…

Page 4: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption

• Key assumption: Proportional hazards• Estimated Hazard ratios are proportional over time• i.e., Estimates of a hazard ratio do NOT vary over time

– Example: Effect of “abstinence” program on sexual behavior

• Issue: Do abstinence programs lower the rate in a consistent manner across time?

– Or, perhaps the rate is lower initially… but then the rate jumps back up (maybe even exceeds the control group).

– Groups are assumed to have “parallel” hazards• Rather than rates that diverge, converge (or cross).

Page 5: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption

• Strategies:

• 1. Visually examine raw hazard plots for sub-groups in your data

• Watch for non-parallel trends• A simple, crude method… but often identifies big

violations

Page 6: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption• Visual examination of raw hazard rate

0.0

5.1

.15

1970 1980 1990 2000analysis time

west = 0 west = 1

Smoothed hazard estimates, by west

Parallel trends in hazard rate indicate proportionality

Page 7: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption

• 2. Plot –ln(-ln(survival plot)) versus ln(time) across values of X variables

• What stata calls “stphplot”• Parallel lines indicate proportional hazards• Again, convergence and divergence (or crossing)

indicates violation

– A less-common approach: compare observed survivor plot to predicted values (for different values of X)

• What stata calls “stcoxkm”• If observed are similar to predicted, assumption is not

likely to be violated.

Page 8: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption• -ln(-ln(survivor)) vs. ln(time) – “stphplot”

Convergence suggests violation of proportional hazard assumption

(But, I’ve seen worse!)

-10

12

34

-ln[-

ln(S

urv

ival

Pro

babi

lity)

]

7.585 7.59 7.595 7.6 7.605ln(analysis time)

west = 0 west = 1

Page 9: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption• Cox estimate vs. observed KM – “stcoxkm”

0.0

00.

20

0.4

00.

60

0.8

01.

00

Sur

viva

l Pro

bab

ility

1970 1980 1990 2000analysis time

Observed: west = 0 Observed: west = 1Predicted: west = 0 Predicted: west = 1

Predicted differs from observed for countries in West

Page 10: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption

• 3. Piecewise Models• Piecewise = break model up into pieces (by time)

– Ex: Split analysis in to “early” vs “late” time

• If coefficients vary in different time periods, hazards are not proportional

– Example:• stcox var1 var2 var3 if _t < 10 • stcox var1 var2 var3 if _t >= 10 • Look for large changes in coefficients!

Page 11: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption

• In a piecewise model, coefficients would differ in non-proportional models

Proportional Non-Proportional

Here, the effect is the same in both time periods

Early Late Early Late

Here, the effect is negative in the early period and positive in the late period

Page 12: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Piecewise Models• Look at coefficients at 2 (or more) spans of timeEARLY. stcox gdp degradation education democracy ngo ingo if year < 1985, robust nohr          _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]-------------+----------------------------------------------------------------         gdp |   .4465818   .4255587     1.05   0.294    -.3874979    1.280661 degradation |   -.282548   .1572746    -1.80   0.072    -.5908005    .0257045   education |  -.0195118   .0328195    -0.59   0.552    -.0838368    .0448131   democracy |   .2295673   .2625205     0.87   0.382    -.2849634     .744098         ngo |   .6792462   .3110294     2.18   0.029     .0696399    1.288853        ingo |   .6664661   .4804229     1.39   0.165    -.2751456    1.608078------------------------------------------------------------------------------LATE. stcox gdp degradation education democracy ngo ingo if year >= 1985, robust nohr          _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]-------------+----------------------------------------------------------------         gdp |   .4963942    .357739     1.39   0.165    -.2047613     1.19755 degradation |  -.5702894   .2395257    -2.38   0.017    -1.039751   -.1008277   education |   .0142118   .0143762     0.99   0.323    -.0139649    .0423886   democracy |   .2541799   .0981386     2.59   0.010     .0618317    .4465281         ngo |   .1742862   .1448187     1.20   0.229    -.1095532    .4581256        ingo |  -.1134661   .2104308    -0.54   0.590    -.5259028    .2989707------------------------------------------------------------------------------

Note: Effect of ngo is larger in early period

Page 13: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption

• 4. Tests based on re-estimating model• Try including time interactions in your model• Recall: Interactions – effect of A on C varies with B• If effect of variable X on hazard rate (or ratio) varies

with time, then hazards aren’t proportional

– Recall example: Abstinence programs• Perhaps abstinence programs have a big effect initially,

but the effect diminishes (or reverses) later on

Page 14: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption

• Red = Abstinence group; green = control

No time interaction Positive timeinteraction

In non-proportional case, the effect of abstinence programs varies across time

Page 15: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption

• Strategy: Create variables that reflect the interaction of X variables with time

• Significant effects of time interactions indicate non-proportional hazard

• Fortunately, inclusion of the interaction term in the model corrects the problem.

• Issue: X variables can interact with time in multiple ways…

– Linearly– With “log time” or time squared– With time dummies– You may have to try a range of things…

Page 16: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption

• Red = Abstinence group; green = control

Linear time interactionEffect grows consistently over timeTry “Abstinence*time”

Interaction with time-period… Effect differs early vs. late Try “Abstinence*DLate”

Page 17: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption

• 5. Grambsch & Therneau test – Ex: Stata “estat phtest”

• Test for non-zero slope of Schoenfeld residuals vs time– Implies log hazard ratio function = proportional

• Can be applied to general model, or for each variable

stcox gdp degradation education democracy ngo ingo, robust nohr scaledsch(sca*) schoenfeld(sch*)

. estat phtest

Test of proportional hazards assumption

Time: Time ---------------------------------------------------------------- | chi2 df Prob>chi2 ------------+--------------------------------------------------- global test | 18.14 6 0.0059 ----------------------------------------------------------------

Significant chi-square indicates violation of proportional hazard assumption

Page 18: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption

• Variable-by-variable test “estat phtest”:

. estat phtest, detail

Test of proportional hazards assumption

Time: Time ---------------------------------------------------------------- | rho chi2 df Prob>chi2 ------------+--------------------------------------------------- gdp | 0.09035 0.63 1 0.4277 degradation | -0.22735 3.41 1 0.0646 education | 0.06915 0.47 1 0.4950 democracy | -0.04929 0.20 1 0.6560 ngo | -0.18691 4.56 1 0.0327 ingo | -0.03759 0.34 1 0.5609 ------------+--------------------------------------------------- global test | 18.14 6 0.0059 ----------------------------------------------------------------

Note: Certain variables are especially problematic…

Page 19: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption• Notes on estat phtest :

– 1. Requires that you calculate “schoenfeld residuals” when you run the original cox model

– And, if you want a test for each variable, you must also request scaled schoenfeld residuals

– 2. Test is based on identifying non-zero time trend… but how should we characterize time?

• Options: normal/linear time, log time, time dummies, etc– Results may differ depending on your choice– Ex: estat phtest, log – specifies “log time”

• Plot of smoothed Schoenfeld residuals can indicate best way to characterize time

– Linear trend (not a curve) indicates that time is characterized OK– Ex: estat phtest, plot(ngo) OR estat phtest, log plot(ngo)

Page 20: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption

• What if the assumption is violated?

• 1. Improve model specification• Add time interactions to address nonproportionality• Ex: If high democracies are not proportional to low

democracies, try adding “highdemoc*time”• Variables can be interacted with linear time, log time,

time dummies, etc., to address the issue

• 2. Model groups separately• Split sample along variables that are non-proportional.

Page 21: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption

• What if the assumption is violated?

• 3. Use a stratified Cox model• Allows a different baseline hazard for each group

– But, you can’t estimate effect of stratifying variable!

• Ex: stcox var1 var2 var3, strata(Dhighdemoc)

• 4. Use a piecewise model• Split time into chunks… in which PH assumption is met

– Requires sufficient sample size in all time periods!

Page 22: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Proportional Hazard Assumption

• What if the assumption is violated?

• 5. Live with it (but temper your conclusions)• Violation of proportional hazard assumption tends to:

– Overestimate the effect of variables whose hazard ratios are increasing over time

– And, underestimate those whose hazard ratios are decreasing

• However, Allison points out: Cox model is reasonably robust

– Other issues (e.g., model misspecification) are bigger issues

Page 23: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Cox Model: Residuals

• OLS regression: Residuals = difference between predicted value of Y and observed

• Y-hat – Yi

• EHA: Residuals are more complicated• You could compute predicted failure minus observed…• But, what about censored cases? What is observed?• There are a number of different ways to calculate

residuals… each with different properties.

Page 24: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Cox Model: Residuals – Summary• From Cleves et al. (2004) An Introduction to Survival

Analysis Using Stata, p. 184:• 1. Cox-Snell residuals

• … are useful for assessing overall model fit

• 2. Martingale residuals• Are useful in determining the functional form of the covariates to

be included in the model

• 3. Schoenfeld residuals (scaled & unscaled), score residuals, and efficient score residuals

• Are useful for checking & testing the proportional hazard assumption, examining leverage points, and identifying outliers

• 4. Deviance residuals• Are useful fin examining model accuracy and identifying outliers.

Page 25: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Martingale/deviance Residuals Outliers

• Martingale residuals: difference over time of observed failures minus expected failures

• Feature: range from +1 to –infinity

– Deviance residuals = martingale residuals that are rescaled to be symmetric around zero

• Easier to interpret

• Extreme martingale or deviance residuals may indicate outliers

• Plot residuals vs. time, case number, IVs, etc.• Or simply sort data by residuals & list the cases.

Page 26: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Martingale & Deviance Residuals: Outliers

• Stata code to identify outliers:

*run Cox Model, calculate martingale residualsstcox var1 var2 var3, robust nohr mgale(mg)* Creates variable “mg” which contains martingale residuals* Next, compute deviance residuals using “predict”predict dev, deviancegen caseid = _n* create plots of various typesscatter mg caseid* Deviance residual plots are generally easier to interpretscatter dev caseidscatter dev caseid, mlabel(newname2)scatter dev _t

Page 27: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Deviance Residuals Plot• Extreme values may be outliers

CROATIA

LATVIA

MACEDONIA

SLOVAKIA

SLOVENIA

ALGERIA

ANGOLA

BENIN

BUR-FASO

BURUNDI

CAMEROON

CHADCOMOROS

CONGO

EGYPT

ETHIOPIA w e

GAMBIA

GHANA

GUINEA

IVORY-CO

KENYA

MADAGASC

MALAWI

MALI

MAURITANMAURITIUS

MOROCCO

MOZAMBIQ

NIGER

NIGERIA

RWANDA

SENEGAL

SIERRA-L

SO-AFRICA

TANZANIA

TOGO

UGANDA

ZAMBIA

ZIMBABWE

CANADA

COSTA-RICUBA

DOM-REP

EL-SALVA

GUATEMA

HONDURAS

JAMAICA

MEXICO

NICARAGPANAMATRIN&TOB

USA

ARGENTIN

BOLIVIA

BRAZIL

CHILE

COLOMBIA

ECUADORGUYANA

PARAGUAY

PERU

URUGUAY

BANGLAD

CYPRUS

KAMPUCH

INDIA

INDONES

IRAN

ISRAEL

JAPAN

JORDAN

KOREA-R(S

LEBANON

MALAYSIA

NEPAL

PAKISTAN

PHILIPPI

SINGAPOR

SRI-LAN

SYRIA

THAILAND

TURKEY

BELGIUM

DENMARKFINLAND

ICELAND

IRELAND

LUXEMB

NETHERL

NORWAY

PORTUGAL

SWEDEN

SWITZERL

AUSTRAL

NEW-ZEAL

-2-1

01

2de

via

nce

re

sidu

al

0 1000 2000 3000caseid

Here, no obvious outliers are visible

Page 28: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Efficient Score Residuals: Influential Cases

• Procedure for identifying outliers using ESRs• It is possible to compute DFBETAs based on ESRs• DFBETA: Change in coefficient a variable’s coefficient

due to a particular case in the analysis– Cases with big DFBETAS may be overly influential

– Issue: Stata cannot automatically compute DFBETAS…

• You have to compute them manually• Also, computation = limited to 800 cases (for

“intercooled stata”)• Hopefully stata will improve this in the future.

Page 29: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

ESRs: Influential Cases• Stata code to estimate DFBETAs:* Run Cox model, request efficient score residuals* Creates vars: esr1 to esr5 corresponding to vars listed in modelstcox gdp var1 var2 var3 var4, robust nohr esr(esr*)* Create room for a matrix of up to 800 rows (for your cases)set matsize 800* Create esr matrixmkmat esr1 esr2 esr3 esr4, matrix(esr)* Multiply ESRs and Var/Cov matrix to estimate DFBETAs, save resultsmat V=e(V)mat Inf = esr*Vsvmat Inf, names(s)* Label estimates for subsequent plotslabel var s1 "dfbeta – var 1"label var s2 "dfbeta – var 2"label var s3 "dfbeta – var 3"label var s4 "dfbeta – var 4"* Plot DFBETAs for each variable vs. time or case numberscatter s1 _t, yline(0) mlab(caseID) s(i)scatter s1 casenumber, yline(0) mlab(caseID) s(i)* Look for extreme values (for each IV – s1 to s4)

Page 30: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

DFBETA Example• DFBETA for NGOs (plotted by casenumber)

LATVIA

MACEDONIA

MACEDONIAMACEDONIAMACEDONIA

MACEDONIA

SLOVAKIASLOVAKIASLOVAKIASLOVAKIASLOVAKIASLOVAKIASLOVAKIASLOVAKIASLOVENIASLOVENIASLOVENIASLOVENIASLOVENIASLOVENIASLOVENIASLOVENIASLOVENIAALGERIAALGERIAALGERIAALGERIAALGERIAALGERIAALGERIAALGERIAALGERIAALGERIAALGERIAALGERIAALGERIA

ALGERIA

ANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLAANGOLA

ANGOLA

BENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENINBENIN

BENIN

BUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASOBUR-FASO

BUR-FASO

BURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDIBURUNDI

BURUNDI

CAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROONCAMEROON

CAMEROON

CHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHADCHAD

CHAD

COMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCOMOROSCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGOCONGO

CONGO

EGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPTEGYPT

EGYPT

ETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIAETHIOPIA w eETHIOPIA w eETHIOPIA w eETHIOPIA w eETHIOPIA w eETHIOPIA w eETHIOPIA w eETHIOPIA w eGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIAGAMBIA

GAMBIA

GHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGHANAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEAGUINEA

GUINEAIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-COIVORY-CO

KENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYAKENYA

KENYA

MADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASCMADAGASC

MADAGASC

MALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWIMALAWI

MALAWI

MALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALIMALI

MALI

MAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITANMAURITAN

MAURITAN

MAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUSMAURITIUS

MAURITIUS

MOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOROCCOMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQMOZAMBIQNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGERNIGER

NIGER

NIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIANIGERIA

NIGERIA

RWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDARWANDASENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSENEGALSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-LSIERRA-L

SO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICASO-AFRICA

SO-AFRICA

TANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATANZANIATOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOTOGOUGANDAUGANDA

-.05

0.0

5.1

dfb

eta

- ng

o

0 200 400 600 800casenumber

DFBETA value indicates that presence of Latvia changes NGO coefficient by +.075 standard deviations

Page 31: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Cox-Snell residuals: Model Fit

• Cox-Snell residuals can be plotted to assess model fit

• If model fits well, graph of integrated (cumulative) hazard conditional on Cox-Snell residuals vs. Cox-Snell residuals will fall on a line

– Strategy in stata:• Run Cox model, request martingale residuals• Use “predict” to compute Cox-Snell residuals• Stset your data again, with Cox-Snell as time variable• Compute integrated hazard• Graph integrated hazard versus residuals.

Page 32: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Cox-Snell residuals: Model Fit

• Cox-Snell residuals can be plotted to assess model fit

• If model fits well, graph of integrated (cumulative) hazard conditional on Cox-Snell residuals vs. Cox-Snell residuals will fall on a line

– Strategy in stata:• Run Cox model, request martingale residuals• Use “predict” to compute Cox-Snell residuals• Stset your data again, with Cox-Snell as time variable• Compute integrated hazard• Graph integrated hazard versus residuals.

Page 33: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Cox-Snell Model Fit Example• Cox-Snell Plot for Environmental Law data

0.5

11.

52

0 .2 .4 .6partial Cox-Snell residual

Nelson-Aalen cumulative hazardpartial Cox-Snell residual

This looks quite bad. Cumulative hazard should fall on the line… Instead, there is a sizable gap.

Note: Don’t worry about small deviations from the line at the right edge of the plot. There are typically few cases there…

Page 34: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Stratified Cox Models

• Stratified models allow different baseline hazards for sub-groups in your data

• But constrain all model coefficients to be the same across all groups

• Useful if we think that some sub-groups have very different hazard curves over time

– AND we aren’t really interested in differences across those groups – it is just a nuisance

• Another option is to simply analyze groups separately– But, we lose sample size. Stratifying avoids that.

Page 35: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Cox Models for Grouped Data

• Sometimes cases are not independent• Ex: Students in same class; people in same family

• Two useful options:– 1. Stata’s “cluster” command: Adjusts standard

errors based on group membership•stcox var1 var2 var3, cluster(FamilyID)

– 2. Cox model with shared frailty • Another name for a random effects model

– We’ll discuss this later in the course– Don’t confuse with non-shared frailty models!

•stcox var1 var2 var3, shared(FamilyID)

Page 36: Event History Analysis 6 Sociology 8811 Lecture 20 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Stata Note: TVC

• Note: Stata has a special option for “time varying covariates”

• You DO NOT need to use this!!!– It is designed for cases where you wish to specify the

character of time variation (e.g., a rate of decay)

• You can simply create time-varying variables in your dataset…

– STATA will analyze it properly using the stcox command.