Heteroskedasticity & Dependence Paul Schrimpf Introduction Consequences of heteroskedas- ticity Var( ˆ β) with heteroskedas- ticity Calculating in R Examples Detecting heteroskedas- ticity Heteroskedasticity and efficiency Standard errors for dependent data Clustering Autocorrelation References Heteroskedasticity & Dependence Paul Schrimpf UBC Economics 326 March 6, 2018
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Heteroskedasticity & Dependence
Paul Schrimpf
UBCEconomics 326
March 6, 2018
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
1 Introduction
2 Consequences of heteroskedasticity
3 Var(β) with heteroskedasticityCalculating in R
4 Examples
5 Detecting heteroskedasticity
6 Heteroskedasticity and efficiency
7 Standard errors for dependent dataClusteringAutocorrelation
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
References
• Wooldridge (2013) chapter 8
• Stock and Watson (2009) chapter 10.5
• Angrist and Pischke (2014) chapter 5
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Section 1
Introduction
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Introduction
MLR.5 (homoskedasticity) Var(εi|X) = σ 2ε is often an
• There are still papers about estimating Engel curves• Banks, Blundell, and Lewbel (1997), Blundell, Duncan,and Pendakur (1998), Blundell, Browning, and Crawford(2003)
• Next slides uses data from British Family ExpenditureSurveys (FES) for 1980-1982 (same data as Blundell,Duncan, and Pendakur (1998))
• OLS still unbiased and consistent assumingMLR.1 (linear model)MLR.2 (independence) {(x1,i, x2,i, yi)}ni=1 is an independent
random sampleMLR.3 (rank condition) no multicollinearity: no xj,i is constant
and there is no exact linear relationship among the xj,iMLR.4 (exogeneity) E[εi|x1,i, ..., xk,i] = 0• Homoskedastic-only standard errors are incorrect,
Var(βj|X) =∑n
i=1σ 2ε∑n
i=1x2j,i
and plim∑n
i=1ε2i∑ni=1x
2j,i
= Var(βj)
• t (and F) statistics formed using homoskedasticity-onlystandard errors do not have t (or F) distributions (noteven asymptotically)
• Hypothesis tests and confidence intervals formed usinghomoskedasticity-only standard errors are invalid
• OLS is not BLUE
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Section 3
Var(β) with heteroskedasticity
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Var(β) with heteroskedasticity
• Central limit theorem: for i.i.d. data, if E[yi] = µ andVar(yi) = σ 2 for all i then
√n (yn − µ) d→N(0, σ 2)
• Recall how to how to show β is asymptotically normalin bivariate regression
√n(β1 − β1) =
√n( 1
n
∑ni=1(xi − x)yi
1n
∑ni=1(xi − x)2
− β1
)
=√n( 1
n
∑ni=1(xi − x)(β0 + β1xi + εi)
1n
∑ni=1(xi − x)2
− β1
)
=√n 1n
∑ni=1(xi − x)εi
1n
∑ni=1(xi − x)2
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Var(β) with heteroskedasticity• Already showed that 1
n
∑ni=1(xi − x)2 p→ Var(x)
• Need to apply CLT to√n 1n
∑ni=1(xi − x)εi. First note that
√n1n
n∑
i=1
(xi − x)εi =√n1n
n∑
i=1
(xi − E[x])εi + (E[x] − x)︸ ︷︷ ︸
p→ 0
√n1n
n∑
i=1
εi︸ ︷︷ ︸d→N(0,Var(ε))
• Let wi = (xi − E[x])εi, can apply CLT to wi because• E[wi] = E[xiεi] = 0• Observations are independent• Assume Var(wi) = E[(xi − E[x])2ε2
i ] exists
then 1√n
∑ni=1(xi − E[x])εi
d→N(0, E [(xi − E[x])2ε2
i ])
• Can conclude that
1√n
n∑
i=1
(xi − x)εid→N
(0, E
[(xi − E[x])2ε2
i
])
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Var(β) with heteroskedasticity
• By Slutsky’s theorem,
√n(β1 − β1) =
√n 1n
∑ni=1(xi − x)εi
1n
∑ni=1(xi − x)2
d→N(0, E [(xi − E[x])2ε2i ]
Var(x)2
)
or equivalently,
β1 − β1√E[(xi−E[x])2ε2i ]
nVar(x)2
d→N(0, 1)
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Var(β) with heteroskedasticity
• By Slutsky’s theorem can replace
√E[(xi−E[x])2ε2i ]
Var(x)2 by
consistent estimators, and
β1 − β1√1n
∑ni=1(xi−x)2ε2i
n( 1n
∑ni=1(xi−x)2)2
d→N(0, 1)
• Similar reasoning applies to multivariate regression
βj − βj√1n
∑ni=1x
2j,iε2i
n(
1n
∑ni=1x
2j,i
)2
d→N(0, 1)
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Var(β) with heteroskedasticity
•
√1n
∑ni=1x
2j,iε2i
n(
1n
∑ni=1x
2j,i
)2 is called the heteroskedasticity robust
standard error or the Eicker-Huber-White standarderror
• Statistics: Eicker (1967), Huber (1967)• Econometrics: White (1980)
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
R: Heteroskedasticity robuststandard errors
1 ## ca l cu l a t i n g Hete roskedas t i c i t y robust standard e r ro r s2 engelCurve <− lm ( foodexp ~ income , data = engel )3 ## ca l cu l a t e by hand4 hetSE <− sq r t (mean( ( engel $ income − mean( engel $ income ) )^2 *5 r e s i dua l s ( engelCurve )^2) /6 ( nrow ( engel ) *mean( ( engel $ income − mean( engel $ income ) )^2 )^2 ) )78 ## use ” sandwich” package9 l i b r a r y ( sandwich )10 sq r t ( vcovHC ( engelCurve , type =”HC0” ) [ 2 , 2 ] )1112 ## te s t H_ 0 : each c o e f f i c i e n t = 0 sepa ra t e l y13 l i b r a r y ( lmtest )14 coe f t e s t ( engelCurve , vcov=vcovHC ( engelCurve , type =”HC0” ) )15 ## compare with homoskedastic standard e r ro r s16 coe f t e s t ( engelCurve )1718 ## we would do F−t e s t s with19 ## waldtest ( engelCurve , vcov=vcovHC ( engelCurve , type =”HC0 ” ) ) or20 ## l r t e s t ( engelCurve , vcov=vcovHC ( engelCurve , type =”HC0 ” ) )2122 ## Or even eas ie r , j u s t use l f e packaage23 l i b r a r y ( l f e )24 engelCurve2 <− felm ( foodexp ~ income , data = engel )25 summary ( engelCurve2 , robust =TRUE )
Detecting heteroskedasticity• Generally best to assume heteroskedasticity and useheteroskedasticity consistent standard errors
• If you have homoskedasticity, but use heteroskedasticstandard errors, the heteroskedaticity consistentstandard error converges to the homoskedicity-onlystandard error, so in large samples it will make nodifference
• But if you assume homoskedastic when there isheteroskedasticity, your standard errors will beinconsistent, and your hypothesis tests and confidenceintervals will be invalid
• Nonetheless occasionally might want to check forheteroskedasticity
• Visually: plot εi against xj,i and/or yi• Formally: can test H0 : homoskedasticity, seeWooldridge (2013) chapter 8 for details
• If do not know h(xi) but can estimate it this is calledfeasible generalized least squares (FGLS)
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Heteroskedasticity andefficiency
• In most empirical work do not know h(xi) and oftencannot estimate it well, so usually just use OLS
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Section 7
Standard errors for dependent data
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Standard errors for dependentdata
• Assume:MLR.1 (linear model)MLR.2 (not too dependent) {(xj,i, , yi)}ni=1 is not too dependentMLR.3 (rank condition) no multicollinearity: no xj,i is constant
and there is no exact linear relationship among the xj,iMLR.4 (strict exogeneity) E[εi|{x1,j, ..., xk,j}nj=1] = 0
• As with heteroskedasticity, when observations are notindependent (and not too dependent):
• OLS remains unbiased and consistent• Usual homoskedastic-only or heteroskedasticity robuststandard errors are incorrect
• If you do not correct standard errors, hypothesis testsand confidence intervals are invalid
• OLS is not BLUE
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Standard errors for dependentdata
• To correct standard errors for dependent data, we needto know something about the form of dependence
• Common forms of dependence:• Clustering: observations can be organized into clusters(groups); pairs of observations in the same cluster aredependent, pairs of observations in different clustersare independent
• Autocorrelation (or serial correlation): observations aretaken over time; yt is correlated with yt−k
• Can correct standard errors for clustering andautocorrelation
• Clustered standard errors and autocorrelationconsistent standard errors are often much larger thanstandard errors that assume independence
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Clustering
• Clustering: observations can be organized into clusters(groups); pairs of observations in the same cluster aredependent, pairs of observations in different clustersare independent
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Clustering - examples
• Blimpo (2014) (from 2015 midterm)• Observations of students test scores, about 1500students from 100 schools
• Students’ test scores in the same school unlikely to beindependent, but reasonable to think students indifferent schools are independent
• Data on 20 industries from 118 countries• log vij = log labor productivity in manufacturingindustry i in country j ten years in the past
• ∆vij be the annual growth rate of labor productivity inindustry i in country j over the past ten years
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Clustering - examples
• Unrealistic to assume to assume that industries withinthe same country are independent, so E[εijεik] = 0
• More reasonable to assume that observations indifferent countries are independent
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Clustering
yij = β0 + β1xij + εij
• J clusters, n(j) observations in each cluster
• Assume observations in different clusters areindependent (observations in the same cluster can bearbitrarily dependent)
• Thenβ1 − β1√√√√
∑Jj=1
(∑n(j)i=1(xij−x)εij
)2
1n
(∑i,j(xij−x)2
)2
d→N(0, 1)
as J→∞• Can use these standard errors and perform t and F testsas usual
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Clustering
• These standard errors allow heteroskedasticity andclustering
• This is an asymptotic result for a large number ofclusters, many empirical applications have relativelyfew clusters, there is no consesus on the best thing todo when you have few clusters, but there are number ofpapers about it, see Angrist and Pischke (2009) chapter 8
• For clustered standard errors in R use the felm from thelfe package
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
Autocorrelation• Observations over time
yt = β0 + β1xt + εt
• Unlikely that εt independent of εt−k
• Need to correct standard error of β• Correct formula called Newey-West or autocorrelationconsistent or HAC (heteroskedasticity andautocorrelation consistent) standard errors
se(β1) =
∑T1/3ℓ=−(T1/3)
[(1 −
∣∣ ℓT1/3
∣∣) 1T∑T
t=1(xt − x)εt(xt+ℓ − x)εt+ℓ]
(∑Tt=1(xt − x)2
)2
• See Wooldridge (2013) section 12.5 for details, orMikusheva and Schrimpf (2007) lectures 2 & 3, &recitation 2 for more discussion
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
References
Angrist, J.D. and J.S. Pischke. 2009. Mostly harmlesseconometrics: An empiricist’s companion. PrincetonUniversity Press.
Angrist, Joshua D and Jörn-Steffen Pischke. 2014. Mastering’Metrics: The Path from Cause to Effect. PrincetonUniversity Press.
Banks, James, Richard Blundell, and Arthur Lewbel. 1997.“Quadratic Engel curves and consumer demand.” Reviewof Economics and Statistics 79 (4):527–539.
Blimpo, Moussa P. 2014. “Team Incentives for Education inDeveloping Countries: A Randomized Field Experiment inBenin.” American Economic Journal: Applied Economics6 (4):90–109. URL http://www.aeaweb.org/articles.php?doi=10.1257/app.6.4.90.
Blundell, Richard, Alan Duncan, and Krishna Pendakur. 1998.“Semiparametric estimation and consumer demand.”Journal of Applied Econometrics 13 (5):435–461.
Blundell, Richard W, Martin Browning, and Ian A Crawford.2003. “Nonparametric Engel curves and revealedpreference.” Econometrica 71 (1):205–240.
Chai, Andreas and Alessio Moneta. 2010. “Retrospectivesengel curves.” The Journal of Economic Perspectives24 (1):225–240.
Eicker, Friedhelm. 1967. “Limit theorems for regressionswith unequal and dependent errors.” In Proceedings of thefifth Berkeley symposium on mathematical statistics andprobability, vol. 1. 59–82.
Heteroskedasticity& Dependence
Paul Schrimpf
Introduction
Consequencesofheteroskedas-ticity
Var(β) withheteroskedas-ticityCalculating in R
Examples
Detectingheteroskedas-ticity
Heteroskedasticityand efficiency
Standarderrors fordependentdataClustering
Autocorrelation
References
References
Engel, Ernst. 1857. “Die Productions- undConsumptionsverhaeltnisse des Koenigsreichs Sachsen.”Zeitschrift des Statistischen Bureaus des KoniglichSachsischen Ministeriums des Inneren (8-9).
Huber, Peter J. 1967. “The behavior of maximum likelihoodestimates under nonstandard conditions.” In Proceedingsof the fifth Berkeley symposium on mathematical statisticsand probability, vol. 1. 221–33.
Lewbel, Arthur. 2008. “Engel curve.” In The New PalgraveDictionary of Economics, edited by Steven N. Durlauf andLawrence E. Blume. Basingstoke: Palgrave Macmillan.URL http://www.dictionaryofeconomics.com/article?id=pde2008_E000085.
ReferencesMikusheva, Anna and Paul Schrimpf. 2007. “14.384 TimeSeries Analysis, Fall 2007 (revised 2009).” URLhttp://ocw.mit.edu/courses/economics/14-384-time-series-analysis-fall-2013/lecture-notes/.
Rodrik, Dani. 2013. “Unconditional Convergence inManufacturing.” The Quarterly Journal of Economics128 (1):165–204. URL http://qje.oxfordjournals.org/content/128/1/165.abstract.
Stock, J.H. and M.W. Watson. 2009. Introduction toEconometrics, 2/E. Addison-Wesley.
White, Halbert. 1980. “A heteroskedasticity-consistentcovariance matrix estimator and a direct test forheteroskedasticity.” Econometrica: Journal of theEconometric Society :817–838URLhttp://www.jstor.org/stable/1912934.