Top Banner

of 41

Spatial Econometrics Jul9

Apr 03, 2018

Download

Documents

sankha80
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/28/2019 Spatial Econometrics Jul9

    1/41

    Introduction to Spatial EconometricsAlfonso Flores-Lagunes1,2

    Workshop prepared for CEPS/INSTEADJuly 9, 2012

    1State University of New York at Binghamton

    2Institute for the Study of Labor (IZA)

    http://find/http://goback/
  • 7/28/2019 Spatial Econometrics Jul9

    2/41

    1. IntroductionTwo Examples

    2. Basic Concepts

    3. SAL Model

    4. SAE Model5. SAL-SAE Model

    6. Testing for Spatial Dependence

    7. GMM

    8. Software

    2/41

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    3/41

    1. What is spatial econometrics?

    The set of econometrics models and methods that allowspatial dependence across observations.Spatial dependence refers to dependence over space, aswell as cross-sectional dependence over some concept of

    distance (e.g., economic distance).Introduced by Paelinck and Klaassen (1979) andpopularized by, among others, Anselin (1980).By now, spatial econometrics has been employed in

    almost all elds of economics, e.g., Agricultural, labor,public nance, environmental, industrial organization,development.

    3/41

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    4/41

  • 7/28/2019 Spatial Econometrics Jul9

    5/41

    1.2 A Hedonic Housing Price ModelSuppose we are interested in estimating a hedonic housing

    price modelClearly, houses that share a common location will beaffected by common geographic characteristics (e.g.,proximity to a good such as a park; or to a bad such as a

    waste site or airport noise).If we have data onall the detailed information aboutlocation characteristics, then including that informationwill take care of such factors.

    To the extent that we cannot control for all locationcharacteristics, there will be unobservables that vary overspace and that will lead to inefficiencies in our estimatedmodel(s).

    5/41

    http://find/http://goback/
  • 7/28/2019 Spatial Econometrics Jul9

    6/41

    2. Basic Concepts

    There are at least two approaches to spatial dependence.We will introduce the approach that consists on imposingstructure to a model in order to specify the spatialdependence.

    This structure takes the form of (a) the specication of aspatial weighting matrix and (b) a spatialautocorrelation parameter to be estimated.This approach has been advocated and employed by,

    among others, L. Anselin, H. Kelejian, I. Prucha, L-f. Lee.Another approach nonparametrically estimates the spatialstructure under certain conditions (e.g., Conley, 1999).

    6/41

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    7/41

    2. Basic Concepts

    In general, the issue to take into account is the increasedsimilarity or dissimilarity of units as they are closer toeach other in space.cov ( y i , y j ) = 0 for i = j that is stronger as i is closer to j .Similarity positive spatial correlation.Dissimilarity negative spatial correlation.

    7/41

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    8/41

    2. Basic Concepts

    For a single cross-sectional sample, the problem is thatspatial correlation implies the existence of aNxN variance-covariance matrix (VCM) of spatial correlations.That many correlation parameters cannot be estimated

    from a single cross-sectional sample.One solution: impose (assume) some structure on thisVCM.Assume that for each unit there is a neighborhood of

    data within which spatial dependence arises.This solution implies setting this neighborhood up in theform of a spatial weighting matrix .

    8/41

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    9/41

    2. Basic ConceptsThe spatial weighting matrix (W ) contains for each

    observation i (rows) the locations (columns) of the otherobservations that belong to the assumed neighborhoodset of i . Thus, it is a NxN matrix.Notes:

    It is typically specied by the researcher: this may bead-hoc.It is typically row-standardized (each row adds up to 1)so that the interpretation is as an average of neighboringvalues.

    The diagonal is composed of zeroes.Examples of W : (a) neighbors that share a commonborder (rst-order contiguity), (b) neighbors within agiven distance of each other.Importantly, W is not estimated.

    9/41

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    10/41

    3. The Spatial Autoregressive Lag (SAL) Model

    Also known as mixed-regressive spatial autoregressive(MRSA) model or spatial autoregressive (SAR) model.

    y = Wy + X + , iid (0, 2)

    where is the SAL parameter with ( 1, 1), and Wy is thespatially lagged dependent variable.

    An issue with this model is endogeneity of Wy , as it iscorrelated with .As a consequence, OLS is inconsistent (an exception is inLee, 2002).

    10/41

    http://find/http://goback/
  • 7/28/2019 Spatial Econometrics Jul9

    11/41

    3. SAL Model

    The model can be written in the following reduced form:

    y Wy = X + y (I W ) = X +

    y =( I W ) 1X + ( I W ) 1

    Also, note that:

    var [ y ] =var [(I W ) 1X ] + var [(I W ) 1]

    = ( I W ) 1var [](I W ) 1

    = 2(I W ) 1(I W ) 1

    11/41

    http://find/http://goback/
  • 7/28/2019 Spatial Econometrics Jul9

    12/41

    3. SAL Model: Estimation

    The SAL model can be estimated with maximumlikelihood (MLE; Ord, 1975; formally Lee, 2007).Take the model in reduced form y = ( I W ) 1X + ( I W ) 1 and assume that isnormally distributed.Note that the transformation of into (I W ) 1implies that the likelihood function will contain thefollowing determinant: |I W | .

    This determinant is NxN and thus it is difficult tocompute for large N .

    12/41

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    13/41

    3. SAL Model: Estimation

    Ord (1975) showed that |I W | = N i =1 (1 i )where i are the eigenvalues of W , which are somewhateasier to compute.Then, the log-likelihood function () takes the form:.

    SAL =N

    i =1

    ln(1 i )N 2

    ln(2) N 2

    ln(2)

    ( y Wy X ) ( y Wy X )

    22which is max. with respect to , , and 2.

    13/41

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    14/41

    3. SAL Model: Estimation

    Notes:MLE is computationally expensive and potentiallyinaccurate for large N .There were no clear theoretical properties until Lee (2007)

    who showed the estimator has usual MLE properties.We will cover another estimation method in GMM.Typically, it is useful to use sparse matrices inprogramming, as W is typically sparse.

    Note that the marginal effects are no longer but instead(I W ) 1 (from the reduced form).

    14/41

    http://goforward/http://find/http://goback/
  • 7/28/2019 Spatial Econometrics Jul9

    15/41

    4. The Spatial Autoregressive Error (SAE) Model

    Also known as spatial autoregressive error model (SARE)model or spatial error model (SEM).

    y = X + u , u = Wu + , iid (0, 2)

    where is the SAE parameter with ( 1, 1), and Wu isthe spatially lagged error term.

    Note that in this model there is no endogeneity, as it wasthe case in SAL.

    However, the error is clearly non-spherical. As aconsequence, OLS is inefficient.

    15/41

    http://goforward/http://find/http://goback/
  • 7/28/2019 Spatial Econometrics Jul9

    16/41

    4. SAE Model

    The model can be written in the following reduced form:

    Note : u (I W ) = = u = ( I W ) 1Thus : y = X + ( I W ) 1

    and

    E [uu ] =E [(I W ) 1 (I W ) 1]= 2(I W ) 1(I W ) 1

    The VCM of the errors (last line) is a full matrix, withboth heteroskedasticity and spatial correlation.

    16/41

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    17/41

    4. SAE Model

    The SAL and SAE models are closely related (throughnon-linear constraints) and thus difficult to tell themapart in practice with statistical tests (more later).Thus, it is important to judiciously choose and carefully justify when choosing between the two.

    SAL is consistent with models of spillovers and strategicbehavior in y . e.g., strategic competition as in example 1.

    SAE is consistent with spatial dependence arising fromcorrelation in unobservables. e.g., natural heterogeneityover space as in example 2.

    17/41

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    18/41

    4. SAE Model: Estimation

    The SAE model can be estimated with MLE.Take the model in reduced form and dene (I W ) 1(I W ) 1 and assume that isnormally distributed.

    Then the model can be seen as a GLS-type model withcorresponding log-likelihood function:

    SAE = 12

    ln ||N 2

    ln(2) N 2

    ln(2)

    ( y X ) 1

    ( y X )22

    which is maximized with respect to , ,and 2.

    18/41

    http://find/http://goback/
  • 7/28/2019 Spatial Econometrics Jul9

    19/41

  • 7/28/2019 Spatial Econometrics Jul9

    20/41

    5. SAL-SAE Model

    Also known as SAC model or SAR-SARE model. Itcombines the previous two models.

    y = Wy + X + u , u = Mu + , iid (0, 2)

    with all the variables dened before andM another spatialweighting matrix.

    In general, W = M since in that case identication issuescan arise (see, e.g., Anselin and Bera, 1998).

    20/41

    http://find/http://goback/
  • 7/28/2019 Spatial Econometrics Jul9

    21/41

    5. SAL-SAE Model

    The model can be written in the following reduced form:

    y = ( I W ) 1X + ( I W ) 1(I M ) 1

    and variances given by:

    E [uu ] =2(I M ) 1(I M ) 1

    var ( y ) = 2(I W ) 1(I M ) 1(I M ) 1(I W ) 1

    The VCM of the errors is a full matrix.

    21/41

    d l

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    22/41

    5. SAL-SAE Model: Estimation

    The SAL-SAE model can be estimated with MLE.Note that the reduced form error is(I W ) 1(I M ) 1 and assume that is normal.

    Then the log-likelihood function for SAL-SAE:=

    12

    ln |(I W )( I M ) 1(I M ) 1| N 2

    ln(2) N 2

    ln(2

    ( y Wy X ) [(I M )( I M )] 1( y Wy X )

    22

    which is maximized with respect to , , ,and 2.

    22/41

    6 T i

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    23/41

    6. Testing

    Given the similarities in the models covered and therelative complexity in estimation, it is important toperform specication tests.

    The rst specication test, the Morans I, is a commonlyused general test for the presence of any spatialdependence.It is forH 0 : no spatial dependence and with no particular

    H A : in mind.

    23/41

    6 T i f S i l D d

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    24/41

    6. Testing for Spatial Dependence

    The test statistic depends on whether W is rowstandardized.The expressions are as follows. The basis is

    I (R ) = R e We e e

    which is standardized by its mean and variance:

    E [I (R )] =R Tr (M X W )

    N k

    Var [I (R )] = R 2 Tr (M X WM X W ) + Tr (M X W )2

    + [Tr (M X W )]2

    (N k )(N k + 2) [E (I )]2

    24/41

    6 T ti f S ti l D d

    http://find/http://goback/
  • 7/28/2019 Spatial Econometrics Jul9

    25/41

    6. Testing for Spatial Dependence

    Then the test statistic becomes:

    I (R ) E [I (R )]V [I (R )]1/ 2

    N (0, 1)

    where R = 1 if W is row-standardized orR = N / S if W isnot row standardized, e = y X OLS ,M X = ( I P X ) = ( I X (X X ) 1X ), S = N i =1

    N j =1 ij , ij

    are the elements of W , N is the sample size, andk the

    number of regressors.

    25/41

    6 T ti f S ti l D d

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    26/41

    6. Testing for Spatial Dependence

    Notes about Moran Is test:It is simple to implement as it only requires OLS residualsand a plausible W.

    Since it is a general test, it is likely to have low power anddoes not point to any particular alternative.It has been extended to a large family of limiteddependent variable (LDV) models (Kelejian and Prucha,2001).

    26/41

    6 T ti g f SAE

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    27/41

    6. Testing for SAE

    Several tests for H 0

    : = 0 are available based on MLEtheory.For instance, a LR test comparing OLS ( = 0) with theSAE model; done in the usual way (LR 2(1) ).As usual in MLE theory, the LM test only requiresestimation under the null (OLS), which avoids estimationof the SEL model.

    The LM test takes the form:

    LM = e We 2

    2= 1

    Tr [(W + W )W ] 2(1)

    27/41

    6 Testing for SAL

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    28/41

    6. Testing for SAL

    Implies H 0 : = 0 and tests can be devised based onMLE theory.For example, a LM test that only requires estimationunder the null (OLS) is:

    LM =e Wy

    22 2

    (WX ) M X (WX ) + Tr [(W + W )W ] 2 2(1)

    28/41

    6 Other Tests

    http://find/http://goback/
  • 7/28/2019 Spatial Econometrics Jul9

    29/41

    6. Other Tests

    Note that the tests for SAE and SAL above implicitlyassume that = 0 and = 0, respectively.If that assumption is not true, the test is not valid.

    Anselin (1988) proposed a LM test forH 0 : = = 0based on OLS residuals, by comparing to the SAL-SAEmodel.But a problem is that rejection does offer guidance aboutwhich one of SAE or SAL is present.

    29/41

    6 Other Tests

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    30/41

    6. Other Tests

    One can test for SAE once SAL has been allowed for, tohelp determine if SAL adequately accounts for spatialdependence (the reverse is not as straightforward).Idea is to estimate the SAL model and testH 0 : = 0

    with a LM test.LM =

    e Me

    2

    1

    { Tr [M . M + M M ] Tr [M . W (I W ) 1 + M W (I W ) 1]2

    var ( )} 2(1)

    A drawback is that it requires computation of the SALmodel.

    30/41

    6 Other Tests

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    31/41

    6. Other Tests

    Anselin, Bera, Florax and Yoon (1996) developed tests forSAE or SAL that only require OLS residuals and allow forlocal misspecication .

    That is, they allow for small misspecication of valuesof ( ) in testing for SAE (SAL).Based on the work by Bera and Yoon (1993).Details of expressions in ABFY (1996).

    31/41

    7 GMM

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    32/41

    7. GMMThus far we have employed MLE theory for estimationand testing.

    The use of GMM for spatial models was pioneered mainlyby H. Kelejian and I. Prucha in the late 1990s.While MLE bases estimation and inference on thelikelihood function, GMM is based onmoment conditions .

    Take OLS as an example. You can obtain OLS estimatesby minimizing the sum of squared residuals ormaximizing the corresponding likelihood function.Or you can obtain them using the implied moment

    conditions: E [X e ] = 0.Using those moment conditions gives rise to a method-of moments estimator.GMM is obtained by setting a quadratic form on themoment conditions and minimizing it w.r. to the

    parameters. 32/41

    7 The GMM Revolution

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    33/41

    7. The GMM Revolution

    An inuential paper is Kelejian and Prucha (1999; KP).Start with the simple model u = Mu + , iid (0, 2)and assume for the moment that u is observed.

    Model can be estimated with MLE ( and 2), but it iscomputationally expensive.

    KP (1999) showed that moments can also be employed,which results in a less computationally demandingalgorithm.And when u is not observed (e.g., in case they areerrors), a consistent estimator of u can be employedinstead (e.g., residuals).

    Importantly, this allows employing GLS-type procedures inestimation of SAE and SAL models!

    33/41

    7 KP (1999)

    http://find/http://goback/
  • 7/28/2019 Spatial Econometrics Jul9

    34/41

    7. KP (1999)In the simple model above, we know that

    E [uu ] = () = 2

    (I M ) 1

    (I M ) 1

    .We use the following notation in the expressions for themoments: u = Mu , u = MMu , u = M u , u = MM u , with u anestimate of u . Also = M . Then, it is not hard to show that

    the following moments hold:

    E 1N

    = 2

    E 1N =

    2

    N Tr (M M )

    E 1N

    =0

    34/41

    7 KP (1999)

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    35/41

    7. KP (1999)

    Using these 3 moments we are able to estimate and 2 since = u u and = u u . Thus, substituting them into themoments we have

    2N E (u u )

    1N E (u u ) 1

    2N E (u u )

    1N E (u u )

    1N Tr (M M )

    1N E (u u + u u )

    1N E (u u ) 0

    2

    2=

    1N E (u u )1N E (u u )1N E (u u )

    N

    2

    2= N

    which can be used to obtain and 2 by substituting sample

    moments and using GMM or NLLS. Estimates are consistentand asymptotically normal (KP, 1999).

    35/41

    7 KP (1999)

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    36/41

    7. KP (1999)

    The previous results can be applied to the SAE model: y = X + u , u = Mu + , and iid (0, 2).Recall that OLS is consistent, then u = y X OLS isconsistent for u .

    Now FGLS can be applied as follows:1. Do OLS and obtain OLS and u .2. Use u to estimate using the KP moments.3. Obtain ( ) = 2(I M ) 1(I M ) 1 and u .4. Obtain FGLS = [X 1( )X ] 1X 1( )y .

    Advantages: We avoid computationally intensive MLE .

    36/41

    7. Using IVs to Estimate the SAL Model

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    37/41

    7. Using IVs to Estimate the SAL Model

    Recall that the problem in the SAL model is endogeneity of Wy : y = Wy + X + , iid (0, 2).

    What about using instrumental variables (IVs) instead of MLE?

    The optimal IVs are

    E [Wy ] = WE [y ] = W (I W ) 1

    X .But the optimal IVs depend on unknown !One approximation to the optimal IVs (KP, 1998):E [y ] = W (I W ) 1X ) = i =0

    i W i X , if || < 1 and with W 0 = I .

    Thus, a practical approximation to the optimal IVs is theset of linearly independent columns of [X , WX , W 2X , W 3X ,... ].

    37/41

    7. Using IVs to Estimate the SAL Model

    http://find/http://goback/
  • 7/28/2019 Spatial Econometrics Jul9

    38/41

    7. Using IVs to Estimate the SAL Model

    Then, using those KP instruments we can employTSLS:

    1. Run Wy on the KP IVs and obtain Wy .2. Run y = Wy + X + to obtain TSLS and TSLS .

    38/41

    7. Application to the SAL-SAE Model

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    39/41

    7. Application to the SAL SAE ModelRecall: Wy : y = Wy + X + u , u = u + , iid (0, 2).

    To avoid MLE, use a combination of the previous twomethods: Generalized Spatial TSLS (GSTSLS) by KP(1998)

    1. Use TSLS as in the previous slide to obtain u which is aconsistent estimate of u .

    2. Use the KP moments to estimate : .3. Do TSLS in the following (Cochran-Orcutt-type)

    transformed model that uses :(y My ) = (Wy MWy ) + ( X MX ) + ory ( ) = Wy ( ) + X ( ) + .

    4. Apply TSLS again using as KP IVs the extended set[X , WX , W 2X , MX , MWX , MW 2X ] to get

    Wy ( ).

    5. Run y ( ) =

    Wy ( ) + X ( ) + to obtain GSTSLS and GSTSLS and GSTSLS .

    39/41

    8. Software

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    40/41

    Two common programs for spatial econometrics analysis.MATLAB has a spatial econometrics toolboxavailable, developed by J. LeSage:http://www.spatial-econometrics.com/R has a spatial package available:

    http://r-spatial.sourceforge.net/Both offer routines for many of the methods reviewedabove (unfortunately, STATA does not have manyspatial econometrics commands).

    To conduct empirical work using spatial data it is a greatadvantage to have working knowledge of ArcGIS.

    40/41

    8. Active Areas of Research

    http://find/
  • 7/28/2019 Spatial Econometrics Jul9

    41/41

    Spatial Econometrics is a very active area of research.

    Panel Data Models: e.g., Lee and Yu (2010), Baltagi(2005), Kapoor, Kelejian and Prucha (2007).

    Limited Dependent Variable Models: e.g., Pinkse andSlade (1998), Klier and McMillen (2008), LeSage andPace (2009), Flores-Lagunes and Schnier (2012).

    Nonparametric and Semiparametric Models: e.g.,

    McMillen and Redfearn (2010), McMillen (2010).

    41/41

    http://find/