10/7/2015 DEM 7263 Fall 2015 - Spatially Autoregressive Models 1 file:///Users/ozd504/Google%20Drive/dem7263/Rcode15/Lecture_2.html 1/43 DEM 7263 Fall 2015 - Spatially Autoregressive Models 1 Corey S. Sparks, Ph.D. September 9, 2015 Introduction to Spatial Regression Models Up until now, we have been concerned with describing the structure of spatial data through correlational, and the methods of exploratory spatial data analysis (http://rpubs.com/corey_sparks/105700). Through ESDA, we examined data for patterns and using the Moran I and Local Moran I statistics, we examined clustering of variables. Now we consider regression models for continuous outcomes. We begin with a review of the Ordinary Least Squares model for a continuous outcome. OLS Model The basic OLS model is an attempt to estimate the effect of an independent variable(s) on the value of a dependent variable. This is written as: where y is the dependent variable that we want to model, x is the independent variable we think has an association with y, is the model intercept, or grand mean of y, when x = 0, and is the slope parameter that defines the strength of the linear relationship between x and y. e is the error in the model for y that is unaccounted for by the values of x and the grand mean . The average, or expected value of y is : , which is the linear mean function for y, conditional on x, and this gives us the customary linear regression plot: set.seed(1234) x<- rnorm(100, 10, 5) beta0<-1 beta1<-1.5 y<-beta0+beta1*x+rnorm(100, 0, 5) plot(x, y) abline(coef = coef(lm(y~x)), lwd=1.5)
43
Embed
Dem 7263 fall 2015 spatially autoregressive models 1
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
10/7/2015 DEM 7263 Fall 2015 - Spatially Autoregressive Models 1
DEM 7263 Fall 2015 - SpatiallyAutoregressive Models 1Corey S. Sparks, Ph.D.September 9, 2015
Introduction to Spatial RegressionModelsUp until now, we have been concerned with describing the structure of spatial data through correlational,and the methods of exploratory spatial data analysis (http://rpubs.com/corey_sparks/105700).
Through ESDA, we examined data for patterns and using the Moran I and Local Moran I statistics, weexamined clustering of variables. Now we consider regression models for continuous outcomes. We beginwith a review of the Ordinary Least Squares model for a continuous outcome.
OLS ModelThe basic OLS model is an attempt to estimate the effect of an independent variable(s) on the value of adependent variable. This is written as:
where y is the dependent variable that we want to model, x is the independent variable we think has anassociation with y, is the model intercept, or grand mean of y, when x = 0, and is the slopeparameter that defines the strength of the linear relationship between x and y. e is the error in the modelfor y that is unaccounted for by the values of x and the grand mean . The average, or expected value ofy is : , which is the linear mean function for y, conditional on x, and this gives us thecustomary linear regression plot:
## Estimate Std. Error t value Pr(>|t|)## (Intercept) 1.446620 1.0879494 1.329676 1.867119e-01## x 1.473915 0.1037759 14.202863 1.585002e-25
Where, the line shows
We assume that the errors, are independent, Normally distributed and homoskdastic, withvariances .
This is the simple model with one predictor. We can easily add more predictors to the equation andrewrite it:
So, now the mean of y is modeled with multiple x variables. We can write this relationship more compactlyusing matrix notation:
Where Y is now a vector of observations of our dependent variable, X is a matrix ofindependent variables, with the first column being all 1’s and e is the vector of errors for eachobservation.
E[y|x] = + 0β0 β1 xi
U N(0, )ei σ2
σ2
y = + 0 +β0 *k βk xik ei
Y = β + eX ′
n 0 1 n 0 kn 0 1
10/7/2015 DEM 7263 Fall 2015 - Spatially Autoregressive Models 1
The residuals are uncorrelated, with covariance matrix =
To estimate the coefficients, we use the customary OLS estimator
this is the estimator that minimizes the residual sum of squares:
or
We can inspect the properties of the estimates by examining the residuals, or of the model. Since weassume the data are normal, a quantile-quantile (Q-Q) plot of the residuals against the expected quantileof the standard normal distribution should be a straight line. Formal tests of normality can also be used.
We may also inspect the association between , or more appropriately the studentized/standardizedresiduals, and the predictors and the dependent variable. If we see evidence of association, thenhomoskedasticity is a poor assumption
par(mfrow=c(2,2))plot(fit)
par(mfrow=c(1,1))
Model-data agreementDo we (meaning our data) meet the statistical assumptions of our analytical models?Always ask this of any analysis you do, if your model is wrong, your inference will also be wrong.
ei
10/7/2015 DEM 7263 Fall 2015 - Spatially Autoregressive Models 1
Since spatial data often display correlations amongst closely located observations (autocorrelation), weshould probably test for autocorrelation in the model residuals, as that would violate the assumptions ofthe OLS model.
One method for doing this is to calculate the value of Moran’s I for the OLS residuals.
## Neighbour list object:## Number of regions: 235 ## Number of nonzero links: 1106 ## Percentage nonzero weights: 2.002716 ## Average number of links: 4.706383 ## Link number distribution:## ## 1 2 3 4 5 6 7 8 9 ## 4 10 30 62 66 34 24 3 2 ## 4 least connected regions:## 61 82 147 205 with 1 link## 2 most connected regions:## 31 55 with 9 links
## ## Global Moran's I for regression residuals## ## data: ## model: lm(formula = I(viol3yr/acs_poptot) ~ pfemhh + hwy +## p5yrinmig + log(MEDHHINC), data = dat)## weights: salw## ## Moran I statistic standard deviate = 0.75475, p-value = 0.2252## alternative hypothesis: greater## sample estimates:## Observed Moran's I Expectation Variance ## 0.021326432 -0.011406176 0.001880845
Which, in this case, there appears to be no clustering in the residuals, since the observed value of Moran’sI is .021, with a z-test of 0.75, p= .225.
Extending the OLS model to accommodate spatial structureIf we now assume we measure our Y and X’s at specific spatial locations (s), so we have Y(s) and X(s).
10/7/2015 DEM 7263 Fall 2015 - Spatially Autoregressive Models 1
In most analysis, the spatial location (i.e. the county or census tract) only serves to link X and Y so we cancollect our data on them, and in the subsequent analysis this spatial information is ignored that explicitlyconsiders the spatial relationships between the variables or the locations.
In fact, even though we measure Y(s) and X(s) what we end up analyzing X and Y, and apply the ordinaryregression methods on these data to understand the effects of X on Y.
Moreover, we could move them around in space (as long as we keep the observations together with )and still get the same results. Such analyses have been called a-spatial. This is the kind of regressionmodel you are used to fitting, where we ignore any information on the locations of the observationsthemselves.
However, we can extend the simple regression case to include the information on (s) and incorporate itinto our models explicitly, so they are no longer a-spatial.
There are several methods by which to incorporate the (s) locations into our models, there are severalalternatives to use on this problem:
The structured linear mixed (multi-level) model, or GLMM (generalized linear mixed model)Spatial filtering of observationsSpatially autoregressive modelsGeographically weighted regression
We will first deal with the case of the spatially autoregressive model, or SAR model, as its structure is justa modification of the OLS model from above.
Spatially autoregressive modelsWe saw in the normal OLS model that some of the basic assumptions of the model are that the: 1) modelresiduals are distributed as iid standard normal random variates 2) and that they have common (andconstant, meaning homoskedastic) unit variance.
Spatial data, however present a series of problems to the standard OLS regression model. Theseproblems are typically seen as various representations of spatial structure or dependence within the data.The spatial structure of the data can introduce spatial dependence into both the outcome, the predictorsand the model residuals.
This can be observed as neighboring observations, both with high (or low) values (positive autocorrelation)for either the dependent variable, the model predictors or the model residuals. We can also observesituations where areas with high values can be surrounded by areas with low values (negativeautocorrelation).
Since the standard OLS model assumes the residuals (and the outcomes themselves) are uncorrelated, asprevious stated, the autocorrelation inherent to most spatial data introduces factors that violate the iiddistributional assumptions for the residuals, and could violate the assumption of common variance for theOLS residuals. To account for the expected spatial association in the data, we would like a model thataccounts for the spatial structure of the data. One such way of doing this is by allowing there to becorrelation between residuals in our model, or to be correlation in the dependent variable.
yi xi
10/7/2015 DEM 7263 Fall 2015 - Spatially Autoregressive Models 1
We are familiar with the concept of autoregression amongst neighboring observations. This concept isthat a particular observation is a linear combination of its neighboring values. This autoregressionintroduces dependence into the data. Instead of specifying the autoregression structure directly, weintroduce spatial autocorrelation through a global autocorrelation coefficient and a spatial proximitymeasure.
There are 2 basic forms of the spatial autoregressive model: the spatial lag and the spatial error models.
Both of these models build on the basic OLS regression model: $ Y = dots X ’ + e$
Where Y is the dependent variable, X is the matrix of independent variables, is the vector of regressionparameters to be estimated from the data, and e are the model residuals, which are assumed to bedistributed as a Gaussian random variable with mean 0 and constant variance-covariance matrix .
The spatial lag modelThe spatial lag model introduces autocorrelation into the regression model by lagging the dependentvariables themselves, much like in a time-series approach .The model is specified as:
where is the autoregressive coefficient, which tells us how strong the resemblance is, on average,between and it’s neighbors. The matrix ** W** is the spatial weight matrix, describing the spatialnetwork structure of the observations, like we described in the ESDA lecture.
In the lag model, we are specifying the spatial component on the dependent variable. This leads to aspatial filtering of the variable, where they are averaged over the surrounding neighborhood defined in W,called the spatially lagged variable.
The specification that is used most often is a spatially filtered Y variable that can then be regressed on X,which can directly be seen in a re-expression of the OLS model as:
where the direct effect of the spatial lagging of the dependent variable is seen.
To estimate these models we can use either GeoDa or R in R we use the spdep package, and thelagsarlm() function
## ## Call:## lagsarlm(formula = I(viol3yr/acs_poptot) ~ pfemhh + hwy + p5yrinmig + ## log(MEDHHINC), data = dat, listw = salw)## ## Residuals:## Min 1Q Median 3Q Max ## -0.0635446 -0.0147641 -0.0036721 0.0090372 0.3252902 ## ## Type: lag ## Coefficients: (asymptotic standard errors) ## Estimate Std. Error z value Pr(>|z|)## (Intercept) 0.3136449 0.0924307 3.3933 0.0006906## pfemhh 0.1913535 0.0336049 5.6942 1.239e-08## hwy 0.0075802 0.0056013 1.3533 0.1759604## p5yrinmig 0.0794330 0.0202592 3.9208 8.824e-05## log(MEDHHINC) -0.0337148 0.0082612 -4.0811 4.482e-05## ## Rho: 0.034756, LR test value: 0.18517, p-value: 0.66697## Asymptotic standard error: 0.082235## z-value: 0.42264, p-value: 0.67256## Wald statistic: 0.17862, p-value: 0.67256## ## Log likelihood: 441.5604 for lag model## ML residual variance (sigma squared): 0.0013657, (sigma: 0.036955)## Nagelkerke pseudo-R-squared: 0.36486 ## Number of observations: 235 ## Number of parameters estimated: 7 ## AIC: -869.12, (AIC for lm: -870.94)## LM test for residual autocorrelation## test value: 0.4691, p-value: 0.4934
We see that is estimated to be .034, and the likelihood ratio test shows that this is not significantlydifferent from 0.
The spatial error modelThe spatial error model says that the autocorrelation is not in the outcome itself, but instead, anyautocorrelation is attributable to there being missing spatial covariates in the data. If these spatiallypatterned covariates could be measures, the tne autocorrelation would be 0. This model is written:
This model, in effect, controls for the nuisance of correlated errors in the data that are attributable to aninherently spatial process, or to spatial autocorrelation in the measurement errors of the measured andpossibly unmeasured variables in the model.
This model is estimated in R using errorsarlm() in the spdep library
ρ
Y = β + eX ′
e = λWe + v
10/7/2015 DEM 7263 Fall 2015 - Spatially Autoregressive Models 1
Which, while slightly lower than the OLS model, show little evidence of favoring the spatial regressionmodels in this case.
Examination of Model SpecificationTo some degree, both of the SAR specifications allow us to model spatial dependence in the data. Theprimary difference between them is where we model said dependence.
The lag model says that the dependence affects the dependent variable only, we can liken this to adiffusion scenario, where your neighbors have a diffusive effect on you.
The error model says that dependence affects the residuals only. We can liken this to the missing spatiallydependent covariate situation, where, if only we could measure another really important spatiallyassociated predictor, we could account for the spatial dependence. But alas, we cannot, and we insteadmodel dependence in our errors.
These are inherently two completely different ways to think about specifying a model, and we should reallymake our decision based upon how we think our process of interest operates.
That being said, this way of thinking isn’t necessarily popular among practitioners. Most practitioners wantthe best fitting model, ‘nuff said. So methods have been developed that test for alternate modelspecifications, to see which kind of model best summarizes the observed variation in the dependentvariable and the spatial dependence.
These are a set of so-called Lagrange Multiplier (econometrician’s jargon for a score test(https://en.wikipedia.org/wiki/Score_test)) test. These tests compare the model fits from the OLS, spatialerror, and spatial lag models using the method of the score test.
For those who don’t remember, the score test is a test based on the relative change in the first derivativeof the likelihood function around the maximum likelihood. The particular thing here that is affecting thevalue of this derivative is the autoregressive parameter, or . In the OLS model or = 0 (so both thelag and error models simplify to OLS), but as this parameter changes, so does the likelihood for the model,hence why the derivative of the likelihood function is used. This is all related to how the estimationroutines estimate the value of or .
Using the Lagrange Multiplier Test (LMT)In general, you fit the OLS model to your dependent variable, then submit the OLS model fit to the LMTtesting procedure.
Then you look to see which model (spatial error, or spatial lag) has the highest value for the test.
Well, drastically bigger, if the LMT for the error model is 2500 and the LMT for the lag model is 2480, thisis NOT A BIG DIFFERENCE, only about 1%. If you see a LMT for the error model of 2500 and a LMT forthe lag model of 250, THIS IS A BIG DIFFERENCE.
So what if you don’t see a BIG DIFFERENCE, HOW DO YOU DECIDE WHICH MODEL TO USE???
Well, you could think more, but who has time for that.
The econometricians have thought up a “better” LMT test, the so-called robust LMT, robust to what I’mnot sure, but it is said that it can settle such problems of a “not so big difference” between the lag anderror model specifications.
So what do you do? In general, think about your problem before you run your analysis, should this fail you,proceed with using the LMT, if this is inconclusive, look at the robust LMT, and choose the model whichhas the larger value for this test.
More Data Examples:San Antonio, TX mortality data
10/7/2015 DEM 7263 Fall 2015 - Spatially Autoregressive Models 1
#create a mortality rate, 3 year averagedat$mort3<-apply(dat@data[, c("deaths09", "deaths10", "deaths11")],1,mean)dat$mortrate<-1000*dat$mort3/dat$acs_poptot
#just a hist(dat$mortrate)
10/7/2015 DEM 7263 Fall 2015 - Spatially Autoregressive Models 1
Chi and Zhu (http://link.springer.com/article/10.1007/s11113-007-9051-8#page-1) suggest using a widearray of neighbor specifications, then picking the one that maximizes the autocorrelation coefficient. So,here I emulate their results:
## ## Global Moran's I for regression residuals## ## data: ## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +## scale(I(viol3yr/acs_poptot)) + scale(dissim) + scale(ppop65plus),## data = dat)## weights: sa.wtr## ## Moran I statistic standard deviate = 0.65308, p-value = 0.2569## alternative hypothesis: greater## sample estimates:## Observed Moran's I Expectation Variance ## 0.017905792 -0.010601820 0.001905428
#looks like we have minimal autocorrelation in our residuals, but the distance based#weight does show significant autocorrelation
#Let's look at the local autocorrelation in our residuals#get the values of Idat$lmfit1<-localmoran(dat$mortrate, sa.wt5, zero.policy=T)[,1]brks<-classIntervals(dat$lmfit1, n=5, style="quantile")spplot(dat, "lmfit1", at=brks$brks, col.regions=brewer.pal(5, "RdBu"), main="Local Moran Plot of Mortality Rate")
10/7/2015 DEM 7263 Fall 2015 - Spatially Autoregressive Models 1
#Now we fit the spatial lag model #The lag mode is fit with lagsarlm() in the spdep library#we basically specify the same model as in the lm() fit above#But we need to specify the spatial weight matrix and the type#of lag model to fit
#Next we fit the spatial error modelfit.err<-errorsarlm(scale(mortrate)~scale(ppersonspo)+scale(I(viol3yr/acs_poptot))+scale(dissim)+scale(ppop65plus), data=dat, listw=sa.wt2)summary(fit.err, Nagelkerke=T)
10/7/2015 DEM 7263 Fall 2015 - Spatially Autoregressive Models 1
This example shows a lot more in terms of spatial effects.
spdat<-readShapePoly("~/Google Drive/dem7263/data/usdata_mort.shp")#Create a good representative set of neighbor typesus.nb6<-knearneigh(coordinates(spdat), k=6)us.nb6<-knn2nb(us.nb6)us.wt6<-nb2listw(us.nb6, style="W")
#do some basic regression models, without spatial structurefit.1.us<-lm(scale(mortrate)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+factor(RUCC), spdat)summary(fit.1.us)
10/7/2015 DEM 7263 Fall 2015 - Spatially Autoregressive Models 1
#extract studentized residuals from the fit, and examine themspdat$residfit1<-rstudent(fit.1.us)
cols<-brewer.pal(5,"RdBu")spplot(spdat,"residfit1", at=quantile(spdat$residfit1), col.regions=cols, main="Residuals from Model fit of US Mortality Rate")
10/7/2015 DEM 7263 Fall 2015 - Spatially Autoregressive Models 1
## ## Global Moran's I for regression residuals## ## data: ## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +## scale(p65plus) + scale(pblack_1) + scale(phisp) + factor(RUCC),## data = spdat)## weights: us.wtr## ## Moran I statistic standard deviate = 32.728, p-value < 2.2e-16## alternative hypothesis: greater## sample estimates:## Observed Moran's I Expectation Variance ## 0.3693002497 -0.0016900532 0.0001284917
#Now we fit the spatial lag model #The lag mode is fit with lagsarlm() in the spdep library#we basically specify the same model as in the lm() fit above#But we need to specify the spatial weight matrix and the type#of lag model to fit
#Next we fit the spatial error modelfit.err.us<-errorsarlm(scale(mortrate)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+factor(RUCC), spdat, listw=us.wt2)summary(fit.err.us, Nagelkerke=T)