International Journal of Humanities, Religion and Social Science ISSN : 2548-5725 | Volume 2, Issue 1 2017 www.doarj.org 21 www.doarj.org GEOGRAPHICALLY WEIGHTED REGRESSION AND BAYESIAN GEOGRAPHICALLY WEIGHTED REGRESSION MODELLING WITH ADAPTIVE GAUSSIAN KERNEL WEIGHT FUNCTION ON THE POVERTY LEVEL IN WEST JAVA PROVINCE Ikin Sodikin 1 , Henny Pramoedyo 2 , and Suci Astutik 2 1 Master Student of Statistics Department, Brawijaya University, Malang, Indonesia; and 2 Lecturer of Statistics Department, Brawijaya University, Malang, Indonesia Abstract: GWR analysis is an expansion of a global regression analysis that generates parameter estimators to predict each point or location where the data is observed and collected. This analysis can accommodate spatial influence in an estimation of the regression model. One of the important issues that arise in GWR modeling is the non-constant variety between observations. Bayesian GWR analysis (BGWR) is considered as one of the best solutions to address the problems that arise in GWR modeling. Through the Bayesian approach, observations that potentially generate a non-constant variety can be detected and weighted directly so as to reduce their effect on model parameter estimation. In this study, the weights used are the adaptive Gaussian Kernel function, where the resulting bandwidth varies for each location of observation. This weighting is applied to compare the estimation results of GWR and BGWR model parameters. The results of the analysis show that the BGWR model is better than the GWR model in explaining the variables of literacy rate (%), percentage of households with joint latrine (%), and percentage of households receiving poor rice (%) to district poverty level in West Java Province. This is shown based on the Mean Square Error (MSE) value that is used as the model goodness criterion. The MSE value for the BGWR model is 0.353×10 2 less than MSE for the GWR model of 0.382×10 2 . Keywords: spatial, bayesian, Geographically Weighted Regression, adaptive gaussian kernel, non-constant variance, poverty I. Introduction As a developing country, Indonesia still has one of the most serious problems of poverty. To overcome the problem of poverty, the government has made various efforts, among others by estimating areas that are categorized as poor up to the level of village administration, in the hope that poverty alleviation will become more directed. The regression analysis approach has often been used in predicting poverty rates, but still global and enforced at all observed locations without involving geographical location based on earth's longitude and latitude. The spatial influences that arise caused the assumptions of freedom between observations required in global regressions are difficult to fulfill (A.S. Fotheringham, C. Brunsdon, and M. Charlton, 2002). One of the models that has been developed to overcome spatial problems is Geographically Weighted Regression (GWR). GWR analysis is an expansion of a global regression analysis that generates parameter estimators to predict each point or location where the data is observed and collected (A.S. Fotheringham, et al, 2002). This analysis can accommodate spatial influence in an estimation of the regression model. Let is an ×1 matrix of response variable, is an ×( + 1) matrix of explanatory variable, is -th × matrix of location spatial weighted, is -th ( + 1)×1 vector of parameters coefficient, and is -th ×1 matrix of error vector where ~(0, 2 ) (Y. Leung, C. L. Mei, and W. X. Zhang, 2000). Mathematically, the GWR model can be written as follows:
10
Embed
GEOGRAPHICALLY WEIGHTED REGRESSION AND BAYESIAN ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
International Journal of Humanities, Religion and Social Science ISSN : 2548-5725 | Volume 2, Issue 1 2017 www.doarj.org
21 www.doarj.org
GEOGRAPHICALLY WEIGHTED REGRESSION AND
BAYESIAN GEOGRAPHICALLY WEIGHTED REGRESSION
MODELLING WITH ADAPTIVE GAUSSIAN KERNEL
WEIGHT FUNCTION ON THE POVERTY LEVEL IN WEST
JAVA PROVINCE
Ikin Sodikin1, Henny Pramoedyo2, and Suci Astutik2 1 Master Student of Statistics Department, Brawijaya University, Malang, Indonesia; and
2Lecturer of Statistics Department, Brawijaya University, Malang, Indonesia
Abstract: GWR analysis is an expansion of a global regression analysis that generates parameter estimators to
predict each point or location where the data is observed and collected. This analysis can accommodate spatial
influence in an estimation of the regression model. One of the important issues that arise in GWR modeling is the
non-constant variety between observations. Bayesian GWR analysis (BGWR) is considered as one of the best
solutions to address the problems that arise in GWR modeling. Through the Bayesian approach, observations that
potentially generate a non-constant variety can be detected and weighted directly so as to reduce their effect on
model parameter estimation. In this study, the weights used are the adaptive Gaussian Kernel function, where the
resulting bandwidth varies for each location of observation. This weighting is applied to compare the estimation
results of GWR and BGWR model parameters. The results of the analysis show that the BGWR model is better than
the GWR model in explaining the variables of literacy rate (%), percentage of households with joint latrine (%),
and percentage of households receiving poor rice (%) to district poverty level in West Java Province. This is
shown based on the Mean Square Error (MSE) value that is used as the model goodness criterion. The MSE value
for the BGWR model is 0.353×102less than MSE for the GWR model of 0.382×102.
As a developing country, Indonesia still has one of the most serious problems of poverty. To
overcome the problem of poverty, the government has made various efforts, among others by estimating
areas that are categorized as poor up to the level of village administration, in the hope that poverty
alleviation will become more directed. The regression analysis approach has often been used in predicting
poverty rates, but still global and enforced at all observed locations without involving geographical
location based on earth's longitude and latitude. The spatial influences that arise caused the assumptions
of freedom between observations required in global regressions are difficult to fulfill (A.S. Fotheringham,
C. Brunsdon, and M. Charlton, 2002). One of the models that has been developed to overcome spatial
problems is Geographically Weighted Regression (GWR).
GWR analysis is an expansion of a global regression analysis that generates parameter estimators
to predict each point or location where the data is observed and collected (A.S. Fotheringham, et al,
2002). This analysis can accommodate spatial influence in an estimation of the regression model. Let 𝒀
is an 𝑛×1 matrix of response variable, 𝑿 is an 𝑛×(𝑝 + 1) matrix of explanatory variable, 𝑾𝒊 is 𝑖-th
𝑛×𝑛 matrix of location spatial weighted, 𝜷𝒊 is 𝑖-th (𝑝 + 1)×1 vector of parameters coefficient, and 𝜺𝒊 is
𝑖-th 𝑛×1 matrix of error vector where 𝜺𝒊~𝑁(0, 𝜎2) (Y. Leung, C. L. Mei, and W. X. Zhang, 2000).
Mathematically, the GWR model can be written as follows:
Geographically Weighted Regression and Bayesian Geographically Weighted …
22 www.doarj.org
𝑾𝒊𝒀 = 𝑾𝒊𝑿𝜷𝒊 + 𝜺𝒊
Estimation of GWR model parameters for each 𝑖-location obtained through the Weighted Least
Square (WLS) method is written as follows:
�̂�𝒊 = (𝑿𝑇 𝑾𝒊 𝑿)−1𝑿𝑇 𝑾𝒊 𝒀
One of the important issues that arise in GWR modeling is the non-constant variety between
observations (H. S. Chan, 2008). This appears as a result of different regression coefficients in each
location of observation. Possible impacts are the variety of errors will also be different for each location
and non-fulfillment of the normality assumption of error.
The Bayesian GWR (BGWR) analysis introduced by Lesage, rated as one of the right solutions to
address the problems that arise in GWR modeling (J. P. LeSage, 2001). The Bayesian approach applied to
the GWR model is able to produce parameter estimators more effectively than the classical approach (I.
Ntzoufras, 2009). In BGWR analysis, the variance of errors is assumed to be not constant between the
observed locations i.e. 𝜺𝒊~𝑁(0, 𝜎2𝑽𝒊). 𝑽𝒊 is an 𝑛×𝑛 diagonal matrix containing parameters (𝑣1, 𝑣2, … , 𝑣𝑛) which indicates a non-constant variety between observational sites (H. S. Chan, 2008).
Unlike the estimation of GWR model parameters using Weighted Least Square (WLS) method (I.
M. Hutabarat, A. Saefuddin, A. Djuraidah, and I. W. Mangki, 2013), the BGWR model applies the Gibbs
Sampling algorithm. This algorithm is one of the simulation methods with the Monte Carlo Markov
Chain (MCMC) approach to generate sequential sample data from a certain posterior distribution, so a set
of estimations can be resulted approximate to the original joints posterior distribution of each parameter
(I. Ntzoufras, 2009). The posterior distribution is formed by combining the prior information and the
sample information expressed by the likelihood function.
The likelihood function of the BGWR model can be described as follows :
𝐿(𝒀|𝜷, 𝜎2, 𝑽) =1
(2𝜋)𝑛/2
1
(𝜎2)𝑛/2∏
1
(𝑣𝑖)1/2
𝑛
𝑖=1
𝑒𝑥𝑝 {−1
2𝜎2𝑣𝑖∑ (𝒀𝒊
∗ − 𝑿𝒊∗𝜷)2
𝑛
𝑖=1}
𝐿(𝒀|𝜷, 𝜎2, 𝑽) =1
(2𝜋)𝑛/2
1
(𝜎2)𝑛/2∏
1
(𝑣𝑖)1/2
𝑛
𝑖=1
𝑒𝑥𝑝 {−1
2𝜎2𝑣𝑖∑ (𝒀𝒊
∗ − 𝑿𝒊∗𝜷)2
𝑛
𝑖=1}
𝐿(𝒀|𝜷, 𝜎2, 𝑽) ∝ 𝜎−𝑛 ∏1
(𝑣𝑖)1/2𝑛𝑖=1 𝑒𝑥𝑝 {− ∑
(𝜺𝒊)2
2𝜎2𝑣𝑖
𝑛𝑖=1 }
where 𝜺𝒊 = 𝒀𝒊∗ − 𝑿𝒊
∗𝜷, 𝒀𝒊∗ = 𝑾𝒊𝒀, dan 𝑿𝒊
∗ = 𝑾𝒊𝑿.
In this research, BGWR model completion using improper prior for each parameter as follows ( J.
Geweke, 1993) :
𝑓(𝜷𝒊) ∝ 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡, 𝑓(𝜎) ∝ 𝜎−1, and
𝑓 (𝑟
𝑣𝑖) ~𝑖𝑖𝑑
𝜒(𝑟)2
𝑟, 𝑖 = 1,2, … , 𝑛, so 𝑓(𝑽) ∝ ∏ 𝑣𝑖
−(𝑟+2)/2exp (
−𝑟
2𝑣𝑖)𝑛
𝑖=1
1.1 Joint Posterior Distribution
Based on the Bayes theorem and the assumption of mutually independent inter-prior distribution
𝑓(𝜷, 𝜎2, 𝑽) ∝ 𝑓(𝜷)×𝑓(𝜎)×𝑓(𝑽), so the joint posterior distribution can be written as follows: