Overview Description Applications Conclusion Tools for the exploratory analysis of two-dimensional spatial point patterns An introduction to spgrid and spkde Maurizio Pisati Department of Sociology and Social Research University of Milano-Bicocca (Italy) [email protected]14th UK Stata Users Group meeting Cass Business School (London), September 8-9, 2008 Maurizio Pisati Stata tools for the analysis of spatial point patterns
48
Embed
Tools for the exploratory analysis of two-dimensional spatial point patterns
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
OverviewDescription
ApplicationsConclusion
Tools for the exploratory analysis of
two-dimensional spatial point patterns
An introduction to spgrid and spkde
Maurizio Pisati
Department of Sociology and Social ResearchUniversity of Milano-Bicocca (Italy)
14th UK Stata Users Group meetingCass Business School (London), September 8-9, 2008
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
Outline
1 OverviewThe programsBackground
2 Descriptionspgrid
spkde
3 ApplicationsCreating two-dimensional gridsEstimating density and intensity functionsEstimating bivariate densities for non-spatial data
4 Conclusion
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
The programsBackground
The programs
The purpose of this talk is to introduce spgrid and spkde,two novel user-written Stata programs for the exploratoryanalysis of two-dimensional spatial point patterns
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
The programsBackground
The programs
The purpose of this talk is to introduce spgrid and spkde,two novel user-written Stata programs for the exploratoryanalysis of two-dimensional spatial point patterns
spgrid generates several kinds of two-dimensional gridscovering rectangular or irregular study regions
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
The programsBackground
The programs
The purpose of this talk is to introduce spgrid and spkde,two novel user-written Stata programs for the exploratoryanalysis of two-dimensional spatial point patterns
spgrid generates several kinds of two-dimensional gridscovering rectangular or irregular study regions
spkde implements a variety of nonparametric kernel-basedestimators of the probability density function and the intensityfunction of two-dimensional spatial point patterns
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
The programsBackground
Two-dimensional spatial point patterns
A two-dimensional spatial pointpattern S can be defined as a setof points si (i = 1, ..., n) located ina two-dimensional study region R
at coordinates (si1, si2)
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
The programsBackground
Two-dimensional spatial point patterns
A two-dimensional spatial pointpattern S can be defined as a setof points si (i = 1, ..., n) located ina two-dimensional study region R
at coordinates (si1, si2)
Each point si represents thelocation in R of an “object”ofsome kind: people, events, sites,buildings, plants, cases of adisease, etc.
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
The programsBackground
Two-dimensional spatial point patterns
A two-dimensional spatial pointpattern S can be defined as a setof points si (i = 1, ..., n) located ina two-dimensional study region R
at coordinates (si1, si2)
Each point si represents thelocation in R of an “object”ofsome kind: people, events, sites,buildings, plants, cases of adisease, etc.
Points si will be referred to as thedata points
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
The programsBackground
Two-dimensional spatial point patterns
A two-dimensional spatial pointpattern S can be defined as a setof points si (i = 1, ..., n) located ina two-dimensional study region R
at coordinates (si1, si2)
Each point si represents thelocation in R of an “object”ofsome kind: people, events, sites,buildings, plants, cases of adisease, etc.
Points si will be referred to as thedata points
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
The programsBackground
Two-dimensional spatial point patterns
A two-dimensional spatial pointpattern S can be defined as a setof points si (i = 1, ..., n) located ina two-dimensional study region R
at coordinates (si1, si2)
Each point si represents thelocation in R of an “object”ofsome kind: people, events, sites,buildings, plants, cases of adisease, etc.
Points si will be referred to as thedata points
Italian provinces (centroids)
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
The programsBackground
Two-dimensional spatial point patterns
In the analysis of spatial point patterns we are often interestedin determining whether the observed data points exhibit someform of clustering, as opposed to being distributed uniformlywithin R
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
The programsBackground
Two-dimensional spatial point patterns
In the analysis of spatial point patterns we are often interestedin determining whether the observed data points exhibit someform of clustering, as opposed to being distributed uniformlywithin R
To explore the possibility of point clustering, it may be usefulto describe the spatial point pattern of interest by means ofits probability density function p(s) and/or its intensityfunction λ(s)
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
The programsBackground
Two-dimensional spatial point patterns
The probability density function p(s) defines the probability ofobserving an object per unit area at location s ∈ R, while theintensity function λ(s) defines the expected number of objectsper unit area at location s ∈ R
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
The programsBackground
Two-dimensional spatial point patterns
The probability density function p(s) defines the probability ofobserving an object per unit area at location s ∈ R, while theintensity function λ(s) defines the expected number of objectsper unit area at location s ∈ R
The probability density function and the intensity functiondiffer only by a constant of proportionality
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
The programsBackground
Two-dimensional spatial point patterns
Both the probability density function p(s) and the intensityfunction λ(s) of a given two-dimensional spatial point patterncan be easily estimated by means of nonparametricestimators, e.g., kernel estimators
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
The programsBackground
Two-dimensional spatial point patterns
Both the probability density function p(s) and the intensityfunction λ(s) of a given two-dimensional spatial point patterncan be easily estimated by means of nonparametricestimators, e.g., kernel estimators
Kernel estimators are used to generate a spatially smoothestimate of p(s) and/or λ(s) at a fine grid of points sg
(g = 1, ...,G ) covering the study region R
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
The programsBackground
Two-dimensional spatial point patterns
Specifically, the intensity λ(sg ) at each grid point sg isestimated by:
λ̂(sg ) =c
Ag
n∑
i=1
k
(
si − sg
h
)
wi
where k(·) is the kernel function – usually a unimodalsymmetrical bivariate probability density function; h is thekernel bandwidth, i.e., the radius of the kernel function; wi isthe value taken on by an optional weighting variable W ; Ag isthe area of the subregion of R over which the kernel functionis evaluated, possibly corrected for edge effects; and c is aconstant of proportionality
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
spgridspkde
spgrid
The purpose of spgrid is to generate two-dimensional gridsthat can be subsequently used by other programs to carry outseveral kinds of spatial data analysis, e.g., kernel estimation ofdensities and intensities for two-dimensional spatial pointpatterns
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
spgridspkde
spgrid
The purpose of spgrid is to generate two-dimensional gridsthat can be subsequently used by other programs to carry outseveral kinds of spatial data analysis, e.g., kernel estimation ofdensities and intensities for two-dimensional spatial pointpatterns
In the context of spatial data analysis, a grid is a regulartessellation of the study region R that divides it into a set ofcontiguous cells whose centers are referred to as the grid
points
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
spgridspkde
spgrid
spgrid can generate both square and hexagonal grids, i.e.,grids whose cells are either square or hexagonal
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
spgridspkde
spgrid
spgrid can generate both square and hexagonal grids, i.e.,grids whose cells are either square or hexagonal
spgrid can generate grids covering both rectangular andirregular study regions, possibly made up by more than onepolygon
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
spgridspkde
spgrid
spgrid can generate both square and hexagonal grids, i.e.,grids whose cells are either square or hexagonal
spgrid can generate grids covering both rectangular andirregular study regions, possibly made up by more than onepolygon
spgrid is able to generate grids with gaps, i.e., grids fromwhich one or more subareas of the study region are excludedfrom the analysis
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
spgridspkde
spkde
spkde implements a variety of nonparametric kernel-basedestimators of the probability density function and the intensityfunction of two-dimensional spatial point patterns
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
spgridspkde
spkde
spkde implements a variety of nonparametric kernel-basedestimators of the probability density function and the intensityfunction of two-dimensional spatial point patterns
spkde allows to choose among eight different kernelfunctions: uniform, normal, truncated normal, negativeexponential, truncated negative exponential, quartic,triangular, and epanechnikov
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
spgridspkde
spkde
spkde implements a variety of nonparametric kernel-basedestimators of the probability density function and the intensityfunction of two-dimensional spatial point patterns
spkde allows to choose among eight different kernelfunctions: uniform, normal, truncated normal, negativeexponential, truncated negative exponential, quartic,triangular, and epanechnikov
The kernel bandwidth can be fixed, variable (based on aminimum number of weighted or unweighted data points), ora combination of the two (adaptive)
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
spgridspkde
spkde
spkde implements a variety of nonparametric kernel-basedestimators of the probability density function and the intensityfunction of two-dimensional spatial point patterns
spkde allows to choose among eight different kernelfunctions: uniform, normal, truncated normal, negativeexponential, truncated negative exponential, quartic,triangular, and epanechnikov
The kernel bandwidth can be fixed, variable (based on aminimum number of weighted or unweighted data points), ora combination of the two (adaptive)
spkde applies an approximate edge correction to theestimates of the quantities of interest
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
Creating two-dimensional gridsEstimating density and intensity functionsEstimating bivariate densities for non-spatial data
Creating two-dimensional grids
Let’s see how spgrid can be used to generate several kinds oftwo-dimensional grids
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
Creating two-dimensional gridsEstimating density and intensity functionsEstimating bivariate densities for non-spatial data
Example 1Rectangular study region - Square grid cells
. spmap using "Italy2-GridCells(HexValid).dta", ///id(spgrid_id) ///poly(data("Italy-OutlineCoordinates.dta") ///ocolor(red) osize(medium))
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
Creating two-dimensional gridsEstimating density and intensity functionsEstimating bivariate densities for non-spatial data
Estimating density and intensity functions
Now, let’s see how we can use spkde and the two-dimensionalgrids generated by spgrid to estimate the probability densityfunction p(s) and the intensity function λ(s) of any givenspatial point pattern
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
Creating two-dimensional gridsEstimating density and intensity functionsEstimating bivariate densities for non-spatial data
Estimating density and intensity functions
Now, let’s see how we can use spkde and the two-dimensionalgrids generated by spgrid to estimate the probability densityfunction p(s) and the intensity function λ(s) of any givenspatial point pattern
To this aim, we will use data pertaining to the 103 Italianprovinces, taking provinces centroids as the observed datapoints si (i = 1, ..., 103)
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
Creating two-dimensional gridsEstimating density and intensity functionsEstimating bivariate densities for non-spatial data
Estimating density and intensity functions
Now, let’s see how we can use spkde and the two-dimensionalgrids generated by spgrid to estimate the probability densityfunction p(s) and the intensity function λ(s) of any givenspatial point pattern
To this aim, we will use data pertaining to the 103 Italianprovinces, taking provinces centroids as the observed datapoints si (i = 1, ..., 103)
p(s) and λ(s) will be estimated at each point sg
(g = 1, ..., 3, 483) of the grid generated in Example 4 above
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
Creating two-dimensional gridsEstimating density and intensity functionsEstimating bivariate densities for non-spatial data
Example 1: Simple point patternQuartic kernel function - Fixed bandwidth (100 km)
. spmap density using ///"Italy-GridCells(HexValid).dta", ///id(spgrid_id) clnum(20) fcolor(Rainbow) ///ocolor(none ..) legend(off) ///point(data("Italy-DataPoints.dta") ///x(xcoord) y(ycoord) size(*0.5))
Maurizio Pisati Stata tools for the analysis of spatial point patterns
OverviewDescription
ApplicationsConclusion
Creating two-dimensional gridsEstimating density and intensity functionsEstimating bivariate densities for non-spatial data
Example 2: Simple point patternNormal kernel function - Fixed bandwidth (69.35 km)The chosen bandwidth equals the average distance between each data point and its 5nearest neighbors