Top Banner
more than just maps
31

GIS for eHealth

Mar 21, 2016

Download

Documents

mahala

GIS for eHealth. more than just maps. A Toolkit for Spatial Analysis. GUI access to the most frequently used tools ArcToolbox – an expandable collection of ready-to-use tools ModelBuilder – a visual programming environment Python – A FOSS scripting language integrated with ArcGIS. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: GIS for  eHealth

more than just maps

Page 2: GIS for  eHealth

A Toolkit for Spatial AnalysisGUI access to the most frequently used toolsArcToolbox – an expandable collection of

ready-to-use toolsModelBuilder – a visual programming

environmentPython – A FOSS scripting language

integrated with ArcGIS

Page 3: GIS for  eHealth

Most analyses involve repeated use of common toolslike “Select by Attribute” (SQL)

Page 4: GIS for  eHealth

We can also create a selection interactively•The selected river is highlighted•The associated rows in the attribute table

are also highlighted

Page 5: GIS for  eHealth

Creating a Buffer allows identification of all objects that fall within a specified distance of a selected feature

Page 6: GIS for  eHealth

Buffers allow further selection, i. e., all patients within 1000 meters of the river

Page 7: GIS for  eHealth

ModelBuilder is a visual programming environment where data and tools can be dragged unto a blank slateto create a working program.•Models can be saved•Models can be exported as graphic documentation•Models can be exported as scripts for further

development

Page 8: GIS for  eHealth

Models are an excellent way to write bug-free code fragments, which can then be assembled into larger scripts

Models and scripts can be called in Models and Scriptsand can be saved as tools in ArcToolbox

Page 9: GIS for  eHealth

PYTHON is a modern Language that is well supported and easy tolearn.

Page 10: GIS for  eHealth

Observed Values Predicted Values

20 40 60 80 1000

200

40

6080

100

Ordinary Least Square (OLS)

Geographically Weighted Regression(GWR)

Regression analysis

10

•Regression analysis allows you to:Regression analysis allows you to:– Model, examine, and explore spatial relationshipsModel, examine, and explore spatial relationships– Better understand the factors behind observed spatial patternsBetter understand the factors behind observed spatial patterns– Predict outcomes based on that understandingPredict outcomes based on that understanding

Page 11: GIS for  eHealth

What’s the big deal?•Pattern analysis (without regression):

oAre there places where people persistently die young?

oWhere are test scores consistently high?oWhere are 911 emergency call hot spots?

11

Page 12: GIS for  eHealth

Why use regression?Understand key factors

What are the most important habitatcharacteristics for an endangered bird?

Predict unknown valuesHow much rainfall will occur in agiven location?

Test hypotheses“Broken Window” Theory: Is there apositive relationship between vandalismand residential burglary?

12

Page 13: GIS for  eHealth

ApplicationsEducation

Why are literacy rates so low inparticular regions?

Natural resource managementWhat are the key variables that

explain high forest fire frequency?Ecology

Which environments should beprotected, to encouragereintroduction of an endangeredspecies?

TransportationWhat demographic characteristics contribute to

high rates of public transportation usage?Many more…

Business, crime prevention, epidemiology,finances, public safety, public health

Page 14: GIS for  eHealth

Regression analysis terms and concepts

Dependent variable (Y): What you are trying to model or predict (e.g., residential burglary).

Explanatory variables (X): Variables you believe cause or explain the dependent variable (e.g., income, vandalism, number of households).

Coefficients (β): Values, computed by the regression tool, reflecting the relationship between explanatory variables and the dependent variable.

Residuals (ε): The portion of the dependent variable that isn’t explained by the model; the model under- and over-predictions.

14

Page 15: GIS for  eHealth

Intercept 1.625506INCOME -0.000030VANDALISM 0.133712HOUSEHOLDS 0.012425LOWER CITY 0.136569

Regression model coefficientsCoefficient sign (+/-) and magnitude reflect each explanatory variable’s relationship to the dependent variable

The asterisk * indicatesthe explanatory variableis statistically significant

Page 16: GIS for  eHealth

Building a global OLS regression model

Choose your dependent variable (Y).Identify potential explanatory variables (X).Explore those explanatory variables.Run OLS regression with different combinations

of explanatory variables, until you find a properly specified model.

16

Page 17: GIS for  eHealth

Adjusted R-Squared [2]: 0.37407Akaike’s Information Criterion (AIC) [2]: 5813.121

Page 18: GIS for  eHealth

Why are people dyingyoung in South Dakota?

Do economic factors explain this spatial pattern?

Poverty rates explain 66% of the variation in the average age of death dependent variable: Adjusted R-Squared [2]: 0.659

However, significant spatial autocorrelation among model residuals indicates important explanatory variables are missing from the model.

18

Page 19: GIS for  eHealth

19

Build a multivariate regression modelBuild a multivariate regression model•Explore variable relationships using the scatterplot matrixExplore variable relationships using the scatterplot matrix•Consult theory and field expertsConsult theory and field experts•Look for spatial variablesLook for spatial variables•Run OLS (this is an iterative, often tedious, trial and error, Run OLS (this is an iterative, often tedious, trial and error,

process)process)

Page 20: GIS for  eHealth

1 Coefficients have the expected sign.Coefficients have the expected sign.2 No redundancy among model explanatory variables.No redundancy among model explanatory variables.

4 Residuals are normally distributed.Residuals are normally distributed.5 Strong Adjusted R-Square value.Strong Adjusted R-Square value.6 Residuals are not spatially autocorrelated.Residuals are not spatially autocorrelated.

Page 21: GIS for  eHealth

Online help is … helpful! Online help is … helpful!

Page 22: GIS for  eHealth

Coefficient significanceLook for statistically significant explanatory variables.

22

* Statistically significant at the 0.05 level.

Page 23: GIS for  eHealth

MulticollinearityFind a set of explanatory variables that have low VIF values.

In a strong model, each explanatory variable gets at a different facet of the dependent variable.

What did one regression coefficient say to the other regression coefficient? …I’m partial to you!

23

VIF--------------2.3512291.5564981.0512071.4003583.232363[1] Large VIF (> 7.5, for example) indicates explanatory variable redundancy.

Page 24: GIS for  eHealth

Model performanceCompare models by looking for the lowest AIC value.As long as the dependent variable remains fixed, the AIC value for different OLS/GWR models are comparable

Look for a model with a high Adjusted R-Squared value.

Akaike’s Information Criterion (AIC) [2]: 524.9762Adjusted R-Squared [2]: 0.864823

[2] Measure of model fit/performance.

Page 25: GIS for  eHealth

Model biasWhen the Jarque-Bera test is statistically significant:The model is biasedResults are not reliableOften indicates that a key variable is missing from the model

25

[6] Significant p-value indicates residuals deviate from a normal distribution.

Jarque-Bera Statistic [6]: 4.207198 Prob(>chi-sq), (2) degrees of freedom: 0.122017

Page 26: GIS for  eHealth

26

Statistically significant clustering of under and over predictions.

Random spatial pattern of underand over predictions.

Page 27: GIS for  eHealth

For each explanatory variable,GWR creates a coefficient surface

showing you where relationships are strongest.

Global vs. local regression modelsOLS

Global regression modelOne equation, calibrated using data from all featuresRelationships are fixed

GWRLocal regression modelOne equation for every feature, calibrated using data from nearby features

Relationships are allowed to vary across the study area

27

Page 28: GIS for  eHealth

Running GWRGWR is a local spatial regression modelModeled relationships are allowed to vary

GWR variables are the same as OLS, except:Do not include spatial regime (dummy) variables

Do not include variables with little value variation

Page 29: GIS for  eHealth

Defining local

GWR constructs an equation foreach feature

Coefficients are estimated usingnearby feature values

GWR requires a definition for nearbyKernel type

Fixed: Nearby is determined by a fixeddistance band

Adaptive: Nearby is determined by a fixednumber of neighbors

Bandwidth method AIC or Cross Validation (CV): GWR will

find the optimal distance or optimalnumber of neighbors

Bandwidth parameter: User-provideddistance or user-provided number ofneighbors

29

Page 30: GIS for  eHealth

Interpreting GWR resultsCompare GWR R2 and AICvalues to OLS R2 and AIC values

The better model has a lower AICand a high R2.

Residual maps show modelunder- and over-predictions.They shouldn’t be clustered.

Coefficient maps show howmodeled relationships varyacross the study area.

Model predictions, residuals,standard errors, coefficients,and condition numbers arewritten to the output featureclass.

Check condition numbers: > 30 indicates a less reliable result

30

Page 31: GIS for  eHealth

Observed

Modeled

Predicted

Calibrate the GWR model using known values for the dependent variable and all of the explanatory variables.

Provide a feature class of prediction locations containing values for all of the explanatory variables.

GWR will create an output feature class with the computed predictions.