Top Banner
Why Is It Why Is It There? There? Lecture 6 Lecture 6 Introduction to Geographic Information Introduction to Geographic Information Systems Systems Geography 176A Geography 176A 2006 Summer, Session B 2006 Summer, Session B Department of Geography Department of Geography University of California, Santa Barbara University of California, Santa Barbara
49

Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Why Is It There?Why Is It There?

Lecture 6Lecture 6Introduction to Geographic Information SystemsIntroduction to Geographic Information Systems

Geography 176AGeography 176A2006 Summer, Session B2006 Summer, Session B

Department of GeographyDepartment of GeographyUniversity of California, Santa BarbaraUniversity of California, Santa Barbara

Page 2: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Review: Dueker’s (1979) DefinitionReview: Dueker’s (1979) Definition

““a geographic information system is a special a geographic information system is a special case of information systems where the database case of information systems where the database consists of observations on spatially distributed consists of observations on spatially distributed features, activities or events, which are definable features, activities or events, which are definable in space as points, lines, or areas. A geographic in space as points, lines, or areas. A geographic information system manipulates data about information system manipulates data about these points, lines, and areas to retrieve data for these points, lines, and areas to retrieve data for ad hoc queries and ad hoc queries and analysesanalyses".".

Page 3: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

GIS is capable of data GIS is capable of data analysisanalysis

Attribute DataAttribute Data• Describe with statisticsDescribe with statistics• Analyze with hypothesis testingAnalyze with hypothesis testing

Spatial DataSpatial Data• Describe with mapsDescribe with maps• Analyze with spatial analysisAnalyze with spatial analysis

Page 4: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Describing one attributeDescribing one attribute

Flat File Database

Record Value Value Value

Attribute Attribute Attribute

Record Value Value Value

Record Value Value Value

Page 5: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Attribute DescriptionAttribute Description

The The extremesextremes of an attribute are the highest and of an attribute are the highest and lowest values, and the lowest values, and the rangerange is the difference is the difference between them in the units of the attribute.between them in the units of the attribute.

A A histogramhistogram is a two-dimensional plot of attribute is a two-dimensional plot of attribute values grouped by magnitude and the frequency values grouped by magnitude and the frequency of records in that group, shown as a variable-of records in that group, shown as a variable-length bar.length bar.

For a large number of records with random errors For a large number of records with random errors in their measurement, the histogram resembles a in their measurement, the histogram resembles a bell curvebell curve and is symmetrical about the and is symmetrical about the meanmean..

Page 6: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

If the records are:If the records are:

TextText• Semantics of text e.g. Semantics of text e.g.

“Hampton” “Hampton” • word frequency e.g. “Creek”, word frequency e.g. “Creek”,

“Kill”“Kill”• address matchingaddress matching

Example: Display all places Example: Display all places called “State Street”called “State Street”

Page 7: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

If the records are:If the records are:

ClassesClasses• histogram by classhistogram by class• numbers in classnumbers in class• contiguity description, e.g. average contiguity description, e.g. average

neighbor (roads, commercial)neighbor (roads, commercial)

Page 8: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Describing a classed raster gridDescribing a classed raster grid

5

10

15

20

P (blue) = 19/48

Page 9: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

If the records are:If the records are:

NumbersNumbers• statistical descriptionstatistical description• min, max, rangemin, max, range• variance variance • standard deviationstandard deviation

Page 10: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

MeasurementMeasurement

One: all I have! [6:00pm]One: all I have! [6:00pm] Two: do they agree? [6:00pm;6:04pm]Two: do they agree? [6:00pm;6:04pm] Three: level of agreement Three: level of agreement

[6:00pm;6:04pm;7:23pm][6:00pm;6:04pm;7:23pm] Many: average all, average without Many: average all, average without

extremesextremes Precision: 6:00pm. “About six o’clock”Precision: 6:00pm. “About six o’clock”

Page 11: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Statistical descriptionStatistical description

Range : min, max, max-minRange : min, max, max-min Central tendency : mode, median Central tendency : mode, median

(odd, even), mean(odd, even), mean Variation : variance, standard Variation : variance, standard

deviationdeviation

Page 12: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Statistical descriptionStatistical description

Range : outliersRange : outliers mode, median, meanmode, median, mean Variation : variance, standard deviationVariation : variance, standard deviation

Page 13: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Elevation (book example)Elevation (book example)

Page 14: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

GPS Example Data: ElevationGPS Example Data: Elevation

Page 15: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

MeanMean

Statistical averageStatistical average Sum of the values Sum of the values

for one attribute for one attribute divided by the divided by the number of recordsnumber of records

X i

i 1=

n

= X / n

Page 16: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Computing the MeanComputing the Mean

Sum of attribute values across all records, Sum of attribute values across all records, divided by the number of records.divided by the number of records.

Add all attribute values down a column, / Add all attribute values down a column, / by # records by # records

A representative value, and for A representative value, and for measurements with normally distributed measurements with normally distributed error, converges on the true reading.error, converges on the true reading.

A value lacking sufficient data for A value lacking sufficient data for computation is called a missing value. computation is called a missing value. Does not get included in sum or n.Does not get included in sum or n.

Page 17: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

VarianceVariance

The total variance is the sum of each The total variance is the sum of each record with its mean subtracted and record with its mean subtracted and then multiplied by itself.then multiplied by itself.

The standard deviation is the square The standard deviation is the square root of the variance divided by the root of the variance divided by the number of records less one.number of records less one.

For two values, there is only one For two values, there is only one variance.variance.

Page 18: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Average difference from the mean

Sum of the mean subtracted from the value for each record, squared, divided by the number of records-1, square rooted.

st.dev. =(X - X )

2i

n - 1

Standard DeviationStandard Deviation

Page 19: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

GPS Example Data: ElevationGPS Example Data: ElevationStandard deviationStandard deviation

Same units as the values of the records, in this Same units as the values of the records, in this case meters.case meters.

Average amount readings differ from the averageAverage amount readings differ from the average Can be above of below the meanCan be above of below the mean Elevation is the mean (459.2 meters) Elevation is the mean (459.2 meters) plus or minus the expected error of 82.92 metersplus or minus the expected error of 82.92 meters Elevation is most likely to lie between 376.28 Elevation is most likely to lie between 376.28

meters and 542.12 meters. meters and 542.12 meters. These limits are called the error band or margin These limits are called the error band or margin

of error.of error.

Page 20: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Mean

459.

2

484

.5

12.17 %

37.83 %

The Bell CurveThe Bell Curve

Page 21: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Samples and populationsSamples and populations

A A samplesample is a set of measurements taken from a is a set of measurements taken from a larger group or larger group or populationpopulation. .

Sample means and variances can serve as Sample means and variances can serve as estimatesestimates for their populations. for their populations.

Easier to measure with samples, then draw Easier to measure with samples, then draw conclusions about entire population.conclusions about entire population.

Page 22: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Testing MeansTesting Means

Mean elevation of 459.2 meters Mean elevation of 459.2 meters standard deviation 82.92 metersstandard deviation 82.92 meters what is the chance of a GPS reading of what is the chance of a GPS reading of

484.5 meters? 484.5 meters? 484.5 is 25.3 meters above the mean484.5 is 25.3 meters above the mean 0.31 standard deviations ( Z-score)0.31 standard deviations ( Z-score) 0.1217 of the curve lies between the 0.1217 of the curve lies between the

mean and this value mean and this value 0.3783 beyond it0.3783 beyond it

Page 23: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Hypothesis testingHypothesis testing

Set up NULL hypothesis (e.g. Values Set up NULL hypothesis (e.g. Values or Means are the same) as Hor Means are the same) as H00

Set up ALTERNATIVE hypothesis. HSet up ALTERNATIVE hypothesis. H11

Test hypothesis. Try to reject NULL.Test hypothesis. Try to reject NULL. If null hypothesis is rejected If null hypothesis is rejected

alternative is accepted with a alternative is accepted with a calculable level of confidence.calculable level of confidence.

Page 24: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Testing the MeanTesting the Mean

Mathematical version of the normal Mathematical version of the normal distribution can be used to compute distribution can be used to compute probabilities associated with probabilities associated with measurements with known means and measurements with known means and standard deviations.standard deviations.

A A test of meanstest of means can establish whether can establish whether two samples from a population are two samples from a population are different from each other, or whether the different from each other, or whether the different measures they have are the different measures they have are the result of random variation.result of random variation.

Page 25: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Alternative attribute histogramsAlternative attribute histograms

Page 26: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

AccuracyAccuracy

Determined by testing measurements against an Determined by testing measurements against an independent source of higher fidelity and independent source of higher fidelity and reliability.reliability.

Must pay attention to units and significant digits.Must pay attention to units and significant digits. Can be expressed as a number using statistics Can be expressed as a number using statistics

(e.g. expected error).(e.g. expected error). Accuracy measures imply accuracy users.Accuracy measures imply accuracy users.

Page 27: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

The difference is the mapThe difference is the map

GIS data description answers the GIS data description answers the question: question: Where?Where?

GIS data analysis answers the GIS data analysis answers the question: question: Why is it there?Why is it there?

GIS data description is different from GIS data description is different from statistics because the results can be statistics because the results can be placed onto a map for placed onto a map for visual visual analysisanalysis..

Page 28: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Spatial Statistical DescriptionSpatial Statistical Description

For coordinates, data extremes For coordinates, data extremes define the two corners of a bounding define the two corners of a bounding rectangle.rectangle.

Page 29: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Geographic extremesGeographic extremes

Southernmost point Southernmost point in the continental in the continental United States.United States.

Range: e.g. Range: e.g. elevation difference; elevation difference; map extentmap extent

Depends on Depends on projection, datum projection, datum etc.etc.

Page 30: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Spatial Statistical Description Spatial Statistical Description

For coordinates, the means and standard For coordinates, the means and standard deviations correspond to the mean center deviations correspond to the mean center and the standard distanceand the standard distance

A centroid is any point chosen to represent A centroid is any point chosen to represent a higher dimension geographic feature, of a higher dimension geographic feature, of which the mean center is only one choice.which the mean center is only one choice.

The standard distance for a set of point The standard distance for a set of point spatial measurements is the expected spatial measurements is the expected spatial error.spatial error.

Page 31: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Mean CenterMean Center

mean y

mean x

Page 32: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Centroid: mean center of a featureCentroid: mean center of a feature

Page 33: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Mean center?Mean center?

Page 34: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Comparing spatial meansComparing spatial means

Page 35: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

GIS and Spatial AnalysisGIS and Spatial Analysis Descriptions of geographic properties Descriptions of geographic properties

such as shape, pattern, and such as shape, pattern, and distribution are often verbaldistribution are often verbal

Quantitative measure can be devised, Quantitative measure can be devised, although few are computed by GIS.although few are computed by GIS.

GIS statistical computations are most GIS statistical computations are most often done using retrieval options often done using retrieval options such as buffer and spread.such as buffer and spread.

Also by manipulating attributes with Also by manipulating attributes with arithmetic commands (map algebra).arithmetic commands (map algebra).

Page 36: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Example: IntervisibilityExample: Intervisibility

Source: Mineter, Dowers, Gittings, Caldwell ESRI Proceedings 

Page 37: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

An exampleAn example

Lower 48 United StatesLower 48 United States 2000 Data from the U.S. Census on 2000 Data from the U.S. Census on

gendergender Gender Ratio = # males per 100 Gender Ratio = # males per 100

femalesfemales Range is 89.00 - 103.90Range is 89.00 - 103.90 What does the spatial distribution What does the spatial distribution

look like?look like?

Page 38: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Gender Ratio by State: 1996Gender Ratio by State: 1996

Page 39: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Searching for Spatial PatternSearching for Spatial Pattern

A linear relationship is a predictable A linear relationship is a predictable straight-line link between the values of a straight-line link between the values of a dependent and an independent variable. (y dependent and an independent variable. (y = a + bx) It is a simple model of the = a + bx) It is a simple model of the relationship.relationship.

A linear relation can be tested for goodness A linear relation can be tested for goodness of fit with least squares methods. The of fit with least squares methods. The coefficient of determination (r-squared) is a coefficient of determination (r-squared) is a measure of the degree of fit, and the measure of the degree of fit, and the amount of variance explained.amount of variance explained.

Page 40: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Simple linear relationshipSimple linear relationship

dependentvariable

independent variable

observationbest fitregression liney = a + bx

intercept

gradient

y=a+bx

Page 41: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Testing the relationshipTesting the relationship

Gender Ratio = -0.1438Longitude + 83.285

R-squared = 61.8%

Page 42: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Patterns in Residual MappingPatterns in Residual Mapping Differences between observed values of the dependent Differences between observed values of the dependent

variable and those predicted by a model are called variable and those predicted by a model are called residualsresiduals..

A GIS allows residuals to be mapped and examined for A GIS allows residuals to be mapped and examined for spatial patterns.spatial patterns.

A model helps explanation and prediction after the GIS A model helps explanation and prediction after the GIS analysis.analysis.

A A modelmodel should be simple, should explain what it should be simple, should explain what it represents, and should be examined in the limits before represents, and should be examined in the limits before use.use.

We should always examine the limits of the model’s We should always examine the limits of the model’s applicability (e.g. Does the regression apply to Europe?)applicability (e.g. Does the regression apply to Europe?)

Page 43: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Unexplained varianceUnexplained variance

More variables?More variables? Different extent?Different extent? More records?More records? More spatial dimensions?More spatial dimensions? More complexity?More complexity? Another model?Another model? Another approach?Another approach?

Page 44: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Spatial InterpolationSpatial Interpolation

http://www.eia.doe.gov/cneaf/solar.renewables/rea_issues/html/fig2ntrans.gif

Page 45: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Issues: Spatial InterpolationIssues: Spatial Interpolation

10

12

14

14 19

40

25

30

12

6

11

meters to water table

?

resolution? extent? accuracy? precision?resolution? extent? accuracy? precision?boundary effects? point spacing? Method?boundary effects? point spacing? Method?

Page 46: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

GIS and Spatial AnalysisGIS and Spatial Analysis

Geographic inquiry examines the Geographic inquiry examines the relationships between geographic features relationships between geographic features collectively to help describe and collectively to help describe and understand the real-world phenomena that understand the real-world phenomena that the map represents.the map represents.

Spatial analysis compares maps, Spatial analysis compares maps, investigates variation over space, and investigates variation over space, and predicts future or unknown maps.predicts future or unknown maps.

Page 47: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Analytic Tools and GISAnalytic Tools and GIS Tools for searching out spatial relationships and for Tools for searching out spatial relationships and for

modeling are only lately being integrated into GIS.modeling are only lately being integrated into GIS. Statistical and spatial analytical tools are also only Statistical and spatial analytical tools are also only

now being integrated into GIS, and many people use now being integrated into GIS, and many people use separate software systems outside the GIS.separate software systems outside the GIS.

Real geographic phenomena are dynamic, but GISs Real geographic phenomena are dynamic, but GISs have been mostly static. Time-slice and animation have been mostly static. Time-slice and animation methods can help in visualizing and analyzing spatial methods can help in visualizing and analyzing spatial trends.trends.

GIS places real-world data into an organizational GIS places real-world data into an organizational framework that allows numerical description and framework that allows numerical description and allows the analyst to model, analyze, and predict with allows the analyst to model, analyze, and predict with both the map and the attribute data.both the map and the attribute data.

Page 48: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

You can lie with...You can lie with...

MapsMaps StatisticsStatistics

• Correlation is not causation!Correlation is not causation!• Hypothesis vs. ActionHypothesis vs. Action

Page 49: Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,

Coming next ...Coming next ...

Making Maps with GISMaking Maps with GIS