Top Banner
Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina
50

Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Dec 13, 2015

Download

Documents

Dwain Rice
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Why is it there?

(How can a GIS analyze data?)

Getting Started, Chapter 6

Paula Messina

Page 2: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

GIS is capable of data analysis

Attribute Data Describe with statistics Analyze with hypothesis testing

Spatial Data Describe with maps Analyze with spatial analysis

Page 3: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Describing one attribute

Flat File Database

Record Value Value Value

Attribute Attribute Attribute

Record Value Value Value

Record Value Value Value

Page 4: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Attribute Description

The extremesextremes of an attribute are the highest and lowest values, and the rangerange is the difference between them in the units of the attribute.

A histogramhistogram is a two-dimensional plot of attribute values grouped by magnitude and the frequency of records in that group, shown as a variable-length bar.

For a large number of records with random errors in their measurement, the histogram resembles a bell curvebell curve and is symmetrical about the meanmean.

Page 5: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Describing a classed raster grid

5

10

15

20

% (blue) = 19/48

Page 6: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

If the attributes are:

Numbers statistical description min, max, range variance standard deviation

Page 7: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Statistical description

Range : max-min Central tendency : mode, median,

mean Variation : variance, standard

deviation

Page 8: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Statistical description

Range : outliers mode, median, mean Variation : variance, standard deviation

Page 9: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Elevation (book example)

Page 10: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

GPS Example Data: Elevation

Table 6.2: Sample GPS ReadingsData Extreme DateTime D M S D M S ElevMinimum 6/14/95 10:47am 42 30 54.8 75 41 13.8 247Maximum 6/15/95 10:47pm 42 31 03.3 75 41 20.0 610Range 1 Day 12 hours 00 8.5 00 6.2 363

Page 11: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Mean

Statistical average Sum of the values

for one attribute divided by the number of records

X i

i 1=

n

= X / n

Page 12: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Variance

The total variance is the sum of each record with its mean subtracted and then multiplied by itself.

The standard deviation is the square root of the variance divided by the number of records less one.

Page 13: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Average difference from the mean

Sum of the mean subtracted from the value for each record, squared, divided by the number of records-1, square rooted.

st.dev. =(X - X )

2i

n - 1

Standard DeviationStandard Deviation

Page 14: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

GPS Example Data: ElevationStandard Deviation

Same units as the values of the records, in this case meters.

Elevation is the mean (459.2 meters) plus or minus the expected error of 82.92

meters Elevation is most likely to lie between

376.28 meters and 542.12 meters. These limits are called the error band

or margin of error.

Page 15: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Standard Deviations and the Bell Curve

Mean

459.

2

542.

1

376.

3

One Std. Dev.below the mean

One Std. Dev.above the mean

Page 16: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Testing Means (1)

Mean elevation of 459.2 meters Standard deviation 82.92 meters What is the chance of a GPS reading of

484.5 meters? • 484.5 is 25.3 meters above the mean• 0.31 standard deviations ( Z-score)

• 0.1217 of the curve lies between the mean and this value

• 0.3783 beyond it

Page 17: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Mean

12.17 %

37.83 %

Testing Means (2)

459.

2

484

.5

Page 18: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Accuracy

Determined by testing measurements against an independent source of higher fidelity and reliability.

Must pay attention to units and significant digits.

Not to be confused with precision!

Page 19: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

The difference is the map

GIS data description answers the question: Where?

GIS data analysis answers the question: Why is it there?

GIS data description is different from statistics because the results can be placed onto a map for visual analysis.

Page 20: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Spatial Statistical Description For coordinates, the means and

standard deviations correspond to the mean center and the standard distance

A centroid is any point chosen to represent a higher dimension geographic feature, of which the mean center is only one choice.

Page 21: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Spatial Statistical Description For coordinates, data extremes

define the two corners of a bounding rectangle.

Page 22: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Geographic extremes

Southernmost point in the continental United States.

Range: e.g. elevation difference; map extent

Depends on projection, datum etc.

Page 23: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Mean Center

mean y

mean x

Page 24: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Centroid: mean center of a feature

Page 25: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Mean center?

Page 26: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Comparing spatial means

Page 27: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Spatial Analysis

Lower 48 United States 1996 Data from the U.S. Census on

gender Gender Ratio = # females per 100

males Range is 96.4 - 114.4 What does the spatial distribution

look like?

Page 28: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Gender Ratio by State: 1996

Page 29: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Searching for Spatial Pattern A linear relation is a predictable straight-

line link between the values of a dependent and an independent variable. (y = a + bx) It is a simple model of correlation.

A linear relation can be tested for goodness of fit with least squares methods. The coefficient of determination r-squared is a measure of the degree of fit, and the amount of variance explained.

Page 30: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Simple linear relation

dependentvariable

independent variable

observationbest fitregression liney = a + bx

intercept

gradient

y=a+bx

Page 31: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Testing the relation

gr = 117.46 + 0.138 long.

Page 32: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

GIS and Spatial Analysis

Geographic inquiry examines the relationships between geographic features collectively to help describe and understand the real-world phenomena that the map represents.

Spatial analysis compares maps, investigates variation over space, and predicts future or unknown maps.

Many GIS systems have to be coaxed to generate a full set of spatial statistics.

Page 33: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

You can lie with...

MapsMaps

StatisticsStatisticsCorrelation is not causation!

Page 34: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Terrain Analysis

Paula Paula MessinaMessina

Page 35: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Introduction to Terrain Analysis What is terrain analysis? How are data points interpolated to

a grid? How are topographic data sets

produced from non-point data? How are derivative data sets (i.e.,

slope and aspect maps) produced by ArcView?

Page 36: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

What is Terrain Analysis? Terrain Analysis: the study of ground-

surface relief and pattern by numerical methods (a.k.a geomorphometry).

Geomorphology qualitative

Geomorphometry = quantitative

Page 37: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Interpolation to a Grid

Assumptions: Elevations are continuously distributed The influence of one known point over an

unknown point increases as distance between them decreases

58

46

97

86

70

58

46

86

7097

?

Page 38: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Interpolation Using the Neighborhood Model

58

46

86

7097

Inverse-Distance theory dictates: The value of X > 58 The value of X < 97 The value of X is

closer to 58 than 97

x

Page 39: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

58

46

86

7097

x

Zx =

Zp dp-n

P = 1

R

dp-n

P = 1

R

Zx= elevation at kernal (point x)

Zp = elevation at known point pdp = distance from point x to point pn = “friction of distance” value; usually between 1 and 6

Neighborhood Interpolation Using Inverse Distance Weighting

When n=2, the technique is called “inverse-squared distance weighting.”

ArcGIS callsthis IDW

Page 40: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Types of “Neighborhoods” used with IDW

Nearest n Neighbors in this example, n = 3 this method isn’t effective when

there are clusters of points “nearest in quadrants,” and

“nearest in octants” searches can help

Fixed Radius a radius is selected points are selected only if they

lie within that fixed radius

58

46

86

7097

x

46

86

97

58

70

x

Page 41: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Interpolation using the Spline Method

The spline interpolator fits a minimum-curvature surface through input points. “Rubber sheet fit”

The spline interpolator fits a mathematical function to a specified number of nearest points

Page 42: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Interpolation Using Kriging Based on regionalized variable

theory Drift, random correlated component,

noise This method produces a

statistically optimal surface, but it is very computationally intensive

Kriging is used frequently in soil science and geology

Page 43: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Trend Interpolator Fits a mathematical function (a

polynomial of specified order) to input points Points may be chosen by nearest neighbor or radius

searches --or-- All points may be used

Uses a least-squares regression fit The surface produced does not

necessarily pass through the points used This is an excellent choice when data points are sparse

Not available asa menu itemin ArcGIS

Page 44: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Which ArcView menu interpolator is better?

IDW Assumption: The variable being mapped

decreases in influence with distance• Example: interpolating consumer purchasing

power for a retail site analysis

Spline Assumption: The variable being mapped is a

smooth, continuous surface; it is not particularly good for surfaces with

large variability over small horizontal distances

• Examples: terrain, water table heights, pollution concentration, etc.

Page 45: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

The Finished Grid

The Messina “Eyeball” Interpolator was used

58

46

86

7097

x

56 58 65 74

46 56 54

86 84 80

70 75 78 86 94 94 80

66 69 73 80 90 88 86

72 76 80 84 90 89 84

50 52 60 64 68 80 80

48 50 54 56

46 48 50 52 46 46 44

Grids are subject to the “layer cake effect”

Page 46: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Point Data Collection in the Field It is critical to obtain data at the

corners of the grid extent It is advisable to obtain the VIPs

(Very Important Points) such as the highest and lowest elevations

Page 47: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Other Continuous Surface Sources

USGS DEMs produced directly from USGS Topographic Maps Elevations of an area are averaged within the grid cell

(pixel) High and low points can never be saved as a grid cell

value Various techniques (i.e. stereograms) were used to

accomplish this process Original datum (i.e. NAD27, NAD83) is preserved in the

DEM Spatial resolution: 30m (7.5 minute data), 1 arc-second (1 degree data), 10m*, 5m* *(limited

coverage)

Page 48: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Other Continuous Surface Sources

Synthetic Aperture Radar, Side-looking Airborne Radar Shuttle Missions:

• Shuttle Radar Topography Mission, 2/00• SIR-C , 1994

Other Orbiters• Magellan Mapping Mission of Venus,

1990-1994 Click here to see an animation of the Venutian surface topography

Airborne Radar Mappers• AirSAR/TopSAR• GeoSAR: California mapping

Click here to link to Hunter College’s Radar Mapping Web Site

Page 49: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

How is Slope Computed?

Slope = arctan [( )2+( )2]

100 130 140

120 150 160

160 170 200

Grid cell = 100m x 100m

dZdX

dZdY

Calculate the slopefor the central pixel.Click here for thesolution.

Page 50: Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

How is Aspect Computed?

Aspect A’ = arctan -( ) ( )

100 130 140

120 150 160

160 170 200

Grid cell = 100m x 100m

dZdY

dZdX

Calculate the aspectfor the central pixel.Click here for thesolution.

If is negative, add 90 to A’

If is positive, and is negative: add 270 to A’

If is positive, and is positive: subtract A’ from 270

dZdX

dZdY dZdY

dZdXdZdX