Spatial analysis in GIS. GIS for mineral and hydrocarbon exploration Used for integrating data (map layers) to identify most prospective areas ∫ Integrating.

Spatial analysis in GIS

GIS for mineral and hydrocarbon explorationUsed for integrating data (map layers) to identify most prospective areas

∫Integrating function • linear or non-linear• parametersInput spatial datasets

• Categoric or numeric• Binary or multi-class

Output mineral potential map • Grey-scale or binary

Data Types

Nominal/categorical

Ordinal

Interval

Ratio

Nominal data are items which are differentiated by a simple label, usually a name. May have numbers assigned to them. This may appear ordinal but is not. Nominal items are usually categorical, in that they belong to a definable category.Can be counted, but not ordered or measured.

Ordinal data can be ranked (put in order) or have a rating scale attached. Can be counted and ordered, but not measured.

Interval data is where the distance between any two adjacent units of measurement (or 'intervals') is the same but the zero point is arbitrary.

Ratio data are measured in terms of the ratio between a magnitude of a continuous quantity and a unit magnitude of the same kind. The zero value is absolute

Data Types

Parametric vs. Non-parametric

Interval and ratio data are parametric, and are used with parametric tools in which distributions are predictable (e.g., Normal).

Nominal and ordinal data are non-parametric, and do not assume any particular distribution. They are used with non-parametric tools such as the histogram.

4’ 7” 5’ 5’5” 5’10’ 6’3” 6’8” 4’ 7” 5’ 5’5” 5’10’ 6’3” 6’8”

Height of women Height of men

Normal distribution – parameters are mean and standard deviation

Data TypesContinuous and DiscreteContinuous measures are measured along a continuous scale.

Discrete data have a set of fixed values.

Continuous dataDiscreet data

Multi-class/continuous and binary dataContinuous

Multiclass

Binary magnetic map

Binary Geological map

What is GIS?

• GIS = Geographic Information System– Links databases and maps– Manages information about places– Helps answer questions such as:

• Where is it?• What else is nearby?• Where is the highest concentration of ‘X’?• Where can I find things with characteristic ‘Y’?• Where is the closest ‘Z’ to my location?

Definition of GIS(Ron Briggs, UT Dallas)

A system of integrated computer-based tools for end-to-end processing (capture, storage, retrieval, analysis, display) of data using location on the earth’s surface.

• set of integrated tools for spatial analysis• encompasses end-to-end processing of data

– capture, storage, retrieval, analysis/modification, display• uses explicit location on earth’s surface to relate data • aimed at decision support, as well as on-going operations and

scientific inquiry

Because of the link between spatial locations and non-spatial data, it is possible to apply non-spatial statistical modeling methods to spatial data

SPATIAL DATA MODELS

What do you mean by spatial data?

How real world spatial data are represented? How would you represent a real world river? Land-use?

SPATIAL DATA MODELSTwo models:1. Vector model2. Raster model

SPATIAL DATA TYPESSpatial data come in three basic forms:• Spatial Data• Attribute data

Vector Model: Map data

Map data contains the location and shape of geographic features. Maps use three basic shapes to present real-world features: • points, • lines, and• areas (called polygons/regions).

Vector ModelThe spatial locations of features are defined on the basis of coordinate pairs.

• These can be discrete, taking the form of points (Point or Node data) or lines (Arc or polyline data) or areas (Area or polygon data)

• Attribute data pertaining the individual spatial features is maintained in an external database.

• Topology – A set of rules that models how points, lines and polygons share geometry and are related to each other.

Area Population

ROCK

SPATIAL DATA MODELS: Vector Model

VECTOR MODELPoints represent anything that can be described as an x, y location on earth’s surface, for example, mineral deposits, gas fields

Lines objects described by length only (zero width) such as faults, streets, highways, and rivers

A Polygon describes a geographic feature that is characterized by a boundary, whether natural, or artificial, such as the boundaries of countries, states, cities, census tracts, postal zones, and market areas or rock types

SPATIAL DATA TYPES: Image data(Raster Model)

Image data ranges from satellite images, digital elevation models, potential field data data and aerial photographs to scanned maps (maps that have been converted from printed to digital format).

We can represent point, line and polygon data in image form

• Every cell represents a unit area on the ground. All unit areas are equal

• The smaller the area the cells represent, the larger the resolution.

• Cell values represent a specific property of the ground in that unit area:

For example, - Surface reflectance- Magnetic field- Gravity field- Elevation- Rock type

- The values can nominal, ordinal, interval or ratio, they can be integers or floating points.

• Georeferenced

10 m x 10 m grid cell

SPATIAL DATA MODELS: Raster Model

- Most spatial analysis are done in raster format because it facilitates mathematical calculations, e.g.,

INGRID1/ INGRID 2 INGRID1 * INGRID 2

SPATIAL DATA MODELS: Raster Model

VECTOR TO RASTER CONVERSION

The area of interest is covered by a fine mesh or matrix of grid cells and the surface attribute value occurring at the centre of each cell point is recorded as the value for that cell.

1

2

3

Id Type Area

1 Granite 25

2 Sandstone 63

3 Limestone 42

1

2

3

123

Raster to vector conversion (Digitization)

For vectorization, trace the boundaries using a digitizing tablet/on-screen.

Essentially, the X,Y coordinates of features are stored

However, often it is necessary to convert raster to vector format, and then back to the raster format (why??)

SPATIAL DATA TYPES: Attribute DataAttribute (tabular) data is the descriptive data that GIS links to map features.

Attribute data is collected and compiled for specific areas like states, census tracts, cities, and so on and often comes packaged with map data.

GEOPROCESSING IN GIS• Processing of spatial data to derive predictor

map layers

Primary data- Geological map - Structural map- Remote sensing- Geophysical data- geochemical

PROCESSING & INTERPRETATION

Derivative (Input) layers- Proximity to granites - Proximity to deep faults- Proximity to fold axes - Reactive rocks- Competency

differences- Alteration - Metal anomalies

GEOPROCESSING IN GIS

• Querying and conditional evaluation• Density calculations• Distance calculations• Interpolation• Reclassification

QUERYING INGIS

• Query by attributes• Query by location

SELECT BY ATTRIBUTESSQL is used for selecting features in a map layer by attributes that full-fill specified condition.

for example,

SELECT * FROM MapLayer WHERE “field1”>= 10

OPTIONS:NEW_SELECTION ADD_TO_SELECTION REMOVE_FROM_SELECTION SUBSET_SELECTION SWITCH_SELECTION

IMPORTANT OPERATORS• = • >• <• <>• >=• <=

• LIKE• AND• OR • NOT

QUERY BY ATTRIBUTES

ROCK

SELECT * FROM GEOLOGY WHERE “ROCK” = ‘Dolerite’

ROCK

QUERY BY ATTRIBUTES

ROCK Map of dolerite

SELECT BY LOCATIONUsed for selecting features from a map layer based on spatial relationship (adjacency, connectivity, containment) with another layer.

For example, SELECT * FROM MapLayer1 CONTAINS MapLayer2

ArcGIS syntax:SelectLayerByLocation MapLayer1 Type_of_relationship MapLayer2 Buffer_distance NEW_SELECTION

Types of spatial relationships that can be queried:• Intersect• Are within a distance of • Contain• Completely contain• Are within• Are completely within• Have their centroid in• Share a line segment with• Are identical to

SELECT BY LOCATION

FaultsGold deposits

Gold deposits within 1 km from Faults

SELECT * FROM GOLD_DEPOSITS WITHIN _ 1_km FROM FAULTS

Density estimationDensity is defined as number of (point/line) features per unit area

Density surfaces show where point or line features are concentrated.

For example, you have a point shape file showing mineral deposit locations. You want to learn more about the metal distribution in the area.

Can be used for cluster studies (mineral deposits, population, roads/infrastructure, natural resources such as minerals, forest, agriculture etc., animal inhabitations, ecology…

Density estimation

Gold deposits Distribution of gold

Density estimation

Faults Fault density (distribution of faults)

Faults

Density estimation

Distribution of gold Distribution of faults

Distance estimationEuclidean distance is calculated from the center of the source cells to the center of each of the surrounding cells. True Euclidean distance is calculated to each cell in the distance functions.

For each cell, the distance is calculated to each source cell by calculating the hypotenuse, with the x-max and y-max as the other two legs of the triangle. This calculation derives the true Euclidean, not cell, distance. The shortest distance to a source is determined, and if it is less than the specified maximum distance, the value is assigned to the cell location on the output raster.

Distance estimation

Faults Distance to faults

Distance estimation

GEOPROCESSING IN GIS

• Interpolation: used for determining the unknown value at any point from the known values at the given sample points in the spatial neigbourhood.

• Non-interpolative methods

• Interpolative methods

Non-interpolative methods

1. Assign each sample point to a grid cell (or pixel).2. Buffer the sample points.3. Draw a Thiessen or Voronoi polygon around

each sample point; assign the value at the sample point to the entire area within the Voronoi polygon.

Delaunay triangles

a Delaunay triangulation for a set of points is a triangulation of the points in such a way that no point is inside the circumcircle of any triangle.

Delaunay triangulations maximize the minimum angle of all the angles of the triangles in the triangulation.

• Voronoi polygons Connecting the centres of the circumcircles produces the Voronoi polygons.

The property of a Voronoi ploygon of a point is that all points with that polygon are closest to that point.

Interpolation:Estimating values at points

intermediate between sample points.

• Triangulation• Inverse distance weighting• Natural Neighbours• Krigging

Triangulation• Draw Delaunay triangles for all sample points

5

4

3

1

2

FID X Y Z

1 1 1 26

2 4 2 32

3 2 3 28

4 5 4 35

5 3 5 42

6 3 4 ?

5

4

3

1

2

6

The equation for every triangular facet is given by

z = a + bx + cy where z is the value, x and y are X and Y coordinates of a sample point, respectively,a, b and c are unknown coefficients

Three unknown coefficients, three equations, hence the values of the coefficients can be estimated. Once you have coefficients, you can estimate values at any point within the triangle

6

Inverse distance weighing5

43

1

2

6

pij

i

n

ii

n

iii

j

dw

w

zwz

),(

1

1

1

,

Point Z Distance from 6

1 26 3.62 32 2.23 28 1.44 35 25 42 1

Where z is the value at the point i;w is the weight of i;d(j,i) is the distance between the point i and the point j where the value needs to be calculated;p is the power;n is total number points in the neighbourhood with known values.

5

43

1

2

6

Natural neighbor

• Draw Vornoi polygons for all points (green colour)

• Draw a Voronoi polygon around the point at which the value is to be determined (orange colour)

• Apply weights to each point value in proportion to the area of intersection between the Voronoi polygon of that point and the the Voronoi polygon of the query point.

Natural neighbor interpolation finds the closest subset of sample points for the query point and applies weights to them based on proportionate areas.

iji

n

ii

n

iii

j

Aw

w

zwz

,

1

1 Aij is the area of intersection between the Vornoi polygons of the points i and j.

Krigging

n

iiizwz

1

ˆ

The value at the queried point is given by:

Where zi are the values at sample pointswi are the weights of sample points

1

......

01...1

1...

............

1...

0

101

1

111

nnnnn

n

C

C

w

w

CC

CC

C ● w = D

C-1 ● C ● w = D ● C-1

Or w = D ● C-1

C – Spatial covariance values between the pair of sample points D – Spatial covariances between sample points and the point where the value is required to be estimated

Krigging: Spatial covarianceCovariance between two variables x and y is given by

)()(1

1

yyxxn

C i

n

i Measures the degree to which x co-varies with y

Moment of inertia measures the deviation from the perfect correlation

2

1

)(2

1i

n

i yxn

In the above equation, suppose we substitute zt for x and z(t+h) for y, where z is a spatial variable measured at a location t and at another location (t+h), where h is the separation distance called a shift or lag.

The spatial covariance of z with itself at separate distance of h can also be measured by γ, (or by C).

Krigging: VariogramsBy changing the separation distance h (called lag or shift), a series of scatter plots can be generated showing how the variable z is correlated with itself as a function of h.

The plot of the moment of inertia as a function of h is called variogram, the plot with covariance is called autocovariance diagram

Sill

Range

Scatter plotExponential model fitted to the scatter plot

γ(h) = C0 if h =0 γ(h) = C0 + C1(1-exp(-3 h/a) ) if h >0

Sill and range are estimated so the model is a reasonable fit to the observed data

Variogram autocovariance diagram

Krigging: Variogram Models

Krigging: Variogram Models

Krigging: VariogramFitting a model to data

Longer range smaller range

Krigging: Spatial covarianceAutocovariance diagram can be used to calculate covariance at different distances, hence different covariances in the equations below:

1

......

01...1

1...

............

1...

0

101

1

111

nnnnn

n

C

C

w

w

CC

CC

C ● w = D

n

iiizwz

1

ˆ

The following equation is then used to estimate the value at the query point

40

56554945

52504543

42 44 48

ELEVATION IN METERS

100 m

100 m

Auto-covarianceCase 1: Shift of 100 meters

X Y = X+100

40

56554945

52504543

42 44 48

Elevation in 100 meters

100 m

100 m

X Y = X+100


X Y = X+100

40 4240 43

40

56554945

52504543

42 44 48


100 m

100 m


X Y = X+100

40 4240 4342 4442 45

40

56554945

52504543

42 44 48


100 m

100 m

100 m

100 m


X Y = X+100

40 4240 4342 4442 4544 4844 5048 5243 4543 45


X Y = X+100

40 4240 4342 4442 4544 4844 5048 5243 4543 4545 5045 4950 5250 5552 56

Mean X = 44.85Mean Y = 48.28

Covariance = 13.94

MOI = 6.71


X Y = X+200

40

56554945

52504543

42 44 48


200 m

200 m


X Y = X+200

40 44

40 45

40

56554945

52504543

42 44 48


200 m

200 m


X Y = X+200

40 44

40 45

42 48

42 49


X Y = X+200

40 44

40 45

42 48

42 49

44 55

48 56

43 50

45 52

Mean X = 43Mean Y = 49.8

Covariance = 8.718

MOI = 25.56


X Y = X+300

40 48

43 52

45 56

Mean X = 42.66Mean Y = 52

Covariance = 7.8889

MOI = 44.33

Auto-covarianceDistance -vs- Covariance

Distance Covariance

100 13.94

200 8.718

300 7.8889

0 50100

150200

250300

3500

10

20

30

Distance vs Covariance

Covariance

range

VariogramDistance -vs- MOI

Distance MOI

100 6.71

200 25.56

300 44.33

sill

nugget effect

50 100 150 200 250 300 350 400 4500

5

10

15

20

25

30

35

40

45

50

Spatial analysis in GIS. GIS for mineral and hydrocarbon exploration Used for integrating data (map layers) to identify most prospective areas ∫ Integrating.

Documents

spatial data slide

map data map data

ratio data

ordinal data

nonspatial data

interval data

discrete data

spatial data models