Transcript
7/28/2019 Trodd Overlay Areas and Surfaces
1/17
1
OVERLAY ANALYSIS: AREAS AND SURFACES
TABLE OF CONTENTS
1 The importance of overlay in GIS..................................................................................................... 22 Learning objectives............................................................................................................................2
3 Area and surface overlay analysis......................................................................................................3
3.1 A brief history............................................................................................................................ 3
3.2 Field and feature perspectives....................................................................................................3
4 Polygon-on-polygon overlay operations............................................................................................44.1 Topological issues......................................................................................................................4
4.2 Creating new geometries............................................................................................................4
4.3 Weighted overlay and the vector data model.............................................................................6
4.4 Problems with overlay on vector data model.............................................................................7
4.4.1 Computational demands.....................................................................................................7
4.4.2 Sliver polygons...................................................................................................................85 Area-on-area overlay operations on raster data............................................................................... 10
5.1 Overlay analysis on raster data is easy! .................................................................................. 10
e-Tutorial Exercise 1................................................................................................................. 10
5.2 Difficulties of overlay on raster data models........................................................................... 11
5.2.1 What do the numbers mean?............................................................................................11
e-Tutorial Exercise 2................................................................................................................. 135.2.2 Constructing area entities.................................................................................................13
5.2.2 Cell resolution.................................................................................................................. 13
6 Some more issues in overlay analysis..............................................................................................146.1 Scales of measurement.............................................................................................................14
6.2 Scale and overlay analysis....................................................................................................... 16
7 What have you learnt in this lesson?............................................................................................... 16
2005 Nigel Trodd
7/28/2019 Trodd Overlay Areas and Surfaces
2/17
2
OVERLAY ANALYSIS: AREAS AND SURFACES
1 The importance of overlay in GIS
Overlay is a fundamental spatial operation. It is one of the functions that distinguishes
GIS from other systems such as CAD and DBMS. The UK Chorley Report (Department of
the Environment, 1986) illustrates what a GIS should be able to do by giving the example
of an industrial siting case study that uses overlay to 'sieve' the various siting criteria and
identify suitable locations.
Overlay operators combine data from the same entity type or different entity types. In
both cases they create new geometries and can change entity type and/or attribute value.
There are four overlay operators in common use:
point-in-area (also known as point-in-polygon)
line-in-area
area-on-area (also known as polygon-on-polygon)
weighted overlay
In this lesson we will concentrate on two of these operators, namely area-on-area and
weighted overlay. These operators process area and surface entity types respectively.
You will find that overlay techniques vary with the data model employed by your GIS. This
means that the results of overlay analysis depend on the data model and, in general,
techniques to analyse vector data are time consuming and computationally intensive
whereas overlay of raster data is relatively straightforward, quick and efficient.
2 Learning objectives
Upon completion of this lesson you should be able to:
Identify and explain techniques to perform area-on-area overlay on raster and
vector data.
Identify and explain techniques to perform weighted overlay on raster and vector
surfaces.
Understand the main weaknesses of these overlay operations as they are
implemented in GIS.
2005 Nigel Trodd
7/28/2019 Trodd Overlay Areas and Surfaces
3/17
3
3 Area and surface overlay analysis
3.1 A brief history
The principles of area-on-area overlay pre-date GIS. Until the arrival of GIS, map overlay
analysis was performed manually by superimposing transparent acetates of map layers on
a light table. The stack of acetates was used to visually identify sites that met a number of
criteria.
In the 1960s the 'quantitative revolution' heralded a new era for spatial analysis. Several
influential figures emerged including Ian McHarg whose work in landscape ecology is most
well known for its attempt to explain the distribution of plants by combining information
about the environment. His approach was to apply sieve-mapping techniques. This was
a substantial step forward in computational spatial analysis, allowing considerably more
work to be performed using the computer than was originally possible from field
observation and single coverage cartography alone (Simpson, 1989). At the same time
the first GIS prototypes were being developed and it is not coincidental that early products
were designed, in part, to automate this `sieve mapping. Speedy and efficient analytical
techniques such as these were of particular interest to governments intent on examining
spatial relationships of large regions. Work done under the auspices of the Canadiangovernment in developing CGIS was largely responsible for increasing the prevalence of
polygon overlay in GIS.
3.2 Field and feature perspectives
Overlay analysis has generally taken the form of either area-on-area overlay or weighted
overlay depending on your perspective. The former is more concerned with the analysis
of particular features and adopts a discrete object perspective. The objectives of area-on-
area overlay are to determine whether two features overlap (the technical term is to
'intersect') and, if so, to define the identity of areas formed by the overlap as one or more
new area objects.
Weighted overlay operations combine two or more complete map layers consisting of
areas or surfaces. In addition to computing the identity of the new geometries the
objective of weighted overlay is to compute new attribute values. Because the operation
processes the complete data set so boundaries from all inputs will be retained but
broken into shorter fragments by intersections that occur between boundaries in one input
2005 Nigel Trodd
7/28/2019 Trodd Overlay Areas and Surfaces
4/17
4
dataset and boundaries in another.
In both area-on-area and weighted overlay the output entity type is an area or surface
respectively but the overlay operation has generated new geometries and new attributes.
4 Polygon-on-polygon overlayoperations
4.1 Topological issues
If you are overlaying two vector map layers you need to ensure before you start that the
input map layers are topologically correct. If this is so then the output maps will also be
topologically correct (Figure 1).
Figure 1. Polygon-on-polygon overlay.
Polygon-on-polygon overlay New geometries
In polygon overlay it is necessary to add new intersections (nodes) and create new
polygons to retain topology. Overlaying 2 sets of polygons can produce a large number of
new polygons and increase the number of nodes and arcs. In Figure 2, for example, the
number of nodes increased by 75% and the number of arcs by 83%. Warning!! Increasing
the number of input data sets can rapidly increase the number of output features.
The algorithms to compute the location of new nodes are the same as those used for line
intersection. Once these have been identified so the arcs need to be split and then thenew topology constructed.
4.2 Creating new geometries
Once the new set of nodes, arcs and polygons have been created the task is to extract a
meaningful set of polygons. It may be desirable to retain only that area that is common to
both input features. For example, a farmer is interested in knowing that part of a field that
has a loam soil. He is able to overlay the map of loam soil polygons on field polygon to
extract a feature that meets both criteria (loam soil AND in-field).
2005 Nigel Trodd
7/28/2019 Trodd Overlay Areas and Surfaces
5/17
5
Figure 2. Creating geometries in overlay operations on the vector data model.
Figure 3. Polygon overlay: intersection
polygon a AND polygon b new feature geometry
(old boundaries dissolved)
It is worth noting that the variables the farmer is processing are both of categorical (or
nominal) data type. This is because mathematicians have developed a suite of algorithms
to analyse these data, known as Boolean operators, that GIS analysts exploit in area-on-
area overlay analysis.
2005 Nigel Trodd
7/28/2019 Trodd Overlay Areas and Surfaces
6/17
6
In the example the farmer was analysing 2 criteria and applied an algorithm to create a
new geometry that met both criteria the area of intersection of polygon a AND polygon
b. In other situations you may be more interested in features that meet either criteria
(polygon a OR polygon b). The algorithm is known as union and the effect is to retain all
parts of both input polygons in the output feature. Likewise other Boolean operators
frequently used in GIS are NOT and XOR.
Figure 4. Polygon overlay: (a) Union, (b) NOT and (c) XOR.
a)
polygon a OR polygon b Union
b)
polygon a NOT polygon b Only parts of polygon a that are outside
polygon b
c)
polygon a XOR polygon b The inverse of intersection
NOT (polygon a AND polygon b)
As well as the mathematical rigour conveyed by the use of Boolean operators a strength
of the basic polygon overlay in that it is intuitive when applied to a vector data model
because we are handling discrete area objects and nominal attributes.
4.3 Weighted overlay and the vector data model
In the basic area-on-area overlay on a vector data model the objective was to identify one
or more parts of the new geometry that met simple criteria. Areas that did not meet the
2005 Nigel Trodd
7/28/2019 Trodd Overlay Areas and Surfaces
7/17
7
criteria were discarded. This was processed as a single task.
The objective of weighted overlay is to calculate a new set of values for the complete
coverage based on a combination of input values. When working with a vector data
model there are two tasks to perform (i) create a new set of geometries for the entire
area and (ii) compute a new set of attributes for those geometries.
The latter task is a matter of describing a mathematical equation to process the input
values. The first task, however, requires you to extend the basic polygon overlay
operation to consider every intersection between all polygons in every data layer. As you
can imagine this can be computationally demanding, especially if the GIS you are using
computes topology 'on the fly' and does not store it in the data structure. As we shall see
this is one of the reasons why weighted overlay is more frequently applied to a raster data
model.
4.4 Problems with overlay on vector data model
4.4.1 Computational demands
The data file produced as a result of polygon overlay may be considerably larger than the
original because lines have been split into smaller segments and new nodes and polygons
have been created. Although more file space is required to store the outputs a morecommon problem is that some implementations of polygon overlay in GIS require large
amounts of memory or temporary file space to hold intermediate products during the
processing. The result is that most GIS are limited in the number of polygons that they
can handle in a polygon overlay operation.
It is fairly obvious that larger map layers will take longer to process. It is therefore prudent
to develop a strategy to minimise processing time (and memory use). A data processing
strategy is particularly important if your GIS has to compute topology 'on the fly' e.g. ESRIArcView 3 and ArcGIS 8, because this increases the computational demands. My advice
is to design your analysis so that the fewest number of features are overlaid.
Example: generate information on the area of coniferous forest in Bavaria.
Poor strategy:
Intersect all states in Germany with all land cover types (wait three
hours....)
Select by attribute to extract coniferous forests in Bavaria.
2005 Nigel Trodd
7/28/2019 Trodd Overlay Areas and Surfaces
8/17
8
Smart strategy:
Select Bavaria
Reclass by attribute all coniferous forest land cover (and all other
non-coniferous forest land cover)
Intersect reclassified coniferous forest with Bavaria (wait 5 mins...)
The smart strategy requires an extra step in processing but will substantially reduce the
computational load.
4.4.2 Sliver polygons
Rogue or spurious polygons that are produced as a result of overlay are commonly known
as sliver polygons. If you overlay two sets of data with the same area entities that have
been acquired from different sources or have been digitised twice from the same source
then you will almost certainly encounter such polygons.
Figure 5. Sliver polygons caused by digitising the same line twice.
The two versions of such boundaries will not be coincident and as a result large numbers
of small sliver polygons will be created by the polygon overlay process.
Figure 6. Sliver polygons along the boundaries of administrative units.
2005 Nigel Trodd
7/28/2019 Trodd Overlay Areas and Surfaces
9/17
9
There are two approaches to eliminating them:
1. close them during processing.
2. eliminate them afterprocessing.
Removing them automatically during processing is normally done using a user-defined
tolerance. The analyst adjusts the tolerance to create an optimal solution. If the tolerance
is too big then lines which are close together, but actually separate, may be joined.
Figure 7. Setting tolerance to close sliver polygons.
The alternative is to remove slivers after processing. This may speed up the actual
overlay processing but requires a degree of intelligence for a computer to be able to
distinguish between real and sliver polygons. There are several differences between
typical (real) polygons and sliver polygons.
Figure 8. Real and sliver polygons.
'Real' polygons
Sliver polygons
2005 Nigel Trodd
7/28/2019 Trodd Overlay Areas and Surfaces
10/17
10
'Real' polygons Sliver polygons
Size and shape vary Generally small, long and thin
Generally more than two bounding arcs Generally only two bounding arcs
Attributes vary randomly between
neighbouring polygons
Attributes may alternate between adjacent
polygons
Usually three arc intersections Four arc intersections generally
Once the sliver polygons have been identified they can be closed by replacing them with a
central line.
5 Area-on-area overlay operations on
raster data5.1 Overlay analysis on raster data is easy!
If two grids are aligned and have the same grid cell size then it is relatively easy to
perform overlay operations. A new layer of values is produced from each pair of
coincident cells. The values of these cells can be added, subtracted, divided or multiplied,
the maximum value can be extracted, mean value calculated, a logical expression
computed and so on. The output cell simply takes on a value equal to the result of the
calculation.
2005 Nigel Trodd
e-Tutorial Exercise 1
Time: 20 mins
Let us return to Klinkenbergs excellent demonstrations of GIS
operations.
http://www.geog.ubc.ca/courses/klink/java/java_examples.htm
l
Use the Binary Overlays demonstration to investigate the
effects of different Boolean operators on two layers.
7/28/2019 Trodd Overlay Areas and Surfaces
11/17
11
Figure 9. Some mathematical operators for overlay operations on the raster data model.
Input layers Output layer
A
Simple addition
A + B = C
C
Multiplication
A * B = D
D
B
Unique conditions
If A =1, B =1 then E = 1
If A = 2, B = 1 then E = 2
If A = 1, B = 2 then E = 3
If A = 2, B = 2 then E = 4E
5.2 Difficulties of overlay on raster data models
The main problems are not technical GIS problems, they are data problems.
5.2.1 What do the numbers mean?
The simplicity of the operator makes the overlay process very easy to implement.
Problems usually start with interpreting the outputs. For example, to identify an area that
meets criteria on two inputs (intersection) can be done one of two ways. The most logical
2005 Nigel Trodd
7/28/2019 Trodd Overlay Areas and Surfaces
12/17
12
approach is to reclass the cell values in each layer as either 0 or 1 to indicate whether
they meet the criteria or not and then multiply. The extra effort to reclassify 2 layers is
time consuming and many analysts will seek to multiply the inputs and then reclassify the
output. This reduces the effort but might not always produce a set of unambiguous output
values. For example, using the farmer interested in identifying that part of a field that has
a loam soil then if the loam soil is coded 3 and the field number is 2 then he should
look for cells in the output layer with a value of 6. The problem is that other
combinations of inputs can generate the same value 4 and 2, 6 and 1. The problem is
caused by the analyst and in my experience it happens far too frequently with the results
being published without anyone being aware of the consequences. Perhaps this is
because of the widespread availability of such easy-to-use operators.
The problem of an output layer not having unique records is not restricted to multiplication.
Jenks has illustrated how the same problem can be caused by addition and it is easy to
show the problem arises in all mathematical operators if the analyst is unaware of the
meaning behind the data.
Figure 10. Ambiguities in the output of overlay operations on raster data.
2005 Nigel Trodd
7/28/2019 Trodd Overlay Areas and Surfaces
13/17
13
5.2.2 Constructing area entities
A problem in the overlay analysis of raster data models is the correct identification of area
features in the output because, unlike polygon-on-polygon analysis on the vector data
model, there is no intuitive geometry. Each cell is processed individually and the analyst
has to create new geometries based on only the new cell attributes. The operator does
not distinguish between area entities and surface entities. The analyst is faced with at
least 2 questions (i) has the mathematical computation produced unambiguous cell
values i.e. each value has a single, distinctive meaning?, and (ii) should diagonally
adjacent cells with the same value be part of the same area feature in the output or should
adjacency be defined in terms of horizontal and vertical neighbouring cells.
5.2.2 Cell resolution
Resolution is the pixel, grid cell or mesh size of spatial data. For example, remotely
sensed multispectral data from the SPOT satellite has a resolution of 20m. This means
that each pixel in the image represents a ground area of size 20m by 20m. Imagine you
wish to overlay a SPOT image with a raster representation of urban fox population
densities which has been coded with a 10mpixel size. Will your output have a resolution
of 10m or 20m or 30m size?
2005 Nigel Trodd
e-Tutorial Exercise 2
Time: 20 mins
Let us visit Klinkenbergs excellent demonstrations of GIS
operations.
http://www.geog.ubc.ca/courses/klink/java/java_examples.html
Can you solve the problem posed in the CROSSTAB and
reclassification demonstration? This requires an
understanding of how new output values are created for each
unique combination of input values.
7/28/2019 Trodd Overlay Areas and Surfaces
14/17
14
6 Some more issues in overlay analysis
6.1 Scales of measurement
GIS gives you immense flexibility in the way you can overlay raster data - probably too
much flexibility for the casual user. There are some computations you can achieve which
simply do not make sense! For instance imagine you have one map layer coded with
different soil types given codes such as 1 (clay) or 5 (loam). And you have a second map
layer with rainfall totals. It is perfectly possible to add, subtract, multiply etc. these two
map layers, but in all cases the answers are nonsense. Why? Because the two sets of
data have been collected using different scales of measurement. Rainfall is generally
measured using values on what is known as a RATIO scale, and soil classes are
NOMINAL or categorical data. The rainfall values are fine, you can add, subtract, divide
etc. using ratio numbers, but you cannot apply these operators to nominal scale numbers.
Using a nominal scale, the numbers allocated are simply labels: they may as well be
letters A - E. 5 on the soil scale is not five time larger than 1, neither is it one unit more
than 4. In fact, the only thing we can say about different values on a nominal scale is
that the property is 'different'. Therefore, multiplying, or adding soil type to rainfall
produces a meaningless result.
So, although the GIS will let you perform these operations, it will not tell you when they
produce meaningless answers. It is important, therefore, to know what scale of
measurement has been used for the measurement of your data. Scales of measurement
are summarised below together with the details of the operations possible on each type of
data. Although the problem is often associated with analysis on raster data models
because many vector-GIS are supported by a database that recognises alphanumeric
values you should be aware that knowledge of measurement scale is fundamental to any
data processing work.
2005 Nigel Trodd
7/28/2019 Trodd Overlay Areas and Surfaces
15/17
15
The table below summarises the operations possible on the different types of
measurement (adapted from Unwin, 1981):
Table 2. Scales of measurement.
Level Basic operations Examples
Nominal Frequency (count)
Recognition of equality
Name (of person or road),
Address (postcode), House
type (detached, semi-
detached, terrace,
apartment), Colour
2005 Nigel Trodd
Scales of Measurement summary
Nominal : such as I.D. number or soil type. Such numbers have no meaning,
they simply represent distinct categories. So, cities given a reference number,
or telephone numbers, are examples of measurements on a nominal scale. The
only relationship between numbers on a nominal scale is one of identity.
Ordinal : such as positions in a competition - the order is important. It is
possible to rank data, but we know nothing about other numerical relationships
between data. For example, we can rank cities in terms of their population
totals, with city number 1 having the highest total. However, the city with rank 2
will not have a population half that of the city with rank 1, but we do know that
the population of city 2 is smaller than that of city 1.
Interval : such as temperature measured in centigrade or Fahrenheit. There is
no real zero but intervals between integers are equal. Temperature data, in
common with other interval data can be added and subtracted (for example to
find daily temperature range from the maximum and minimum) but we cannot
say 20 oC is twice as hot as 10 oC.
Ratio : such as distance. There is a real zero, negatives are possible, intervals
between numbers are equal and so is order. Ratio scale data can be added and
subtracted and have ratio properties. Thus we can say 20/10 equals 30/15.
Each scale has the property described by its name and, below nominal scale,
has all the properties of the one above.
7/28/2019 Trodd Overlay Areas and Surfaces
16/17
16
Level Basic operations Examples
Ordinal Determination of order (rank) Grade (A-B-C-D-E), Tax
band (High rate, Standard
rate, Basic rate)
Interval Addition, subtraction Temperature (degrees
Celsius), date
Ratio Addition, subtraction,
multiplication, division
Distance, rainfall, income
6.2 Scale and overlay analysis
Berry (1991) identified scale as a cause of error that may be incorporated almost
effortlessly into overlay analysis. Overlay analysis can be implemented on any pair of
inputs if they cover the same spatial extent. Many analysts, however, ignore the
consequences of combining data of different scales. Two maps at very different scales
are frequently the product of very different data modelling exercises e.g. the GB Ordnance
Survey produces 1:10,000, 1:50,000 and 1:250,000 data products that have been
maintained separately and are designed for different purposes.
7 What have you learnt in this lesson?
The integration of spatial data is at the heart of GIS and area-on-area and weighted
overlay epitomise the analysis of multiple data layers. They were some of the first
operators implemented in GIS in its' early years and have attracted considerable attention
from both researchers to extend the range of algorithms and investigate the
consequences of different algorithms and software developers to improve efficiency in the
implementation of algorithms.
Vector algorithms for area-on-area overlay analysis are elegant and intuitive but
computationally demanding. They also produce sliver polygons in their thousands. These
problems can be reduced by adopting a set of heuristics and implementing additional
processing to clean up the unwanted artefacts. It remains highly desirable to design a
smart strategy when using polygon overlay. Even so, I suspect that more overlay analysis
is performed on the raster data model.
Area-on-area and weighted overlay are simple and quick to apply to the raster data model
2005 Nigel Trodd
7/28/2019 Trodd Overlay Areas and Surfaces
17/17
17
if the grids are aligned and of equal cell size. The inherent weaknesses of the raster data
model become apparent in post-processing when the analyst might be faced with making
some arbitrary decisions as to the meaning of the output.
2005 Nigel Trodd
top related