SPATIAL DATA STRUCTURES
SPATIAL DATA STRUCTURES
introduction
Spatial data structures refer to the organization of
spatial data in a form suitable for digital computers
Choice of an optimal data structure depends on the
nature of the data and how they are used
RASTER STRUCTURES
FULL RASTER STRUCTURE
A rectangular array of pixel values, in which the row
and column coordinates define a particular location
Most digital image processing systems use full raster
structures.
The structures differ from one another mainly in the way
that attribute data are organized and represented.
The sequencing of pixel in a full raster is usually by
row-order, starting in the upper left and scanning left-
to-right, top-to-bottom
RASTER STRUCTURES
FULL RASTER STRUCTURE
The full raster structure can be organized as:
Band sequential (BSQ)
The values of a single attribute are arranged in row order.
If there is more than one attribute, the second attribute starts where the first attribute finishes
Band interleaved by line (BIL)
Each row of pixels is repeated m times where m is the number of attributes, before moving to the next row
Band interleaved by pixel (BIP)
The band values of each pixel are stored together, so that for a 7-attribute image the first seven values refer to the first pixel, followed by the next seven values of the second pixel, and so on.
RASTER STRUCTURES
FULL RASTER STRUCTURE
Band interleaved by pixel (BIP) and band
interleaved by line (BIL) formats are advantageous
for operations involving the combination of images,
because the physical addresses of the same pixel
are close together.
For very rapid display of single attributes from a
large multi-band dataset, band sequential (BSQ) is
more efficient
Raster Data Structures:
Raster Array Representations for multiple layers
raster data comprises rows and columns, by one or more characteristics or arrays
elevation, rainfall, & temperature; or multiple spectral channels (bands) for remote sensed data
how organize into a one dimensional data stream for computer storage & processing?
Band Sequential (BSQ)
each characteristic in a separate file
elevation file, temperature file, etc.
good for compression
good if focus on one characteristic
bad if focus on one area
Band Interleaved by Pixel (BIP)
all measurements for a pixel grouped together
good if focus on multiple characteristics of geographical area
bad if want to remove or add a layer
Band Interleaved by Line (BIL)
rows follow each other for each characteristic
A B
B B
III IV
I II 150 160
120 140Elevation
Soil
Veg
File 1: Veg A,B,B,B
File 2: Soil I,II,III,IV
File 3: El. 120,140,150,160
A,I,120, B,II,140 B,III,150 B,IV,160
A,B,I,II,120,140 B,B,III,IV,150,160
Note that we start in lower left.
Upper left is alternative.
RASTER STRUCTURES
RUN-LENGTH ENCODING
The storage requirements for full raster images increase geometrically with the decreasing size of a pixel causing storage space problems
This requires compression methods
Run-length encoding is a simple data structure that can reduce the space requirements of some images drastically.
It is efficient for image display and for some processing algorithms
Adjacent pixels having the same value are combined together as a run, represented as a pair of numbers.
Each run pair consists of a number for the length of the run in pixels, followed by a second number for the attribute value of the run
Raster Data StructuresRunlength Compression (for single layer)
Full Matrix--162 bytes
111111122222222223
111111122222222233
111111122222222333
111111222222223333
111113333333333333
111113333333333333
111113333333333333
111333333333333333
111333333333333333
1,7,2,17,3,18
1,7,2,16,3,18
1,7,2,15,3,18
1,6,2,14,3,18
1,5,3,18
1,5,3,18
1,5,3,18
1,3,3,18
1,3,3,18
Run Length (row)--44 bytes
“Value thru column” coding.
1st number is value, 2nd is
last column with that value.
This is a “lossless”
compression, as
opposed to “lossy,”
since the original data
can be exactly
reproduced.
VECTOR DATA STRUCTURE
SPAGHETTI STRUCTURE
Tables of locational coordinates are associated with each of the spatial objects (points, lines, or polygons)
No topological attributes are used, so that navigating around a map must be accomplished by searching lists of spatial coordinates
Costly for search operations, but efficient for display purposes
Separate tables are used for points, lines and polygons
Linkages between objects are determined by computations from the spatial coordinates
Sometimes called unstructured because topological relationships must be derived through computation
VECTOR DATA STRUCTURE
SPAGHETTI STRUCTURE
Point tables
Each point is a row of the table, with the locational
attributes as columns
Lines are strings of connected straight –line segments
defined by ordered sequences of points or vertices.
Polygon tables are similar to line tables, except that the
last vertex is the same as the first vertex
SPAGHETTI STRUCTURE
point table
ID# X Y A1 A2 … An
1 X1 Y1 a11 a12 . a1n
2 X2 Y2 a21 a22 . a2n
3 X3 Y3 a31 a32 . a3n
M Xm Ym am1 am2 , amn
Point table. X and Y are locational coordinates.A1, A2, …An are thematic attributes.Each record or row is a single point object
SPAGHETTI STRUCTURE
line table
1 5 2 7 Header for line 1
X1
X2
X3
X4
X5
Y1
Y2
Y3
Y4
Y5
Coordinates of
vertices for line 1
• Many lines are held in the same file
• Each new line begins with a header
• The next line contains the locational coordinates of the vertices or points defining
the lines
• The first field is the line ID#, the second field is the number of vertices, the third
and fourth or more fields are attributes
SPAGHETTI STRUCTURE
polygon table
• The same as for the lines, except that the last vertex has the same coordinates
as the first vertex
• Each polygon may have many attributes, in which case the attribute data are
held in a separate table, linked by polygon number
• One attribute must define priority for plotting to take care of the presence of
islands
Spaghetti data structure: example
Example: points, lines and polygons are stored separately
Polygon 1 Polygon 2
X1,Y1
X2,Y2
X3,Y3
X4,Y4
X5,Y5
X6,Y6
X7,Y7
X1,Y1
X8,Y8
X9,Y9
X10,Y10
X11,Y11
X3,Y3
X2,Y2
X1,Y1
X8,Y8
1 2X5,Y5
X4,Y4
X3,Y3X11,Y11
X1,Y1
X2,Y2
X7,Y7X6,Y6
X10,Y10
X9,Y9
X8,Y8
For each polygon, we store a (ordered) list of coordinates of points on its
boundary
Spaghetti data structure: remarks
Example:
NOTE 1: coordinates of points along common boundary are recorded twice!
Redundancy: if we update coordinates of a point, we need to update them everywhere!
Polygon 1 Polygon 2
X1,Y1
X2,Y2
X3,Y3
X4,Y4
X5,Y5
X6,Y6
X7,Y7
X1,Y1
X8,Y8
X9,Y9
X10,Y10
X11,Y11
X3,Y3
X2,Y2
X1,Y1
X8,Y8
1 2X5,Y5
X4,Y4
X3,Y3X11,Y11
X1,Y1
X2,Y2
X7,Y7X6,Y6
X10,Y10
X9,Y9
X8,Y8
Spaghetti data structure: remarks (cont.d)
Example:
NOTE 2: no easy way of solving queries such as: “Do Polygon 1 and 2 share a common bounding line?”
Need to analyse all coordinates of points of Polygon 1 and compare with those of Polygon 2 and see if two consecutive pairs are the same: inefficient!!
Polygon 1 Polygon 2
X1,Y1
X2,Y2
X3,Y3
X4,Y4
X5,Y5
X6,Y6
X7,Y7
X1,Y1
X8,Y8
X9,Y9
X10,Y10
X11,Y11
X3,Y3
X2,Y2
X1,Y1
X8,Y8
1 2X5,Y5
X4,Y4
X3,Y3X11,Y11
X1,Y1
X2,Y2
X7,Y7X6,Y6
X10,Y10
X9,Y9
X8,Y8
VECTOR DATA STRUCTURE
SPAGHETTI STRUCTURE
Advantages
The sequential organization for digital plotting
Disadvantages
redundancy because of repetition of polygon
boundaries
Computational expense due to the absence of
topological attributes
VECTOR DATA STRUCTURE
TOPOLOGICAL DATA STRUCTURE
Points are vertices
A line is a sequence of ordered vertices, where the beginning 0f the line is a special vertex or start node and the end a special vertex called an end node
A chain is a line which is part of one or more polygons, they are also called arcs or edges
A node is a point where lines or chains meet or terminate
A polygon consists of one outer ring and zero or more inner rings
A ring consists of one or more chains.
A simple polygon has no inner rings
Non-simple or complex polygon has one or more inner rings and is said to have “holes” or “islands”
VECTOR DATA STRUCTURE
TOPOLOGICAL DATA STRUCTURE
The advantages of this structure over the spaghetti
structure are:
There is no repetition of spatial coordinates between
one polygon and the next, except at nodes, so that the
repeat lines are eliminated
Topological information is explicitly stored and is
separated from the spatial coordinates, facilitating
search that require adjacency, containment and
connectivity information
VECTOR DATA STRUCTURE
TOPOLOGICAL DATA STRUCTURE
Polygon # Ring # Ring sequence #
1 2 1
2 1 1
2 3 2
3 3 1
A- polygon topology table
VECTOR DATA STRUCTURE
TOPOLOGICAL DATA STRUCTURE
Ring # Chain # Chain sequence #
2 3 1
2 2 2
1 2 1
1 4 2
3 1 1
3 5 2
B- ring topology table
VECTOR DATA STRUCTURE
TOPOLOGICAL DATA STRUCTURE
Chain # Start node Stop node Left polygon Right polygon
1 1 2 2 3
2 3 4 1 2
3 4 3 1 0
4 4 3 0 1
5 1 2 3 2
C- chain topology table
Node# Vertex#
1 14
2 11
3 1
4 3
D- node-to-vertex table
VECTOR DATA STRUCTURE
TOPOLOGICAL DATA STRUCTURE
Chain# V3ertex# Vertex sequence#
1 14 1
1 9 2
1 10 3
1 11 4
2 1 1
2 7 2
2 8 3
2 3 4
3 3 1
3 2 2
3 1 3
4 3 1
E- chain-to-vertex table
Vertex# X Y
1 X1 Y1
2 X2 Y2
3 X3 Y3
4 X4 Y4
… .. ..
14 X14 Y14
F- coordinates of vertices table