SPATIAL DATA STRUCTURES - Delta Univdeltauniv.edu.eg/.../ch-3-SPATIAL-DATA-STRUCTURES.pdf · SPAGHETTI STRUCTURE Tables of locational coordinates are associated with each of the spatial

SPATIAL DATA STRUCTURES

introduction

Spatial data structures refer to the organization of

spatial data in a form suitable for digital computers

Choice of an optimal data structure depends on the

nature of the data and how they are used

RASTER STRUCTURES

FULL RASTER STRUCTURE

A rectangular array of pixel values, in which the row

and column coordinates define a particular location

Most digital image processing systems use full raster

structures.

The structures differ from one another mainly in the way

that attribute data are organized and represented.

The sequencing of pixel in a full raster is usually by

row-order, starting in the upper left and scanning left-

to-right, top-to-bottom

RASTER STRUCTURES


The full raster structure can be organized as:

Band sequential (BSQ)

The values of a single attribute are arranged in row order.

If there is more than one attribute, the second attribute starts where the first attribute finishes

Band interleaved by line (BIL)

Each row of pixels is repeated m times where m is the number of attributes, before moving to the next row

Band interleaved by pixel (BIP)

The band values of each pixel are stored together, so that for a 7-attribute image the first seven values refer to the first pixel, followed by the next seven values of the second pixel, and so on.

RASTER STRUCTURES


Band interleaved by pixel (BIP) and band

interleaved by line (BIL) formats are advantageous

for operations involving the combination of images,

because the physical addresses of the same pixel

are close together.

For very rapid display of single attributes from a

large multi-band dataset, band sequential (BSQ) is

more efficient

Raster Data Structures:

Raster Array Representations for multiple layers

raster data comprises rows and columns, by one or more characteristics or arrays

elevation, rainfall, & temperature; or multiple spectral channels (bands) for remote sensed data

how organize into a one dimensional data stream for computer storage & processing?

Band Sequential (BSQ)

each characteristic in a separate file

elevation file, temperature file, etc.

good for compression

good if focus on one characteristic

bad if focus on one area

Band Interleaved by Pixel (BIP)

all measurements for a pixel grouped together

good if focus on multiple characteristics of geographical area

bad if want to remove or add a layer

Band Interleaved by Line (BIL)

rows follow each other for each characteristic

A B

B B

III IV

I II 150 160

120 140Elevation

Soil

Veg

File 1: Veg A,B,B,B

File 2: Soil I,II,III,IV

File 3: El. 120,140,150,160

A,I,120, B,II,140 B,III,150 B,IV,160

A,B,I,II,120,140 B,B,III,IV,150,160

Note that we start in lower left.

Upper left is alternative.

RASTER STRUCTURES

RUN-LENGTH ENCODING

The storage requirements for full raster images increase geometrically with the decreasing size of a pixel causing storage space problems

This requires compression methods

Run-length encoding is a simple data structure that can reduce the space requirements of some images drastically.

It is efficient for image display and for some processing algorithms

Adjacent pixels having the same value are combined together as a run, represented as a pair of numbers.

Each run pair consists of a number for the length of the run in pixels, followed by a second number for the attribute value of the run

Raster Data StructuresRunlength Compression (for single layer)

Full Matrix--162 bytes

111111122222222223

111111122222222233

111111122222222333

111111222222223333

111113333333333333

111113333333333333

111113333333333333

111333333333333333

111333333333333333

1,7,2,17,3,18

1,7,2,16,3,18

1,7,2,15,3,18

1,6,2,14,3,18

1,5,3,18

1,5,3,18

1,5,3,18

1,3,3,18

1,3,3,18

Run Length (row)--44 bytes

“Value thru column” coding.

1st number is value, 2nd is

last column with that value.

This is a “lossless”

compression, as

opposed to “lossy,”

since the original data

can be exactly

reproduced.

VECTOR DATA STRUCTURE

SPAGHETTI STRUCTURE

Tables of locational coordinates are associated with each of the spatial objects (points, lines, or polygons)

No topological attributes are used, so that navigating around a map must be accomplished by searching lists of spatial coordinates

Costly for search operations, but efficient for display purposes

Separate tables are used for points, lines and polygons

Linkages between objects are determined by computations from the spatial coordinates

Sometimes called unstructured because topological relationships must be derived through computation


SPAGHETTI STRUCTURE

Point tables

Each point is a row of the table, with the locational

attributes as columns

Lines are strings of connected straight –line segments

defined by ordered sequences of points or vertices.

Polygon tables are similar to line tables, except that the

last vertex is the same as the first vertex

SPAGHETTI STRUCTURE

point table

ID# X Y A1 A2 … An

1 X1 Y1 a11 a12 . a1n

2 X2 Y2 a21 a22 . a2n

3 X3 Y3 a31 a32 . a3n

M Xm Ym am1 am2 , amn

Point table. X and Y are locational coordinates.A1, A2, …An are thematic attributes.Each record or row is a single point object

SPAGHETTI STRUCTURE

line table

1 5 2 7 Header for line 1

X1

X2

X3

X4

X5

Y1

Y2

Y3

Y4

Y5

Coordinates of

vertices for line 1

• Many lines are held in the same file

• Each new line begins with a header

• The next line contains the locational coordinates of the vertices or points defining

the lines

• The first field is the line ID#, the second field is the number of vertices, the third

and fourth or more fields are attributes

SPAGHETTI STRUCTURE

polygon table

• The same as for the lines, except that the last vertex has the same coordinates

as the first vertex

• Each polygon may have many attributes, in which case the attribute data are

held in a separate table, linked by polygon number

• One attribute must define priority for plotting to take care of the presence of

islands

Spaghetti data structure: example

Example: points, lines and polygons are stored separately

Polygon 1 Polygon 2

X1,Y1

X2,Y2

X3,Y3

X4,Y4

X5,Y5

X6,Y6

X7,Y7

X1,Y1

X8,Y8

X9,Y9

X10,Y10

X11,Y11

X3,Y3

X2,Y2

X1,Y1

X8,Y8

1 2X5,Y5

X4,Y4

X3,Y3X11,Y11

X1,Y1

X2,Y2

X7,Y7X6,Y6

X10,Y10

X9,Y9

X8,Y8

For each polygon, we store a (ordered) list of coordinates of points on its

boundary

Spaghetti data structure: remarks

Example:

NOTE 1: coordinates of points along common boundary are recorded twice!

Redundancy: if we update coordinates of a point, we need to update them everywhere!

Polygon 1 Polygon 2

X1,Y1

X2,Y2

X3,Y3

X4,Y4

X5,Y5

X6,Y6

X7,Y7

X1,Y1

X8,Y8

X9,Y9

X10,Y10

X11,Y11

X3,Y3

X2,Y2

X1,Y1

X8,Y8

1 2X5,Y5

X4,Y4

X3,Y3X11,Y11

X1,Y1

X2,Y2

X7,Y7X6,Y6

X10,Y10

X9,Y9

X8,Y8

Spaghetti data structure: remarks (cont.d)

Example:

NOTE 2: no easy way of solving queries such as: “Do Polygon 1 and 2 share a common bounding line?”

Need to analyse all coordinates of points of Polygon 1 and compare with those of Polygon 2 and see if two consecutive pairs are the same: inefficient!!

Polygon 1 Polygon 2

X1,Y1

X2,Y2

X3,Y3

X4,Y4

X5,Y5

X6,Y6

X7,Y7

X1,Y1

X8,Y8

X9,Y9

X10,Y10

X11,Y11

X3,Y3

X2,Y2

X1,Y1

X8,Y8

1 2X5,Y5

X4,Y4

X3,Y3X11,Y11

X1,Y1

X2,Y2

X7,Y7X6,Y6

X10,Y10

X9,Y9

X8,Y8


SPAGHETTI STRUCTURE

Advantages

The sequential organization for digital plotting

Disadvantages

redundancy because of repetition of polygon

boundaries

Computational expense due to the absence of

topological attributes


TOPOLOGICAL DATA STRUCTURE

Points are vertices

A line is a sequence of ordered vertices, where the beginning 0f the line is a special vertex or start node and the end a special vertex called an end node

A chain is a line which is part of one or more polygons, they are also called arcs or edges

A node is a point where lines or chains meet or terminate

A polygon consists of one outer ring and zero or more inner rings

A ring consists of one or more chains.

A simple polygon has no inner rings

Non-simple or complex polygon has one or more inner rings and is said to have “holes” or “islands”



The advantages of this structure over the spaghetti

structure are:

There is no repetition of spatial coordinates between

one polygon and the next, except at nodes, so that the

repeat lines are eliminated

Topological information is explicitly stored and is

separated from the spatial coordinates, facilitating

search that require adjacency, containment and

connectivity information



Polygon # Ring # Ring sequence #

1 2 1

2 1 1

2 3 2

3 3 1

A- polygon topology table



Ring # Chain # Chain sequence #

2 3 1

2 2 2

1 2 1

1 4 2

3 1 1

3 5 2

B- ring topology table



Chain # Start node Stop node Left polygon Right polygon

1 1 2 2 3

2 3 4 1 2

3 4 3 1 0

4 4 3 0 1

5 1 2 3 2

C- chain topology table

Node# Vertex#

1 14

2 11

3 1

4 3

D- node-to-vertex table



Chain# V3ertex# Vertex sequence#

1 14 1

1 9 2

1 10 3

1 11 4

2 1 1

2 7 2

2 8 3

2 3 4

3 3 1

3 2 2

3 1 3

4 3 1

E- chain-to-vertex table

Vertex# X Y

1 X1 Y1

2 X2 Y2

3 X3 Y3

4 X4 Y4

… .. ..

14 X14 Y14

F- coordinates of vertices table

SPATIAL DATA STRUCTURES - Delta Univdeltauniv.edu.eg/.../ch-3-SPATIAL-DATA-STRUCTURES.pdf · SPAGHETTI STRUCTURE Tables of locational coordinates are associated with each of the spatial

Documents