Top Banner

of 70

Data models and management

May 30, 2018

Download

Documents

Hari Prasad
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/14/2019 Data models and management

    1/70

    DATA MODELS&

    MANGEMENT- I

  • 8/14/2019 Data models and management

    2/70

    Outlines

    Introduction

    Raster Data

    Vector Data

    Raster and Vector Structures

    Raster and Vector Advantages andDisadvantages

  • 8/14/2019 Data models and management

    3/70

    Introduction

    Geographic Data and Information arethe heart of GIS.

    Two fundamental components ofgeographic data: space (expressed asspatial data) and qualities (attributes).

    Both of these are stored in database.

  • 8/14/2019 Data models and management

    4/70

    Data and InformationDefinitions

    Information is the primary purpose of

    GIS, not just data.

    Data is the input; information is theoutput.

  • 8/14/2019 Data models and management

    5/70

    5

    Types of data

    Maps

    Images

    Spatial non-spatial

    Postcodes/ZIP codes

    Oblique photographs

    Videography

    Financial statements

    Films

    Schematic diagrams

    KT1 2EE

    RH8 9AA

    SW1P 3AD12,000 23.45 56789

    23,456 12.45 23456

    45,987 29.57 87634

  • 8/14/2019 Data models and management

    6/70

  • 8/14/2019 Data models and management

    7/70

    Introduction

    Spatial data in GIS has two primary dataformats: raster and vector.

    Raster uses a grid cell structure, whereasvector is more like a drawn map.

  • 8/14/2019 Data models and management

    8/70

    Spatial Data: Vector format

    PointPoint- a pair of x and y coordinates(x1,y1)

    LineLine - a sequence of points

    PolygonPolygon - a closed set of lines

    Node

    vertex

    Vector data are defined spatially:

  • 8/14/2019 Data models and management

    9/70

    Raster and Vector Data

    PointPoint

    LineLine

    PolygonPolygon

    VectorVector RasterRaster

    Raster data are described by a cell grid, one value per ce

    Zone of cells

  • 8/14/2019 Data models and management

    10/70

    Raster and Vector Data

    Vector format has points, lines, polygons that appearnormal, much like a map.

    Raster format generalizes the scene into a grid ofcells, each with a code to indicate the feature beingdepicted. The cell is the minimum mapping unit.

    Raster has generalized reality: all of the features inthe cell area are reduced to a single cell identity.

  • 8/14/2019 Data models and management

    11/70

  • 8/14/2019 Data models and management

    12/70

    Raster and Vector DataModels

    Raster: because the raster cells value or coderepresents all of the features within the grid, it doesnot maintain true size, shape, or location for

    individual features. Even where nothing exists (nodata), the cells must be coded.

    Vector: vectors are data elements describingposition and direction. In GIS, vector is the map-like

    drawing of features, without the generalizing effectof a raster grid. Therefore, shape is better retained.Vector is much more spatially accurate than theraster format.

  • 8/14/2019 Data models and management

    13/70

  • 8/14/2019 Data models and management

    14/70

    Raster Data

    Raster Coding

    Resolution Gridding and Linear Features

    Raster Precision and Accuracy

  • 8/14/2019 Data models and management

    15/70

    Raster Coding

    In the data entry process, maps can be digitizedor scanned at a selected cell size and each cellassigned a code or value.

    The cell size can be adjusted according to thegrid structure or by ground units, also termedresolution.

    There are three basic and one advanced schemefor assigning cell codes.

  • 8/14/2019 Data models and management

    16/70

    Raster Coding

    Presence/Absence: is the most basic method and to record afeature if some of it occurs in the cell space.

    Cell Center: involves reading only the center of the cell and

    assigning the code accordingly. Not good for points or lines.

    Dominant Area: to assign the cell code to the feature with thelargest (dominant) share of the cell. This is suitable primarilyfor polygons.

    Percent Coverage: a more advanced method. To separateeach feature for coding into individual themes and then assignvalues that show its percent cover in each cell.

  • 8/14/2019 Data models and management

    17/70

  • 8/14/2019 Data models and management

    18/70

    Raster Coding Problems

    Raster coding produces spatialinaccuracies.

  • 8/14/2019 Data models and management

    19/70

  • 8/14/2019 Data models and management

    20/70

    Raster Coding Problems

    One possible solution is to increase theresolution by increasing the number ofcells, making each one smaller and

    therefore more sensitive to accurateclassification.

  • 8/14/2019 Data models and management

    21/70

    Raster Mapping

    A major problem with the raster structure is thatthe shape of features is forced into an artificial grid

    cell format.

    For right-angled features, such as squareagricultural fields or rectangular political districts,this may not present a major problem. However,for many features, size and shape can becomeundesirably distorted.

  • 8/14/2019 Data models and management

    22/70

  • 8/14/2019 Data models and management

    23/70

    Resolution

    Increasing the number of cells on a data setincreases spatial resolution, which helps to

    increase spatial accuracy.

    One advantage to using relatively few cellsis the short processing time and ease ofanalysis.

  • 8/14/2019 Data models and management

    24/70

  • 8/14/2019 Data models and management

    25/70

    Gridding and LinearFeatures

    Low-resolution raster results in a rathergeneralized and crude shape.

    High-resolution raster shape appears morerealistic, though still a long way from thevector shape and spatial accuracy.

  • 8/14/2019 Data models and management

    26/70

  • 8/14/2019 Data models and management

    27/70

    Raster Precision andAccuracy

    Questions of raster data precision (the exact location) andaccuracy (maximum spatial truth) are often a problem.

    Because the raster cell is the maximum resolution and the

    minimum mapping unit, there is no way to know exactly wheresmall feature occurs.

    Smaller cells have less spatial error because the area of doubt issmaller.

    Uncertainty becomes greater when measuring across cells.

    Area measurement are also generalized.

  • 8/14/2019 Data models and management

    28/70

  • 8/14/2019 Data models and management

    29/70

    Vector Data

    Vector features appear more realistic thanraster features and have better spatial

    accuracy.

    Vector features are defined primarily by theirshapes, more specifically by the outline oftheir shapes. In GIS, the vector system is acoordinate-based data structure.

  • 8/14/2019 Data models and management

    30/70

    Vector Data

    Shape points are the ends and bends that define the featuresoutline.

    At the beginning and end of every line or polygon feature is a

    node.

    At each bend (change of direction) is a vertex (plural: vertices).

    Node are end points and vertices are between, defining the

    shape.

    Point features are standalone nodes.

  • 8/14/2019 Data models and management

    31/70

    Vector Data

    Chains connect the shape points to draw the features outline.

    Chains are vectors or data structure paths that are not part ofthe actual stored data elements; they are not real lines, but

    define and present the connection between shape points.

    Vector system data files store only the coordinate of each nodeand vertex; the hardware draws the connecting chainsegments. It is virtual component.

    The vector data structure is also known as an arc-node modelbecause it uses chains (arcs) and end points (nodes).

  • 8/14/2019 Data models and management

    32/70

  • 8/14/2019 Data models and management

    33/70

    Raster and VectorStructures

    Raster and vector structure have differentmethods of storing and displaying spatial

    data.

    Raster cells are stored and displayed ascells, but in the vector format only the nodesand vertices are stored. This results inconsiderable data storage differences.

  • 8/14/2019 Data models and management

    34/70

    Raster and VectorStructures

    A point in a raster system is a single cell, but in a vectorsystem it is only a node represented by a symbol with itscoordinate position noted.

    A simple line in a raster system consists of a sequence ofcells. In a vector system, a simple line consists of two nodesand a chain that connects them.

    A more complex raster line consists of connected cells,

    sometimes in stair-step fashion when they are diagonal.Complex lines in the vector format have vertices to markchanges in direction, with nodes at each end.

  • 8/14/2019 Data models and management

    35/70

    Raster and VectorStructures

    Raster polygons are filled with cells. Forsingle polygons, the vector format usuallyhas a single node and several vertices tomark the boundary direction changes.

    Connected polygons are simply two blocks

    of cells in the raster format, but in vectorthey share a common border and somecommon nodes.

  • 8/14/2019 Data models and management

    36/70

  • 8/14/2019 Data models and management

    37/70

    Raster to Vector Conversion

    There are at least four basic reasons to convert fromraster to vector:

    (1) better visual appearance of vector features;

    (2) some plotter work only on vector data;

    (3) comparison with vector data is best when bothdata files have identical formats;

    (4) some GIS systems have vectors as the centraloperating data structure.

    Rasterization of vector data is often called gridding.

  • 8/14/2019 Data models and management

    38/70

  • 8/14/2019 Data models and management

    39/70

    Raster Advantages

    A relatively simple data structure;

    The simple grid structure makes analysis easier.

    The computer platform can be low tech and inexpensive.

    Remote sensing imagery is typically obtained in raster format.

    Modeling is the creation of a generalized data file or a set ofuniversal procedures to accomplish a certain GIS task.

  • 8/14/2019 Data models and management

    40/70

  • 8/14/2019 Data models and management

    41/70

    Raster Disadvantages

    Spatial inaccuracies

    Because each cell tends to generalize a landscape, the result

    is relatively low resolution compared to the vector format.

    Because of spatial inaccuracies caused by datageneralization, a raster format cannot tell precisely whatexists at a given location.

    Each cell must have a code, even where nothing exists.

  • 8/14/2019 Data models and management

    42/70

  • 8/14/2019 Data models and management

    43/70

    Vector Advantages

    In general, vector data is more map-like.

    Is very high resolution.

    The high resolution supports high spatial accuracy.

    Vector formats have storage advantages.

    The general public usually understands what is shown on

    vector maps.

    Vector data can be topological.

  • 8/14/2019 Data models and management

    44/70

  • 8/14/2019 Data models and management

    45/70

    Vector Disadvantages

    May be more difficult to manage than raster formats.

    Require more powerful, high-tech machines.

    The use of better computers, increased management needs,and other considerations often make the vector format moreexpensive.

    Learning the technical aspects of vector system is moredifficult than understanding the simplicity of the rasterformat, particularly when topology is introduced.

  • 8/14/2019 Data models and management

    46/70

  • 8/14/2019 Data models and management

    47/70

    GIS Data Characteristics

    Location, or position, is a major staring point ofspatial measurement. Location can be descriptive,or uses a Lat-Lon system.

    Size characteristics: Polygon: area and perimeter;Lines: length.

    Shape: an important descriptive element used inmap and image interpretation. The shape of afeature often indicates its identity and role on thelandscape.

  • 8/14/2019 Data models and management

    48/70

    Point features have no real shape or spatialdimension, only the position of objects oroccurrences. They are represented by symbols,such as dots, geometric shapes, or icons.

    A line feature has length from beginning to end.

    Polygon features have a wide variety of shapes,from easily interpreted circles and squares tocomplicated shapes that defy description.

    GIS DataCharacteristics

  • 8/14/2019 Data models and management

    49/70

  • 8/14/2019 Data models and management

    50/70

    Spatial Data Relationships

    Spatial relationships are how featuresrelate to each other in space.

    It includes distance, distribution,density, and pattern.

  • 8/14/2019 Data models and management

    51/70

    Distance from one feature to another is anelementary but important relationship. It is

    available through simple measurement.

    Distribution is the collective location of features;the geographic dispersal or range. There are two

    basic ways of perceiving distribution: featuresamong themselves and their spatial relationshipwith other features.

    Spatial Data Relationships

  • 8/14/2019 Data models and management

    52/70

    Spatial Data Relationships

    Density is the number of items per unitarea; how close features are to each other.

    Pattern is the consistent arrangement offeatures, similar to (and can include)

    distribution and density.

  • 8/14/2019 Data models and management

    53/70

  • 8/14/2019 Data models and management

    54/70

    The Data ModelThe Data Model

    Geographical variation in the real world isinfinitely complex. Therefore, we require a setof rules (the data model) to convert realgeographical variation into discrete objects.

    A set of guidelines for the representation ofthe logical organisation of the data in adatabase (consisting) of named logical unitsof data and the relationships between them.

    The GIS Model: example

  • 8/14/2019 Data models and management

    55/70

    The GIS Model: example

    roads

    hydrology

    topography

    Here we have three layers or themes:

    --roads,

    --hydrology (water),

    --topography (land elevation)

    They can be related because precise geographic

    coordinates are recorded for each theme.

    longitude

    latitu

    de

    longitude

    longitude

    latitu

    de

    latit

    ud

    e

    Layers are comprised of two data types

    Spatial data which describes location (where)Attribute data specifing what, how much,when

    Layers may be represented in two ways:in vectorformat as points and linesin raster(or image) format as pixels

    All geographic data has 4 properties:

    projection, scale, accuracy and resolution

  • 8/14/2019 Data models and management

    56/70

    Types of data model

    The Raster Model

    Equivalent of a

    continuous grid coveringthe surface, wherebyeach cell in the gridrepresents a square onthe ground.

    The Vector Model

    Attempts to represent

    objects as exactly andprecisely as possible bystoring points, lines(arcs) and polygons(areas) in a continuous

    co-ordinate space

  • 8/14/2019 Data models and management

    57/70

    Raster-Vector Data Model

    Raster

    Vector

    Real World

    R ti D t ith R t d

  • 8/14/2019 Data models and management

    58/70

    Representing Data with RasterandVectorModels

    Raster Model area is covered by grid with (usually) equal-

    sized, square cells

    attributes are recorded by assigning each cell asingle value based on the majority feature(attribute) in the cell, such as land use type.

    Image data is a special case of raster data inwhich the attribute is a reflectance value

    from the geomagnetic spectrum cells in image data often calledpixels

    (picture elements)

    R ti D t ith R t d

  • 8/14/2019 Data models and management

    59/70

    Vector Model

    The fundamental concept of vector GIS is that allgeographic features in the real work can berepresented either as:

    points or dots (nodes): trees, poles, fireplugs, airports, cities

    lines (arcs): streams, streets, sewers,

    areas (polygons): land parcels, cities,counties, forest, rock type

    Representing Data with RasterandVectorModels

    Vector and Raster Models in

  • 8/14/2019 Data models and management

    60/70

    Vector and Raster Models inGIS

    Representation of

    Lines

    Raster

    Vector

    TOPOLOGY (for vector

  • 8/14/2019 Data models and management

    61/70

    TOPOLOGY(for vectordata) What is topology? Why is important? Three types of topological models in GIS Spatial operations of topology

    Contiguity Connectivity

    Trade-offs of topological structure Application model

    Triangular Irregular Network (TIN):Vector-basedGIS

    S ti l f t d ti l

  • 8/14/2019 Data models and management

    62/70

    Spatial features and spatialrelationships

    Spatial features in maps Points, lines and polygons

    Human being interprets additional

    information from maps about thespatial relationships betweenfeatures A route trace from an airport to a house

    Land contiguity adjacent to streets alongwhich the lands are located

  • 8/14/2019 Data models and management

    63/70

    The definition of Topology

    The spatial relationships can be interpreted identification of connecting lines along a path definition of the areas enclosed within these

    lines

    identification of contiguous areas In digital maps, these relationships are

    depicted using Topology Topology =A mathematical procedure for

    explicitly defining spatial relationship Topology is the description of how the spatial

    objects are related with spatial meaning

  • 8/14/2019 Data models and management

    64/70

    Topological data models

    Three types of topological concepts Arc, Node and polygon topologies

    Arc Arcs have directions and left and right polygons

    (=contiguity) Node

    Nodes link arcs with start and end nodes(=connectivity)

    Polygon Arcs that connect to surround an area define a

    polygon (=area definition)

  • 8/14/2019 Data models and management

    65/70

    Terms and concepts

    To

    Node

    Left

    Polygon

    Right

    Polygon

    From

    Node

    Connectivity - from and to nodes

    Contiguity - Polygon Enclosure

    Adjacency - from Direction

    Ar

    c

    Spatial operations of

  • 8/14/2019 Data models and management

    66/70

    Spatial operations oftopology Connectivity and contiguity (Aronoff, 1989)

    A basic, but core spatial analysis operations in GIS Contiguity

    A biologist might be interested in the habitats thatoccur next to each other

    A city planner might be interested in zoning conflictssuch as industrial zones bordering recreation areas

    Connectivity Transportation network, telecommunication systems,

    river systems

    To find optimum routings or most efficient deliveryroutes or the fastest travel route

    To predict loading at critical points in a river channel To estimate water flow at a bridge crossing that will

    result from heavy flood

  • 8/14/2019 Data models and management

    67/70

    Trade-offs of topology

    Advantages Spatial data is stored more efficiently Analysis process faster and efficient for large

    data sets

    By topological relationships, we can performspatial analysis functions, Modelling flow through the connection of lines in

    a network (i.e. buffering) Combining adjacent polygons with similar

    characteristics (i.e. spatial merge) Overlaying geographical features (i.e. spatial

    overlay)

  • 8/14/2019 Data models and management

    68/70

    Disadvantages

    Extra cost and time creating topological structure does impose a

    cost Topology should be always updated when a new

    map or existing map is updated

    Additional batch job working To avoid the extra efforts, GIS systems need to

    run a batch job (i.e. a process that can be runwithout user interactions); 70% of total GIS

    costs Autoexec.bat in DOS Macro languages such as AML (Arc/Info), Avenue

    (ArcView), MapBasic (MapInfo) and etc

  • 8/14/2019 Data models and management

    69/70

    Conclusions of topology

    When topology is created, we canidentify

    Know its positions of spatial featuresKnowwhat is around it

    Understandits geographicalcharacteristics by virtue of recognising

    its surroundings

    Know how to get from A to B

  • 8/14/2019 Data models and management

    70/70

    Thank You