Top Banner
in conjunc(on with Data Management & Warehousing http://www.datamgmt.com
16

Implementing Netezza Spatial

Oct 22, 2014

Download

Documents

davidmwalker

Introducing Netezza Spatial: The ability to analyse information in a geographic context:
– Where is the nearest petrol station?
– Which road am I on?
– How many ATMs are in this area?
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Implementing Netezza Spatial

in  conjunc(on  with  

Data Management & Warehousing http://www.datamgmt.com

Page 2: Implementing Netezza Spatial

What  is  the  Spa(al  Module?  

•  It’s  the  ability  to  analyse  informa(on  in  a  geographic  context:  – Where  is  the  nearest  petrol  sta(on?  – Which  road  am  I  on?  – How  many  ATMs  are  in  this  area?  

•  It’s  not  maps  and  images  – These  come  later  with  tools  that  help  present  the  informa(on  

Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   2  

Page 3: Implementing Netezza Spatial

The  three  types  of  data  &  many  ques(ons  

•  Points  –  OS  Grid  –  La(tude  &  Longitude    

•  Lines  –  Pairs  of  points  –  e.g.  Road  Segments  

•  Polygons  –  A  series  of  points  that  define  a  boundary  

–  e.g.  Postcode  Boundaries  

•  How  close  are  two  points?  

•  Does  a  point  touch  a  line?  

•  Is  a  point  inside  or  outside  a  polygon?  

•  Does  a  line  cross  a  polygon?  

•  How  many  points  are  in  a  polygon?  

Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   3  

Page 4: Implementing Netezza Spatial

Using  Spa(al  Data  Is  Complex  

•  Different  distances  between  points  at  different  longitudes  and  la(tudes  

•  Measurement  over  a  curved  irregular  surface  

•  Mul(ple  input  and  output  formats  

•  Mul(ple  co-­‐ordinate  systems  see:A  Guide  to  Coordinate  Systems  in  Great  Britain    

Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   4  

Page 5: Implementing Netezza Spatial

Sources  of  Informa(on  –  GPS  

•  In  Car  Device  –  Sends  frequent  data  sets  to  processing  centre  

–  Point  Data  •  Speed,  Direc(on,    Loca(on  and  G-­‐force  

–  Aggregate  Data  •  Speed  and  Direc(on  

•  Other  Devices  –  Sat  Nav  Systems  –  Smart  Phone  Apps    e.g.  ‘GPS  Tracker’  

–  Cameras  

Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   5  

Page 6: Implementing Netezza Spatial

Sources  of  Informa(on  –  Ordnance  Survey  

•  Integrated  Road  Network:  A  series  of  3  million  ‘linestrings’  and  17  million  points  that  describe  every  road  in  the  UK  

•  Linestrings  have  between  2  and  655  points,  most  have  less  than  10  

•  23  points  for  this  picture      

Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   6  

Page 7: Implementing Netezza Spatial

Sources  of  Informa(on  –  Post  Office/GAdm  

•  Postal  Address  File:  A  series  of  c.1.75M  UK  postcodes  –  Postcode  Boundaries    –  Over  28M  complete  

addresses  

•  Global  Admin  Boundaries  –  Na(onal  and  regional  

boundaries  for  c.245  countries  

–  hgp://www.gadm.org    

Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   7  

Page 8: Implementing Netezza Spatial

Data  Layers  –  Enriching  what  you  have  

•  Data  Layers  are  sets  of  informa(on  (ed  to  a  geographic  point  – Road  Speed  for  a  given  road  segment  – ATM  Loca(on  – House  Price  for  a  postcode  

•  Where  data  has  loca(on  informa(on  it  is  known  as  ‘Geo-­‐tagged’  

Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   8  

Page 9: Implementing Netezza Spatial

Data  Layer  Sources  (1)  

•  Ordnance  Survey  – Road  Types,  Limits,  Closures,  etc.  

•  Government  – UK  Government  now  providing  masses  of    geo-­‐tagged  info  (hgp://data.gov.uk)  

•  Met  Office  /  HM  Nau(cal  Almanac  Office    – Weather,  Daylight  to  Postcode  Level  

Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   9  

Page 10: Implementing Netezza Spatial

Data  Layer  Sources  (2)  

•  Wikipedia  –  Geo-­‐tag  Access  API  –  what’s  nearby?  

•  Google  Maps  –  Road  level  photographic  images  

•  Commercial  Sources  –  Fast  Food  Outlets,  Supermarkets,  Petrol  Sta(ons,  ATMs,  etc.  

•  Massive  growth  in  both  commercial  and  public  domain  geo-­‐tagged  data  

Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   10  

Page 11: Implementing Netezza Spatial

Issues  with  Geo-­‐tagged  data  

•  Geo-­‐tagging  uses  different  formats  –  Longitude  &  La(tude,  OS  Grid  Reference,  etc  

•  Geo-­‐tagging  at  different  levels  – Data  for  a  postcode  or  a  an  en(re  county  which  makes  it  difficult  to  compare  

•  Geo-­‐tagging  coverage  is  patchy  and/or  historic  –  Rate  of  change  of  fine  detail  data  is  very  high    –  e.g.  OS  issues  monthly  updates  to  the  UK  mapping  

•  Mul(ple  standards  and  formats  –  XML  &  CSV,  different  file  formats,  etc.    

Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   11  

Page 12: Implementing Netezza Spatial

Our  Model  For  Delivering  Spa(al  Data  

Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   12  

Source  

Source  

Source  

Source  

Source  

Source  

(Small)  Postgres  Database  

Netezza  

1   3  

2  

Spa(

al  Analysis  

 (Proximity,  Con

tains,  Exclude

s)  

Spa(

al  Presenta(

on  

(Sets  of  data  with

 spa(a

l  ag

ribu

tes)  

4   5  

1.  Load  Mul(ple  File  Formats  2.  Standardise  Geo-­‐Tagging  3.  Extract  &  Load  CSVs  4.  Perform  Spa(al  Analysis  5.  Create  User  Access  Area  

Que

ry  &  Presenta(

on  Too

ls  

(Tableau,  G

oogle  Maps,  etc.)  

Page 13: Implementing Netezza Spatial

Netezza  Spa(al  Value  Add  

•  Netezza  Spa(al  is  fast  –  Analysis  

•  Look  up  a  typical  18  point  trip  in  the  3M  linestrings  to  find  the  roads  that  the  vehicle  was  on  in  less  than  1  second  

•  Overnight  batch  process  of  300,000  points  to  matching  road  names  in  under  30  minutes  

–  Presenta(on  •  Tools  rely  on  fast  query  access  to  render  any  queried  map  with  sub-­‐second  response  (mes  

•  Netezza  Spa(al  is  easy  –  Distance  and  proximity  

calcula(ons  are  simple  –  ‘Touches’,  ‘Overlaps’  &  

‘Contains’  queries  allow  instant  value  add    

•  Netezza  Spa(al  integrates  –  Works  well  with  Tableau  –  Easy  to  generate  KML  for  

use  with  Google  Earth  and  Google  Maps  

Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   13  

Page 14: Implementing Netezza Spatial

Netezza  Spa(al  Limita(ons  

•  Fails  the  Slar(barpast  Test:  –  Polygons  for  very  detailed  maps  

are  too  big  to  be  loaded  as  Netezza  limits  the  maximum  block  size  to  64000  characters  

–  Named  aqer  the  Hitch-­‐Hikers  Guide  to  the  Galaxy  coastline  designer  responsible  for  the  twiddly  bits  around  the  Norwegian  rords  

•  Work-­‐around:  –  Use  regional  boundaries  (e.g.  

UK  Coun(es,  US  States,  etc.)  and  then  aggregate  into  na(onal  boundaries  

–  If  a  point  is  in  Berkshire  then  by  defini(on  it  is  also  in  England  

Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   Page  14  

Norway  

Slar(barpast  

Page 15: Implementing Netezza Spatial

Current  Uses  …  

•  M/A/B  road  driving  profiles  •  Time  of  day  driving  profiles  

•  Speed  Limits  vs.  Driven  Speed  

•  Matching  GPS  posi(ons  to  road  names  

•  Out  of  bounds  driving  •  Customer  Demographic  Profiles  

 …  but  this  is  only  the  start  in  a  very  short  (me  

Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   15  

Page 16: Implementing Netezza Spatial

in  conjunc(on  with  

Data Management & Warehousing http://www.datamgmt.com