Top Banner
No(Geo)SQL Geographic search in (No)SQL
29

No(Geo)SQL

Dec 01, 2014

Download

Technology

An overview of how to handle Geo in DBMS form a NoSQL point of view
Hibernate Search spatial module
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: No(Geo)SQL

No(Geo)SQLGeographic search in (No)SQL

Page 2: No(Geo)SQL

Me

8 years mappy.com as platform architectand deputy CTO

Founding partner of NovaCodex since 2008

@NHelleringer

Page 3: No(Geo)SQL

Why

Geo in databasesWhat the point ?

Page 4: No(Geo)SQL

Why

Geo in databases challenges

Data is complex to store in SQL

Data is bi dimensional

Data is dense

Data is huge

Page 5: No(Geo)SQL

Origin (challenge)

Multiples dimensions but B-trees sort on oneQuery dependent index sorting calculation

New data structures and algorithms to handle dimensionsA two phases search : select and then filter

Page 6: No(Geo)SQL

Origin (needs)

Geographic Information Systemshandling of geometric objects

The origins of geography in the information systems are in the needs administrations had to handle data of the real world :

Geology / Geography Roads, administrative areas for cadastral surveys Census data Infrastructure elements (water delivery network, electrical

delivery network, communication network)

Other needs came when the data became available and use the same tools :

Geo marketing (market areas)

Page 7: No(Geo)SQL

How

All you ever hated about SQL … and more !

Complex SQL additions

Full size complex normalized API

Vendor dependent implementations

Not scalable

Page 8: No(Geo)SQL

Current Implementations (traditionnal SGBD)

OracleQuad Trees / R-Trees

Oracle 4 side dev (1984)integrated in Oracle 7 (1992)

SQL Server4 level Grid Index

Since 2008 version (2007)

SpatialiteR-Trees

since 3.6.0 (Mar 2008)

PostgreSQLR-tree-over-GiST

since PostGIS 1.0 for 8.0(Apr 2005)

The Open Geospatial Consortium edits a norm : OpenGIS

MySQL since Feb 2005, DB2 Spatial Extender since July 2006, Ingres added support very recently

Hibernate Spatial is a generic access to OpenGis implementations

GIS Software as ESRI, MapInfo, GeoConcept, QuantumGIS use this standard to access data

Page 9: No(Geo)SQL
Page 10: No(Geo)SQL

Puzzled ?

Do we need all this ?

Is Geo only for geo centric companies ?

Page 11: No(Geo)SQL

How

LBS changed everything !

Maps, geocoding & route planning available

Platforms handle millions of hits/day

Available through multiples APIs

Often for free

Page 12: No(Geo)SQL

How

GEOCODING

Data is hugeNot a geo problemExpertise extremely valued

Provided

MAPS

Data is huge and complex objectsIndexing is geoProcessing capabilities required

Provided

ROUTE PLANNING

Data is hugeNot a geo problemNot shard able

Provided

POI SEARCH

Data is less huge (your business size)Indexing is geoMay shard

Less relevant

Page 13: No(Geo)SQL

Origin (needs)

Location aware datahandling of data associated with a latitude/longitude tuple

Location became a search criterion : Geo search

The map/the geography is the center of the search process Proximity search

The location is one in many criteria to refine a search

Page 14: No(Geo)SQL
Page 15: No(Geo)SQL

New Solutions ?

Does NoSQL help ?

Page 16: No(Geo)SQL

Geo as a NoSQL Technology

Why does Geo fits a NoSQL approach ?

Geo does not fit in traditional ‘pure’ DBMS : First normal form (1NF), many dimensions in one column break the rules

(48,23) <?> (47,25)

Geo Objects hard to be strictly defined by SQL types : they are fickle

Tim Anglade ‘No SQL for fun and profit’ : Geo/hierarchical is one of seven forms of NoSQL to date

Page 17: No(Geo)SQL

Extensions to SQL or NoSQL data stores Quad-trees R-trees

Geo as a NoSQL Technology

Page 18: No(Geo)SQL

quad-tree

Page 19: No(Geo)SQL

How does it work ?

Search steps1) Select

Compute level Compute boxes ids Fetch boxes

2) Filter Compute distance Select result set

Limits High levels

Page 20: No(Geo)SQL

r-tree

Page 21: No(Geo)SQL

Current Implementations (NoSQL databases)

Spatial Lucene/Solr, Elastic Search Quad tree labels in Lucene tokens Tile indices or GeoHash labels

GeoCouch R-tree in Erlang

Neo4J Spatial R-tree & quad-tree Object can be stored as graph elements

Page 22: No(Geo)SQL

Current Implementations (NoSQL databases)

MongoDb Geo hashes into MongoDB B-trees Shard support incoming Spherical model since 1.7

Pincaster In memory quad tree

Page 23: No(Geo)SQL

How

How do I build PoI search ?

Page 24: No(Geo)SQL

POI Search

Do it in pure SQL !!

Use a clustered long, lat index :o Select is done by the cluster on

longitude (whish is more selective than latitude !)

o Bounding box requests are handled on the index level as latitude is included

o Filter with distance calculation can be done by a stored procedure on the database side or in application code

Page 25: No(Geo)SQL

POI Search

Lucene via Hibernate Search

o Available in 4.2 beta 1o Annotation basedo Simple to step ino Refine by usage o DSL supported

Page 26: No(Geo)SQL

Sample indexation code

@Indexed@Spatialpublic class Hotel { @Latitude Double latitude; @Longitude Double longitude; [...]

Page 27: No(Geo)SQL

Sample search code

QueryBuilder builder = fullTextSession.getSearchFactory() .buildQueryBuilder().forEntity( PoI.class ).get();

double centerLatitude= 24;double centerLongitude= 31.5;

Query luceneQuery = builder.spatial() .onCoordinates( PoI.class.getName() ) .within( 50, Unit.KM ) .ofLatitude( centerLatitude ) .andLongitude( centerLongitude ) .createQuery();

Page 28: No(Geo)SQL

End !

Thank you for listening !

Page 29: No(Geo)SQL

Ref

http://www.slideshare.net/timanglade/nosql-for-fun-profit

http://en.wikipedia.org/wiki/First_normal_form

http://en.wikipedia.org/wiki/Quadtree

http://technet.microsoft.com/en-us/library/bb964712.aspx

http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene_v2.html

http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-geospatial-queries-with-couchdb:2008-10-26:en,CouchDB,Python,geo

http://wiki.neo4j.org/content/Neo4j_Spatial

http://www.osgeo.org/

http://relation.to/Bloggers/SpatialQueriesFirstBetaForHibernateSearch42IsAvailable

http://www.novacodex.net/