Top Banner
Alexander Rubin - Principal Architect, Percona Mike Benshoof - Technical Account Manager, Percona MySQL 5.7 and MongoDB Geospatial Introduction
48

Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Jun 25, 2018

Download

Documents

vantuong
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Alexander Rubin - Principal Architect, PerconaMike Benshoof - Technical Account Manager, Percona

MySQL 5.7 and MongoDB Geospatial Introduction

Page 2: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Agenda

•Introduction

•GIS in MySQL

•GIS in MongoDB

2

Page 3: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

●Encompasses a wide array of topics●Revolves around geo-positioning data (latitude/longitude)

○ Point data (single lat/lon)○ Bounded areas (think radius from point)○ Defined area (think city limits outline on map)

●Often includes some function of distance○ Distance between points○ All points within x, y

What is Geodata?

3

Page 4: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

● Here are some commonly asked questions based around geodata:○ What are the 5 closest restaurants to my hotel?

○ How far is it from here to the airport?

○ How many restaurants are there within 25 miles of my hotel?

● These are all fairly common questions – especially with the prevalence of geo-enabled devices (anyone here ever enable “Location Services” on their phone?)

Why do we care?

4

Page 5: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

● Oil/gas exploration● Meteorology● Logistics companies● Travel companies● < INSERT YOUR INDUSTRY HERE >

● The point: geodata is so readily available, you are likely already using it or will be soon!

Other Industries

5

Page 6: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Distance between points on sphere (The Haversine Formula):

Everyone get out your calculators and slide rules...

High Level theory and formulas…… everyone’s favorite

6

Page 7: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

● MySQL prior to 5.6○ Get out your calculators

● MySQL 5.6○ Introduced st_distance (built-in)

● MySQL 5.7○ New spatial functions, InnoDB spatial indexes

● MySQL 8.0 (upcoming)○ The world is not flat any more… (more below)

En MySQL por favor...

7

Page 8: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Enough with the theory already!

8

SET @src_lat = 37; SET @src_lng = -133;

SET @dest_lat = 38; SET @dest_lng = -133;

SELECT (3959 * acos(cos(radians(@src_lat)) * cos(radians(@dest_lat)) * cos(radians(@dest_lng) - radians(@src_lng)) + sin(radians(@src_lat)) * sin( radians(@dest_lat)))) as GreatCircleDistance

+---------------------+

| GreatCircleDistance |

+---------------------+

| 69.09758508647379 |

+---------------------+

Huzzah!! The distance between two latitude lines is ~69 miles!

Page 9: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

… now let’s put it all together

9

Find me all zip codes within 25 miles of my current zip code...

InnoDB Table Structure (pre 5.6)

CREATE TABLE `zipcodes` ( `id` int(11) NOT NULL AUTO_INCREMENT, `zip` varchar(255) DEFAULT NULL, `latitude` decimal(10,8) DEFAULT NULL, `longitude` decimal(11,8) DEFAULT NULL, `city` varchar(255) DEFAULT NULL, `state` varchar(255) DEFAULT NULL, `county` varchar(255) DEFAULT NULL, `type` varchar(255) DEFAULT NULL, PRIMARY KEY (`id`), KEY `lat` (`latitude`,`longitude`), KEY `city` (`city`,`state`), KEY `state` (`state`), KEY `zip` (`zip`), KEY `county` (`county`)) ENGINE=InnoDB

Page 10: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Helper Function (pre 5.6)

… add some help

10

DELIMITER $$

DROP FUNCTION IF EXISTS DistanceInMiles$$CREATE FUNCTION DistanceInMiles (src_lat decimal(10,8), src_lng decimal(11,8), dest_lat decimal(10,8), dest_lng decimal(11,8)) RETURNS decimal(15,8) DETERMINISTICBEGINRETURN CAST((3959 * acos(cos(radians( src_lat)) * cos(radians( dest_lat)) * cos(radians(dest_lng) - radians(src_lng)) + sin( radians( src_lat)) * sin( radians(dest_lat)))) as decimal(15,8));END $$

DELIMITER ;

Page 11: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

mysql> # Norman, OK Post Office (73071)mysql> SET @srcLat = 35.254049;Query OK, 0 rows affected (0.00 sec)

mysql> SET @srcLng = -97.300313;Query OK, 0 rows affected (0.00 sec)

mysql> SET @dist = 25;Query OK, 0 rows affected (0.00 sec)

mysql> mysql> SELECT z.ZIP, z.City, z.State, -> DistanceInMiles(@srcLat, @srcLng, z.Latitude, z.Longitude) as distance -> FROM zipcodes z -> HAVING distance < @dist -> ORDER BY distance -> LIMIT 10;

… and results!

11

+-------+---------------+-------+-------------+| ZIP | City | State | distance |+-------+---------------+-------+-------------+| 73071 | NORMAN | OK | 0.00000000 || 73026 | NORMAN | OK | 1.45586169 || 73072 | NORMAN | OK | 4.30645795 || 73165 | OKLAHOMA CITY | OK | 5.84183161 || 73070 | NORMAN | OK | 7.15379176 || 73068 | NOBLE | OK | 7.15998471 || 73160 | OKLAHOMA CITY | OK | 7.83641939 || 73069 | NORMAN | OK | 7.91904816 || 73019 | NORMAN | OK | 8.72433716 || 73150 | OKLAHOMA CITY | OK | 10.62626848 |+-------+---------------+-------+-------------+10 rows in set (0.47 sec)

Page 12: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

mysql> EXPLAIN SELECT z.ZIP, z.City, z.State, -> DistanceInMiles(@srcLat, @srcLng, z.Latitude, z.Longitude) as distance -> FROM zipcodes z -> HAVING distance < @dist -> ORDER BY distance -> LIMIT 10\G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: z partitions: NULL type: ALLpossible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 42347 filtered: 100.00 Extra: Using temporary; Using filesort1 row in set, 1 warning (0.00 sec)

Not so great...

12

Page 13: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Rather than scan the whole table, let’s just look at a small rectangle of data (i.e. a bounding box):

More math to the rescue!

13

1o Latitude ~= 69 miles1o Longitude ~= cos(lat) * 69

$lat_range = radius / 69$lng_range = abs(radius / (cos(lat) * 69))

$lon1 = $mylng + $lng_range$lon2 = $mylng - $lng_range$lat1 = $mylat +$lat_range$lat2 = $mylat - $lat_range

Page 14: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

mysql> SELECT z.ZIP, z.City, z.State, -> DistanceInMiles(@srcLat, @srcLng, z.Latitude, z.Longitude) as distance -> FROM zipcodes z -> WHERE z.Longitude BETWEEN -97.744004 AND -96.856621 -> AND z.Latitude BETWEEN 34.891730 AND 35.616367 -> HAVING distance < @dist -> ORDER BY distance -> LIMIT 10;+-------+---------------+-------+-------------+| ZIP | City | State | distance |+-------+---------------+-------+-------------+| 73071 | NORMAN | OK | 0.00000000 || 73026 | NORMAN | OK | 1.45586169 || 73072 | NORMAN | OK | 4.30645795 || 73165 | OKLAHOMA CITY | OK | 5.84183161 || 73070 | NORMAN | OK | 7.15379176 || 73068 | NOBLE | OK | 7.15998471 || 73160 | OKLAHOMA CITY | OK | 7.83641939 || 73069 | NORMAN | OK | 7.91904816 || 73019 | NORMAN | OK | 8.72433716 || 73150 | OKLAHOMA CITY | OK | 10.62626848 |+-------+---------------+-------+-------------+10 rows in set (0.00 sec)

… and respectable!

14

Updated Explain Plan:

id: 1 select_type: SIMPLE table: z partitions: NULL type: rangepossible_keys: lat key: lat key_len: 13 ref: NULL rows: 1616 filtered: 11.11 Extra: Using index condition; Using temporary; ...

Page 15: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Onwards and upwards to Spatial Data Types!

15

Page 16: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

● Non-Spatial way:○ Lat / Lon as individual decimal columns○ Composite B-Tree index covering each column

● Spatial way:○ Use the geometry type with a POINT object (single lat/lon)○ Use the geometry type with a POLYGON object (defined

border)○ Use a SPATIAL index on the geometry column (MyISAM only

through 5.6, added to InnoDB in 5.7)

Converting our table to geospatial...

16

Page 17: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

And voila!

17

CREATE TABLE `legacy`.`zipcodes` ( `id` int(11) NOT NULL AUTO_INCREMENT, `zip` varchar(255) DEFAULT NULL, `latitude` decimal(10,8) DEFAULT NULL, `longitude` decimal(11,8) DEFAULT NULL, `city` varchar(255) DEFAULT NULL, `state` varchar(255) DEFAULT NULL, `county` varchar(255) DEFAULT NULL, `type` varchar(255) DEFAULT NULL, PRIMARY KEY (`id`), KEY `lat` (`latitude`,`longitude`), KEY `city` (`city`,`state`), KEY `state` (`state`), KEY `zip` (`zip`), KEY `county` (`county`)) ENGINE=InnoDB

CREATE TABLE `geospatial`.`zipcodes` ( `id` int(11) NOT NULL AUTO_INCREMENT, `zip` varchar(255) DEFAULT NULL, `geo` geometry NOT NULL, `city` varchar(255) DEFAULT NULL, `state` varchar(255) DEFAULT NULL, `county` varchar(255) DEFAULT NULL, `type` varchar(255) DEFAULT NULL, PRIMARY KEY (`id`), SPATIAL KEY `geo` (`geo`), KEY `city` (`city`,`state`), KEY `state` (`state`)) ENGINE=InnoDB

Now, we can use GIS notation to find our zip code:

SELECT zip, city, state, county, type, ST_AsText(geo) lon_latFROM zipcodes WHERE ST_CONTAINS(geo, ST_GeomFromText('POINT(35.254049 -97.300313)', 4326));

Page 18: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

● We had already written a query do just that● And this one is still doing the same amount of handler operations!

● This opens up the opportunity to run other GIS based calculations as we'll see now...

So What?!

18

Page 19: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

● In earlier slides, we based everything on a single lat/lon coordinate

● Several “zip code” databases follow this model

● In recent years, the prevalence of full polygon region databases has increased

● Example: the US Census provided tl_2014_us_zcta510 schema...

Not just points!

19

Page 20: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

● This table defines full regions for postal codes as opposed to a centered lat/lon for an actual building:

Not just points!

20

CREATE TABLE `zipcodes` ( `ogr_fid` int(11) NOT NULL AUTO_INCREMENT, `shape` geometry NOT NULL, `zcta5ce10` varchar(5) DEFAULT NULL, `geoid10` varchar(5) DEFAULT NULL, `classfp10` varchar(2) DEFAULT NULL, `mtfcc10` varchar(5) DEFAULT NULL, `funcstat10` varchar(1) DEFAULT NULL, `aland10` double DEFAULT NULL, `awater10` double DEFAULT NULL, `intptlat10` varchar(11) DEFAULT NULL, `intptlon10` varchar(12) DEFAULT NULL, PRIMARY KEY (`ogr_fid`), SPATIAL KEY `shape` (`shape`), KEY `zipcode` (`zcta5ce10`)) ENGINE=InnoDB

Page 21: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

● Old version for Norman zip code 73071:ST_AsText(geo): POINT(35.254049 -97.300313)

● New version in polygon based version:ST_AsText(shape): POLYGON((35.228675 -97.44109,35.228673 -97.442413,35.228669 -97.442713,35.229179 -97.442722,35.229632 -97.442729,35.230083 -97.442727,35.230542 -97.442726,35.230998 -97.442725,35.23099 -97.441071,35.231275 -97.441072,35.231437 -97.441072,35.231464 -97.442724,35.231912 -97.442722,35.231898 -97.442647,35.231884 -97.441073,35.232346 -97.441073,35.232849 -97.441074,35.234002 -97.441073,35.234021 -97.442529,...

● Note that now, we have a full region as opposed to a point...

Difference in storage

21

Page 22: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

● st_contains○ Check if an object is entirely within another object

● st_within○ Check if an object is spatially within another object

● st_*○ Detailed list here:

https://dev.mysql.com/doc/refman/8.0/en/spatial-function-reference.html

GIS Functions

22

Page 23: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

● # Find me all the zipcodes (polygons) for my points of interest

● # Find me all points of interest within a zip code (polygon)

Some Sample GIS Queries

23

SELECT zip.zcta5ce10 AS zipcode, pts.name FROM my_points pts JOIN zipcodes zip ON st_within(pts.loc, zip.shape);

SELECT zip.zcta5ce10 AS zipcode, pts.name FROM my_points pts JOIN zipcodes zip ON st_contains(zip.SHAPE, pts.loc)WHERE zip.zcta5ce10 = "73071";

Page 24: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

● # Find me all the points of interest within a county

SELECT zip.zcta5ce10 AS zipcode, pts.name FROM my_points pts JOIN zipcodes zip ON st_contains(zip.shape, pts.loc)JOIN legacy.zipcodes leg ON leg.zip = zip.zcta5ce10WHERE leg.county = "Cleveland";

Some Sample GIS Queries

24

Page 25: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

● I mentioned earlier that SPATIAL keys are available for InnoDB starting in 5.7

● Here is just quick sample of the same query, with and without a SPATIAL key around a polygon

● Let’s grab one of the earlier queries:

Impact of SPATIAL KEY

25

SELECT raw.zcta5ce10 AS zipcode, pts.name FROM my_points pts JOIN tl_2014_us_zcta510 raw ON st_within(pts.loc, raw.SHAPE);

Page 26: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Impact of SPATIAL KEY

26

With Key:+---------+-------+| zipcode | name |+---------+-------+| 73071 | POI 2 || 73069 | POI 3 |+---------+-------+2 rows in set (0.00 sec)

+----------------------------+-------+| Variable_name | Value |+----------------------------+-------+| Handler_read_first | 1 || Handler_read_key | 4 || Handler_read_last | 0 || Handler_read_next | 4 || Handler_read_prev | 0 || Handler_read_rnd | 0 || Handler_read_rnd_next | 4 || Handler_write | 0 |+----------------------------+-------+

Without Key:+---------+-------+| zipcode | name |+---------+-------+| 73069 | POI 3 || 73071 | POI 2 |+---------+-------+2 rows in set (57.38 sec)

+----------------------------+-------+| Variable_name | Value |+----------------------------+-------+| Handler_read_first | 2 || Handler_read_key | 2 || Handler_read_last | 0 || Handler_read_next | 0 || Handler_read_prev | 0 || Handler_read_rnd | 0 || Handler_read_rnd_next | 33149 || Handler_write | 0 |+----------------------------+-------+

Page 27: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

MySQL 8 - GIS Updates

● Spatial indexes for InnoDB (added in 5.7)● ST_Distance()

○ returns geodetic distance on the ellipsoid in meters!○ other functions still need helper functions (st_length() for example), but

are being set to be updated in future versions

● ST_AsText() ○ Returns lat/lon order based on spatial reference system

● MySQL now stores information about spatial reference systems other than SRID 0, for use with spatial data○ Note - all previous slides used SRID 4326 (geographical coordinates)

Page 28: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Quick Demo

● Collect points along a trip (GPX format)● Load the raw file● Determine geodata about each point● Calculate total distance driven per county

Page 29: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Quick Demo

mysql> explain SELECT raw.zcta5ce10 AS zipcode, raw.city, raw.county, raw.state, st_astext(pts.pt) FROM trip_tracker.raw_trip_points pts JOIN geopoly.geo_zips raw ON st_contains(raw.SHAPE, pts.pt) WHERE pts.trip_id = 16\G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: pts partitions: NULL type: refpossible_keys: loc,trip_idx key: trip_idx key_len: 4 ref: const rows: 5321 filtered: 100.00 Extra: NULL*************************** 2. row *************************** id: 1 select_type: SIMPLE table: raw partitions: NULL type: ALLpossible_keys: SHAPE key: NULL key_len: NULL ref: NULL rows: 26601 filtered: 100.00 Extra: Range checked for each record (index map: 0x2)

Page 30: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Quick Demo

+----------------------------+-------+| Variable_name | Value |+----------------------------+-------+| Handler_commit | 1 || Handler_delete | 0 || Handler_discover | 0 || Handler_external_lock | 4 || Handler_mrr_init | 0 || Handler_prepare | 0 || Handler_read_first | 0 || Handler_read_key | 5323 || Handler_read_last | 0 || Handler_read_next | 14754 || Handler_read_prev | 0 || Handler_read_rnd | 0 || Handler_read_rnd_next | 0 || Handler_rollback | 0 || Handler_savepoint | 0 || Handler_savepoint_rollback | 0 || Handler_update | 0 || Handler_write | 0 |+----------------------------+-------+

Page 31: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

MySQL 5.7: GeoJSON and Points of Interestsset @lat= 35.9974084;set @lon = -78.90455519999999;set @p = ST_GeomFromText(concat('POINT(', @lon, ' ', @lat, ')'), 1);

SELECT CONCAT('{ "type": "FeatureCollection", "features": [ ', GROUP_CONCAT('{ "type": "Feature", "geometry": ', ST_AsGeoJSON(shape), ', "properties": {"distance":', st_distance_sphere(shape, @p) , ', "name":"', name , '"} }' order by st_distance_sphere(shape, @p)), ']}') as jFROM points_newWHERE st_within(shape, create_envelope(@lat, @lon, 10)) and (other_tags like '%"amenity"=>"cafe"%' or other_tags like '%"amenity"=>"restaurant"%') and name is not null and st_distance_sphere(shape, @p) < 450 limit 10;

Page 32: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Walk to lunch from Percona office

Page 33: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Points of Interest: data

Openstreatmap: http://www.openstreetmap.org/

$ ogr2ogr -overwrite -progress -f "MySQL" MYSQL:osm,user=root north-america-latest.osm.pbf

Page 34: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Points of Interest: data

mysql> select name, st_astext(shape), other_tags from points_new where other_tags like '%"amenity"=>"cafe"%' or other_tags like '%"amenity"=>"restaurant"%' and name is not null limit 10\G

*************************** 1. row ***************************

name: The Widow's Tavern and Grille

st_astext(shape): POINT(-75.2623818 40.7551579)

other_tags: "amenity"=>"restaurant","old_name"=>"Widow Brown's Tavern"

Page 35: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

MySQL 5.7: Creating bounding box rectangleCREATE DEFINER = current_user() FUNCTION create_envelope(lat decimal(20, 14), lon decimal(20, 14), dist int) RETURNS geometry DETERMINISTICbegindeclare point_text varchar(255);declare l varchar(255);declare p geometry;declare env geometry;declare rlon1 double;declare rlon2 double;declare rlat1 double;declare rlat2 double;

set point_text = concat('POINT(', lon, ' ', lat, ')');set p = ST_GeomFromText(point_text, 1);set rlon1 = lon-dist/abs(cos(radians(lat))*69);set rlon2 = lon+dist/abs(cos(radians(lat))*69);set rlat1 = lat-(dist/69);set rlat2 = lat+(dist/69);set l = concat('LineString(', rlon1, ' ', rlat1, ',', rlon2 , ' ', rlat2, ')');set env= ST_Envelope(ST_GeomFromText(l, 1)); return env;end //

Page 36: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

MySQL Spatial Indexes in 5.7

No create_envelope With create_envelope

mysql> explain SELECT .. FROM points_new WHERE

(other_tags like '%"amenity"=>"cafe"%' or other_tags like '%"amenity"=>"restaurant"%') and name is not null and st_distance_sphere(shape, @p) < 450 limit 10\G*************************** 1. row*************************** id: 1 select_type: SIMPLE table: points_new partitions: NULL type: ALLpossible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 11368798 filtered: 18.89 Extra: Using where

mysql> explain SELECT .. FROM points_new WHERE st_within(shape, create_envelope(@lat, @lon, 10))(other_tags like '%"amenity"=>"cafe"%' or other_tags like '%"amenity"=>"restaurant"%') and name is not null and st_distance_sphere(shape, @p) < 450 limit 10\G*************************** 1. row*************************** id: 1 select_type: SIMPLE table: points_new partitions: NULL type: rangepossible_keys: SHAPE key: SHAPE key_len: 34 ref: NULL rows: 665 filtered: 18.89 Extra: Using where

Page 38: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

From to

MySQL 5.7 MongoDB 3.2

SELECT osm_id, name,round(st_distance_sphere(shape, st_geomfromtext('POINT (-78.9064543 35.9975194)', 1) ), 2) as dist, st_astext(shape)

FROM points_newWHERE st_within(shape, create_envelope(@lat, @lon, 10))and (other_tags like '%"amenity"=>"cafe"%'

or other_tags like '%"amenity"=>"restaurant"%')

and name is not nullORDER BY dist asc LIMIT 10;

db.runCommand( { geoNear: "points", near: { type: "Point" , coordinates: [ -78.9064543, 35.9975194 ] }, spherical: true, query: { name: { $exists: true, $ne:null}, "other_tags": { $in: [ /.*amenity=>restaurant.*/, /.*amenity=>cafe.*/ ] } }, "limit": 5, "maxDistance": 10000 } )

Page 39: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

From to

MySQL 5.7 MongoDB 3.2

mysql> SELECT osm_id, name, round(st_distance_sphere(shape, st_geomfromtext('POINT ... -> ORDER BY dist asc LIMIT 10;+------------+----------------------------+--------+--------------------------------------+| osm_id | name | dist | st_astext(shape) |+------------+----------------------------+--------+--------------------------------------+| 880747417 | Pop's | 127.16 | POINT(-78.9071795 35.998501) || 1520441350 | toast | 240.55 | POINT(-78.9039761 35.9967069) || 2012463902 | Pizzeria Toro | 256.44 | POINT(-78.9036457 35.997125) || 398941519 | Parker & Otis | 273.39 | POINT(-78.9088833 35.998997) || 881029843 | Torero's | 279.96 | POINT(-78.90829140000001 35.9995516) || 299540833 | Fishmonger's | 300.01 | POINT(-78.90850250000001 35.9996487) || 1801595418 | Lilly's Pizza | 319.83 | POINT(-78.9094462 35.9990732) || 1598401100 | Dame's Chicken and Waffles | 323.82 | POINT(-78.9031929 35.9962871) || 685493947 | El Rodeo | 379.18 | POINT(-78.909865 35.999523) || 685504784 | Piazza Italia | 389.06 | POINT(-78.9096472 35.9998794) |+------------+----------------------------+--------+--------------------------------------+

10 rows in set (0.21 sec)

db.runCommand( { geoNear: "points", ...} )

Milliseconds: "stats" : { "nscanned" : 1728, "objectsLoaded" : 1139, "avgDistance" : 235.76379903759667, "maxDistance" : 280.2681226202938,

"time" : 17 }, "ok" : 1

Page 40: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Export from MySQL 5.7:mysql> SELECT JSON_OBJECT('name', replace(name, '"', ''), 'other_tags',

replace(other_tags, '"', ''), 'geometry', st_asgeojson(shape)) as j

FROM `points` INTO OUTFILE '/var/lib/mysql-files/points.json';

Query OK, 13660667 rows affected (4 min 1.35 sec)

From to

Page 41: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Load to MongoDB (parallel):mongoimport --db osm --collection points -j 24 --file /var/lib/mysql-files/points.json 2016-04-11T22:38:10.029+0000 connected to: localhost2016-04-11T22:38:13.026+0000 [........................] osm.points 31.8 MB/2.2 GB (1.4%)2016-04-11T22:38:16.026+0000 [........................] osm.points 31.8 MB/2.2 GB (1.4%)2016-04-11T22:38:19.026+0000 [........................] osm.points 31.8 MB/2.2 GB (1.4%)…2016-04-11T23:12:13.447+0000 [########################] osm.points 2.2 GB/2.2 GB (100.0%)2016-04-11T23:12:15.614+0000 imported 13660667 documents

From to

Page 42: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Points json example (OpenStreetMap data)

{ "name": "Wendy's", "geometry": { "type": "Point", "coordinates": [-83.5137825, 41.7223277] }, "other_tags": "amenity=>fast_food"}

From to

GeoJSON

Page 43: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Creating index

> use osmswitched to db osm> db.points.createIndex({ geometry : "2dsphere" } ){ "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1}

From to

Page 44: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Now trying “lines” collection … ooops!> db.lines.createIndex({ geometry : "2dsphere" } ){ "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "errmsg" : "exception: Can't extract geo keys: { _id: ObjectId('570308864f45f7f0d6dfbed2'), name: \"75 North\", geometry: { type: \"LineString\", coordinates: [ [ -85.808852, 41.245582 ], [ -85.808852, 41.245582 ] ] }, other_tags: \"tiger:cfcc=>A41,tiger:county=>Kosciusko, IN,tiger:name_base=>75,tiger:name_direction_suffix=>N,tiger:reviewed=>no\" } GeoJSON LineString must have at least 2 vertices: [ [ -85.808852, 41.245582 ], [ -85.808852, 41.245582 ] ]", "code" : 16755, "ok" : 0}

From to

Same coordinates for start and end

Page 45: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Removing the “bad” data is painful … and slow… but fixed the issue

> db.lines.remove({"geometry.type": "LineString", "geometry.coordinates": {$size:2}, $where: "this.geometry.coordinates[0][0] == this.geometry.coordinates[1][0] && this.geometry.coordinates[0][1] == this.geometry.coordinates[1][1]" })WriteResult({ "nRemoved" : 22 })

From to

$where is slow… and acquire global lock

Page 46: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Find all cafes near our Durham, NC office:

> db.runCommand( { geoNear: "points", near: { type: "Point" , coordinates: [ -78.9064543, 35.9975194 ]}, spherical: true, ... query: { "other_tags": { $in: [ /.*amenity=>restaurant.*/, /.*amenity=>cafe.*/ ] } } , "limit": 5, "maxDistance": 10000 } ){ "results" : [ { "dis" : 127.30183814835166, "obj" : { "_id" : ObjectId("570329164f45f7f0d66f8f13"), "name" : "Pop's", "geometry" : { "type" : "Point", "coordinates" : [ -78.9071795, 35.998501 ] },

"other_tags" : "addr:city=>Durham,addr:country=>US,addr:housenumber=>605,addr:street=>West Main Street,amenity=>restaurant,building=>yes"

} ...

Page 47: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Questions?

MySQL 5.7 and MongoDB: Geospatial Introduction

47

Page 48: Geospatial Introduction MySQL 5.7 and MongoDB - … · MySQL 5.7 and MongoDB Geospatial Introduction. Agenda ... Oil/gas exploration ... schema... Not just points! 19

Thanks!

MySQL 5.7 and MongoDB: Geospatial Introduction

48