Top Banner
40

Demolitions and Dali : Web Dev and Data in a Graph Database

Jan 22, 2018

Download

Software

Nicholas Doiron
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Demolitions and Dali : Web Dev and Data in a Graph Database
Page 2: Demolitions and Dali : Web Dev and Data in a Graph Database

• — TO —>

Page 3: Demolitions and Dali : Web Dev and Data in a Graph Database

Also

Page 4: Demolitions and Dali : Web Dev and Data in a Graph Database

OK, graph databases

• Instead of tables and SQL

• Nodes and relationships

• Specialized queries

• Not everything is a graph (and this is not sponsored)

Page 5: Demolitions and Dali : Web Dev and Data in a Graph Database

Install / Update Neo4j

• Neo4j

• http://localhost:7474Community Edition 3.0.3

• Python, PIP, and Py2Neo

• py2neo.__version__ = ‘3b1’

Page 6: Demolitions and Dali : Web Dev and Data in a Graph Database

Step 0 - installing• Install Neo4j - neo4j.com/install

• brew on Mac

• DigitalOcean has Linux instructions

• change default password

• Trouble installing locally?

• heroku addons:add graphene

Page 7: Demolitions and Dali : Web Dev and Data in a Graph Database

Who uses graphs?

• Panama Papers

• IMDB / Six Degrees of Kevin Bacon

• Especially:

• social networks, research data, maps

• anywhere number of joins is large, indefinite, or unlimited

Page 8: Demolitions and Dali : Web Dev and Data in a Graph Database
Page 9: Demolitions and Dali : Web Dev and Data in a Graph Database

Cypher

Page 10: Demolitions and Dali : Web Dev and Data in a Graph Database

MoMA.org• PostgreSQL sync to “The Museum System” CMS

outside our control

Page 11: Demolitions and Dali : Web Dev and Data in a Graph Database

Who uses MoMA.org?• Tourists

• Researchers

• Distant art fans

• Members

Page 12: Demolitions and Dali : Web Dev and Data in a Graph Database

The trouble with tables

• Many joins to get people, titles, photos, additional relationship info

• Speed of query

• Difficult to write new queries

Page 13: Demolitions and Dali : Web Dev and Data in a Graph Database

Art Graph DB• did Picasso collaborate with other artists

in his lifetime?

• are any artists credited as painter, director, sculptor, etc?(maybe an art EGOT)

Page 14: Demolitions and Dali : Web Dev and Data in a Graph Database

Let’s build that graph

• Artists and artworks

• Basic bio data, MoMA ID -> Artist node

• Future DB: all people connected

• Title, date, MoMA ID -> Artwork node

• ARTIST_OF relationship (include order)

Page 15: Demolitions and Dali : Web Dev and Data in a Graph Database

Let’s build that graph

• git clonehttps://github.com/mapmeld/graph

!

• Building a scraper for MoMA

Page 16: Demolitions and Dali : Web Dev and Data in a Graph Database

Demolitions and Dalíin a Graph Database

Nick Doiron - @mapmeld

Page 17: Demolitions and Dali : Web Dev and Data in a Graph Database

Cypher

Page 18: Demolitions and Dali : Web Dev and Data in a Graph Database

Cypher

Page 19: Demolitions and Dali : Web Dev and Data in a Graph Database
Page 20: Demolitions and Dali : Web Dev and Data in a Graph Database

On to OSM

Page 21: Demolitions and Dali : Web Dev and Data in a Graph Database

If you’re interested

• Google: MapZen Extracts

• download a city

• for this script, download the OSM XML file

• if you like PostGIS, there is a download (no import script)

Page 22: Demolitions and Dali : Web Dev and Data in a Graph Database

Benefits of OSM

• Open to use / full data

• Open to edit / choose tags

• HOT community

• Civil e-mail lists (Crimea)

Page 23: Demolitions and Dali : Web Dev and Data in a Graph Database

Benefits of OSM

Page 24: Demolitions and Dali : Web Dev and Data in a Graph Database

Google on OSM

• "Our maps representwhat you or I need to do on a day-to-day basisin the developed part of the world”

• — Google Maps Geospatial Technologist (quoted in FastCompany)

Page 25: Demolitions and Dali : Web Dev and Data in a Graph Database

In Haiti and worldwide

Page 26: Demolitions and Dali : Web Dev and Data in a Graph Database

In Haiti and worldwide

Page 27: Demolitions and Dali : Web Dev and Data in a Graph Database

XML data

Page 28: Demolitions and Dali : Web Dev and Data in a Graph Database

XML data• Nodes, ways, and relations

• Ways made up of multiple nodes

• Relations contain nodes and ways

• Practically:

• Multiple ways connect / combine

• Tags are a community construct

Page 29: Demolitions and Dali : Web Dev and Data in a Graph Database

Smart Renderer

• When is a <way> a line (cul-de-sac) or a polygon (river, lake, parking lot)?

• Has to support world’s fonts

• Tag for real life, not for the renderer

Page 30: Demolitions and Dali : Web Dev and Data in a Graph Database

Building graph data

• Script adds all roads to Neo4j

• Includes an array of node ids (can mix content types, similar to a document database)

• If two ways share a node with the same ID, link them both ways <—>

Page 31: Demolitions and Dali : Web Dev and Data in a Graph Database

Cypher + OSM

* you can put an index on schema fields now

Page 32: Demolitions and Dali : Web Dev and Data in a Graph Database

Problem

Page 33: Demolitions and Dali : Web Dev and Data in a Graph Database

Google Prediction API

• Prediction based on a CSV

• Categorization or numerical

• Google generates a model and estimates accuracy

• Not allowed in Myanmar

Page 34: Demolitions and Dali : Web Dev and Data in a Graph Database

Predicting Houses• Format 60,000+ rows of database export

• Choose categories to predict 2-3 years

• Competing models determine how important each column is

• Can it parse dates? Find patterns

• Edging up to ~74 percent accuracy

Page 35: Demolitions and Dali : Web Dev and Data in a Graph Database

Network effect

• Adding network of streets

• Now tokens include not just my street and neighbors, but shared streets

Page 36: Demolitions and Dali : Web Dev and Data in a Graph Database

Network effect

• Most demolitions have one house on their street demolished (it’s them)

Page 37: Demolitions and Dali : Web Dev and Data in a Graph Database

Network effect

Page 38: Demolitions and Dali : Web Dev and Data in a Graph Database

Network effect

• Google Prediction API reported 81% accuracy

• But is it good?

• Early optimization studies moved fire stations and left neighborhoods vulnerable

• City can’t maintain it… hasn’t continued to open their data

Page 39: Demolitions and Dali : Web Dev and Data in a Graph Database

Looking forward

• Ideas for graph databases?Ways to release large graph data - as an API? As JSON files? As Neo4j dump?

• Ideas for statisticians / future research?

Page 40: Demolitions and Dali : Web Dev and Data in a Graph Database

Demolitions and Dalíin a Graph Database

Nick Doiron - @mapmeld