Top Banner
Pierre De Wilde 4 May 2012 Global Brain Institute VUB - ECCO Group A Walk in Graph Databases
65

A walk in graph databases v1.0

May 08, 2015

Download

Documents

Pierre De Wilde
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A walk in graph databases v1.0

Pierre De Wilde4 May 2012Global Brain InstituteVUB - ECCO Group

A Walk in Graph Databases

Page 2: A walk in graph databases v1.0

The Law of the Hammer

If the only tool you have is a hammer, everything looks like a nail.

Abraham Maslow - The Psychology of Science - 1966

Page 3: A walk in graph databases v1.0

The Law of the Relational Database

If the only tool you have is a relational database, everything looks like a table.

A Walk in Graph Databases - 2012

Page 4: A walk in graph databases v1.0

One size fits all

Scalability issueScale upScale out

Index-intensive issue

Find dataJoin data

doesn't

Page 5: A walk in graph databases v1.0

NoSQL ?! No SQL ? Not only SQL !

Scalability solutionsKey-value storesColumn databasesDocument databases

Index-intensive solution

Graph databases

Page 6: A walk in graph databases v1.0

Query language for relational databases

SQL

ISUD or CRUD

Page 7: A walk in graph databases v1.0

Query language for relational databases

Gremlin is a graph traversal language

Traversal graph

Page 8: A walk in graph databases v1.0

Map of walk

Warm upProperty GraphGraph Database

Walk with Gremlin

Graph ManipulationGraph Traversal

Stretching

Linked DataGlobal Graph

Page 9: A walk in graph databases v1.0

Map of walk

Warm upProperty GraphGraph Database

Walk with Gremlin

Graph ManipulationGraph Traversal

Stretching

Linked DataGlobal Graph

Page 10: A walk in graph databases v1.0

Graph

G = (V, E)

--. .-. .- .--. ....

Page 11: A walk in graph databases v1.0

One graph doesn't fit all

Marko A. Rodriguez and Peter Neubauer - Constructions from Dots and Lines - 2010

Page 12: A walk in graph databases v1.0

Property graph

A property graph is a directed, labeled, attributed, multi graph.

Page 13: A walk in graph databases v1.0

Anatomy of a vertex

A vertex is composed of - an unique identifier (id)- a collection of properties- a set of incoming edges (inE)- a set of outgoing edges (outE)

Page 14: A walk in graph databases v1.0

Anatomy of an edge

An edge is composed of - an unique identifier (id)- an outgoing vertex (outV)- a label- an incoming vertex (inV)- a collection of properties

Page 15: A walk in graph databases v1.0

Map of walk

Warm upProperty GraphGraph Database

Walk with Gremlin

Graph ManipulationGraph Traversal

Stretching

Linked DataGlobal Graph

Page 16: A walk in graph databases v1.0

Graph database

Page 17: A walk in graph databases v1.0

Key feature of a graph database

Index-free adjacency

Page 18: A walk in graph databases v1.0

Some graph database vendors

Neo4j from Neo Technologyhttp://neo4j.org/ OrientDB from Orient Technologieshttp://www.orientdb.org/ Dex from Sparsity-Technologieshttp://www.sparsity-technologies.com/dex InfiniteGraph from Objectivity, Inc.http://www.infinitegraph.com/

Page 19: A walk in graph databases v1.0

Map of walk

Warm upProperty GraphGraph Database

Walk with Gremlin

Graph ManipulationGraph Traversal

Stretching

Linked DataGlobal Graph

Page 20: A walk in graph databases v1.0

TinkerPop

Open source project in the graph space

Page 21: A walk in graph databases v1.0

TinkerPop family

https://github.com/tinkerpop

Page 22: A walk in graph databases v1.0

Gremlin

Gremlin is a graph traversal language

$ gremlin.sh \,,,/ (o o)-----oOOo-(_)-oOOo-----gremlin>

Page 23: A walk in graph databases v1.0

Map of walk

Warm upProperty GraphGraph Database

Walk with Gremlin

Graph ManipulationGraph Traversal

Stretching

Linked DataGlobal Graph

Page 24: A walk in graph databases v1.0

Connect to a graph database

gremlin> g = new TinkerGraph(name) gremlin> g = new Neo4jGraph(name) gremlin> g = new OrientGraph(name) gremlin> g = new DexGraph(name) gremlin> g = new IGGraph(name)

Page 25: A walk in graph databases v1.0

Add a vertex / an edge

gremlin> v1 = g.addVertex()gremlin> v2 = g.addVertex()... gremlin> g.addEdge(v1, 'knows', v2)... gremlin> g.loadGraphML(url)

Page 26: A walk in graph databases v1.0

Update a vertex

gremlin> v = g.getVertex(1)==>v[1] gremlin> v.getPropertyKeys()==>age==>name gremlin> v.getProperty('name')==>markogremlin> v.getProperty('age')==>29gremlin> v.setProperty('age',32)==>32 gremlin> v.age==>32gremlin> v.name==>marko

Page 27: A walk in graph databases v1.0

Update an edge

gremlin> e = g.getEdge(8)==>e[8][1-knows->4] gremlin> e.getPropertyKeys()==>weight gremlin> e.getProperty('weight')==>1.0gremlin> e.setProperty('weigth',0.9)==>0.9 gremlin> e.map() ==>weigth=0.9==>weight=1.0 gremlin> e.removeProperty('weigth')==>0.9

Page 28: A walk in graph databases v1.0

Remove a vertex

gremlin> v = g.getVertex(3)==>v[3]gremlin> g.removeVertex(v)==>null

Page 29: A walk in graph databases v1.0

Remove an edge

gremlin> e = g.getEdge(10)==>e[10][4-created->5]gremlin> g.removeEdge(e)==>null

Page 30: A walk in graph databases v1.0

Disconnect from the graph database

gremlin> g.shutdown()

Page 31: A walk in graph databases v1.0

Map of walk

Warm upProperty GraphGraph Database

Walk with Gremlin

Graph ManipulationGraph Traversal

Stretching

Linked DataGlobal Graph

Page 32: A walk in graph databases v1.0

Graph traversal

Jump - from vertex to edge- from edge to vertex- from vertex to vertex

Page 33: A walk in graph databases v1.0

Graph traversal: starting the traversal

gremlin> g.v(1)==>v[1]

Page 34: A walk in graph databases v1.0

Graph traversal: outgoing edges

gremlin> g.v(1).outE==>e[7][1-knows->2]==>e[9][1-created->3]==>e[8][1-knows->4]

Page 35: A walk in graph databases v1.0

Graph traversal: incoming vertices

gremlin> g.v(1).outE.inV==>v[2]==>v[4]==>v[3]

Page 36: A walk in graph databases v1.0

Graph traversal: outgoing edges (cont.)

gremlin> g.v(1).outE.inV.outE==>e[10][4-created->5]==>e[11][4-created->3]

Page 37: A walk in graph databases v1.0

Graph traversal: incoming vertices (cont.)

gremlin> g.v(1).outE.inV.outE.inV==>v[5]==>v[3]

Page 38: A walk in graph databases v1.0

Graph traversal: ending the traversal

gremlin> g.v(1).outE.inV.outE.inV.outE

Page 39: A walk in graph databases v1.0

Graph traversal: starting vertex

gremlin> g.v(1)==>v[1]

Page 40: A walk in graph databases v1.0

Graph traversal: adjacent vertices

gremlin> g.v(1).out==>v[2]==>v[4]==>v[3]

Page 41: A walk in graph databases v1.0

Graph traversal: adjacent vertices (cont.)

gremlin> g.v(1).out.out==>v[5]==>v[3]

Page 42: A walk in graph databases v1.0

Graph traversal: starting vertex

gremlin> g.v(1)==>v[1]

Page 43: A walk in graph databases v1.0

Graph traversal: labeled outgoing edges

gremlin> g.v(1).outE('created')==>e[9][1-created->3]

Page 44: A walk in graph databases v1.0

Graph traversal: labeled adjacent vertices

gremlin> g.v(1).outE('created').inV==>v[3] gremlin> g.v(1).out('created')==>v[3]

Page 45: A walk in graph databases v1.0

Graph traversal: labeled adjacent (cont.)

gremlin> g.v(1).out('created').in('created')==>v[1]==>v[4]==>v[6]

Page 46: A walk in graph databases v1.0

Graph traversal and ...

indextransformfiltercomputemanipulatelooppath

Page 47: A walk in graph databases v1.0

Graph traversal and index

gremlin> g.idx('vertices')[[name:'marko']]==>v[1]

Page 48: A walk in graph databases v1.0

Graph traversal and transform

gremlin> g.v(1).outE.label.dedup==>knows==>created gremlin> g.v(1).out('knows').name==>vadas==>josh

Page 49: A walk in graph databases v1.0

Graph traversal and filter

gremlin> g.v(1).out('knows').age==>27==>32 gremlin> g.v(1).out('knows').filter{it .age>30}.age==>32

Page 50: A walk in graph databases v1.0

Graph traversal and compute

gremlin> g.v(1).outE.weight==>0.5==>1.0==>0.4 gremlin> g.v(1).outE.weight.mean()==>0.6333333353201548

Page 51: A walk in graph databases v1.0

Graph traversal and manipulate

gremlin> g.v(1).outE.sideEffect{it. weight+=0.1}.weight==>0.6==>1.1==>0.5

Page 52: A walk in graph databases v1.0

Graph traversal and loop

gremlin> g.v(1).out.loop(1){it.loops<3}==>v[5]==>v[3]

Page 53: A walk in graph databases v1.0

Graph traversal and path

gremlin> g.v(1).outE.inV.path ==>[v[1], e[7][1-knows->2], v[2]]==>[v[1], e[8][1-knows->4], v[4]]==>[v[1], e[9][1-created->3], v[3]] gremlin> g.v(1).out.path==>[v[1], v[2]]==>[v[1], v[4]]==>[v[1], v[3]]

Page 54: A walk in graph databases v1.0

Global traversal: in-degree distribution

gremlin> m=[:].withDefault{0}; g.V.sideEffect{m[it.in.count()]+=1}.iterate(); m.sort()==>0=2==>1=3==>3=1

Page 55: A walk in graph databases v1.0

Walk is ending

Gremlin is a graph traversal languageflexible

Page 56: A walk in graph databases v1.0

Map of walk

Warm upProperty GraphGraph Database

Walk with Gremlin

Graph ManipulationGraph Traversal

Stretching

Linked DataGlobal Graph

Page 57: A walk in graph databases v1.0

Linked Data

http://www.w3.org/DesignIssues/LinkedData.html

Page 58: A walk in graph databases v1.0

Linked Data cloud

Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/

Page 59: A walk in graph databases v1.0

Linked Data initiative

http://freeyourmetadata.org/

Page 60: A walk in graph databases v1.0

Linked Data and Gremlin

gremlin> g = new SparqlRepositorySailGraph("http://dbpedia.org/sparql") gremlin> v = g.v(' http://dbpedia.org/resource/Global_brain')==>v[http://dbpedia.org/resource/Global_brain] gremlin> v.out('http://www.w3.org/2000/01/rdf-schema#comment').has('lang','en').value==>The Global Brain is a metaphor for the worldwide intelligent network... gremlin> v.inE('http://dbpedia.org/ontology/knownFor').outV==>v[http://dbpedia.org/resource/Francis_Heylighen] gremlin> v.inE('http://dbpedia.org/ontology/knownFor').outV.outE('http://dbpedia.org/ontology/knownFor').inV==>v[http://dbpedia.org/resource/Self-organization]==>v[http://dbpedia.org/resource/Memetics]==>v[http://dbpedia.org/resource/Global_brain]

+ +

Page 61: A walk in graph databases v1.0

Map of walk

Warm upProperty GraphGraph Database

Walk with Gremlin

Graph ManipulationGraph Traversal

Stretching

Linked DataGlobal Graph

Page 62: A walk in graph databases v1.0

Graph and Brain

Page 63: A walk in graph databases v1.0

Global Graph

=>

=>

=>

I called this graph the Semantic Web, but maybe it should have been Giant Global Graph.

Tim Berners-Lee - timbl's blog - 2007

Internet

Word Wide Web

Giant Global Graph

net of computers web of documents graph of metadata

Page 64: A walk in graph databases v1.0

Thank you

Page 65: A walk in graph databases v1.0

Logos created by Ketrina Yim for TinkerPop geeksImages created by Flickr Creative Commons ArtistsGraphs created by Memotive Concept Mapping tool

http://tinkerpop.com