Top Banner
Understanding Graph Databases with Neo4j and Cypher Group Members S.S. Niranga MS-14901836 Nipuna Pannala MS-14902208 Ruhaim Izmeth MS-14901218
44
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Understanding Graph Databases with Neo4j and Cypher

Understanding Graph Databases with Neo4j and Cypher

Group Members

S.S. Niranga MS-14901836

Nipuna Pannala MS-14902208

Ruhaim Izmeth MS-14901218

Page 2: Understanding Graph Databases with Neo4j and Cypher

Trends in Data

Data is getting bigger:“Every 2 days we create as much information as we did up to 2003”– Eric Schmidt, Google

Page 3: Understanding Graph Databases with Neo4j and Cypher

The History of Graph Theory

● 1736: Leonard Euler writes a paper on the “Seven Bridges of Konisberg”

● 1845: Gustav Kirchoff publishes his electrical circuit laws

● 1852: Francis Guthrie poses the “Four Color Problem”

● 1878: Sylvester publishes an article in Nature magazine that describes graphs

● 1936: Dénes Kőnig publishes a textbook on Graph Theory

● 1941: Ramsey and Turán define Extremal Graph Theory

● 1959: De Bruijn publishes a paper summarizing Enumerative Graph Theory

● 1959: Erdos, Renyi and Gilbert define Random Graph Theory

● 1969: Heinrich Heesch solves the “Four Color” problem

● 2003: Commercial Graph Database products start appearing on the market

Page 4: Understanding Graph Databases with Neo4j and Cypher

What is Graph database?

“A traditional relational database may tell you the average age of everyone in this room..

..but a graph database will tell you who is most likely to buy you a beer!”

Page 5: Understanding Graph Databases with Neo4j and Cypher
Page 6: Understanding Graph Databases with Neo4j and Cypher

What does a Graph database look like?

Page 7: Understanding Graph Databases with Neo4j and Cypher

What is a Graph Database?

● A database with an explicit graph structure

● Each node knows its adjacent nodes

● As the number of nodes increases, the cost of a local

step (or hop) remains the same

● Plus an Index for lookups

Page 8: Understanding Graph Databases with Neo4j and Cypher

Compared to Relational Databases

Optimized for aggregation Optimized for connections

Page 9: Understanding Graph Databases with Neo4j and Cypher

Complexity Vs Size

Page 10: Understanding Graph Databases with Neo4j and Cypher

What to Choose?

http://db-engines.com/en/ranking/graph+dbms

Page 11: Understanding Graph Databases with Neo4j and Cypher

What is Neo4j?● Neo4j is an open-source graph database, implemented in Java.

● Neo4j version 1.0 was released in February, 2010.

● Neo4j version 2.0 was released in December, 2013

● Neo4j was developed by Neo Technology, Inc.

● Neo Technology board of directors consists of Rod Johnson, (founder of the Spring Framework), Magnus Christerson (Vice President of Intentional Software Corp), Nikolaj Nyholm (CEO of Polar Rose), Sami Ahvenniemi (Partner at Conor Venture Partners) and Johan Svensson (CTO of Neo Technology).

Page 12: Understanding Graph Databases with Neo4j and Cypher

Entities in Graph DBs (Neo4j)

● Nodes

● Relationships

● Properties

● Labels

● Paths

● Traversal

● Schema (index and constraints)

Page 13: Understanding Graph Databases with Neo4j and Cypher

Neo4j Properties

Ex.

Page 14: Understanding Graph Databases with Neo4j and Cypher

Ex.

Neo4j Labels

Page 15: Understanding Graph Databases with Neo4j and Cypher

Ex.

Neo4j Nodes

Page 16: Understanding Graph Databases with Neo4j and Cypher

Neo4j Relationships

Ex.

Page 17: Understanding Graph Databases with Neo4j and Cypher

Neo4j PathsEx.

Page 18: Understanding Graph Databases with Neo4j and Cypher
Page 19: Understanding Graph Databases with Neo4j and Cypher

Introducing - CypherQuery Language for Neo4j

Page 20: Understanding Graph Databases with Neo4j and Cypher
Page 21: Understanding Graph Databases with Neo4j and Cypher

Relational SchemaPerson

p_namep_id

Book

b_titleb_id

p_type

Wrote

b_idp_id

Purchased

b_id pur_datep_id

Page 22: Understanding Graph Databases with Neo4j and Cypher

Cypher - Few KeywordsGeneral Clauses● Return● Order by● Limit

Writing Clauses● Create● Merge● Set● Delete● Remove

Reading Clauses● Match● Optional Match● Where● Aggregation

See Full list at Cypher RefCardhttp://neo4j.com/docs/stable/cypher-refcard/

Functions● Predicates● Scalar functions● Collection functions● Mathematical functions● String functions

Page 23: Understanding Graph Databases with Neo4j and Cypher

Cypher Demo

http://console.neo4j.org/

or

if Neo4j is locally installed

http://localhost:7474

Page 24: Understanding Graph Databases with Neo4j and Cypher

Cypher

Creating nodes

CREATE (:Person)

CREATE (:Person { name:"John Le Carre" })

CREATE ({ name:"John Le Carre" })

CREATE (:Person:Author { name:"John Le Carre" })

CREATE (:Person:Author { name:"Graham Greene" }),

(:Book { title:"Tinker, Tailor, Soldier, Spy" }),

(:Book { title:"Our Man in Havana" }),

(:Person { name:"Ian" }),

(:Person { name:"Alan" })

Page 25: Understanding Graph Databases with Neo4j and Cypher

Cypher

Modifying nodes

MATCH (p:Person { namme:"Alan" })

SET p += {name2 : "Alan2"}

MATCH (p:Person { namme:"Alan" })

SET p.name = "Alan"

MATCH (p:Person { namme:"Alan" })

SET p = {name : "Alan"}

CREATE (:Person { namme:"Alan" })

MATCH (p:Person { name2:"Alan2" })DELETE p

MATCH (p:Person { namme:"Alan" })REMOVE p.namme

Page 26: Understanding Graph Databases with Neo4j and Cypher

Cypher Relationships

Page 27: Understanding Graph Databases with Neo4j and Cypher

Cypher - Creating Relationships

CREATE (john:Person:Author { name:"John Le Carre" }),(b:Book { title:"Tinker, Tailor, Soldier, Spy" }),(john)-[:WROTE]->(b)

MATCH (p:Person { name:"Ian" }),(b:Book { title:"Our Man in Havana" })MERGE (p)-[:PURCHASED { date:"09-09-2011" }]->(b)

MATCH(graham:Person:Author { name:"Graham Greene" }),(b:Book { title:"Our Man in Havana" })

MERGE (graham)-[:WROTE]->(b)

MATCH (t:Book { title:"Tinker, Tailor, Soldier, Spy" }),(i:Person { name:"Ian" }),(a:Person { name:"Alan" })MERGE (i)-[:PURCHASED { date:"03-02-2011" }]->(t)<-[:PURCHASED { date:"05-07-2011" }]-(a)

Page 28: Understanding Graph Databases with Neo4j and Cypher

Cypher - Modifying Relationships

MATCH

(graham:Person {name:"Graham Greene"})-[r]->(b:Book {title:"Our Man in Havana" })DELETE r

MATCH (p:Person { name:"Ian" })-[r]->(b:Book { title:"Our Man in Havana" })SET r.date = "09-09-2012"

MATCH (graham:Person:Author { name:"Graham Greene" }),(b:Book { title:"Our Man in Havana" })MERGE (graham)-[:WORTE]->(b)

Page 29: Understanding Graph Databases with Neo4j and Cypher

Cypher - Querying DBs Find All Books

SQL

SELECT * FROM Books

Cypher Query

MATCH (b:Book)RETURN b

Person (p_id, p_name, p_type)Wrote (p_id, b_id)Book (b_id, b_title )Purchased (p_id, b_id, pur_date)

Cypher Result+-----------------------------------------------+| b |+-----------------------------------------------+| Node[2]{title:"Tinker, Tailor, Soldier, Spy"} || Node[3]{title:"Our Man in Havana"} |+-----------------------------------------------+2 rows2 ms

Page 30: Understanding Graph Databases with Neo4j and Cypher

Cypher - Querying DBs Find All Authors

SQL

SELECT * FROM Person where p_type=”Author”

Cypher Query

MATCH (a:Author)RETURN a

Person (p_id, p_name, p_type)Wrote (p_id, b_id)Book (b_id, b_title )Purchased (p_id, b_id, pur_date)

Cypher Result+-------------------------------+| a |+-------------------------------+| Node[0]{name:"John Le Carre"} || Node[1]{name:"Graham Greene"} |+-------------------------------+2 rows8 ms

Page 31: Understanding Graph Databases with Neo4j and Cypher

Cypher - Querying DBs Find All Authors and the Books written by them

SQL

SELECT p.p_name, b.b_title FROM Person p, Wrote w, Book b where p.p_type=”Author” and w.p_id = p.p_id andw.b_id = b.b_id

Cypher Query

MATCH (a:Author)-[:WROTE]->(b:Book)RETURN a,b

Person (p_id, p_name, p_type)Wrote (p_id, b_id)Book (b_id, b_title )Purchased (p_id, b_id, pur_date)

Cypher Result+-------------------------------------------------------------------------------+| a | b |+-------------------------------------------------------------------------------+| Node[0]{name:"John Le Carre"} | Node[2]{title:"Tinker, Tailor, Soldier, Spy"} || Node[1]{name:"Graham Greene"} | Node[3]{title:"Our Man in Havana"} |+-------------------------------------------------------------------------------+2 rows12 ms

Page 32: Understanding Graph Databases with Neo4j and Cypher

Cypher - Querying DBs Find Books written by Graham Greene

SQL

SELECT b.b_title FROM Person p, Wrote w, Book b where p.p_type=”Author” and w.p_id = p.p_id andw.b_id = b.b_id andp.name = “Graham Greene”

Cypher Query

MATCH (a:Author)-[:WROTE]->(b:Book)WHERE a.name = 'Graham Greene'RETURN b

Person (p_id, p_name, p_type)Wrote (p_id, b_id)Book (b_id, b_title )Purchased (p_id, b_id, pur_date)

Cypher Result+------------------------------------+| b |+------------------------------------+| Node[3]{title:"Our Man in Havana"} |+------------------------------------+1 row13 ms

Page 33: Understanding Graph Databases with Neo4j and Cypher

Cypher - Querying DBs Find names of all persons, the books they purchased and the date the purchase was made

SQL

SELECT p.p_name, pur.pur_date, b.b_titleFROM Person p, Book b, Purchased pur WHERE pur.p_id=p.p_id and b.b_id = pur.b_id

Cypher Query

MATCH (a)-[r:PURCHASED]->(b)RETURN a,r.date,b

Person (p_id, p_name, p_type)Wrote (p_id, b_id)Book (b_id, b_title )Purchased (p_id, b_id, pur_date)

Cypher Result+-------------------------------------------------------------------------------------+| a | r.date | b |+-------------------------------------------------------------------------------------+| Node[4]{name:"Ian"} | "09-09-2011" | Node[3]{title:"Our Man in Havana"} || Node[4]{name:"Ian"} | "03-02-2011" | Node[2]{title:"Tinker, Tailor, Soldier, Spy"} || Node[5]{name:"Alan"} | "05-07-2011" | Node[2]{title:"Tinker, Tailor, Soldier, Spy"} |+-------------------------------------------------------------------------------------+3 rows

Page 34: Understanding Graph Databases with Neo4j and Cypher

Cypher - Querying DBs Find how Graham Greene is related to Ian

SQL

I won’t attempt!!!

Cypher Query

MATCH (a:Author)-[r*]-(p:Person { name:'Ian' })WHERE a.name = 'Graham Greene'RETURN a,r,p

Person (p_id, p_name, p_type)Wrote (p_id, b_id)Book (b_id, b_title )Purchased (p_id, b_id, pur_date)

Cypher Result+--------------------------------------------------------------------------------------------------------+| a | r | p |+--------------------------------------------------------------------------------------------------------+| Node[1]{name:"Graham Greene"} | [:WROTE[1] {},:PURCHASED[0] {date:"09-09-2011"}] | Node[4]{name:"Ian"} |+--------------------------------------------------------------------------------------------------------+1 row38 ms

Page 35: Understanding Graph Databases with Neo4j and Cypher

Support for Graph Algorithms● shortestPath● allSimplePaths● allPaths● dijkstra (optionally with

cost_property and default_cost parameters)

Page 36: Understanding Graph Databases with Neo4j and Cypher

Neo4j - Default locking behavior for Concurrency

● When adding, changing or removing a property on a node or relationship a write lock will be taken on the specific node or relationship.

● When creating or deleting a node a write lock will be taken for the specific node.

● When creating or deleting a relationship a write lock will be taken on the specific relationship and both its nodes.

Page 37: Understanding Graph Databases with Neo4j and Cypher

Neo4j - Performance● As JVM runs on a shared environment, the way the

JVM is configured greatly related to Performance.

● More optimized for querying than CRUD operations, Batch updates are recommended

● Indexes can be set on nodes, relationships and their properties. Can boost query response times

● Mixed reports on querytimes and performance, upcoming releases are optimizing this.

Page 38: Understanding Graph Databases with Neo4j and Cypher

Neo4j Capacity - Data size

In Neo4j, data size is mainly limited by the address space of the primary keys for Nodes, Relationships, Properties and Relationship types. Currently, the address space is as follows:

nodes 2^35 (∼ 34 billion)relationships 2^35 (∼ 34 billion)properties 2^36 to 2^38 depending on property types (maximum ∼

274 billion, always at least ∼ 68 billion)relationship types

2^15 (∼ 32 000)

Page 39: Understanding Graph Databases with Neo4j and Cypher

Calling Neo4j JVM Server

Neo4j DB

Java Application

Web Application Web REST API

Java APIOfficially supported languages● Java● .NET● JavaScript● Python● Ruby● PHP

Page 40: Understanding Graph Databases with Neo4j and Cypher

Neo4j EditionsEnterpriseEnterprise Lock Manager

High Performance Cache

Clustering

Hot Backups

Advanced Monitoring

NOT FREE

CommunityFREE

OPEN SOURCE

Page 41: Understanding Graph Databases with Neo4j and Cypher

If you’ve ever● Joined more than 7 tables together

● Modeled a graph in a table

● Written a recursive CTE (Common Table Expression)

● Tried to write some crazy stored procedure with multiple

recursive self and inner joins

You should use Neo4j

Page 42: Understanding Graph Databases with Neo4j and Cypher

Disadvantages● JVM should configured properly to get the

optimal performance.

● Neo4j DB cannot be distributed. They should replicated.

● Inappropriate for transactional information like accounting and banking.

Page 43: Understanding Graph Databases with Neo4j and Cypher

Who use Neo4j?

Page 44: Understanding Graph Databases with Neo4j and Cypher

Thank you !!!