YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: Relational to Big Graph

Rela%onal  to  (Big)  Graph  Harnessing  the  Power  of  the  Graph  

Michael  Hunger  JAX  Mainz  2015  

Page 2: Relational to Big Graph

Agenda  

• History  of  Neo4j  • Rela1onal  Pains  –  Graph  Pleasure  • Rela1onal  to  Graph  • Model  -­‐>  Import  -­‐>  Query  -­‐>  Build  -­‐>  Integrate  • Demo  • Q&A  

Page 3: Relational to Big Graph

History  of  Neo4j  

A  Story  of  Rela%onal  Pain  

Page 4: Relational to Big Graph

History  of  Neo4j  -­‐  Problem  

•  Digital  Asset  Management  System  in  2000  •  SaaS  many  users  in  many  countries  •  Two  hard  use-­‐cases  •  Mul1  language  keyword  search  •  Including  synonyms  /  word  hierarchies  

•  Access  Management  to  Assets  for  SaaS  Scale  

Page 5: Relational to Big Graph

History  of  Neo4j  –  Rela%onal  ABempt  

•  Tried  with  many  rela1onal  DBs  •  JOIN  Performance  Problems  •  Hierarchies,  Networks,  Graphs  

•  Modeling  Problems  •  Data  Model  evolu1on  

•  No  Success,  even  …  •  With  expensive  database  consultants!  

Page 6: Relational to Big Graph

History  of  Neo4j  –  First  working  Implementa%on  

•  Graph  Model    &  API  sketched  on  a  napkin  •  Nodes  connected  by  RelaAonships  •  Just  like  your  conceptual  model  

•  Implemented  network-­‐database  in  memory  •  Java  API,  fast  Traversals  •  Worked  well,  but  …  •  No  persistence,  No  Transac1ons  •  Long  import  /  export  1me  from  rela1onal  storage  

Page 7: Relational to Big Graph

History  of  Neo4j  -­‐  Solu%on  

•  Evolved  to  full  fledged  database  in  Java  •  With  persistence  using  files  +  memory  mapping  •  Transac1ons  with  Transac1on  Log  (WAL)  •  Lucene  for  fast  Node  search  

•  Founded  Company  in  2007  •  Neo4j  (REST)-­‐Server  •  Neo4j  Clustering  &  HA    •  Cypher  Query  Language  

•  Today  …  

Page 8: Relational to Big Graph

Neo  Technology  Overview  

Product  • Neo4j  -­‐  World’s  leading  graph  database  

• 1M+  downloads,  adding  50k+    per  month  

• 150+  enterprise  subscrip1on  customers  including  over    50  of  the  Global  2000  

Company  • Neo  Technology,  Creator  of  Neo4j  • 80  employees  with  HQ  in  Silicon  Valley,  London,  Munich,  Paris  and  Malmö  

• $45M  in  funding  from  Fidelity,  Sunstone,  Conor,  Creandum,  Dawn  Capital  

Page 9: Relational to Big Graph

Neo4j  Adop%on  by  Selected  Ver%cals  FinancialServices Communications Health &

Life Sciences HR &

Recruiting Media &

Publishing SocialWeb

Industry & Logistics

Entertainment Consumer Retail Information Services Business Services

Page 10: Relational to Big Graph

How  Customers  Use  Neo4j  Network &

Data Center Master DataManagement Social Recom–

mendations Identity

& Access Search &Discovery GEO

Page 11: Relational to Big Graph

“Forrester  es1mates  that  over  25%  of  enterprises  will  be  using  graph  databases  by  2017”  

Neo4j  Leads  the  Graph  Database  Revolu%on  

“Neo4j  is  the  current  market  leader  in  graph  databases.”  

“Graph  analysis  is  possibly  the  single  most  effec%ve  compe%%ve  differen%ator  for  organiza1ons  pursuing  data-­‐driven  opera1ons  and  decisions  aler  the  design  of  data  capture.”  

IT  Market  Clock  for  Database  Management  Systems,  2014  hmps://www.gartner.com/doc/2852717/it-­‐market-­‐clock-­‐database-­‐management  TechRadar™:  Enterprise  DBMS,  Q1  2014  hmp://www.forrester.com/TechRadar+Enterprise+DBMS+Q1+2014/fulltext/-­‐/E-­‐RES106801  Graph  Databases  –  and  Their  Poten%al  to  Transform  How  We  Capture  Interdependencies  (Enterprise  Management  Associates)  hmp://blogs.enterprisemanagement.com/dennisdrogseth/2013/11/06/graph-­‐databasesand-­‐poten1al-­‐transform-­‐capture-­‐interdependencies/  

Page 12: Relational to Big Graph

Largest  Ecosystem  of  Graph  Enthusiasts  

•  1,000,000+  downloads  •  20,000+  educated  developers  •  18,000+  Meetup  members  •  100+  technology  and  service  partners  •  150+  enterprise  subscrip1on  customers    including  50+  Global  2000  companies  

Page 13: Relational to Big Graph

High  Business  Value  in  Data  Rela%onships  

Data  is  increasing  in  volume…  •  New  digital  processes  •  More  online  transac1ons  •  New  social  networks  •  More  devices  

Using  Data  Rela%onships  unlocks  value    •  Real-­‐1me  recommenda1ons  •  Fraud  detec1on  •  Master  data  management  •  Network  and  IT  opera1ons  •  Iden1ty  and  access  management  •  Graph-­‐based  search  …  and  is  ge^ng  more  connected  

Customers,  products,  processes,  devices  interact  and  relate  to  each  other    

Early  adopters  became  industry  leaders  

Page 14: Relational to Big Graph

Rela%onal  Pains  –    Graph  Pleasure  

Page 15: Relational to Big Graph

Rela%onal  DBs  Can’t  Handle  Rela%onships  Well  

•  Cannot  model  or  store  data  and  relaAonships  without  complexity  

•  Performance  degrades  with  number  and  levels  of  rela1onships,  and  database  size  

•  Query  complexity  grows  with  need  for  JOINs  •  Adding  new  types  of    data  and  relaAonships  requires  schema  redesign,  increasing  1me  to  market  

…  making  tradi1onal  databases  inappropriate  when  data  rela1onships  are  valuable  in  real-­‐%me      

Slow  development  Poor  performance  Low  scalability  Hard  to  maintain  

Page 16: Relational to Big Graph

Why  Rela%onal  DBs  Can’t  Handle  Rela%onships  Well?  

•  Data  Model  built  for  tabular  forms  not  JOINS  managing  connec1ons  was  bolted  on  both  in  schema  and  query  

•  Strict  schema  not  suitable  for  variable  structured  data  which  is  generated  and  used  by  todays  applica1ons  

•  Data  volume  and  JOIN  number  affect  cost  of  query  opera1on  exponen1ally  

•  Variable  hierarchies  and  networks  are  hard  to  store  and  query  so  many  “pamerns”  were  developed  

…  olen  only  denormaliza1on  makes  complex  rela1onal  queries  fast  but  destroys  the  good  normalized  data-­‐model      

Built  for  Forms  Joins  are  expensive  Denormalize  #FTW  

 

Page 17: Relational to Big Graph

Unlocking  Value  from  Your  Data  Rela%onships  

•  Model  your  data  naturally  as  a  graph  of  data  and  rela1onships  

•  Drive  graph  model  from  domain  and  use-­‐cases  

•  Use  rela1onship  informa1on  in  real-­‐1me  to  transform  your  business  

•  Add  new  rela1onships  on  the  fly  to  adapt  to  your  changing  requirements  

Page 18: Relational to Big Graph

High  Query  Performance  with  a  Na%ve  Graph  DB  

•  Rela1onships  are  first  class  ci1zen  •  No  need  for  joins,  just  follow  pre-­‐materialized  rela1onships  of  nodes  

•  Query  &  Data-­‐locality  –  navigate  out  from  your  star1ng  points  

•  Only  load  what’s  needed  •  Aggregate  and  project  results  as  you  go  

•  Op1mized  disk  and  memory  model  for  graphs  

Page 19: Relational to Big Graph

High  Query  Performance:  Some  Numbers  

•  Traverse  4M+  rela1onships  per  second  and  core  

•  Cost  based  query  op1mizer  –  complex  queries  return  in  milliseconds  

•  Import  100K-­‐1M  records  per  second  transac1onally  

•  Bulk  import  tens  of  billions  of  records  in  a  few  hours  

Page 20: Relational to Big Graph

High  Query  Performance:  Some  Numbers  

•  Traverse  4M+  rela1onships  per  second  and  core  

•  Cost  based  query  op1mizer  –  complex  queries  return  in  milliseconds  

•  Import  100K-­‐1M  records  per  second  transac1onally  

•  Bulk  import  tens  of  billions  of  records  in  a  few  hours  

Page 21: Relational to Big Graph

Modeling  as  a  Graph  

Page 22: Relational to Big Graph

The  Whiteboard  Model  Is  the  Physical  Model  

Page 23: Relational to Big Graph

CAR  

name:  “Dan”  born:  May  29,  1970  twimer:  “@dan”  

name:  “Ann”  born:    Dec  5,  1975  

since:    Jan  10,  2011  

brand:  “Volvo”  model:  “V70”  

Property  Graph  Model  Components  

Nodes  •  The  objects  in  the  graph  •  Can  have  name-­‐value  proper&es  •  Can  be  labeled  Rela%onships  •  Relate  nodes  by  type  and  direc1on  •  Can  have  name-­‐value  proper&es  

LOVES  

LOVES  

LIVES  WITH  PERSON   PERSON  

Page 24: Relational to Big Graph

Rela%onal  Versus  Graph  Models  

Rela%onal  Model   Graph  Model  

KNOWS  ANDREAS  

TOBIAS  

MICA  

DELIA  

Person   Friend  Person-­‐Friend  

ANDREAS  DELIA  

TOBIAS  

MICA  

Page 25: Relational to Big Graph

Let’s  Model!  

 

Customer,  Supplier,  and  Product  (Master  Data)  Orders  (Ac%vity)  

Page 26: Relational to Big Graph

The  Domain  Model  

Order

Product

Customer Employee

SOLD

ORDERS

Category

Employee

REPORTS_TO

PART_OF

PURCHASED

Supplier

SUPPLIES

Page 27: Relational to Big Graph

Except…  

Page 28: Relational to Big Graph

The  Requisite  Northwind  Example!  

 

NOT  JUST  ANY  

Page 29: Relational to Big Graph

(Northwind)-­‐[:TO]-­‐>(Graph)  Building  the  Graph  Model  

Page 30: Relational to Big Graph

Building  Rela%onships  in  Graphs  

SOLD  

Employee   Order  Order  

Page 31: Relational to Big Graph

Locate  Foreign  Keys  

Page 32: Relational to Big Graph

(FKs)-­‐[:BECOME]-­‐>(Rela%onships)  Correct  Direc%ons  

Page 33: Relational to Big Graph

Drop  Foreign  Keys  

Page 34: Relational to Big Graph

Find  the  Join  Tables  

Page 35: Relational to Big Graph

Simple  Join  Tables  Becomes  Rela%onships  

Page 36: Relational to Big Graph

ABributed  Join  Tables  Become  Rela%onships  with  Proper%es  

Page 37: Relational to Big Graph

Working  Subset  (Today’s  Exercise)  

Page 38: Relational to Big Graph

Northwind  Graph  Model  

Order

Product

Customer Employee

SOLD

ORDERS

Category

Employee

REPORTS_TO

PART_OF

PURCHASED

Supplier

SUPPLIES

Page 39: Relational to Big Graph

s  

Recap  -­‐  Rules  

Model  your  graph  first  and    import  into  that  model.  

Alterna%vely  …  

Page 40: Relational to Big Graph

Normalized  ER-­‐Models:  Transforma%on  Rules  

•  Tables  become  nodes  •  Table  name  as  node-­‐label  •  Columns  turn  into  proper%es  •  Convert  values  if  needed  •  Foreign  Keys  (1:1,  1:n,  n:1)  into  rela%onships,    column  name  into  rela1onship-­‐type  (or  bemer  verb)  

•  JOIN-­‐Tables  represent  rela%onships  •  Also  other  tables  without  domain  iden1ty  (w/o  PK)  and  two  FKs  •  Columns  turn  into  rela%onship  proper%es  

Page 41: Relational to Big Graph

Normalized  ER-­‐Models:  Cleanup  Rules  

•  Remove  technical  IDs  (auto-­‐incremen1ng  PKs)  •  Keep  domain  IDs  (e.g.  ISBN)  •  Add  constraints  for  those  

•  Add  indexes  for  lookup  fields  •  Adjust  names  for  Label,  REL_TYPE  and  propertyName  

 Note:  currently  no  composite  constraints  and  indexes  

Page 42: Relational to Big Graph

Impor%ng  Your  Data  

Page 43: Relational to Big Graph

Ge^ng  Data  into  Neo4j  

Cypher-­‐Based  “LOAD  CSV”  Capability  •  Transac1onal  (ACID)  writes  •  Ini1al  and  incremental  loads  of  up  to    10  million  nodes  and  rela1onships  

Command-­‐Line  Bulk  Loader        neo4j-­‐import  •  For  ini1al  database  popula1on  •  For  loads  up  to  10B+  records  •  Up  to  1M  records  per  second  

 4.58  million  things  and  their  rela1onships…  

 Loads  in  100  seconds!  

CSV  

Page 44: Relational to Big Graph

Ge^ng  Data  into  Neo4j  

Custom  Cypher-­‐Based  Loader  •  Uses  transac1onal  Cypher  hmp  endpoint  •  Parametrized,  batched,  concurrent    Cypher  statements  

•  Any  programming/script  language  with  driver  or  plain  hmp  

JVM  Transac%onal  Loader  •  Use  Neo4j’s  Java-­‐API  •  From  any  JVM  language  •  Up  to  1M  records  per  second  

Any    Data    

Program  

Program  

Program  

Page 45: Relational to Big Graph

Data  Import  Demo  

Page 46: Relational to Big Graph

Import  Demo  

Cypher-­‐Based  “LOAD  CSV”  Capability  •  Use  to  import  Northwind  CSV  dumps  

Command-­‐Line  Bulk  Loader        neo4j-­‐import  •  Chicago  Crimes  Dataset  

Rela%onal  Import  Tool        neo4j-­‐rdbms-­‐import  •  Proof  of  Concept  

JDBC  +  API  

CSV  

Page 47: Relational to Big Graph

RDBMS  Import  Tool  Demo  –  Proof  of  Concept  

•  JDBC  for  vendor-­‐independent  database  connec1on  •  SchemaCrawler  to  extract  DB-­‐Meta-­‐Data  •  Use  Rules  to  drive  graph  model  import  •  Op1onal  means  to  override  default  behavior  •  Scales  writes  with  Parallel  Batch  Importer  API  •  Reads  tables  concurrently  for  nodes  &  rela1onships  

Demo:  MySQL  -­‐  Employee  Demo  Database    Source:  github.com/jexp/neo4j-­‐rdbms-­‐import  

Post  gres  

MySQL  Oracle  

Page 48: Relational to Big Graph

Querying  Your  Data  

Page 49: Relational to Big Graph

Basic  Query:  Who  do  people  report  to?  

MATCH  (:Employee  {firstName:”Steven”}  )  -­‐[:REPORTS_TO]-­‐>  (:Employee  {firstName:“Andrew”}  )    

REPORTS_TO  Steven   Andrew  

LABEL   PROPERTY  

NODE   NODE  

LABEL   PROPERTY  

Page 50: Relational to Big Graph

Basic  Query  Comparison:  Who  do  people  report  to?  

SELECT *FROM Employee as e JOIN Employee_Report AS er ON (e.id = er.manager_id) JOIN Employee AS sub ON (er.sub_id = sub.id)

MATCH (e:Employee)<-[:REPORTS_TO]-(sub:Employee)RETURN *

Page 51: Relational to Big Graph

Basic  Query:  Who  do  people  report  to?  

Page 52: Relational to Big Graph

Basic  Query:  Who  do  people  report  to?  

Page 53: Relational to Big Graph

MATCH  (sub)-­‐[:REPORTS_TO*0..3]-­‐>(boss),              (report)-­‐[:REPORTS_TO*1..3]-­‐>(sub)  WHERE  boss.firstName  =  'Andrew'  RETURN  sub.firstName  AS  Subordinate,        count(report)  AS  Total;  

Express  Complex  Queries  Easily  with  Cypher  

Find  all  direct  reports  and  how  many  people  they  manage,    each  up  to  3  levels  down  

Cypher  Query  

SQL  Query  

Page 54: Relational to Big Graph

“We  found  Neo4j  to  be  literally  thousands  of  %mes  faster  than  our  prior  MySQL  solu1on,  with  queries  that  require  10  to  100  %mes  less  code.  Today,  Neo4j  provides  eBay  with  func1onality  that  was  previously  impossible.”    Volker  Pacher  Senior  Developer  

Page 55: Relational to Big Graph

Who  is  in  Robert’s  (direct,  upwards)  repor%ng  chain?  

MATCH path=(e:Employee)<-[:REPORTS_TO*]-(sub:Employee)WHERE sub.firstName = 'Robert'RETURN path;

Page 56: Relational to Big Graph

Who  is  in  Robert’s  (direct,  upwards)  repor%ng  chain?  

Page 57: Relational to Big Graph

Who’s  the  Big  Boss?  

MATCH (e:Employee)WHERE NOT (e)-[:REPORTS_TO]->()RETURN e.firstName as bigBoss;

Page 58: Relational to Big Graph

Who’s  the  Big  Boss?  

Page 59: Relational to Big Graph

Product  Cross-­‐Sell  MATCH (choc:Product {productName: 'Chocolade'}) <-[:ORDERS]-(:Order)<-[:SOLD]-(employee), (employee)-[:SOLD]->(o2)-[:ORDERS]->(other:Product)RETURN employee.firstName, other.productName, count(distinct o2) as countORDER BY count DESCLIMIT 5;

Page 60: Relational to Big Graph

Product  Cross-­‐Sell  

Page 61: Relational to Big Graph

Neo4j  Query  Planner  

Cost  based  Query  Planner  since  Neo4j  2.2  •  Uses  database  stats  to  select  best  plan  •  Currently  for  Read  OperaAons  •  Query  Plan  Visualizer,  finds  •  Non  op1mal  queries  •  Cartesian  Product  •  Missing  Indexes,  Global  Scans  •  Typos  •  Massive  Fan-­‐Out  

 

Page 62: Relational to Big Graph

Query  Planner  

Slight  change,  add  an  :Employee  label  -­‐>  more  stats  available  -­‐>  new  plan  with  fewer  database-­‐hits  

Page 63: Relational to Big Graph

Architecture  &  Integra%on  “Polyglot  Persistence”  

Page 64: Relational to Big Graph

Neo4j  Clustering    Architecture  Op%mized  for  Speed  &  Availability  at  Scale  

64

Performance  Benefits  •  No  network  hops  within  queries  •  Real-­‐Ame  operaAons  with  fast  and  consistent  response  1mes    

•  Cache  sharding  spreads  cache  across  cluster  for  very  large  graphs  

Clustering  Features  •  Master-­‐slave  replica1on  with    master  re-­‐elecAon  and  failover    

•  Each  instance  has  its  own  local  cache  •  Horizontal  scaling  &  disaster  recovery  

Load  Balancer  

Neo4j  Neo4j  Neo4j  

Page 65: Relational to Big Graph

MIGRATE    ALL  DATA  

MIGRATE    GRAPH  DATA  

DUPLICATE  GRAPH  DATA  

Non-­‐graph  data   Graph  data  

Graph  data  All  data  

All  data  

Rela%onal  Database  

Graph  Database  

Applica1on  

Applica1on  

Applica1on  

Three  Ways  to  Migrate  Data  to  Neo4j  

Page 66: Relational to Big Graph

Data  Storage  and  Business  Rules  Execu1on  

Data  Mining    and  Aggrega1on  

Neo4j  Fits  into  Your  Enterprise  Environment  

Applica%on  

Graph  Database  Cluster  

Neo4j   Neo4j   Neo4j  

Ad  Hoc  Analysis  

Bulk  Analy%c  Infrastructure  

Graph  Compute  Engine  EDW      …  

Data  Scien%st  

End  User  

Databases  Rela1onal  NoSQL  Hadoop  

Page 67: Relational to Big Graph

User  Voice  

Page 68: Relational to Big Graph

Users  Love  Neo4j  

Page 69: Relational to Big Graph

Learn  the  Way  of  the  Graph  Quickly  and  Easily  

Page 70: Relational to Big Graph

Quick  Start:  Plan  Your  Project  

1  

2  

3  

4  

5  

6  

7  

8  

Learn  Neo4j  

Decide  on  Architecture  

Import  and  Model  Data  

Build  Applica%on  

Test  Applica%on  

Deploy  your  app  in  as  limle  as  8  weeks  

PROFESSIONAL  SERVICES  PLAN  

Page 71: Relational to Big Graph

There  Are  Lots  of  Ways  to  Easily  Learn  Neo4j  

Page 72: Relational to Big Graph

GraphConnect,Europe,London,•,May,657,,2015

DATE,

LOCATION,

ACTIVITIES,

Wednesday,,May,6,–,Full,Day,Trainings,(includes,new,Advanced,Deployment,class),Thursday,,May,7,–,Main,Conference,

Etc,Venues,in,London,,UK,Training:,4,Norton,Folgate,Conference:,at,155,Bishopsgate,Liverpool,Street,

• Customers,and,community,members,such,as,adidas,,Pitney*Bowes,,Orange,,e1Spirit,,KNMI,and,others,,showcasing,their,Neo4j,solutions,• Neo4j,product,training,• Free,personal,advice,in,Neo4j,GraphClinics,• Opportunity,to,network,with,graph,users,from,across,the,world,• Enjoy,yourself!

TICKETS!JAX,Discount,Code,

50%,off,JAX50GCE,

www.graphconnect.com

www.graphconnect.com

GraphConnect,Europe,London,•,May,657,,2015

DATE,

LOCATION,

ACTIVITIES,

Wednesday,,May,6,–,Full,Day,Trainings,(includes,new,Advanced,Deployment,class),Thursday,,May,7,–,Main,Conference,

Etc,Venues,in,London,,UK,Training:,4,Norton,Folgate,Conference:,at,155,Bishopsgate,Liverpool,Street,

• Customers,and,community,members,such,as,adidas,,Pitney*Bowes,,Orange,,e1Spirit,,KNMI,and,others,,showcasing,their,Neo4j,solutions,• Neo4j,product,training,• Free,personal,advice,in,Neo4j,GraphClinics,• Opportunity,to,network,with,graph,users,from,across,the,world,• Enjoy,yourself!

TICKETS!JAX,Discount,Code,

50%,off,JAX50GCE,

www.graphconnect.com

www.graphconnect.com

Page 73: Relational to Big Graph

Rela%onal  to  (Big)  Graph  Harnessing  the  Power  of  the  Graph  

End  of  PresentaAon  


Related Documents