Agenda • About Graphs • About Graph Databases • Why Graph Databases ma<er for Fraud Detec?on – Short demonstra?on
• Case Studies • Q&A
So what is a graph database?
• OLTP database – “end-‐user” transac?ons
• Model, store, manage data as a graph
Contrast with Rela?onal
Graphs are often referred to as “Whiteboard Friendly”. The data model reflects the way a domain expert would naturally
draw their data on a whiteboard“The schema is the data”. Schema flexibility allows the system
to change in response to a changing environment
Examples of complex queries? 1. Semi-‐structure in datasets
14
– Normaliza?on introduces complexity
– Forces developers to develop all kinds of logic to deal with this variability in their applica?on logic
Examples of complex queries: 2. Connectedness in data
Lots of normalized rela?onships between the different en??es, forces developers to do • Deep joins • Recursive joins • Pathfinding opera?ons • “open-‐ended” queries
Graphs in Fraud Detec?on Systems
• Real ?me aspect • Detec?on Prac?ces rely on Graph Algorithms
• Opera?onal efficiency
Real 1me fraud detec?on? • Context is everything – You don’t want to be blocking credit cards for no
reason… false posi?ves are fatal… • Complexity ma<ers – Need to outsmart the “bad guys” – Assume that they can and will understand/beat
the system • Visualiza?on ma<ers – Manual interven?on relies on fast understanding
of context – and visualiza?on helps there
Fraud Detec?on prac?ces: Graph Algorithms
● Helpful for naviga?ng complex networks ● tell me how A and B are related ● The things on the path between A and B could very well be
interes?ng ● ShortestPath, AllShortestPaths, Weighted ShortestPath (Dijkstra, A*)
● Helpful for understanding the important parts of a network ● Clusters ● Bridges
● Centrality ● Betweenness Centrality
● (Page)Ranking
Opera1onal Efficiency
• Graph datamodel removes the need for many “batch opera?ons” – No need to precalculate – just feed it into the graph
• Complex pa<ern matching in milliseconds • Graph Locality == Predictability & Speed, even over large datasets
Neo Technology, Inc Confidential
Neo4j License Overview
Developer!Seats!
($6K*/Developer/Year)
Test!Instances!
($6K/Instance/Year)
Production!Instances!
(Bundle / Core Pricing)
Instances whose purpose is to ensure that the software accessing
Neo4j is meeting specification.!!
(e.g. System Test, Integration Test, UAT, Performance Test, Staging)
Instances that store and process data in a way that benefits and
advances an organization’s goals.!!
May be accessed by applications and/or end users
Includes access by programmers to licensed test instances, and
private instances on the programmer’s personal machine for the sole purpose of writing, debugging, or testing software
designed to access Neo4j
*Or otherwise, depending on the Bundle, and negotiation
Neo4j versions / licenses
Personal < Startup / Departmental < Enterprise deployment models Open source & Commercial license terms available
Specific OEM models
Neo Technology www.neotechnology.com Neo4j www.neo4j.org [email protected] or +32 478 686800
Q&A, Conclusion, Next Steps