Top Banner
Bob Briody Network Analysis Adventure
30

DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Jan 22, 2018

Download

Software

DataStax
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Bob Briody

Network Analysis Adventure

Page 2: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Who is this guy?

© DataStax, All Rights Reserved. 2

Page 3: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Who are these people?

© DataStax, All Rights Reserved. 3

Page 4: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

What is your role?

© DataStax, All Rights Reserved. 4

Page 5: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

TinkerPop / Gremlin

© DataStax, All Rights Reserved. 5

Page 6: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Network Analysis

© DataStax, All Rights Reserved. 6

Page 7: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Property Graph

Set of Vertices

• Set of outgoing edges

• Set of incoming edges

Set of Edges

• Single outgoing tail vertex

• Single incoming head vertex

Vertices & Edges

• Unique ID

• Collection of properties

• Label denoting type

© DataStax, All Rights Reserved. 7

Page 8: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

What is Network Analysis?

© DataStax, All Rights Reserved. 8

Page 9: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Why should you care?

© DataStax, All Rights Reserved. 9

Page 10: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Some Product Questions…

I want to understand our user/customer base in the aggregate.

What are the underlying communities among our users/customers?

I need to mediate a conflict between some groups of employees. Who should I talk to?

© DataStax, All Rights Reserved. 10

Page 11: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

The Graph Analysis Spectrum

© DataStax, All Rights Reserved. 11

Academic

Domain

Graph

Analysis

Computing

MethodsSolutions

Product

Domain

Domain Specific General

Page 12: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

The Product Domain

© DataStax, All Rights Reserved. 12

Academic

Domain

Graph

Analysis

Computing

MethodsSolutions

Product

Domain

• Master Data Management

• Recommendation and Personalization

• IoT, Asset Management, and Networking

• Security Management and Fraud Detection

• Criminal Network Analysis

Page 13: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

The Academic Domain

© DataStax, All Rights Reserved. 13

Academic

Domain

Graph

Analysis

Computing

MethodsSolutions

Product

Domain

Types of Network Analysis:

• Social

• Network (IT)

• Economical

• Supply Chain

• Literary

• Web

• Biological

Page 14: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Terminology

© DataStax, All Rights Reserved. 14

Academic

Domain

Graph

Analysis

Computing

MethodsSolutions

Product

Domain

Graph = Network

Vertex = Node

Edge = Link or Relationship

Page 15: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

A quick note on Edge Labels.

© DataStax, All Rights Reserved. 15

Page 16: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Social Network Analysis

© DataStax, All Rights Reserved. 16

Social Network

Analysis

Graph

Analysis

Computing

MethodsSolutions

Product

Domain

Domain Specific General

It’s all about the

people.

Page 17: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Some Social Network Analysis Questions…

I want to understand our user/customer base in the aggregate.

Counts, Degree Distribution, Density

What are the underlying communities among our users/customers?

Community Detection, Modularity

I need to mediate a conflict between some groups of employees. Who should I talk to?

Bridges & Brokers -> Centrality, PageRank

© DataStax, All Rights Reserved. 17

Page 18: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Centrality

© DataStax, All Rights Reserved. 18

Identify the most “important” vertices in

the graph.

• Degree

• Betweenness

• Eigenvector, PageRank

• etc…

Page 19: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Degree Centrality

© DataStax, All Rights Reserved. 19

Number of edges incident upon a

vertex.

Page 20: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Betweenness Centrality

© DataStax, All Rights Reserved. 20

Number of times a vertex appears along the

shortest path between two other vertices.

Page 21: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Bridges & Brokers

© DataStax, All Rights Reserved. 21

Bridge: An individual whose weak ties fill a

structural hole, providing the only link

between two individuals or clusters.

Brokerage: Vertex lies between others.

Page 22: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

PageRank

© DataStax, All Rights Reserved. 22

Based on the concept that connections to

high-scoring vertices contribute more to

the score of the vertex in question than

connections to low-scoring vertices.

Page 23: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Homophily

© DataStax, All Rights Reserved. 23

”Birds of a feather

flock together.”

Page 24: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Graph Analysis

© DataStax, All Rights Reserved. 24

Social Network

Analysis

Graph

Analysis

Computing

MethodsSolutions

Product

Domain

Domain Specific General

Graph

Vertex & Edge Counts

Degree Distribution

Avg Degree

Degree Density

Vertex

Clustering, Community Detection, Modularity

Centrality, PageRank

Path

Traversals, Pattern Matching

Page 25: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Graph Analysis

© DataStax, All Rights Reserved. 25

Social Network

Analysis

Graph

Analysis

Computing

MethodsSolutions

Product

Domain

Domain Specific General

Graph

Vertex & Edge Counts

Degree Distribution

Avg Degree

Degree Density

Vertex

Clustering, Community Detection, Modularity

Centrality, PageRank

Path

Traversals, Pattern Matching

Oh and btw…

ALL STANDARD

DATA ANALYSIS

TECHNIQUES!!!

Page 26: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Solutions

Gremlin is a functional, data-flow language that enables users to succinctly

express complex traversals on (or queries of) their application's property graph.

Apache TinkerPop™ is a graph computing framework for both graph databases

(OLTP) and graph analytic systems (OLAP).

A scale-out property graph database built on DataStax Enterprise, Apache

Cassandra, and…

Apache Spark™ is a fast and general engine for large-scale data

processing.

© DataStax, All Rights Reserved. 26

Academic

Domain

Graph

Analysis

Computing

MethodsSolutions

Product

Domain

Page 27: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Some Social Network Analysis Questions…

I want to understand our user/customer base in the aggregate.

Counts, Degree Distribution, Density

What are the underlying communities among our users/customers?

Community Detection, Modularity

I need to mediate a conflict between some groups of employees. Who should I talk to?

Bridges & Brokers -> Centrality, PageRank

© DataStax, All Rights Reserved. 27

Page 28: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Further Learning

Gremlin Recipes

http://tinkerpop.apache.org/docs/current/recipes/

Lada Adamic

Computational Social Scientist @ Facebook

http://www.ladamic.com/

Stanford University - Social and Economic Networks: Models and Analysis

https://www.coursera.org/course/networksonline

© DataStax, All Rights Reserved. 28

Page 29: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Try it yourself!!!

Twitter Exporter

https://github.com/rjbriody/twitter-exporter

Studio Notebook Gist

https://gist.github.com/rjbriody/1aa82bd8952dc4a46a6fa597716c1987

DSE Graph

https://docs.datastax.com/en/latest-dse/datastax_enterprise/graph/graphTOC.html

Studio

http://docs.datastax.com/en/latest-studio/

© DataStax, All Rights Reserved. 29

Page 30: DataStax | Network Analysis Adventure with DSE Graph, DataStax Studio, and TinkerPop (Bob Briody) | Cassandra Summit 2016

Find Me

www.bobbriody.com

Twitter

@bobbriody

https://twitter.com/bobbriody

Github

rjbriody

https://github.com/rjbriody

© DataStax, All Rights Reserved. 30