Top Banner
T : @markrittman HOW A TWEET WENT VIRAL Mark Rittman, Oracle ACE Director & Independent Analyst MJR Analytics ltd (http://www.mjr-analytics.com) BIWA SUMMIT 2017, SAN FRANCISCO
54

How a Tweet Went Viral - BIWA Summit 2017

Apr 13, 2017

Download

Technology

Mark Rittman
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: How a Tweet Went Viral - BIWA Summit 2017

T : @markrittman

HOW A TWEET WENT VIRALMark Rittman, Oracle ACE Director & Independent Analyst MJR Analytics ltd (http://www.mjr-analytics.com)

BIWA SUMMIT 2017, SAN FRANCISCO

Page 2: How a Tweet Went Viral - BIWA Summit 2017

•Oracle ACE Director, now Independent Analyst •Past ODTUG Executive Board Member •Author of two books on Oracle BI •Co-founder & CTO of Rittman Mead •15+ Years in Oracle BI, DW, ETL + now Big Data •Now working in analytics product management + strategy •Host of the Drill to Detail Podcast (www.drilltodetail.com) •Based in Brighton & work in London, UK

About The Presenter

2

Page 3: How a Tweet Went Viral - BIWA Summit 2017

3

HOME AUTOMATION

3

Page 4: How a Tweet Went Viral - BIWA Summit 2017

•One of my personal interests is Home Automation •Started with Nest thermostat and Philips Hue lights •Extended the Nest system to include Nest Protect and Nest Cam

•Used Apple HomeKit, Apple TV for Siri voice control •Added Samsung Smart Things hub for Z-wave, Zigbee compatibility

•Linked Smart Things to Homekit using open-source HomeBridge project to enable for Siri control

•Added Logitech Harmony for TV, Console, Roku

Home Automation and Smart ‘IoT’ Devices

4

Philips Hue Lighting

Nest Protect (X2), Thermostat, Cam

WithingsSmart Scales

AirplaySpeakers

HomebridgeHomekit / Smarthings Connector

SamsungSmart Things Hub (Z-Wave, Zigbee)

Door, Motion, Moisture,Presence Sensors

Apple Homekit,Apple TV, Siri

•Then Amazon Echo (x2) and Echo Dots (x4) to extend voice control + add Alexa skills

•… and then Google Home + Chromecasts for hangouts, Google Assistant + Google Search

Page 5: How a Tweet Went Viral - BIWA Summit 2017

Voice Control - When Home Automation Gets Real

5

Philips Hue Lighting

Nest Protect (X2), Thermostat, Cam

WithingsSmart Scales

AirplaySpeakers

SamsungSmart Things Hub (Z-Wave, Zigbee)

Door, Motion, Moisture, Presence Sensors

•Position multiple units around the house for ubiquitous voice control and music playback

•Integration with smart home devices •Use ML algorithms in the cloud, constantly improving and leveraging cloud-scale processing

“Alexa, turn on the kitchen lights”

“Hey Google, turn up the heating”

Amazon Echo

Google Home

Page 6: How a Tweet Went Viral - BIWA Summit 2017

ONE DAY BACK IN SEPTEMBER 2016 …

6

Page 7: How a Tweet Went Viral - BIWA Summit 2017

7

Page 8: How a Tweet Went Viral - BIWA Summit 2017

THE FOLLOWING MORNING…

8

Page 9: How a Tweet Went Viral - BIWA Summit 2017

9

Page 10: How a Tweet Went Viral - BIWA Summit 2017

10

Page 11: How a Tweet Went Viral - BIWA Summit 2017

11

Page 12: How a Tweet Went Viral - BIWA Summit 2017

12

Page 13: How a Tweet Went Viral - BIWA Summit 2017

13

Page 14: How a Tweet Went Viral - BIWA Summit 2017

14

Page 15: How a Tweet Went Viral - BIWA Summit 2017

15

Page 16: How a Tweet Went Viral - BIWA Summit 2017

BUT WAIT…

16

Page 17: How a Tweet Went Viral - BIWA Summit 2017

THIS COULD BE INTERESTING…

17

Page 18: How a Tweet Went Viral - BIWA Summit 2017

18

All Device Data at Home Logged to Hadoop Cluster

Page 19: How a Tweet Went Viral - BIWA Summit 2017

•Data extracted or transported to target platform using LogStash, CSV file batch loads •Landed into HDFS as JSON documents, then exposed as Hive tables using Storage Handler •Cataloged, visualised and analysed using Oracle Big Data Discovery + Python ML

Other Personal Project : Home + Wearables Analytics

19

Data Transfer Data Access

“Personal” Data Lake

Jupyter Web Notebook

6 Node Hadoop Cluster (CDH5.5)

Discovery & Development Labs Oracle Big Data Discovery 1.2

Data sets and samplesModels and programs

Oracle DV Desktop

Models

BDD Shell,Python, Spark ML

Data Factory

LogStashvia HTTP

ManualCSV U/L

Data streamsCSV, IFTTTor API call

Raw JSON log files in HDFS

Each document an event, daily record or

comms message

Hive Tables w/ Elastic

Storage Handler

Index data turned into tabular format

Health Data

Unstructured Comms Data

Smart Home Sensor Data

Page 20: How a Tweet Went Viral - BIWA Summit 2017

20

Page 21: How a Tweet Went Viral - BIWA Summit 2017

21

Page 22: How a Tweet Went Viral - BIWA Summit 2017

22

Page 23: How a Tweet Went Viral - BIWA Summit 2017

THIS TIME LAST YEAR…

23

Page 24: How a Tweet Went Viral - BIWA Summit 2017

24

Page 25: How a Tweet Went Viral - BIWA Summit 2017

•Graph, spatial and raster data processing for big data •Primarily documented + tested against Oracle BDA •Installable on commodity cluster using CDH

•Data stored in Apache HBase or Oracle NoSQL DB •Complements Spatial & Graph in Oracle Database •Designed for trillions of nodes, edges etc

•Out-of-the-box spatial enrichment services •Over 35 of most popular graph analysis functions •Graph traversal, recommendations •Finding communities and influencers, •Pattern matching

Oracle Big Data Spatial & Graph

25

Page 26: How a Tweet Went Viral - BIWA Summit 2017

CAN WE USE GRAPH ANALYSIS AND ORACLE BIG DATA TO FIND OUT…

26

Page 27: How a Tweet Went Viral - BIWA Summit 2017

HOW THIS TWEET WENT VIRAL?

27

Page 28: How a Tweet Went Viral - BIWA Summit 2017

28

Page 29: How a Tweet Went Viral - BIWA Summit 2017

AND AROUND THE WORLD IN 24 HOURS?

29

Page 30: How a Tweet Went Viral - BIWA Summit 2017

30

3454Tweets, retweets and mentions over 48 hours

3017Twitters users commenting

30+Number of countries ‘WiFi Kettle” became meme or news item

Page 31: How a Tweet Went Viral - BIWA Summit 2017

•Tweets in HDFS files processed and transformed into OBDGS file format for HBase load

Loading Tweets (Edges) And Users (Vertices)

• Unique ID for the vertex • Integer added via sequence in ODI

• Property name (“name”, “followers”) • Vertex Property datatype and value

Vertex File (.opv)

• Unique ID for the edge • Leading edge vertex ID • Trailing edge vertex ID • Edge Type (“tweet”) • Edge Property (“timestamp” or “location”) • Edge Property datatype and value

Edge File (.ope)

Page 32: How a Tweet Went Viral - BIWA Summit 2017

•Data loaded from files or through Java API into HBase •In-Memory Analytics layer runs common graph and spatial algorithms on data •Visualised using Cytoscape, R or in this example, Tom Sawyer Perspectives

Oracle Big Data Graph And Spatial Architecture

32

Massively Scalable Graph Store • Oracle NoSQL • HBase

Lightning-Fast In-Memory Analytics • YARN Container • Standalone Server • Embedded

Page 33: How a Tweet Went Viral - BIWA Summit 2017

cfg = GraphConfigBuilder.forPropertyGraphHbase() \

.setName("connectionsHBase") \

.setZkQuorum("bigdatalite").setZkClientPort(2181) \

.setZkSessionTimeout(120000).setInitialEdgeNumRegions(3) \

.setInitialVertexNumRegions(3).setSplitsPerRegion(1) \

.build();

opg = OraclePropertyGraph.getInstance(cfg); opg.clearRepository();

vfile=“../../data/kettle_nodes.opv" efile=“../../data/kettle_edges.ope"

opgdl=OraclePropertyGraphDataLoader.getInstance(); opgdl.loadData(opg, vfile, efile, 2);

// read through the vertices opg.getVertices();

// read through the edges opg.getEdges();

Loading Edges And Vertices Into Hbase

33

Uses “Gremlin” Shell for HBase • Creates connection to HBase • Sets initial configuration for database • Builds the database ready for load

• Defines location of Vertex and Edge files • Creates instance of

OraclePropertyGraphDataLoader • Loads data from files

• Prepares the property graph for use • Loads in Edges and Vertices • Now ready for in-memory processing

Page 34: How a Tweet Went Viral - BIWA Summit 2017

•Plugin created by Oracle to add to open-source Cytoscape analysis tool •Connects to HBase or NoSQL property graph •Connect to PGX analytics engine •Run Page Rank and other analyses •Visualize property graph on-screen •Search for nodes and edges using Apache Solr search engine

Visualize And Analyze Using Cytoscape Plugin

34

Page 35: How a Tweet Went Viral - BIWA Summit 2017

Top 5 Influencers Based On Mentions, Retweets

35

Page 36: How a Tweet Went Viral - BIWA Summit 2017

36

@markrittman

@erinscafe

@internetofshit

Page 37: How a Tweet Went Viral - BIWA Summit 2017

•The story was picked-up by several influential Twitter users and online news sites •ErinsCafe, BoingBoing, Internet of Sh*t

•Featured as a “Twitter Moment” on Day 1 PM •Guardian Newspaper website Day 2 AM •Influencers identified in two ways •By number of followers in Twitter profile •By number of connecting edges in tweets Property Graph using Page Rank algorithm

Role Of Network Influencers In Meme Propagation

37

•But … how did they hear about the story?

Page 38: How a Tweet Went Viral - BIWA Summit 2017

Understanding How A User Joined Conversation

38

Page 39: How a Tweet Went Viral - BIWA Summit 2017

Visualising Potential Story Paths To Influencer Nodes

39

@philjoneswired

But did this tweet cause, or just comment on, the virality?

We need to see the timeline…

Page 40: How a Tweet Went Viral - BIWA Summit 2017

•Filters PGX analysis on timestamp edge or vertex property when present in property graph •Select start date, optional end date for filter •Supports two-sided timeline in directed graphs •View property graph as it develops over time

New Cytoscape Plugin Feature - Timeline Analysis

40

Page 41: How a Tweet Went Viral - BIWA Summit 2017

•The Timeline Analysis plugin for Cytoscape is useful and helps us filter by date range •Another option for visualising property graphs is Tom Sawyer Perspectives •Timeline analysis down to the hour - 3hr periods are perfect for this analysis •Map visualization, network visualization

•Prototype using subsets of tweets in CSV files, or connect to full HBase/NoSQL dataset

Tom Sawyer Perspectives For Social Network Analysis

42

Page 42: How a Tweet Went Viral - BIWA Summit 2017

43

Day 1 : 10am GMT

Page 43: How a Tweet Went Viral - BIWA Summit 2017

44

Day 1 : 3pm GMT

Page 44: How a Tweet Went Viral - BIWA Summit 2017

45

Day 1 : 6pm GMT

Page 45: How a Tweet Went Viral - BIWA Summit 2017

46

Day 1 : 8pm GMT

Page 46: How a Tweet Went Viral - BIWA Summit 2017

47

Day 2 : 6am GMT

Page 47: How a Tweet Went Viral - BIWA Summit 2017

IT WAS @ERINSCAFE

48

Page 48: How a Tweet Went Viral - BIWA Summit 2017

IT WAS @ERINSCAFE

49

Page 49: How a Tweet Went Viral - BIWA Summit 2017

50

Page 50: How a Tweet Went Viral - BIWA Summit 2017

51

Page 51: How a Tweet Went Viral - BIWA Summit 2017

AND USING SPATIAL CO-ORDINATES IN THE TWEET METADATA…

52

Page 52: How a Tweet Went Viral - BIWA Summit 2017
Page 53: How a Tweet Went Viral - BIWA Summit 2017

•Tweet went viral because it was picked-up on by a very well-connected Twitter user •And why did that happen? Probably because the story “had legs”…

•Some mentions in the Twitter-verse before this but main viral explosion due to @erinscafe •All subsequent activity including mentions by @guardian, @internetofsh*t followed that •You can analyse Twitter and other meme breakouts using Oracle Big Data Spatial & Graph •New Timeline Analysis feature in Cytoscape Plugin useful for time-slice analysis of data •Tom Sawyer Perspectives provides even more visualisation incl. mapping analysis capabilities •Thank you to Alan Wu, Juan Francisco & Hans Viehmann from Oracle, and Kevin Madden & Austris Krastiņš from Tom Sawyer for their help with the demos

Conclusions

54

Page 54: How a Tweet Went Viral - BIWA Summit 2017

T : @markrittman

HOW A TWEET WENT VIRALMark Rittman, Oracle ACE Director & Independent Analyst MJR Analytics ltd (http://www.mjr-analytics.com)

BIWA SUMMIT 2017, SAN FRANCISCO