Top Banner
Cody Buntain [email protected] Human-Computer Interaction Lab University of Maryland Jimmy Lin [email protected] University of Waterloo Jennifer Golbeck [email protected] University of Maryland CCNC’16 11 January 2016 Las Vegas, NV Discovering Key Moments in Social Media Streams 1
45

Discovering Key Moments from Social Media Streams

Jan 26, 2017

Download

Science

Cody Buntain
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Discovering Key Moments from Social Media Streams

Cody [email protected] Interaction LabUniversity of Maryland

Jimmy [email protected] of Waterloo

Jennifer [email protected] of Maryland

CCNC’1611 January 2016

Las Vegas, NV

Discovering Key Moments in Social Media Streams

1

Page 2: Discovering Key Moments from Social Media Streams

2Introduction

Page 3: Discovering Key Moments from Social Media Streams

3Introduction

Page 4: Discovering Key Moments from Social Media Streams

Most event detection

systems track human-

generated, seed keywords

4

Tweets per second mentioning “gol copa, gool copa, goool, golaço” during the match June 12th, 2014 [1]

Tweets per hour related to earthquakes [2]

Introduction

Page 5: Discovering Key Moments from Social Media Streams

Step 1Identify Keywords

5

goal, score

Step 2Find Bursts

Typical Approach

Introduction

Page 6: Discovering Key Moments from Social Media Streams

Weaknesses

6

goal == gooooal?

Introduction

Page 7: Discovering Key Moments from Social Media Streams

Can we identify interesting moments without seed tokens?

7Introduction

Page 8: Discovering Key Moments from Social Media Streams

Can we identify interesting moments without seed tokens?

8Introduction

Page 9: Discovering Key Moments from Social Media Streams

Can we identify interesting moments without seed tokens?

9Introduction

Page 10: Discovering Key Moments from Social Media Streams

Step 1Identify Keywords

10

goal, score

Step 2

LABurst Algorithm

Find Bursts

goooal, 0-1, 0:1,1-0, gollll, holandaaaa, penal, penalti, persie

Introduction

Page 11: Discovering Key Moments from Social Media Streams

LABurst Algorithm

Discover Unanticipated Moments

11

suarez, bit,

biting

Identify Keywords

Introduction

Page 12: Discovering Key Moments from Social Media Streams

12Methods

Page 13: Discovering Key Moments from Social Media Streams

13

193 Key Moments

Methods

Page 14: Discovering Key Moments from Social Media Streams

14

Can we transfer these sports-trained

models to more impactful domains?

Methods

Page 15: Discovering Key Moments from Social Media Streams

15

Event Tweet Count Training Data 2010 NFL Division Championship 109,8092012 Premier League Soccer Games 1,064,0402014 NHL Stanley Cup Playoffs 2,421,0652014 NBA Playoffs 500,1702014 Kentucky Derby Horse Race 233,1722014 Belmont Stakes Horse Race 226,1602014 FIFA World Cup Stages A+B 5,867,783Testing Data 2013 MLB World Series Game 5 1,052,8522013 MLB World Series Game 6 1,026,8482013 Honshu Earthquake 444,0182014 NFL Super Bowl 1,024,3672014 FIFA World Cup Third Place 809,4262014 FIFA World Cup Final 1,166,7672014 Iwaki Earthquake 358,966

Total 16,305,443

Methods

Page 16: Discovering Key Moments from Social Media Streams

LABurst learns

bursts from sporting event

data

16Methods

Page 17: Discovering Key Moments from Social Media Streams

How do we model these

bursts?

17

Extract Tokens

Methods

Page 18: Discovering Key Moments from Social Media Streams

How do we model these

bursts?

18Methods

Page 19: Discovering Key Moments from Social Media Streams

19

Token Feature Vector v

How do we model these

bursts?

Freq. Regression

ΔAverage Freq.

Inter-Arrival TimeMessage EntropyNetwork Density

TF-IDFTF-PDF1

BursT2

Methods

Page 20: Discovering Key Moments from Social Media Streams

20

Token Feature Vector v

SVM Random Forests

Ensemble

Bursty or Not?

BurstyClassifier

Methods

Page 21: Discovering Key Moments from Social Media Streams

The more tokens that experience bursts in a

given minute, the more

important the moment

21

Key moment!

Methods

Page 22: Discovering Key Moments from Social Media Streams

We evaluate LABurst by

comparing it against two

baseline methods

22Evaluation

Page 23: Discovering Key Moments from Social Media Streams

Baseline 1 RawBurst

23

Find “bursts” in Twitter’s raw message frequency

Current Freq – Avg Freq ⩼ Threshold

? > threshold: KEY MOMENT!Evaluation

Page 24: Discovering Key Moments from Social Media Streams

Baseline 2 TokenBurst

24

Modify RawBurst to use frequency of pre-specified

seed tokens

Current Freq – Avg Freq ⩼ Threshold

Sport Seed Tokens

World Series run, home, homerun

Super Bowl score, touchdown, td, fieldgoal, points

World Cup goal, gol, golazo, score, foul, penalty, card, red, yellow, points

Evaluation

Page 25: Discovering Key Moments from Social Media Streams

25

Compare using ROC-

AUC

LABurst ThresholdNumber of tokens

experiencing a burst in this minute

Baseline ThresholdsDifference between

current frequency and average frequency

Evaluation

Page 26: Discovering Key Moments from Social Media Streams

How well does our method perform?

26

10-Fold Cross Validation

Best scoring LABurst ensemble classifier:

ROC-AUC of 89.84% for training data

Results

Page 27: Discovering Key Moments from Social Media Streams

Which features are the most important?

27

Feature Sets ROC-AUC Difference

AdaBoost, All Features 89.84% –

Without Regression 87.79% -2.05

Without Entropy 87.94% -1.9

Without TF-IDF 88.85% -0.99

Without TF-PDF 89.00% -0.84

Without Density 89.07% -0.77

Without InterArrival 89.46% -0.38

Without BursT 89.52% -0.31

Without Average

Difference 90.56% 0.72

Results

Page 28: Discovering Key Moments from Social Media Streams

How well does our method perform?

28Results

Page 29: Discovering Key Moments from Social Media Streams

How well does our method perform?

29Results

Page 30: Discovering Key Moments from Social Media Streams

How well does our method perform?

30Results

Page 31: Discovering Key Moments from Social Media Streams

How well does our method perform?

31Results

Page 32: Discovering Key Moments from Social Media Streams

Composite ROC-AUC

32

Competitive without seed keywords or

prior domain knowledge

Results

Page 33: Discovering Key Moments from Social Media Streams

Why is the Super Bowl

hard?

33

Training/Testing Data:

Other Impactful Moments:

Discussion

Page 34: Discovering Key Moments from Social Media Streams

What was bursting at

these moments?

34

Match Event Bursty Tokens

Brazil v. Netherlands, 12 July

2014

Netherlands' Van Persie scores a goal on a penalty at 3',

1-0

0-1, 1-0, 1:0, 1x0, card, goaaaaaaal, goal, gol, goool,

holandaaaa, kırmızı, pen, penal, penalti, pênalti, persie, red

Brazil v. Netherlands, 12 July

2014

Brazil's Oscar gets a yellow card at 68'

dive, juiz, penalty, ref

Germany v. Argentina, 13 July

2014

Germany’s Götze scores a goal at

113’, 1-0

goaaaaallllllll, goalllll, godammit,

goetze, gollllll, gooooool, gotze, gotzeeee, götze,

nooo, yessss,

Discussion

Page 35: Discovering Key Moments from Social Media Streams

What other moments did

LABurst discover?

35

LABurst vs. TokenBurst at World Cup Final

Discussion

Page 36: Discovering Key Moments from Social Media Streams

What other moments did

LABurst discover?

36

LABurst vs. TokenBurst at World Cup Final

Moment: "puyol", "gisele", and "bundchen"

Discussion

Page 37: Discovering Key Moments from Social Media Streams

What other moments did

LABurst discover?

37

LABurst vs. Baseline at World Cup Final

Moment: "pipita", "higuaín", "", “pipa”, “choke”

Discussion

Page 38: Discovering Key Moments from Social Media Streams

Can these models be

useful in other domains?

38

Earthquake Detection

Honshu, Japan Earthquake - 25 October 2013

Iwaki, Japan Earthquake - 11 July 2014

Simultaneously detects spikes

about the earthquake

Also detects an aftershock

Discussion

Page 39: Discovering Key Moments from Social Media Streams

Can discover key moments from Twitter streams without seed tokens

39Conclusions

Page 40: Discovering Key Moments from Social Media Streams

Can discover key moments from Twitter streams without seed tokens

40Conclusions

Page 41: Discovering Key Moments from Social Media Streams

Can discover key moments from Twitter streams without seed tokens

41Conclusions

Page 42: Discovering Key Moments from Social Media Streams

Can discover key moments from Twitter streams

without seed tokens

42Conclusions

Page 43: Discovering Key Moments from Social Media Streams

Cody [email protected]@codybuntainHuman-Computer Interaction LabUniversity of Maryland

Thank you! Questions?

43

Discovering Key Moments in Social Media Streams

Page 44: Discovering Key Moments from Social Media Streams

Backup Slides

44

Page 45: Discovering Key Moments from Social Media Streams

How do we train these classifiers?

45

Examples of Bursty Tokens:

saints peterson

7-0 1-0

touchdown score

goalpenaltytd

fumble

persie messi

tonalist

Examples of Non-Bursty Tokens:

??

the, i, me, my, myself, we, our, ours, ourselves, you, before,

after, above, below, to, from, up, down, in,

out, on

Stop Words