Top Banner
Discovering Credible Events in Near Real Time from Social Media Streams WWW2015 Cody Buntain, @codybuntain, [email protected] Advised by Dr. Jen Golbeck
119

Discovering Credible Events in Near Real Time from Social Media Streams

Aug 14, 2015

Download

Science

Cody Buntain
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Discovering Credible Events in Near Real Time from Social Media Streams

Discovering Credible Events in Near Real Time from Social Media Streams

WWW2015Cody Buntain, @codybuntain, [email protected] by Dr. Jen Golbeck

Page 2: Discovering Credible Events in Near Real Time from Social Media Streams

Social networks are

powerful resources

during crises.

2

Page 3: Discovering Credible Events in Near Real Time from Social Media Streams

Social networks are

powerful resources

during crises.

2 Carvin, A. "The 2008 Mumbai Attacks As They Happened On Twitter." Storify.com. https://storify.com/acarvin/the-2008-mumbai-attacks-as-they-happened-on-twitte

Page 4: Discovering Credible Events in Near Real Time from Social Media Streams

Social networks are

powerful resources

during crises.

3 B. Parr. “#IranElection Crisis: A Social Media Timeline.” 21 June 2009. http://mashable.com/2009/06/21/iran-election-timeline/

Page 5: Discovering Credible Events in Near Real Time from Social Media Streams

Social networks are

powerful resources

during crises.

4Huffington Post. "Chile Earthquake PICTURES: Twitter Photos Record The Wreckage In Chile (PHOTOS)."

http://www.huffingtonpost.com/2010/02/27/chile-earthquake-pictures_n_479535.html

Page 6: Discovering Credible Events in Near Real Time from Social Media Streams

Social networks are

powerful resources

during crises.

5

Page 7: Discovering Credible Events in Near Real Time from Social Media Streams

Social networks are

powerful resources

during crises.

5 Alex Nunns, Nadia Idle. "Tweets from Tahrir: Egypt's Revolution as it Unfolded, in the Words of the People who Made it." OR Books. 2011.

Page 8: Discovering Credible Events in Near Real Time from Social Media Streams

Social networks are

powerful resources

during crises.

6

E. Guskin, P. Hitlin. "Hurricane Sandy and Twitter." November 6, 2012. http://www.journalism.org/2012/11/06/hurricane-sandy-and-twitter/

C. Ngak. "Social media a news source and tool during Superstorm Sandy." October 30, 2012. http://www.cbsnews.com/news/social-media-a-news-source-and-tool-during-superstorm-sandy/

Page 9: Discovering Credible Events in Near Real Time from Social Media Streams

Social networks are

powerful resources

during crises.

7C. Kanalley. "Boston Marathon Bombing Timeline: The Week in 50 Tweets, 5 Videos." Huffington Post. 21 April 2013. http://www.huffingtonpost.com/craig-kanalley/boston-marathon-bombing-timeline_b_3125721.html

Page 10: Discovering Credible Events in Near Real Time from Social Media Streams

Social networks are

powerful resources

during crises.

7C. Kanalley. "Boston Marathon Bombing Timeline: The Week in 50 Tweets, 5 Videos." Huffington Post. 21 April 2013. http://www.huffingtonpost.com/craig-kanalley/boston-marathon-bombing-timeline_b_3125721.html

Page 11: Discovering Credible Events in Near Real Time from Social Media Streams

8 http://srogers.cartodb.com/viz/64f6c0f4-745d-11e4-b4e1-0e4fddd5de28/embed_map

Page 12: Discovering Credible Events in Near Real Time from Social Media Streams

8 http://srogers.cartodb.com/viz/64f6c0f4-745d-11e4-b4e1-0e4fddd5de28/embed_map

Page 13: Discovering Credible Events in Near Real Time from Social Media Streams
Page 14: Discovering Credible Events in Near Real Time from Social Media Streams
Page 15: Discovering Credible Events in Near Real Time from Social Media Streams
Page 16: Discovering Credible Events in Near Real Time from Social Media Streams
Page 17: Discovering Credible Events in Near Real Time from Social Media Streams

11

Page 18: Discovering Credible Events in Near Real Time from Social Media Streams

Social media is not completely

trustworthy.

11

Page 19: Discovering Credible Events in Near Real Time from Social Media Streams

Social media is not completely

trustworthy.

12

Page 20: Discovering Credible Events in Near Real Time from Social Media Streams

Social media is not completely

trustworthy.

12

Page 21: Discovering Credible Events in Near Real Time from Social Media Streams

Social media is not completely

trustworthy.

12

Page 22: Discovering Credible Events in Near Real Time from Social Media Streams

Social media is not completely

trustworthy.

13

C. Reinwald. "What Twitter Got Wrong During the Week Following Last Year’s Boston Marathon." Boston.com. 18 April 2014.

http://www.boston.com/news/local/massachusetts/2014/04/18/what-twitter-got-wrong-during-the-boston-marathon-bombing-week/

ZOYLJpEydYgJ8UYNUT674H/story.html

Page 23: Discovering Credible Events in Near Real Time from Social Media Streams

Social media is not completely

trustworthy.

13

C. Reinwald. "What Twitter Got Wrong During the Week Following Last Year’s Boston Marathon." Boston.com. 18 April 2014.

http://www.boston.com/news/local/massachusetts/2014/04/18/what-twitter-got-wrong-during-the-boston-marathon-bombing-week/

ZOYLJpEydYgJ8UYNUT674H/story.html

Page 24: Discovering Credible Events in Near Real Time from Social Media Streams

Social media is not completely

trustworthy.

13

C. Reinwald. "What Twitter Got Wrong During the Week Following Last Year’s Boston Marathon." Boston.com. 18 April 2014.

http://www.boston.com/news/local/massachusetts/2014/04/18/what-twitter-got-wrong-during-the-boston-marathon-bombing-week/

ZOYLJpEydYgJ8UYNUT674H/story.html

Page 25: Discovering Credible Events in Near Real Time from Social Media Streams

Social media is not completely

trustworthy.

14Lu Wang, Whitney Kisling and Eric Lam. "Fake Post Erasing $136 Billion Shows Markets Need Humans." Bloomberg.com. 23 April, 2013. http://www.bloomberg.com/news/2013-04-23/fake-report-erasing-136-billion-shows-market-s-fragility.html

Page 26: Discovering Credible Events in Near Real Time from Social Media Streams

Social media is not completely

trustworthy.

14Lu Wang, Whitney Kisling and Eric Lam. "Fake Post Erasing $136 Billion Shows Markets Need Humans." Bloomberg.com. 23 April, 2013. http://www.bloomberg.com/news/2013-04-23/fake-report-erasing-136-billion-shows-market-s-fragility.html

Page 27: Discovering Credible Events in Near Real Time from Social Media Streams

Social media is not completely

trustworthy.

14Lu Wang, Whitney Kisling and Eric Lam. "Fake Post Erasing $136 Billion Shows Markets Need Humans." Bloomberg.com. 23 April, 2013. http://www.bloomberg.com/news/2013-04-23/fake-report-erasing-136-billion-shows-market-s-fragility.html

Page 28: Discovering Credible Events in Near Real Time from Social Media Streams

Social media is not completely

trustworthy.

14Lu Wang, Whitney Kisling and Eric Lam. "Fake Post Erasing $136 Billion Shows Markets Need Humans." Bloomberg.com. 23 April, 2013. http://www.bloomberg.com/news/2013-04-23/fake-report-erasing-136-billion-shows-market-s-fragility.html

Page 29: Discovering Credible Events in Near Real Time from Social Media Streams

How can we decide what is important and what to trust in these streams?

15

Page 30: Discovering Credible Events in Near Real Time from Social Media Streams

How can we decide what is important and what to trust in these streams?

15

Page 31: Discovering Credible Events in Near Real Time from Social Media Streams

How can we decide what is important and what to trust in these streams?

15

Page 32: Discovering Credible Events in Near Real Time from Social Media Streams

And can we make these decisions rapidly?

Page 33: Discovering Credible Events in Near Real Time from Social Media Streams

By integrating machine learning and high-volume streams across social networks, one can detect high-impact events, identify specific occurrences within those events, and evaluate credibility of those occurrences in near real time.

17

Thesis Statement

Page 34: Discovering Credible Events in Near Real Time from Social Media Streams

By integrating machine learning and high-volume streams across social networks, one can detect high-impact events, identify specific occurrences within those events, and evaluate credibility of those occurrences in near real time.

18

Thesis Statement

Page 35: Discovering Credible Events in Near Real Time from Social Media Streams

By integrating machine learning and high-volume streams across social networks, one can detect high-impact events, identify specific occurrences within those events, and evaluate credibility of those occurrences in near real time.

19

Thesis Statement

Page 36: Discovering Credible Events in Near Real Time from Social Media Streams

By integrating machine learning and high-volume streams across social networks, one can detect high-impact events, identify specific occurrences within those events, and evaluate credibility of those occurrences in near real time.

20

Thesis Statement

Page 37: Discovering Credible Events in Near Real Time from Social Media Streams

Two Research

Areas

21

Page 38: Discovering Credible Events in Near Real Time from Social Media Streams

Two Research

Areas

21

Event

Event

Event

EventRapid Event Discovery

Page 39: Discovering Credible Events in Near Real Time from Social Media Streams

Two Research

Areas

21

Event

Event

Event

EventRapid Event Discovery

Credibility Analysis

Page 40: Discovering Credible Events in Near Real Time from Social Media Streams

Completed Work

22 * www.iflscience.com

Discovering Interesting Moments with

Burst Detection

Page 41: Discovering Credible Events in Near Real Time from Social Media Streams

Typical event detection

systems track human-

generated, predefined keywords.

23

Tweets per second mentioning “gol copa, gool copa, goool, golaço” during the match June 12th, 2014 [1]

Tweets per hour related to earthquakes [2]

[1] L. Cipriani, “Goal! Detecting the most important World Cup moments,” Twitter Developer’s Blog. 23 June 2014. https://blog.twitter.com/2014/goal-detecting-the-most-important-world-cup-moments [2] T. Sakaki, M. Okazaki, and Y. Matsuo, “Earthquake shakes Twitter users: real-time event detection by social sensors,” in Proceedings of the 19th international conference on World wide web, 2010, pp. 851–860.

Page 42: Discovering Credible Events in Near Real Time from Social Media Streams

24

Page 43: Discovering Credible Events in Near Real Time from Social Media Streams

Step 1Identify Keywords

24

goal, score

Step 2Find Bursts

Typical Approach

Page 44: Discovering Credible Events in Near Real Time from Social Media Streams

Weaknesses

25

Page 45: Discovering Credible Events in Near Real Time from Social Media Streams

Weaknesses

• Unable to detect unanticipated events

25

Page 46: Discovering Credible Events in Near Real Time from Social Media Streams

Weaknesses

• Unable to detect unanticipated events

• Disregards languages not represented in the keyword list

25

Page 47: Discovering Credible Events in Near Real Time from Social Media Streams

Weaknesses

• Unable to detect unanticipated events

• Disregards languages not represented in the keyword list

• Often requires language models for text normalization

25

Page 48: Discovering Credible Events in Near Real Time from Social Media Streams

Weaknesses

• Unable to detect unanticipated events

• Disregards languages not represented in the keyword list

• Often requires language models for text normalization

• Limited insight into the context of the event

25

Page 49: Discovering Credible Events in Near Real Time from Social Media Streams

Step 1Identify Keywords

26

goal, score

Step 2

LABurst Algorithm

Find Bursts

Page 50: Discovering Credible Events in Near Real Time from Social Media Streams

Step 1Identify Keywords

26

goal, score

Step 2

LABurst Algorithm

Find Bursts

Page 51: Discovering Credible Events in Near Real Time from Social Media Streams

Step 1Identify Keywords

26

goal, score

Step 2

LABurst Algorithm

Find Bursts

Page 52: Discovering Credible Events in Near Real Time from Social Media Streams

Step 1Identify Keywords

26

goal, score

Step 2

LABurst Algorithm

Find Bursts

goooal, 0-1, 0:1,1-0, gollll, holandaaaa, penal, penalti, persie

Page 53: Discovering Credible Events in Near Real Time from Social Media Streams

LABurst Algorithm

Discover Unexpected Moments

27

Page 54: Discovering Credible Events in Near Real Time from Social Media Streams

LABurst Algorithm

Discover Unexpected Moments

27

Page 55: Discovering Credible Events in Near Real Time from Social Media Streams

LABurst Algorithm

Discover Unexpected Moments

27

suarez, bit,

biting

Identify Keywords

Page 56: Discovering Credible Events in Near Real Time from Social Media Streams

Performance vs. Baselines

28

Page 57: Discovering Credible Events in Near Real Time from Social Media Streams

Can LABurst be useful in

other domains?

29

Page 58: Discovering Credible Events in Near Real Time from Social Media Streams

Can LABurst be useful in

other domains?

29

Earthquake Detection

Page 59: Discovering Credible Events in Near Real Time from Social Media Streams

Can LABurst be useful in

other domains?

29

Earthquake Detection

Honshu, Japan Earthquake - 25 October 2013

Page 60: Discovering Credible Events in Near Real Time from Social Media Streams

Can LABurst be useful in

other domains?

29

Earthquake Detection

Honshu, Japan Earthquake - 25 October 2013

Iwaki, Japan Earthquake - 11 July 2014

Page 61: Discovering Credible Events in Near Real Time from Social Media Streams

Proposed Work

30

Event

Event

Event

EventRapid Event Discovery

Credibility Analysis

Page 62: Discovering Credible Events in Near Real Time from Social Media Streams

31

Event

Event

Event

EventRapid Event Discovery

Extending LABurst to New Domains and Scales

Page 63: Discovering Credible Events in Near Real Time from Social Media Streams

31

Event

Event

Event

EventRapid Event Discovery

Extending LABurst to New Domains and Scales

Page 64: Discovering Credible Events in Near Real Time from Social Media Streams

Is LABurst adaptable to more critical

events?

32

Page 65: Discovering Credible Events in Near Real Time from Social Media Streams

Is LABurst adaptable to more critical

events?

32

Page 66: Discovering Credible Events in Near Real Time from Social Media Streams

Is LABurst adaptable to more critical

events?

32

Page 67: Discovering Credible Events in Near Real Time from Social Media Streams

Is LABurst adaptable to more critical

events?

33

Page 68: Discovering Credible Events in Near Real Time from Social Media Streams

Is LABurst adaptable to more critical

events?

33

Page 69: Discovering Credible Events in Near Real Time from Social Media Streams

Is LABurst adaptable to more critical

events?

34

Page 70: Discovering Credible Events in Near Real Time from Social Media Streams

Events occur across

different temporal and geographical

scales.

35

Page 71: Discovering Credible Events in Near Real Time from Social Media Streams

Events occur across

different temporal and geographical

scales.

35

Japanese Earthquake, Oct. 2013

Page 72: Discovering Credible Events in Near Real Time from Social Media Streams

Events occur across

different temporal and geographical

scales.

35

Japanese Earthquake, Oct. 2013

Boston Marathon, 15 Apr. 2013

Page 73: Discovering Credible Events in Near Real Time from Social Media Streams

Research Question:

How well do sports models transfer to non-sports events?

36

Page 74: Discovering Credible Events in Near Real Time from Social Media Streams

Research Question:

How well do sports models transfer to non-sports events?

36

Event Date

Argentine Cacerolazo

Protest8 Nov. 2012

Boston Marathon Bombing

15 Apr. 2013

Westgate Mall Attack

21 Sept. 2013

Ukrainian Revolution

18-23 Feb. 2014

Ferguson, MO Protests

9-19 Aug. 2014

Page 75: Discovering Credible Events in Near Real Time from Social Media Streams

vs.

Events Discovered by LABurst

Page 76: Discovering Credible Events in Near Real Time from Social Media Streams

Research Question:

How can we track related moments?

38

Page 77: Discovering Credible Events in Near Real Time from Social Media Streams

Research Question:

How can we track related moments?

38

Similarity

Page 78: Discovering Credible Events in Near Real Time from Social Media Streams

Research Question:

How can we track related moments?

39

Page 79: Discovering Credible Events in Near Real Time from Social Media Streams

Research Question:

How can we track related moments?

39

Per-Half-Hour Bursty Tokens/Topics

Per-Minute Bursty Tokens

argentina

deutschland

germany

arg

ger

worldcup

goalkick

gol

goaaaalllgoalll

gooolll

goetze

Semantic Similarity

Page 80: Discovering Credible Events in Near Real Time from Social Media Streams

Research Question:

How can we track related moments?

40

Moment 1

Moment 2

Location Similarity

Page 81: Discovering Credible Events in Near Real Time from Social Media Streams

Research Question:

How can we track related moments?

41

Moment 1

Moment 2

Network Similarity

Page 82: Discovering Credible Events in Near Real Time from Social Media Streams

Event Detection at Scale

42

Page 83: Discovering Credible Events in Near Real Time from Social Media Streams

43

Page 84: Discovering Credible Events in Near Real Time from Social Media Streams

44

Page 85: Discovering Credible Events in Near Real Time from Social Media Streams

45

Page 86: Discovering Credible Events in Near Real Time from Social Media Streams

46M. O

sbor

ne, S

. Mor

an, R

. McC

read

ie, A

. Von

Lun

en,

M. S

ykor

a, E

. Can

o, N

. Ire

son,

C. M

acdo

nald

, I. O

unis

, Y.

He,

and

oth

ers,

“Rea

l-Tim

e D

etec

tion,

Tra

ckin

g, a

nd

Mon

itorin

g of

Aut

omat

ical

ly D

isco

vere

d Ev

ents

in

Soci

al M

edia

,” As

soc.

Com

put.

Ling

uist

., 20

14.

English “Security-Related” Keywords

Page 87: Discovering Credible Events in Near Real Time from Social Media Streams

47

LABurst at Scale

Page 88: Discovering Credible Events in Near Real Time from Social Media Streams

48

LABurst at Scale

No Seed Keywords

Page 89: Discovering Credible Events in Near Real Time from Social Media Streams
Page 90: Discovering Credible Events in Near Real Time from Social Media Streams

Proposed Work

50

Event

Event

Event

EventRapid Event Discovery

Credibility Analysis

Page 91: Discovering Credible Events in Near Real Time from Social Media Streams

51

Credibility Analysis

Page 92: Discovering Credible Events in Near Real Time from Social Media Streams

51

Credibility Analysis

Page 93: Discovering Credible Events in Near Real Time from Social Media Streams

51

Credibility Analysis

Near Real-Time Credibility Evaluation of Discovered Events

Page 94: Discovering Credible Events in Near Real Time from Social Media Streams

One needs confidence in information

used to make critical

decisions.

52 http://thedailyshow.cc.com/videos/9qx6fp/the-most-busted-name-in-news

Page 95: Discovering Credible Events in Near Real Time from Social Media Streams

One needs confidence in information

used to make critical

decisions.

52

“The Most Busted Name in News”

http://thedailyshow.cc.com/videos/9qx6fp/the-most-busted-name-in-news

Page 96: Discovering Credible Events in Near Real Time from Social Media Streams

One needs confidence in information

used to make critical

decisions.

52

“The Most Busted Name in News”

http://thedailyshow.cc.com/videos/9qx6fp/the-most-busted-name-in-news

Page 97: Discovering Credible Events in Near Real Time from Social Media Streams

Most credibility analyses focus

on people rather than information.

53

Page 98: Discovering Credible Events in Near Real Time from Social Media Streams

[1] N. Diakopoulos, M. De Choudhury, and M. Naaman, “Finding and assessing social media information sources in the context of journalism,” in Proceedings of

the SIGCHI Conference on Human Factors in Computing Systems, 2012, pp. 2451–2460.

Most credibility analyses focus

on people rather than information.

53

Page 99: Discovering Credible Events in Near Real Time from Social Media Streams

[1] N. Diakopoulos, M. De Choudhury, and M. Naaman, “Finding and assessing social media information sources in the context of journalism,” in Proceedings of

the SIGCHI Conference on Human Factors in Computing Systems, 2012, pp. 2451–2460.

Most credibility analyses focus

on people rather than information.

53

Page 100: Discovering Credible Events in Near Real Time from Social Media Streams

Other approaches

trust a subset of curated,

“trustworthy” authors.

54

Page 101: Discovering Credible Events in Near Real Time from Social Media Streams

Other approaches

trust a subset of curated,

“trustworthy” authors.

55

Page 102: Discovering Credible Events in Near Real Time from Social Media Streams

56

The “majority of content generated on Twitter during the [Mumbai terror attacks]

comes from non authority users.” [1]

Page 103: Discovering Credible Events in Near Real Time from Social Media Streams

56

The “majority of content generated on Twitter during the [Mumbai terror attacks]

comes from non authority users.” [1]

[1] A. Gupta and P. Kumaraguru, “Twitter explodes with activity in mumbai blasts! a lifeline or an unmonitored daemon in the lurking?,” 2012.

Page 104: Discovering Credible Events in Near Real Time from Social Media Streams

Research Question:

How can we evaluate the credibility of aggregate

event information?

57 http://davedecampblog.wordpress.com/shareyourstory/wordclouds/

Page 105: Discovering Credible Events in Near Real Time from Social Media Streams

58

Moment 8

Moment 7

Moment 6

Moment 5

Moment 4

Moment 3

Moment 2

Moment 1

0 22.5 45 67.5 90

Credibility Ranking

Page 106: Discovering Credible Events in Near Real Time from Social Media Streams

Research Question:

How can we evaluate the credibility of aggregate

event information?

59

Sparse versus Dense Networks

Moment 1 Moment 2

Page 107: Discovering Credible Events in Near Real Time from Social Media Streams

Research Question:

How can we evaluate the credibility of aggregate

event information?

59

Sparse versus Dense Networks

More Credible

Less Credible

Moment 1 Moment 2

Page 108: Discovering Credible Events in Near Real Time from Social Media Streams

Research Question:

How can we evaluate the credibility of aggregate

event information?

60 [1] M. Mendoza, B. Poblete, and C. Castillo, “Twitter Under Crisis: Can We Trust What We RT?,” in Proceedings of the First Workshop on Social Media Analytics, 2010, pp. 71–79.

“… false rumors tend to be questioned much more than

confirmed truths…” [1]

Discord

Moment 1 Moment 2

Page 109: Discovering Credible Events in Near Real Time from Social Media Streams

Research Question:

How can we evaluate the credibility of aggregate

event information?

60 [1] M. Mendoza, B. Poblete, and C. Castillo, “Twitter Under Crisis: Can We Trust What We RT?,” in Proceedings of the First Workshop on Social Media Analytics, 2010, pp. 71–79.

“… false rumors tend to be questioned much more than

confirmed truths…” [1]

Discord

Moment 1 Moment 2

More Credible

Less Credible

Page 110: Discovering Credible Events in Near Real Time from Social Media Streams

Research Question:

How can we evaluate the credibility of aggregate

event information?

61

Page 111: Discovering Credible Events in Near Real Time from Social Media Streams

Summary of Proposed Work

62

Page 112: Discovering Credible Events in Near Real Time from Social Media Streams

Summary of Proposed Work

• Extending LABurst • Adapting sports models to crisis-level events• Tracking related moments• Detecting and tracking events from streaming sources at scale

62

Page 113: Discovering Credible Events in Near Real Time from Social Media Streams

Summary of Proposed Work

• Extending LABurst • Adapting sports models to crisis-level events• Tracking related moments• Detecting and tracking events from streaming sources at scale

• Credibility Analysis • Evaluating the credibility of aggregate event information in near

real time

62

Page 114: Discovering Credible Events in Near Real Time from Social Media Streams

By integrating machine learning and high-volume streams across social networks, one can detect high-impact events, identify specific occurrences within those events, and evaluate credibility of those occurrences in near real time.

63

Thesis Statement

Page 115: Discovering Credible Events in Near Real Time from Social Media Streams

Thanks!

Questions?

64

Contact: @codybuntain

[email protected]

Page 116: Discovering Credible Events in Near Real Time from Social Media Streams

Keyword bursts across

language barriers

65

Page 117: Discovering Credible Events in Near Real Time from Social Media Streams

Keyword bursts across

language barriers

65

Match Event Bursty Tokens

Brazil v. Netherlands, 12 July

2014

Netherlands' Van Persie scores a goal

on a penalty at 3', 1-0

0-1, 1-0, 1:0, 1x0, card, goaaaaaaal, goal, gol, goool,

holandaaaa, kırmızı, pen, penal, penalti, pênalti, persie, red

Brazil v. Netherlands, 12 July

2014

Brazil's Oscar gets a yellow card at 68'

dive, juiz, penalty, ref

Germany v. Argentina, 13 July

2014

Germany’s Götze scores a goal at

113’, 1-0

goaaaaallllllll, goalllll, godammit,

goetze, gollllll, gooooool, gotze, gotzeeee, götze,

nooo, yessss, ドイツ

Page 118: Discovering Credible Events in Near Real Time from Social Media Streams

Keyword bursts across

language barriers

65

Match Event Bursty Tokens

Brazil v. Netherlands, 12 July

2014

Netherlands' Van Persie scores a goal

on a penalty at 3', 1-0

0-1, 1-0, 1:0, 1x0, card, goaaaaaaal, goal, gol, goool,

holandaaaa, kırmızı, pen, penal, penalti, pênalti, persie, red

Brazil v. Netherlands, 12 July

2014

Brazil's Oscar gets a yellow card at 68'

dive, juiz, penalty, ref

Germany v. Argentina, 13 July

2014

Germany’s Götze scores a goal at

113’, 1-0

goaaaaallllllll, goalllll, godammit,

goetze, gollllll, gooooool, gotze, gotzeeee, götze,

nooo, yessss, ドイツ

Page 119: Discovering Credible Events in Near Real Time from Social Media Streams

Keyword bursts across

language barriers

65

Match Event Bursty Tokens

Brazil v. Netherlands, 12 July

2014

Netherlands' Van Persie scores a goal

on a penalty at 3', 1-0

0-1, 1-0, 1:0, 1x0, card, goaaaaaaal, goal, gol, goool,

holandaaaa, kırmızı, pen, penal, penalti, pênalti, persie, red

Brazil v. Netherlands, 12 July

2014

Brazil's Oscar gets a yellow card at 68'

dive, juiz, penalty, ref

Germany v. Argentina, 13 July

2014

Germany’s Götze scores a goal at

113’, 1-0

goaaaaallllllll, goalllll, godammit,

goetze, gollllll, gooooool, gotze, gotzeeee, götze,

nooo, yessss, ドイツ