Top Banner
NSLab, RIIT, Tsinghua Univ A New Approach to Bot Detection: Striking the Balance Between Precision and Recall Fred Morstatter et al. Presented by Jun Yang 2017.3.29
24

A New Approach to Bot Detection

Jun 01, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

A New Approach to Bot Detection:Striking the Balance Between Precision and Recall

Fred Morstatter et al.Presented by Jun Yang

2017.3.29

Qing Lyu
Page 2: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

What is a bot?

� Social media accounts that are controlled by software.

� Self-declared bots.� Spambots.� Socialbot.

2

Page 3: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

What is a bot?

� Innocuous.� Post up-to-date weather, news, historical events, etc.

� Nocuous.� Infiltration.� Influence trending.� Repost or follow specific user.

3

Page 4: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

How many?

4

� Over half of the accounts on Twitter are not human.� 5-9% bots produce 24% tweets on Twitter.� 28% of accounts created in 2008 and half of the accounts

created in 2014 have been suspended by Twitter.

Page 5: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

Influence

5

� Harvest private users' data.� Sway discussion.� Influence trending hashtags and user statistics.� Lose user experience and trust.� Social media researches.

Page 6: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

Bots detection

6

� Classification tasks.

Page 7: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

Bots detection

7

� Classification tasks.� Content¾ Different from normal users.¾ URLs.¾ Sentiment.¾ Length.¾ Similarity.¾ Original tweet.

Page 8: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

Bots detection

8

� Classification tasks.� User profile¾ Automatically generated accounts with detectable patterns.¾ E-mail addresses.¾ Creation times.¾ Life time.¾ Screen name and verified name.¾ Human typing.

Page 9: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

Bots detection

9

� Classification tasks.� Activities¾ Request frequency.¾ IP addresses.¾ Multiple login location.

Page 10: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

Bots detection

10

� Classification tasks.� Network structure and connection¾ Mass following and unfollowing behaviors.¾ Statistical and structural features.

Page 11: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

Ground Truth Acquirements

11

� Manual annotation� Suspended users list.� Honeypots.

Page 12: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

Precision vs Recall

12

� An undetected bot vs an angry user?

Page 13: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

Contribution

13

� Two labeled datasets by different techniques.� Textual features by LDA.� Modified approach for higher Recall and F1.

Page 14: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

Dataset

14

� Lybya� Querying keywords of Arab Spring.� Collect accounts from 2011.2 to 2013.2� Check whether suspended or removed in 2015.2� 7.5% accounts as bots.

Page 15: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

Dataset

15

� Arabic Honeypot� Random tweet or retweet Arabic phrases.� Measures to avoid suspension.� Collect human users that tweet same phrases.� Balanced dataset.

Page 16: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

Dataset

16

� Arabic Honeypot� Random tweet or retweet Arabic phrases.� Measures to avoid suspension.� Collect human users that tweet same phrases.� Balanced dataset.

Page 17: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

Baselines

17

� Heuristics� Retweet fraction.� Average tweet length.� URLs fraction.� Average time interval.

Page 18: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

LDA

18

Page 19: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

AdaBoost

19

� Different weak classifiers will focus on different bots.

Page 20: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

BoostOR

20

� A modified AdaBoost algorithm to improve Recall.

Page 21: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

BoostOR

21

� A modified AdaBoost algorithm to improve Recall.

Page 22: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

Discussion of F1-score

22

� Balanced test set.

� Positive:Negative = 3:1

� Positive:Negative = 1:3

Precision Recall F1C1 90.00% 70.00% 78.75%C2 70.00% 90.00% 78.75%

Precision Recall F1C1 96.43% 70.00% 81.12%C2 87.50% 90.00% 88.73%

Precision Recall F1C1 75.00% 70.00% 72.41%C2 43.75% 90.00% 58.88%

Page 23: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

Number of topics

23

Page 24: A New Approach to Bot Detection

NSLab, RIIT, Tsinghua Univ

Thank you!

Questions?

24