gatica/teaching-csm/readings/csm-reading2-presenta… · Private traits and attributes are predictable from digital records of human behavior Michal Kosinski, David Stillwell, and

Post on 12-Aug-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Private traits and attributes are predictable from digital records of human behaviorMichal Kosinski, David Stillwell, and Thore Graepel

Computational Social Media

Karolos Antoniadis

Presentation12th of March, 2020

2

Private traits and attributes are predictable from digital records of human behavior

3

Private traits and attributes are predictable from digital records of human behavior

4

Private traits and attributes are predictable from digital records of human behavior

• openness• extraversion• age• sexual orientation• gender• ethnicity• etc.

5

Private traits and attributes are predictable from digital records of human behavior

• openness• extraversion• age• sexual orientation• gender• ethnicity• etc.→

Problem

• Information about people might be predicted.

• For example, studies have shown that attributes can be predicted from browsing logs, used language.

6

Problem

• Information about people might be predicted.

• For example, studies have shown that attributes can be predicted from browsing logs, used language.

7

Question: Use basic digital records to automatically and accurately estimate personal attributes?

Contribution

With Facebook likes, we can accurately estimate a wide range of personal attributes (typically assumed private).

8

Approach - Data

Objects: quotes, web sites, press articles, books, images, etc.

9

Likes are shared with friends to express support, bookmarking, etc.

Approach - Data

10

9 million unique objects liked by users.

A majority of the objects associated with very few users.

Discard likes with < 20 users and users with < 2 likes.

What remains?

Approach - Data

11

55,814 Objects

58,4

66 U

sers

Approach - Data

12

55,814 Objects

58,4

66 U

sers

user

object

Approach - Data

13

55,814 Objects

58,4

66 U

sers

0 or 1user

object

Approach - Labels

14

Personality traits with the International Personality Item Pool (IPIP).

Religion, political party, etc. from Facebook profile.

Ethnicity by looking at users’ pictures.

Two types: dichotomous and numeric.

Approach - Models

Reduce the dimensionality of the User-Like matrix with SVD.

15

Use 100 components.

Build models that predict traits and attributes.

For numeric variables: linear regressionFor dichotomous variables: logistic regression

Approach - Overview

16

Highest accuracy: gender & ethnicity

Lowest accuracy: divorced parents

17

Results - Dichotomous Variables

Results - Numeric Variables

18

Results - Predictive Likes

19

Results - Predictive Likes

20

Results

21

Results - Power of Likes

22

Even a single like resultsin nonnegligible accuracy.

Results - Overview

23

Few users were associated with explicitly revealing Likes.

Less than 5% of gay users liked explicitly gay objects.

Results - Overview

24

Likes can accurately predict individual traits and attributes.

Few users were associated with explicitly revealing Likes.

Less than 5% of gay users liked explicitly gay objects.

ConclusionPersonal attributes, ranging from sexual orientation to intelligence, can be automatically and accurately inferred using their Facebook’s likes.

25

PROS CONS

• Improve products and services• Improve recommendations• New avenues in psychology

• Revealing without consent (danger)• What do we reveal?• Distrust in online services

ConclusionPersonal attributes, ranging from sexual orientation to intelligence, can be automatically and accurately inferred using their Facebook’s likes.

26

PROS CONS

• Improve products and services• Improve recommendations• New avenues in psychology

• Revealing without consent (danger)• What do we reveal?• Distrust in online services

Thank y

ou!

top related