Personality Prediction from Network Structure Research Project by Marco Milano Supervised by Fabio Pianesi 22/01/2014 Rovereto
Aug 06, 2015
Personality Prediction from Network Structure
Research Project by Marco Milano
Supervised by Fabio Pianesi
22/01/2014Rovereto
2
Main Contributions Acknowledgement
The SocioMetric Badges Corpus: A Multilevel Behavioral Dataset for Social Behavior in Complex Organizations, Lepri et al., SocialCom/PASSAT 2012
Friends don't lie: inferring personality traits from social network structure, Staiano et. al, Proceedings of the 2012 ACM Conference on Ubiquitous Computing
22/01/2014Personality Prediction from Network Structure
3
Presentation Outline
Research Hypothesis and Goal Task Description Automatic Classification Results Discussion Further Work
22/01/2014Personality Prediction from Network Structure
4
Research Hypothesis
Can psychological predispositions of individuals – like personality traits – add explanatory capacity (cit.) and predictive power to standard SNA tools?
22/01/2014Personality Prediction from Network Structure
Cit. from Kalish & Robins, 2005
5
Research Goal
To predict one individual’s personality trait from its social network in the Sociometric Badges experiment corpus
Based on previous study by Staiano et al., 2012
22/01/2014Personality Prediction from Network Structure
6
Benchmark study – Staiano et al.,2012
Goal: to predict one individual’s personality trait from its social network
Network Data: mobile phone call records between students of the MIT (USA)
Personality Data: survey
22/01/2014Personality Prediction from Network Structure
Personality Prediction from Network Structure
7
Sociometric Badges Corpus
22/01/2014
53 participants 6 weeks duration Real work setting 1 wearable device 4 recording sensors:
IR,BT,Accelerometer,Speech
Initial + final survey Daily questionnaire
8
About personality (Allport, 1962)
People’s behaviour can be explained to some extent in terms of underlying personality constructs (cit.)
Trait characteristics are constructs that remain stable over time, whilst state characteristics appear in context and are temporarily induced by external circumstances
22/01/2014Personality Prediction from Network Structure
Cit. from Kalish & Robins, 2005
9
Quantify Personality – Big Five Framework
22/01/2014Personality Prediction from Network Structure
Image from: http://blog.lib.umn.edu/
10
Personality Data From initial survey The Big Five Marker Scale (BFMS) to assess personality
traits. The BFMS scale is an adjective list composed by 50 items
specifically designed to optimize the simplicity of the big-five factor solution in the light of results of psycho-lexical studies on the Italian language
Sample composed of 90.56% Italian native Non-Italian speakers received a validated translation of the
BFMS
22/01/2014Personality Prediction from Network Structure
Trait Mean
Std Max Min
Extra 41.83 9.83 62 23
Agree 51.6 6.03 62 29
Consc
50.44 8.61 66 34
Em. St.
41.02 6.92 54 23
Creat 46.11 3.74 55 38
11
IR and BT Recording
Infrared hit used as indicator of face-to-face interaction
IR signal coverage: cone of height h ≤ 1 meter and radius of r ≤ htanθ, where θ = +/- 15degrees
Trasmission rate set at 1HZ
BT used as indicator of proximity RSSI (radio signal strength indicator) set to values
range -128 to 127 Sampling rate every 5 secs
22/01/2014Personality Prediction from Network Structure
12
IR and BT Networks Summaries
Density: Ratio of the number of edges and the number of possible edges
Avg Short. Path Length: Average of the length of all shortest paths from or to the vertices in the network
Avg Clustering: Ratio of the triangles and connected triples in the graph
Diamater: Maximum distance from a vertex to all other vertices in the graph
22/01/2014Personality Prediction from Network Structure
Statistics IR BT
Number of Nodes 50 49
Number of Edges 334 981
Average Degree 13.36 40.04
Density 0.27 0.83
Avg Shortest Path Length
1.83 1.16
Avg Clustering 0.52 0.90
Diamater 3 3
14
Network Metrics
22/01/2014Personality Prediction from Network Structure
Whole-network level
Transitivity Efficiency Centrality measures
Degree Closeness Betweenness Eigenvector Information
Ego-network level
Transitivity Efficiency Triadic Measures
Davis & Leinhardt (DL) Triads Census Kalish & Robins (KR) Triads Census
15
Centrality Measures
22/01/2014Personality Prediction from Network Structure
Degree(A): number of neighbours of a node
Closeness(B): the reciprocal of the sum of the shortest path distances from a node to all other n-1 nodes
Betweenness(C): the sum of the fraction of all-pairs shortest paths that pass through a node
Eigenvector(D): connections to high-scoring nodes contribute more to the score of the node
Information: importance of a node is related to the ability of the network to respond to the deactivation of the node from the network
16
Centrality Measures – Interpretation
Degree: number of connections of a node (popularity)
Closeness: how long would take to spread information from node to all others
Betweenness: quantifies relevance of node on information flow
Eigenvector: measures influence of a node Information: measures efficiency of propagation
of information from a node
22/01/2014Personality Prediction from Network Structure
19
Personality traits /Centrality Measures Pearson’s CorrelationIR Extra Agree Consc Emo. St. Creat
Degree 0.012 -0.086 -0.135 0.162 -0.113Between 0.061 -0.124 -0.024 0.062 -0.213Close -0.001 -0.087 -0.101 0.166 -0.188Eigen 0.048 -0.073 -0.186 0.182 -0.008Inform 0.277 -0.147 0.019 -0.138 0.031
22/01/2014Personality Prediction from Network Structure
BT Extra Agree Consc Emo. St. Creat
Degree -0.161 0.077 0.188 -0.183 0.163Between 0.160 0.097 0.110 -0.114 0.076Close -0.126 0.114 0.157 -0.148 0.133Eigen -0.179 0.072 0.187 -0.181 0.159Inform -0.031 0.168 0.042 -0.235 0.216
20
Classification Task
Python’s Scikit-Learn Library Binary classification task Feature space: centrality measures Target: each Big-Five personality traits’ score
(0 for low, 1 for high) Random Forest Classifier with Leave-One-Out
Cross Validation strategy and sample bootstrapping
Compare random forest classifier vs SVM, KNN
22/01/2014Personality Prediction from Network Structure
21
Classification Algorithm(s) Random Forest classifier (as used in Staiano et
al., 2012) Little parameters tuning
Support Vector Machine Performance highly dependent on parameters
tuning K-nearest Neighbors
No parameters tuning
22/01/2014Personality Prediction from Network Structure
22
Random Forest
Ensemble method: divide-and-conquer based approach Combine weak learners to form a strong learner Each weak learner is a decision tree
22/01/2014Personality Prediction from Network Structure
23
Random Forest – In detail For some number of trees T:
1. Sample N cases at random with replacement to create a subset of the data. The subset should be about 66% of the total set.
2. At each node: 1. For some number m, m predictor variables are selected at random from all the predictor
variables. 2. The predictor variable that provides the best split, according to some objective function, is
used to do a binary split on that node. 3. At the next node, choose another m variables at random from all predictor variables and
do the same.
22/01/2014Personality Prediction from Network Structure
Credits: citizennet.com
24
Classification ResultsF1 Score – Random Forest
IR Extra Agree Consc Emo. St Creat
Low 0.43 0.55 0.60 0.41 0.46
High 0.36 0.32 0.51 0.50 0.36
Avg/Total 0.40 0.43 0.56 0.46 0.42
22/01/2014Personality Prediction from Network Structure
BT
Low 0.51 0.47 0.45 0.53 0.56
High 0.44 0.49 0.43 0.59 0.52
Avg/Total 0.48 0.48 0.44 0.56 0.54
25
Classification ResultsF1 Score – Random Forest, SVM, KNN IR
BT
22/01/2014Personality Prediction from Network Structure
Algorithm
Extra Agree Consc Emo. St Creat
RF 0.40 0.43 0.56 0.46 0.42
SVM 0.46 0.50 0.31 0.74 0.06
KNN 0.50 0.37 0.40 0.53 0.44
Algorithm
Extra Agree Consc Emo. St Creat
RF 0.48 0.48 0.44 0.56 0.54
SVM 0.10 0.60 0.44 0.56 0.18
KNN 0.40 0.44 0.38 0.40 0.52
26
Classification Accuracy: Staiano, 2012
Bluetooth
Extra Agree Consc Emo. St. Creat
Baseline 60 58 52 60 54
RF 73.08 73.59 72.25 60.54 65.56
22/01/2014Personality Prediction from Network Structure
Calls Extra Agree Consc Emo. St. Creat
Baseline 54.5 56.8 56.8 59.1 56.8
RF 59.45 68.82 63.83 73.74 68.39
27
Discussion
Random Forest classifier underperformed Different classifiers better at predicting
different personality traits Personality traits as discrete vs continuous
variables
22/01/2014Personality Prediction from Network Structure