User Interactions in Social Networks and their Implications Christo Wilson, Bryce Boe, Alessandra Sala, Krishna P. N. Puttaswamy, Ben Y. Zhao (UC Santa Barbara) EuroSys 2009 May 4, 2011 Hyewon Lim
Dec 13, 2015
User Interactions in Social Net-works and their ImplicationsChristo Wilson, Bryce Boe, Alessandra Sala, Krishna P. N. Puttaswamy, Ben Y. Zhao (UC Santa Barbara)EuroSys 2009
May 4, 2011Hyewon Lim
2
Outline Introduction The Facebook Social Network Data Set and Collection Methodology Analysis of Social Graphs Analysis of Interaction Graphs Conclusion
3
Introduction Social networks
– Popular infrastructures for communication, interaction, and information sharing on the Internet
Recent work– Meaningful, interactive relationships with friends are critical
to improving trust and reliability in the system
But real-world interpersonal association is not uni-form– Social links often connect acquaintances with no level of mu-
tual trust or shared interest
4
Introduction
Are social links valid indicators of real user interac-tion?
If not, then what can we use to form a more accurate model for evaluating socially-enhanced applications?
5
Introduction Contribution
– The first large scale study of the Facebook social network Users tend to interact mostly with a small subset of friends
– Propose the interaction graph– Examine the impact of using different graph models in eval-
uating socially-enhanced applications
6
The Facebook Social Network
– The largest social network in the world– Number one photo sharing site on the Internet– Over 150M active users (Feb 2009)– Designed around the concept of networks that organizes
users into membership-based groups Educational institution, company/organization, geographic loca-
tion
– Bidirectional social links by friending other users– Wall, photo uploading, tag, Mini-Feed
7
Data Set and Collection Methodology
Data Collection Process
Crawled by regional network– Unauthenticated and open to all users– Users belong to at least one regional network– Most users do not modify their default privacy settings
To crawl– Seed: 50 user IDs– Breadth-first searches of social links on each network
Complete data set– Approximately 500GB– Includes full profiles of more than 10 million Facebook users
8
Data Set and Collection Methodology
Data Collection Process
Primary data set [March – May of 2008]
– Profile, Wall and photo data crawled from the 22 largest re-gional networks
Also performed daily crawls of the San Francisco re-gional network in Oct 2008 to gather data specifically on the Mini-Feed
9
Data Set and Collection Methodology
Completeness of Graph Coverage
The majority of user accounts in the social graph are part of a single large, weakly connected component (WCC)
Social links on Facebook are undirected– Breadth first crawling of social links should be able to gener-
ate complete coverage of the WCC, assuming that at least one of the initial seeds is linked to the WCC
Validation– Performed five simultaneous crawls of the San Francisco re-
gional network Start with 50 seeds and going up to 5000 seeds
– Difference in the number of users was only 242 users out of approximately 169,000 total
10
Data Set and Collection Methodology
Description of Collected Data
Collected the full user profile of each user visited dur-ing crawls– Also collected full transcripts of Wall posts and photo com-
ments
“Date Joined”– By examining each user’s earliest Wall post
Performed crawls of Mini-Feed data from the San Francisco regional network– To obtain interaction data on Facebook at a more fine-grained
level– Crawl daily to ensure that we build up a complete record of
each user’s actions on a day-to-day basis ~400K users
11
Analysis of Social Graphs Analyze general properties of our Facebook popula-
tion– Including user connectivity in the social graph and growth
characteristics over time
Different types of user interactions on Facebook– Including how interactions vary across time applications, and
different segments of the user population
Analyze detailed user activities – Through crawls of users Mini-Feed from the San Francisco
network– Paying attention to social network growth and interactions
over fine-grained time scales
12
Analysis of Social Graphs
Social Network Analysis
10M users from the 22 largest regional networks– 56% of the total user population of those networks
Complete data set [table 1, slide 8]
– Over 940M social links – 24M interaction events
13
Analysis of Social Graphs
Social Network Analysis
Social degree analysis
– Social degrees on Facebook scale based on a power-law dis-tribution
14
Analysis of Social Graphs
Social Network Analysis
Social graph analysis– Construct a social graph for each crawled regional network
Limit social graphs to only include links for which users at both end-points were fully visible during crawls
Avg. path leng. ≤ 6– Lending credence to the six-degrees of separation hypothesis_Milgram
1967
Radius & diameter is similar to the values presented for other SNs
– low when compared to other large network graphs, such as the WWW
15
Analysis of Social Graphs
Social Network Analysis
Clustering coefficient measurements– A measure to determine whether social graphs conform to
the small-world principle_Watts 1998
– Defined on an undirected graph as the ratio of the number of links
Exist between a node’s immediate neighborhood and the maxi-mum # of links that could exist
– For a node with N neighbors and E edges between those neighbors,
CC = (2E) / (N(N-1))
– High CC Nodes tend to form tightly connected, localized cliques with
their immediate neighbors
16
Analysis of Social Graphs
Social Network Analysis
Clustering coefficient measurements– Avg. CC of Facebook: 0.133 ~ 0.211 (avg over all: 0.167,
Orkut: 0.171) Higher levels of local clustering than either random graphs or
random power-law graph, which indicates a tightly clustered fringe that is characteristic of social networks_Mislove 2007
– User w/ lower social degrees have high CC
17
Analysis of Social Graphs
Social Network Analysis
Assortativity measurement– Assortativity coefficient
A graph measures the probability for nodes in a graph to link to other nodes of similar degree
Calculated as the Pearson correlation coefficient of the degrees of node pairs for all edges in a graph
Result range: -1 ≤ r ≤ 1– 0 ≤ r: nodes tend to connect with other nodes of similar degree
– AC for our Facebook graphs are uniformly positive Connections between high degree nodes in graphs are numer-
ous– This well-connected core of high degree nodes form the backbone of
small-world network AC values closely resemble the those for other large SNs
18
Analysis of Social Graphs
Social Network Analysis
Growth of Facebook over time– Historical growth of the user population in our sample set
Exponentially increase from month 24– > 80% of profiles are “young profiles”
19
Analysis of Social Graphs
User Interaction Analysis
Distribution of the users’ interaction among their friends
– The large majority of interactions occur only across a small subset of their social links Only a subset of social links actually represent interactive relationships
20
Analysis of Social Graphs
User Interaction Analysis
Interaction distribution among friends– Analyze the user interaction patterns
Photo tags accurately capture real life social situation– Even highly social ones, users show significant skew towards inter-
acting with, and sharing physical proximity with a small subset of their friends
21
Analysis of Social Graphs
User Interaction Analysis
Distribution of total interactions
The bulk of all Facebook interactive events are generated by a small, highly active subset of users
Not all social links are equally useful when Analyzing SNs
A correlation between social degree and interactivity does exist
22
Analysis of Social Graphs
User Interaction Analysis
Users’ avg. # of interactions at different point in their lifetime
– Two possible interpretations The oldest users were the original users who participated in
Facebook’s growth– Self-selected to users highly interested in SNs
Leave only active Facebook users
23
Analysis of Social Graphs
Mini-feed Analysis
Two missing perspective from Wall and Photo com-ment data– Do not tell us about the formation of new friend links– Not describe user interactions in other applications
– Social graph is growing at a faster rate than users are able to communicate with one another Average users do not interact with most of the their “Face-book friends”
24
Analysis of Interaction Graphs Interaction Graph
Subset of the social graph where for each link, interactivity be-tween the link’s endpoints is greater than the rate stipulated by n and t– n: a minimum number of interaction events– t: a window of time during which interactions must have oc-
curred n & t delineate an interaction rate threshold
Interaction DegreeThe number of friends who interact with the user at a rate greater than the parameterized minimum
Implicit assumption underlying our IG– The majority of user interaction events occur across social
links
25
Analysis of Interaction Graphs Interaction Graphs on Facebook
– Evaluating each user’s incoming and outgoing interactions is chal-lenging
– Sample interactions that occur over social links that connect two users in our user population
– Acceptable to model interaction graphs on FB using undirected edges
since this model suits the interactivity patterns of the majority or users
26
Analysis of Interaction Graphs Comparison of Social and Interaction Graphs
– Social vs. interaction degree Interaction degree does not scale equally with social degree
27
Analysis of Interaction Graphs Comparison of Social and Interaction Graphs
28
Conclusion We show…
– Interaction activity on Facebook is significantly skewed to-wards a small portion of each user’s social links
Interaction graph– A more accurate representation of meaningful peer connec-
tivity on SN
Social-based applications should be designed with in-teractions graphs in mind– Reflect real user activity rather than social linkage alone