This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Maximisation Václav Belák, Samantha Lam, Conor Hayes
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Motivation
• Information cascades of high interest in marketing, CRM, etc. • A common approach is to maximise information diffusion by
targeting influential actors • In the context of many online communities (e.g. discussion
fora) the information is shared to the community as a whole and not to individual actors
common case – targeting individuals cross-community case – targeting communities
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Objectives
• Our main hypothesis is that it is possible to efficiently spread a message over the information flow network by targeting highly influential communities
• The main problem is then formulated as a prediction of the set of communities to target such that the message is spread over the network as much as possible • Spread over the actors, i.e. user activation fraction • Spread over the communities, i.e. community
activation fraction
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
• We propose (Belák et al., ‘12) to take two factors into account: 1. degree of community membership of the users 2. centrality of the users within each community
• Impact of community A on community B defined as an average centrality of actors from A within B, weighted by their membership in A
Methods: Definition of Impact
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Methods: Targeting Communities
• Level of dispersion (heterogeneity) of total impact of community i can be measured as an entropy of an i-th row/column of the impact matrix
• We propose to target communities by means of the product of the total impact of community i and its entropy: impact focus (IF)
• We simulated the diffusion by extending Independent Cascade (ICM) and Linear Threshold (LTM) Models (Kempe et al., ‘03)
1. Take q target communities and sample s users from each of them 2. Run the original models from the union of sampled users
• Information diffusion network derived from the reply-to network:
i jreplies to
information
flow wiji j
rji
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Evaluation Strategy
• IF compared with random targeting (R), and group in-degree (GI) (Everett & Borgatti, ’99)
• The main aim was to investigate robustness of our framework with respect to:
• Character of the system • Diffusion models • User and Community Activation Fractions
• Procedural outline 1. Target q communities using one of the heuristics evaluated on
the data from time-slice t 2. Run the diffusion model on the network from time-slice t+1 3. Compute an average user and community spreads over all pairs (t, t+1)
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Evaluation Data-Sets
• 51 weeks of data of the largest Irish discussion board system • Segmented using 1 week sliding window
• 5 years of data from the technical support fora of SAP • Used only for the diffusion experiments • Segmented using 2 months sliding window
• 2 months represent approx. 50% of cross-fora posting activity
• 33 communities, 2k users/snapshot (avg)
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
User Act. Fraction
One targeted community
5 10 15 20
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
q=1, Boards−LTM
user sample size (s)
mea
n us
er a
ctiva
tion
fract
ion
(u)
IFGIR
5 10 15 20
0.00
0.05
0.10
0.15
0.20
0.25
0.30
q=1, SAP−LTM
user sample size (s)
mea
n us
er a
ctiva
tion
fract
ion
(u)
IFGIR
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Community Act. Fr.
One targeted community
5 10 15 20
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
q=1, Boards−LTM
user sample size (s)
mea
n co
mm
unity
act
ivatio
n fra
ctio
n (c
)
IFGIR
5 10 15 20
0.0
0.1
0.2
0.3
0.4
0.5
q=1, SAP−LTM
user sample size (s)
mea
n co
mm
unity
act
ivatio
n fra
ctio
n (c
)
IFGIR
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Community Act. Fr.
Five targeted communities
5 10 15 20
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
q=5, Boards−LTM
user sample size (s)
mea
n co
mm
unity
act
ivatio
n fra
ctio
n (c
)
IFGIR
5 10 15 20
0.0
0.1
0.2
0.3
0.4
0.5
q=5, SAP−LTM
user sample size (s)
mea
n co
mm
unity
act
ivatio
n fra
ctio
n (c
)
IFGIR
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Results Highlights
• Diffusion process became saturated at approximately 80% of users or communities in Boards, and 30% in SAP
• More efficient to target few communities
• Impact Focus outperformed the other two strategies with respect to both user and community activation fractions, namely for small number of targeted communities (i.e. [1, 2]) and seed users (i.e. [1, 20])
• Diminishing returns
• For high number of targeted communities and seed users, random strategy outperformed the other two with respect to community activation fractions in SAP data-set
• SAP network fragmented into many small components, which made it hard to reach peripheral communities
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Conclusion
• The evaluation demonstrated that the framework • is able to identify highly influential communities • can predict which communities to target s.t. the
message spreads efficiently over both individual users and communities
• We aim to extend it with content analysis • E.g. What are the most influential communities with
respect to a particular topic?
• We will also investigate empirically-observed topic cascades and modify our models accordingly if needed
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Questions?
References • Belák V., Lam S., Hayes C. Cross-Community Influence in Discussion
Fora. ICWSM. AAAI, 2012. • M. Everett and S. Borgatti. The centrality of groups and classes. J. of
Mathematical Sociology, 23(3):181–201, 1999. • D. Kempe, J. Kleinberg, and É. Tardos. Maximizing the spread of
influence through a social network. SIGKDD. ACM, 2003.