ProfileRank: Finding Relevant Content and Influential Users based on Information Diffusion @SNAKDD’13, Chicago, IL Arlei Silva 1 , Sara Guimar˜ aes 2 , Wagner Meira Jr. 2 , Mohammed Zaki 3 1 Computer Science Department – University of California, Santa Barbara, CA 2 Computer Science Department – Universidade Federal de Minas Gerais, Brazil 3 Computer Science Department – Rensselaer Polytechnic Institute, NY
23
Embed
Pro leRank: Finding Relevant Content and In uential Users based … › ~arlei › talks › snakdd13.pdf · 2014-07-03 · Pro leRank: Finding Relevant Content and In uential Users
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
ProfileRank:Finding Relevant Content and Influential
Users based on Information Diffusion@SNAKDD’13, Chicago, IL
Arlei Silva1, Sara Guimaraes2,Wagner Meira Jr.2, Mohammed Zaki3
1Computer Science Department – University of California, Santa Barbara, CA2Computer Science Department – Universidade Federal de Minas Gerais, Brazil
3Computer Science Department – Rensselaer Polytechnic Institute, NY
Social Media in Numbers
Twitter: 500M users, 340M tweets/day
Tumblr: 100M users, 75M posts/day
Facebook: 1.15B users, 1B pieces of content shared/day
Instagram: 30M users, 5M photos shared/day
Influence and Relevance in Social Media: Questions
Who are the influentials?
I influence: ability of popularizing information
I personalized influence
What is relevant?
I relevance: capacity of satisfying a user’s information needs
I personalized relevance
Why are these questions important?
I Information diffusion mechanisms
I Recommender systems
I Viral marketing
Information Diffusion Data
Content creation/propagation represented as tuples:
I <user,content,time>
C
RT@user_0 A
0 @user_0 1
2 3
@user_1
@user_2 @user_3
A
BB | !BB?
RT@user_0 B
B
RT@user_1 C
(a) Twitter
user 0, A, t0user 0, B, t1user 1, A, t2user 1, C , t3user 2, B, t4user 3, C , t5
(b) Diffusion data
How can we measure influence and relevance?
ProfileRankRandom walks over a content-user graph
Relevant content is created and propagated by influentialusers and influential users create relevant content
Relies on content propagation, instead of a social networkI In some scenarios, there is no social network availableI # of followers 6= capacity to propagate content [Cha et al.’10]
user 0, A, t0user 0, B, t1user 1, A, t2user 1, C , t3user 2, B, t4user 3, C , t5
(a) Diffusion data (b) Diffusion model
ProfileRank: Formulation
Information diffusion data → information diffusion graph
I G (U,C ,F ,E )
G can be represented as two matrices:
1. M: User-content matrix
2. L: Content-user matrix
Relevance r and influence i computed as:
r = iM i = rL
r(k) = r(k−1)LM) i(k) = i(k−1)ML
r = (1− d)u(I − dLM)−1 i = (1− d)u(I − dML)−1
These equations always have a unique solution
Related Work
Social influence and information diffusion [Gruhl et al.’04,Leskovec et al.’07, Tang et al.’09, Cha et al.’09, Cha et al.’10,Weng et al’10, Goyal et al.’10, Romero et al.’11]
Content search and recommendation [Baluja et al.’08, Chen etal.’10, De Choudhury et al.’11, Kim and Shim’11]
Link prediction in social networks [Liben-Nowell andKleinberg’03, Hannon et al.’10, Leroy et al.’10, Gomez Rodriguezet al.’10]
Relevance in hyperlinked environments [Kleinberg’98, Page etal.’99]
Evaluation
Problem: Absence of ground truth information
I Influential users
I Relevant content
Solution: Considering personalized assessments
I A user is influential to another user
I A content is relevant to a given user
ProfileRank can be personalized to provide recommendations
Assumption: Recommendation accuracy → model quality
Evaluation: Datasets
Dataset content #users #pieces of content #propagations source