Competing for attention: branching-process models of meme popularity James P. Gleeson MACSI, Department of Mathematics and Statistics, University of Limerick, Ireland #branching www.ul.ie/gleeson [email protected]@gleesonj NetSci14, Berkeley, 5 June 2014
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Competing for attention: branching-process models of meme popularity
James P. Gleeson
MACSI, Department of Mathematics and Statistics, University of Limerick, Ireland
Branching processes for meme popularity models Overview
Ξ¦
Memory
Network Competition ππ
Branching processes for meme popularity models Part 1
Memory
Network Competition
Motivating examples from empirical work on Twitter
Twitter 15M one-year dataset: collaboration with R. BaΓ±os and Y. Moreno
πΌ = 2
fraction of hashtags with popularity β₯ π at age π
Branching processes for meme popularity models Part 2
Memory
Network Competition
Simonβs model
β’ Simon, βOn a class of skew distribution functionsβ, Biometrica, 1955 β’ The basis of βcumulative advantageβ and βpreferential attachmentβ models;
see Simkin and Roychowdhury, Phys. Rep., 2011 β’ During each time step, one word is added to an ordered sequence
β’ With probability π, the added word is an innovation (a new word)
β’ With probability 1 β π, a previously-used word is copied; the copied word is
β’ Early-mover advantage; fixed-age distributions have exponential tails
[Simkin and Roychowdyury, 2007]
Simonβs model
π = 0.02
Simonβs model as a branching process
β’ During each time step, one word is added to an ordered sequence β’ With probability π, the added word is an innovation (a new word) β’ With probability 1 β π, a previously-used word is copied; the copied word is
β’ PGFs are βtransformsβ of probability distributions: define PGF π(π₯) by β’ β¦but βinverse transformβ usually requires numerical methods, e.g. Fast
Fourier Transforms [Cavers, 1978] β’ Some properties:
β’ PGF for the sum of independent random variables is the product of the
PGFs for each of the random variables e.g., H. S. Wilf, generatingfunctionology, CRC Press, 2005
β’ In this case, popularity distributions depend only on the age of the seed; there is no early-mover advantage
π = 0.02
πΌ = 1.5
Competition-induced criticality
Simonβs original model, and the copying-with-memory model both have the following features:
β’ One word is added in each time step
β’ Words βcompeteβ for user attention in order to become popular β’ The words have equal βfitnessβ β a type of βneutral modelβ [Pinto and
MuΓ±oz 2011, Bentley et al. 2004 ]
β’ β¦ except for the early-mover advantage in Simonβs modelβ¦
but only the copying-with-memory model gives critical branching processes.
Branching processes for meme popularity models Part 3
Memory
Network Competition
β’ Each node (of π) has a memory screen, which holds the meme of current interest to that node. Each screen has capacity for only one meme.
β’ During each time step (Ξπ‘ = 1/π), one node is chosen at random. β’ With probability π, the selected node innovates, i.e., generates a brand-new
meme, that appears on its screen, and is tweeted (broadcast) to all the node's followers.
β’ Otherwise (with probability 1 β π), the selected node (re)tweets the meme currently on its screen (if there is one) to all its followers, and the screen is unchanged. If there is no meme on the node's screen, nothing happens.
β’ When a meme π is tweeted, the popularity ππ of meme π is incremented by 1 and the memes currently on the followers' screens are overwritten by meme π.
The Markovian Twitter model
β’ Network structure: a node has π followers (out-degree π) with probability ππ.
β’ In-degree distribution (number of followings) has a Poisson distribution. β’ Mean degree π§ = β ππππ .
β’ A simplified version of the model of Weng, Flammini, Vespignani, Menczer,
Scientific Reports 2, 335 (2012). β’ Related to the random-copying βneutralβ (Moran-type) models of Bentley
et al. 2004 [Bentley et al. Iβll Have What Sheβs Having: Mapping Social Behavior, MIT Press, 2011], where the distribution of popularity increments can be obtained analytically [Evans and Plato, 2007].
β’ Our focus is on the distributions of popularity accumulated over long timescales: when a meme π is tweeted, the popularity ππ of meme π is incremented by 1.
The Markovian Twitter model
β’ When all screens are non-empty, memes compete for the limited resource of user attention
β’ Random fluctuations lead to some memes becoming very popular, while others languish in obscurity
The Markovian Twitter model
β’ Random fluctuations lead to some memes becoming very popular, while others languish in obscurity
β’ The popularity distributions depend on the structure of the network, through the out-degree distribution ππ
π = 0
ππ = πΏπ,10
The Markovian Twitter model
β’ Random fluctuations lead to some memes becoming very popular, while others languish in obscurity
β’ The popularity distributions depend on the structure of the network, through the out-degree distribution ππ
π = 0.01
ππ β πβπΎ; πΎ = 2.5
The Markovian Twitter model
overwritten π§ Ξπ‘
π‘ π‘ + Ξπ‘
Branching processes solution of Twitter model
Define πΊ(π, π₯) as the PGF for the excess popularity distribution at age π of memes that originate from a single randomly-chosen screen (the root screen)
β’ If πβ²β² 1 < β (finite second moment of ππ),
ππ β βΌ π΄ πβππ πβ
32 as π β β
with π = 2π2
πβ²β² 1 +2π§π§+1 2
β’ If ππ β πβπΎ for large π with 2 < πΎ < 3,
ππ β βΌ οΏ½π΅ πβπΎ
πΎβ1 if π = 0πΆ πβπΎ if π > 0
as π β β
ππΊππ
= 0
cf. sandpile SOC on networks [Goh et al. 2003]
Comparing branching process theory with simulations
ππ = πΏπ,10
ππ β πβπΎ πΎ = 2.5
π = 0.01
π = 0
Branching processes for meme popularity models Part 4
Memory
Network Competition
Twitter model with memory
Ξ¦
β’ During each time step (with time increment Ξπ‘ = 1/π), one node is chosen at random.
β’ The selected node may innovate (with probability π), or it may retweet a meme from its memory using the memory distribution Ξ¦(π‘ β π).
β’ Define πΊ(π, π₯) as the PGF for the excess popularity distribution at age π of memes that originate from a single randomly-chosen seed (the root)
β’ The mean popularity π(π) of age-π memes has Laplace transform:
Conclusions: Branching processes for meme popularity models
Ξ¦
Memory
Network Competition ππ
β’ Competition between memes for the limited resource of user attention induces criticality in this model in the π β 0 limit
β’ Criticality gives power-law popularity distributions and epochs of linear-in-time popularity growth, even for (cf. Weng et al. 2012) β homogeneous out-degree distributions β homogeneous user activity levels
β’ Despite its simplicity, the model matches the empirical popularity
distribution of real memes (hastags on Twitter) remarkably well
β’ Generalizations of the model are possible, and remain analytically tractable
Conclusions: Competition-induced criticality
β a useful null model to understand how memory, network structure and competition affect popularity distributions
Davide Cellai, UL Mason Porter, Oxford J-P Onnela, Harvard Felix Reed-Tsochas, Oxford
Jonathan Ward, Leeds Kevin OβSullivan, UL William Lee, UL
Yamir Moreno, Zaragoza Raquel A BaΓ±os, Zaragoza Kristina Lerman, USC
Science Foundation Ireland FP7 FET Proactive PLEXMATH SFI/HEA Irish Centre for High-End
Computing (ICHEC)
Collaborators, funding, references
β’ βA simple generative model of collective online behaviourβ arXiv :1305.7440v2 β’ Physical Review Letters, 112, 048701 (2014); arXiv:1305.4328