HawkesTopic: A Joint Model for Network Inference and Topic Modeling from Text- Based Cascades Xinran He 1 , Theodoros Rekatsinas 2 , James Foulds 3 , Lise Getoor 3 and Yan Liu 1 07/08/2015 1 University of Southern California 2 University of Maryland, College Park 3 University of California, Santa Cruz
19
Embed
HawkesTopic: A Joint Model for Network Inference and Topic Modeling from Text-Based Cascades Xinran He 1, Theodoros Rekatsinas 2, James Foulds 3, Lise.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
HawkesTopic: A Joint Model for Network Inference and Topic Modeling from Text-Based
Cascades
Xinran He1, Theodoros Rekatsinas2, James Foulds3, Lise Getoor3 and Yan Liu1
07/08/20151University of Southern California2University of Maryland, College Park3University of California, Santa Cruz
He et al. HawkesTopic ICML 2015
Introduction• Diffusion is an important and fundamental phenomenon:
• Abundant text-based cascades in a variety of social platforms
A
B C
D
E
F
G
01/17
Viral marketing, detection of rumors, modeling news dynamics …
t=0
t=1 t=1.5
t=2
t=3.5
He et al. HawkesTopic ICML 2015
Traditional vs Text-based Cascades
02/17
t=0t=3.5
t=1
t=2
t=1.5
B
A
C
D
E
F
G
t=0t=3.5
t=1
t=2
t=1.5
Traditional cascades Text-based cascades
- Temporal information - Temporal information- Content information
Incorporate content information => better model of diffusion Incorporate temporal information => better model of documents
He et al. HawkesTopic ICML 2015
Network Inference
aaaaaabbb
cccbbbccc
aaabbbbba aaa
aabccc
cccbbcaaa
Network Inference focuses on inferring a hidden diffusion network
Related work: - NetInf, NetRate [Gomez et al. 11,12], MMHP [Yang and Zha 13], KernelCascades [Du el al. 12]
- TopicCascades [Du el al. 13]
t=0t=3.5
t=1
t=2
t=1.5
A
C
D
E
F
G
B B
A
C
D
E
F
G
aaaaab
bbb bbabbc
ccc
Topic 1 Topic 2 Topic 3
aaaaaabbb
cccbbbccc
aaabbbbba aaa
aabccc
cccbbcaaa
0.60.5
0.3 0.2
0.2
0.1
0.1
03/17
He et al. HawkesTopic ICML 2015
Topic Modeling
aaaaaabbb
cccbbbccc
aaabbbbba aaa
aabccc
cccbbcaaa
Topic modeling aims to discover the latent thematic topics
Related work: - LDA [Blei et al. 03], CTM [Blei and Lafferty 06]
- Citation Influence model [Dietz el al. 07], TIR model [Foulds et al. 13]
t=0t=3.5
t=1
t=2
t=1.5
A
C
D
E
F
G
B B
A
C
D
E
F
G
aaaaab
bbb bbabbc
ccc
Topic 1 Topic 2 Topic 3
aaaaaabbb
cccbbbccc
aaabbbbba aaa
aabccc
cccbbcaaa
aaaaaabbb
cccbbbccc
aaabbbbba aaa
aabccc
cccbbcaaa
Corpus
04/17
Our Contribution
HawkesTopic: joint model for simultaneous Network Inference and Topic Modeling from text-based cascades
aaaaaabbb
cccbbbccc
aaabbbbba
aaaaabccc
cccbbcaaa
aaaaab
Topic 1
bbb bbabbc
Topic 2
cccTopic 3
Topic Modeling
He et al. HawkesTopic ICML 2015
B
A
C
D
E
F
Gaaaaaabbb
aaaaabccc
cccbbcaaa
aaabbbbba
cccbbbccc t=0t=3.5
t=1
t=2
t=1.5
Network Inference
A
B C
D
E
F
G
0.6 0.4
0.10.2
0.3
0.3
05/17
HawkesTopic: Intuition
𝑣1
𝑣2
aaaaaabbb
ccccccbbb
aaaababbb
cccccabbb
bbbbbacca
Mutual exciting nature: A posting event can trigger future events
Content cascades: The content of a document should be similar to the document that triggers its publication
𝒕
𝒕
He et al. HawkesTopic ICML 2015 06/17
Modeling Posting Times
Mutually exciting nature captured via Multivariate Hawkes Process (MHP) [Liniger 09].
For MHP, intensity process takes the form:
: influence strength from to : probability density function of the delay distribution
Base intensity Influence from previous events
He et al. HawkesTopic ICML 2015 07/17
+Rate =
Generating Posting Times
𝑣1
𝑣2
𝒕
𝒕
Generate events and their posting times in a breadth first order by interpreting the MHP as clustered Poisson process [Simma 10]
Provide explicit parent relationship for evolution of the content information