Large networks, clusters and Kronecker products Jure Leskovec ([email protected]) Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos Faloutsos (CMU), Michael Mahoney (Stanford), Kevin Lang (Yahoo), Anirban Dasgupta (Yahoo)
Jure Leskovec ([email protected]) Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos Faloutsos (CMU), Michael Mahoney (Stanford), Kevin Lang (Yahoo), Anirban Dasgupta (Yahoo). - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Large networks, clusters and Kronecker productsJure Leskovec ([email protected])Computer Science DepartmentCornell University / Stanford UniversityJoint work with: Jon Kleinberg (Cornell), Christos Faloutsos (CMU), Michael Mahoney (Stanford), Kevin Lang (Yahoo), Anirban Dasgupta (Yahoo)
Rich data: Networks Large on-line computing applications have
detailed records of human activity: On-line communities: Facebook (120 million) Communication: Instant Messenger (~1 billion) News and Social media: Blogging (250 million)
We model the data as a network (an interaction graph)
Can observe and study phenomena at scales not
possible before Communication network
3
Small vs. Large networks Community (cluster) structure of networks
Collaborations in NetSci (N=380) Tiny part of a large social network
What is the structure of the network? How can we model that?
4
Conductance (normalized cut):
How expressed are communities? How community like is a set of
nodes? Idea: Use approximation
algorithms for NP-hard graph partitioning problems as experimental probes of network structure.
Small Φ(S) == more community-like sets of nodes
S
S’
[w/ Mahoney, Lang, Dasgupta, WWW ’08]
5
Network Community Profile Plot We define:
Network community profile (NCP) plotPlot the score of best community of size k
Community size, log k
log Φ(k)Φ(5)=0.25
Φ(7)=0.18
k=5 k=7
[w/ Mahoney, Lang, Dasgupta, WWW ’08]
6
NCP plot: Network Science Collaborations between scientists in
Networks [Newman, 2005]
Community size, log k
Cond
ucta
nce,
log
Φ(k
)
[w/ Mahoney, Lang, Dasgupta, WWW ’08]
7
NCP plot: Large network Typical example:
General relativity collaboration network (4,158 nodes, 13,422 edges)
[w/ Mahoney, Lang, Dasgupta, WWW ’08]
8
More NCP plots of networks
[w/ Mahoney, Lang, Dasgupta, WWW ’08]
9
Φ(k
), (c
ondu
ctan
ce)
k, (community size)
NCP: LiveJournal (n=5m, e=42m)
Better and better
communities
Communities get worse and worse
Best community has ~100
nodes
[w/ Mahoney, Lang, Dasgupta, WWW ’08]
10
Community size is bounded!
Each dot is a different networkPractically constant!
[w/ Mahoney, Lang, Dasgupta, WWW ’08]
11
Structure of large networks
Core-periphery (jellyfish, octopus)
Small good
communities
Denser and denser core
of the network
Core contains ~60% nodes and ~80%
edges
So, what’s a good model?
12
Kronecker product: Definition Kronecker product of matrices A and B is given by
We define a Kronecker product of two graphs as a Kronecker product of their adjacency matrices
N x M K x L
N*K x M*L
[w/ Chakrabarti-Kleinberg-Faloutsos, PKDD ’05]
13
Kronecker graphs Kronecker graph: a growing sequence of
graphs by iterating the Kronecker product
Each Kronecker multiplication exponentially increases the size of the graph
One can easily use multiple initiator matrices (G1
’, G1’’, G1
’’’ ) that can be of different sizes
[w/ Chakrabarti-Kleinberg-Faloutsos, PKDD ’05]
14
Kronecker graphs
Kronecker graphs mimic real networks: Theorem: Power-law degree distribution, Densification,
References Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations, by
J. Leskovec, J. Kleinberg, C. Faloutsos, KDD 2005
Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication, by J. Leskovec, D. Chakrabarti, J. Kleinberg and C. Faloutsos, PKDD 2005
Scalable Modeling of Real Graphs using Kronecker Multiplication, by J. Leskovec and C. Faloutsos, ICML 2007
Statistical Properties of Community Structure in Large Social and Information Networks, by J. Leskovec, K. Lang, A. Dasgupta, M. Mahoney, WWW 2008
Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters, by J. Leskovec, K. Lang, A. Dasgupta, M. Mahoney, Arxiv 2008