Trace Complexity of Network Inference Bruno Abrahao (Cornell) Flavio Chierichetti (Sapienza) Robert Kleinberg (Cornell) Alessandro Panconesi (Sapienza) Cornell University 1 Text Sapienza University Wednesday, August 14, 13
Jul 07, 2015
Trace Complexity ofNetwork Inference
Bruno Abrahao (Cornell)Flavio Chierichetti (Sapienza)Robert Kleinberg (Cornell)Alessandro Panconesi (Sapienza) Cornell University
1
Text
Sapienza University
Wednesday, August 14, 13
Influence and diffusion on networks
2Wednesday, August 14, 13
Influence and diffusion on networks
• Network Inference: Find influencers, improve marketing, prevent disease outbreaks, and forecast crimes
2Wednesday, August 14, 13
The Network Inference Problem
• Learning each edge independently - [Adar,Adamic‘2005]
• MLE-inspired approaches
- [Gomez-Rodriguez, Leskovec, Krause’2010]- [Gomez-Rodriguez, Balduzzi, Scholkopf’2011]- [Myers, Leskovec‘2011]- [Du et al.‘2012]
• Information theoretic - [Netrapalli, Sanghavi‘2012]- [Grippon, Rabbat‘2013]
3Wednesday, August 14, 13
The Network Inference Problem
• Learning each edge independently - [Adar,Adamic‘2005]
• MLE-inspired approaches
- [Gomez-Rodriguez, Leskovec, Krause’2010]- [Gomez-Rodriguez, Balduzzi, Scholkopf’2011]- [Myers, Leskovec‘2011]- [Du et al.‘2012]
• Information theoretic - [Netrapalli, Sanghavi‘2012]- [Grippon, Rabbat‘2013]
3
Our work
Wednesday, August 14, 13
The Network Inference Problem
• The relationship between the amount of data and the performance of inference algorithms is not well understood
4
What can be inferred? What amounts of resources are required? How hard is the inference task?
Wednesday, August 14, 13
Our goal
• Provide rigorous foundation to network inference
1. develop a measure that relates the amount of data to the performance of algorithms
2. give information-theoretic performance guarantees
3. develop more efficient algorithms
5Wednesday, August 14, 13
We assume an underlying cascade model
6
b
d
e
a
c
st = 0.0
Wednesday, August 14, 13
We assume an underlying cascade model
6
b
d
e
a
c
s
Pr{H} = p
Wednesday, August 14, 13
We assume an underlying cascade model
6
b
d
e
a
c
s
Pr{H} = p
Wednesday, August 14, 13
We assume an underlying cascade model
6
b
d
e
a
c
s
Exp(�)
Wednesday, August 14, 13
We assume an underlying cascade model
6
b
d
e
a
c
s!
Wednesday, August 14, 13
We assume an underlying cascade model
6
b
d
e
a
c
s
c
Wednesday, August 14, 13
We assume an underlying cascade model
6
b
d
e
a
c
s
c
a b
Wednesday, August 14, 13
We assume an underlying cascade model
6
b
d
e
a
c
s
c
a b
Node s
t=0.0Node c
t=0.345Node a
t=1.236
Node b
t=1.705Trace
Wednesday, August 14, 13
Traces and cascades
• Each cascade generates one trace
• Random cascade: starts at a node chosen uniformly at random (assumption in some of our models)
• Traces do not directly reflect the underlying network over which the cascade propagates
7
Node s
t=0.0Node c
t=0.345Node a
t=1.236
Node b
t=1.705
Wednesday, August 14, 13
Traces and cascades
• Each cascade generates one trace
• Random cascade: starts at a node chosen uniformly at random (assumption in some of our models)
• Traces do not directly reflect the underlying network over which the cascade propagates
7
Node s
t=0.0Node c
t=0.345Node a
t=1.236
Node b
t=1.705
How much structural information is contained in a trace?
Wednesday, August 14, 13
Our Research Question I
How many traces do we need to reconstruct the underlying network?
We call this measure the trace complexity of the problem.
8Wednesday, August 14, 13
Our Research Question II
How does trace length play a role for inference?
As we keep scanning the trace, it becomes less and less informative.
9
Node s
t=0.0Node c
t=0.345Node a
t=1.236
Node b
t=1.705
Wednesday, August 14, 13
Our Research Question II
How does trace length play a role for inference?
As we keep scanning the trace, it becomes less and less informative.
9
Node s
t=0.0Node c
t=0.345Node a
t=1.236
Node b
t=1.705
s
c
Wednesday, August 14, 13
Our Research Question II
How does trace length play a role for inference?
As we keep scanning the trace, it becomes less and less informative.
9
Node s
t=0.0Node c
t=0.345Node a
t=1.236
Node b
t=1.705
s
c
a??
Wednesday, August 14, 13
Our Research Question II
How does trace length play a role for inference?
As we keep scanning the trace, it becomes less and less informative.
9
Node s
t=0.0Node c
t=0.345Node a
t=1.236
Node b
t=1.705
s
c
a??
b
?? ?
Wednesday, August 14, 13
• First-Edge Algorithm- Infers the edge corresponding to the first two nodes in
each trace (and ignores the rest of the trace)
The head of the trace
10
Node s
t=0.0Node c
t=0.345Node a
t=1.236
Node b
t=1.705
Node d
t=1.725
Wednesday, August 14, 13
• First-Edge Algorithm- Infers the edge corresponding to the first two nodes in
each trace (and ignores the rest of the trace)
The head of the trace
10
Node s
t=0.0Node c
t=0.345
Wednesday, August 14, 13
1. The head of traces• First-Edge is close to the best we can do for exact reconstruction
2. The tail of traces• We give algorithms using exponentially fewer traces
- trees- bounded degree graphs
3. Infer properties without reconstructing the network itself- degree distribution
Contributions
11
O(log n)
⌦(n�1�✏)
O(n)
O(poly(�) log n)
Wednesday, August 14, 13
How many traces do we need for exact reconstruction of general graphs?
12Wednesday, August 14, 13
Lower bound for exact reconstruction of general graphs
13
a b
c...
df
e
G0 = Kn
a b
c...
df
e
G1 = Kn � {a, b}
1. We choose the unknown graph in {G0, G1}
2. Run random cascades on the chosen graph
Wednesday, August 14, 13
14
a b
c...
df
e
G?
Given a set of ` random traces T1, . . . , T`,
Bayes’ rule can tell us which of the two alternatives
G0 or G1 is the most likely.
Lower bound for exact reconstruction of general graphs
Wednesday, August 14, 13
15
a b
c...
df
e
G?
Lower bound for exact reconstruction of general graphs
LemmaLet ` < n
2�✏. For any small positive constant ✏.
Then with prob. 1-o(1) over the random traces T1, ..., T`,
the posterior Pr{G0|T1, ..., T`} lies in [
12 � o(1),
12 + o(1)]
Wednesday, August 14, 13
16
Lower bound for exact reconstruction of general graphs
Corollary
If ` < n ·�1�✏, any algorithm will fail to reconstruct
the graph with high probability.
Let Δ be the largest degree of a node in the network
⌦(n�1�✏) traces are necessary
Wednesday, August 14, 13
The head of the trace
17
First-Edge reconstructs the graph with O(n� log n) traces.
First-Edge
O(n� log n)
Lower bound
⌦(n�1�✏)
First-Edge is close to the best we can do for exact reconstruction!
Wednesday, August 14, 13
Can we reconstruct special families of graphs using fewer traces?
18Wednesday, August 14, 13
• Useful information to reconstruct special graphs
• We give algorithms for inference using exponentially fewer traces.
- trees - bounded degree graphs
The tail of the trace
19
O(log n)
O(poly(�) log n)
Wednesday, August 14, 13
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using O(log n) traces.
Wednesday, August 14, 13
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using O(log n) traces.
Take ` traces
Wednesday, August 14, 13
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using O(log n) traces.
Take ` traces1. Set c(u, v) as the median of observations |t(u)� t(v)| over all traces
Wednesday, August 14, 13
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using O(log n) traces.
Take ` traces
u v
1. Set c(u, v) as the median of observations |t(u)� t(v)| over all traces
Wednesday, August 14, 13
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using O(log n) traces.
Take ` traces
u v{u,v} is the only route of
infection between u and v
1. Set c(u, v) as the median of observations |t(u)� t(v)| over all traces
Wednesday, August 14, 13
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using O(log n) traces.
Take ` traces
u v{u,v} is the only route of
infection between u and vIncubation time between u and v is a sample of Exp(λ)
1. Set c(u, v) as the median of observations |t(u)� t(v)| over all traces
Wednesday, August 14, 13
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using O(log n) traces.
Take ` traces
u v{u,v} is the only route of
infection between u and vIncubation time between u and v is a sample of Exp(λ)
1. Set c(u, v) as the median of observations |t(u)� t(v)| over all traces
If (u, v) 2 E, c(u, v) < ��1with prob. approaching 1 exponentially with `
Wednesday, August 14, 13
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using O(log n) traces.
Take ` traces
u v
1. Set c(u, v) as the median of observations |t(u)� t(v)| over all traces
(*Step 3 omitted)
Otherwise⇤, c(u, v) > ��1 with prob. approaching 1 exponentially with `
If (u, v) 2 E, c(u, v) < ��1with prob. approaching 1 exponentially with `
Wednesday, August 14, 13
Maximum Likelihood Tree Estimation
20
We can perfectly reconstruct trees with high probability using O(log n) traces.
Take ` traces
u v
1. Set c(u, v) as the median of observations |t(u)� t(v)| over all traces
(*Step 3 omitted)
Otherwise⇤, c(u, v) > ��1 with prob. approaching 1 exponentially with `
If (u, v) 2 E, c(u, v) < ��1with prob. approaching 1 exponentially with `
Prob. that all these events happen � 1� 1nc using ` � c · log n traces.
Wednesday, August 14, 13
Local MLE for inferring bounded degree graphs
• Think of the potential neighbor sets of u as “forecasters” predicting the infection time of u, given their own infection times
21
• Identify the most accurate using a proper scoring rule
Trace complexity O(poly(�) log n)
Wednesday, August 14, 13
Can we recover properties of a network without paying the full price of network reconstruction?
22Wednesday, August 14, 13
• Useful to reason about the behavior of processes that take place in the network
• Robustness [Cohen et al.’00]• Network evolution [Leskovec, Kleinberg, Faloutsos’05]• ...
Obtaining network properties cheaper
23
We can infer the degree distribution with high probability with O(n) traces.
Lower bound for reconstruct the whole network
⌦(n�1�✏)
Wednesday, August 14, 13
Reconstructing degree distribution
24
s
Wednesday, August 14, 13
Reconstructing degree distribution
24
s
t1
Trace 1 t1
Wednesday, August 14, 13
Reconstructing degree distribution
24
st2
Trace 1 t1
Trace 2 t2
Wednesday, August 14, 13
Reconstructing degree distribution
24
s
t3
Trace 1 t1
Trace 2 t2
Trace 3 t3
Wednesday, August 14, 13
Reconstructing degree distribution
24
s
Trace 1 t1
Trace 2 t2
Trace 3 t3...
Trace ` t`
t`
Wednesday, August 14, 13
Reconstructing degree distribution
24
s
Trace 1 t1
Trace 2 t2
Trace 3 t3...
Trace ` t`
Let d be the degree of s
• T =
P`i=1 ti is Erlang(`, d�)
Wednesday, August 14, 13
Reconstructing degree distribution
24
s
Pr{Erlang(n,�) < z} = Pr{Pois(z · �) � n}
Trace 1 t1
Trace 2 t2
Trace 3 t3...
Trace ` t`
Let d be the degree of s
• T =
P`i=1 ti is Erlang(`, d�)
Wednesday, August 14, 13
Reconstructing degree distribution
24
s
Pr{Erlang(n,�) < z} = Pr{Pois(z · �) � n}
Trace 1 t1
Trace 2 t2
Trace 3 t3...
Trace ` t`
Let d be the degree of s
• T =
P`i=1 ti is Erlang(`, d�)
Output: d̂ = `T�
We achieve (1 + ✏)-approximation
with probability 1� �
using ⌦⇣
ln ��1
✏2
⌘traces.
Using the Poisson tail bound
Wednesday, August 14, 13
Reconstructing degree distribution
• Using 10n traces
25
Barabasi-Albert1024 nodes
Facebook-Rice Undergraduate1220 nodes
Facebook-Rice Graduate503 nodes
Wednesday, August 14, 13
Building on the First-Edge algorithm
• First-Edge is close to optimal, but
• Naive and too conservative: Ignores most of the trace information
• predictable performance: At most as many true-positive edges as the number of traces (and no false positives)
26Wednesday, August 14, 13
Could we discover more true positives if we are willing to take more (calculated) risks?
27Wednesday, August 14, 13
First-Edge+
28Wednesday, August 14, 13
• Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property.
First-Edge+
28Wednesday, August 14, 13
• Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property.
First-Edge+
28
s
t0N(s) = ds
Wednesday, August 14, 13
• Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property.
First-Edge+
28
s
t0N(s) = ds
u
t1N(u) = du
Wednesday, August 14, 13
• Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds � 1 + du � 1 edges waiting at time t1
s
t0N(s) = ds
u
t1N(u) = du
Wednesday, August 14, 13
• Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds � 1 + du � 1 edges waiting at time t1
s
t0N(s) = ds
u
t1N(u) = du
Any of these are equally likely to be the first to finish
Wednesday, August 14, 13
• Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds � 1 + du � 1 edges waiting at time t1
v
?t2
s
t0N(s) = ds
u
t1N(u) = du
Wednesday, August 14, 13
• Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds � 1 + du � 1 edges waiting at time t1
v
?t2
s
t0N(s) = ds
u
t1N(u) = du s infected v with probability
p(s,v) =ds�1
ds+du�2
u infected v with probability
p(u,v) =du�1
ds+du�2
Wednesday, August 14, 13
• Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds � 1 + du � 1 edges waiting at time t1Infer (x, y) if p(x,y) � 0.5
v
?t2
s
t0N(s) = ds
u
t1N(u) = du s infected v with probability
p(s,v) =ds�1
ds+du�2
u infected v with probability
p(u,v) =du�1
ds+du�2
Wednesday, August 14, 13
• Idea: 1. Reconstruct degree distribution 2. Guess edges by exploring the memoryless property.
First-Edge+
28
ds � 1 + du � 1 edges waiting at time t1Infer (x, y) if p(x,y) � 0.5
Given a larger trace prefix u1, · · · , uk ( u1 is the source)
p(ui,uk+1) 'duiPj duj
v
?t2
s
t0N(s) = ds
u
t1N(u) = du s infected v with probability
p(s,v) =ds�1
ds+du�2
u infected v with probability
p(u,v) =du�1
ds+du�2
Wednesday, August 14, 13
Experimental Inference Results
29
Barabasi-Albert1024 nodes
� = 174
Power-law tree1024 nodes
� = 94
Facebook1220 nodes
� = 287
� = max. degree
-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf
-Netinf [Gomez-Rodriguez, Leskovec, Krause’2010]
-First-Edge-First-Edge+
Wednesday, August 14, 13
Experimental Inference Results
29
Barabasi-Albert1024 nodes
� = 174
Power-law tree1024 nodes
� = 94
Facebook1220 nodes
� = 287
� = max. degree
First-Edge+ exhibit competitive performance
-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf
-Netinf [Gomez-Rodriguez, Leskovec, Krause’2010]
-First-Edge-First-Edge+
Wednesday, August 14, 13
Experimental Inference Results
29
Barabasi-Albert1024 nodes
� = 174
Power-law tree1024 nodes
� = 94
Facebook1220 nodes
� = 287
� = max. degree
NetInf’s performance flattens
-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf
-Netinf [Gomez-Rodriguez, Leskovec, Krause’2010]
-First-Edge-First-Edge+
Wednesday, August 14, 13
Experimental Inference Results
29
Barabasi-Albert1024 nodes
� = 174
Power-law tree1024 nodes
� = 94
Facebook1220 nodes
� = 287
� = max. degree
Our algorithm perfectly reconstructs trees with ~30 traces
-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf
-Netinf [Gomez-Rodriguez, Leskovec, Krause’2010]
-First-Edge-First-Edge+
Wednesday, August 14, 13
Experimental Inference Results
29
Barabasi-Albert1024 nodes
� = 174
Power-law tree1024 nodes
� = 94
Facebook1220 nodes
� = 287
� = max. degree
• First-Edge+: competitive performance, extremely simple to implement, computationally efficient, preemptive.
-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf-First-Edge-First-Edge+-NetInf
-Netinf [Gomez-Rodriguez, Leskovec, Krause’2010]
-First-Edge-First-Edge+
Wednesday, August 14, 13
Conclusions
• Our results have direct implication in the design of network inference algorithms
• We provide rigorous analysis of the relationship between the amount of data and the performance of algorithms
• We give algorithms that are competitive with, while being simpler and more efficient than, existing approaches
30Wednesday, August 14, 13
Open questions and challenges
• Performance guarantees for approximated reconstruction
• Trace complexity under other distributions of incubation times
• Bounded degree network inference has trace complexity polynomial in Δ, but running time exponential in Δ
- Can we optimize the algorithm?
• Other network properties that can be recovered without reconstructing the network
31Wednesday, August 14, 13
Trace complexity ofNetwork Inference
Bruno Abrahao (Cornell)Flavio Chierichetti (Sapienza)Robert Kleinberg (Cornell)Alessandro Panconesi (Sapienza) Cornell University
32
Text
Sapienza University
Complete version including all proofswww.arxiv.org/abs/1308.2954
orhttp://www.cs.cornell.edu/~abrahao
Wednesday, August 14, 13