Outline Problem Motivations Model Formulation Experiments Conclusion References Semi-blind Subgraph Reconstruction in Gaussian Graphical Models †Tianpei Xie, ?Sijia Liu, ?Alfred O. Hero †Transaction Risk Management Team @ Amazon 1 ?University of Michigan, Ann Arbor 1 This work was completed while Tianpei Xie was a Phd student at University of Michigan 1 / 34
33
Embed
Semi-blind Subgraph Reconstruction in Gaussian Graphical ... · OutlineProblem MotivationsModel FormulationExperimentsConclusionReferences Semi-blind Subgraph Reconstruction in Gaussian
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Outline Problem Motivations Model Formulation Experiments Conclusion References
Semi-blind Subgraph Reconstruction in Gaussian GraphicalModels
†Tianpei Xie, ?Sijia Liu, ?Alfred O. Hero
†Transaction Risk Management Team @ Amazon 1
?University of Michigan, Ann Arbor
1This work was completed while Tianpei Xie was a Phd student at University of Michigan1 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
1 Problem Motivations
2 Model Formulation
3 Experiments
4 Conclusion
2 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
Backgrounds
• Learning a dependency graph from relational data is a key step in datavisualization and analysis. Examples include
1 recommendation system2 social network analysis [Goyal et al., 2010]3 sensor network analysis [Joshi and Boyd, 2009, Liu et al., 2016]
• However, in many situations, only a limited set of data is accessible, dueto• the limited budgets during data collections (e.g. labor, energy)• the restricted accessibility to data sources (e.g. data security, privacy)
• Semi-blinded subgraph topology learning problem: only see data ona subgraph but blind to the rest.
3 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
Semi-blinded subgraph topology learning problem
4 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
Challenges
• Challenges:• The influence of external latent data⇒ the target network⇒ bias in
inference
probabilistic models: marginalization⇒ false positives in edge detection
Figure: The red nodes are conditional independent given the blue node. Aftermarginalizing the blue node, it creates a false connection in the graph
• Assumption: additional information from external sources⇒ summaryinfo. of latent data
5 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
Settings
• Random graph signal x ∈ Rn ∼ N(0,Θ−1), Markovian w.r.t. G = (V,E),|V| = n⇒ xi ⊥⊥ xj |x−{i,j} ⇔Θi,j = 0 iff (i, j) < E.
• Consider• partition of V = V1 ∪ V2, non-overlapping, |V1| = n1, |V2| = n2, edge
Outline Problem Motivations Model Formulation Experiments Conclusion References
Experiments
• Compare algorithms:• DiLat-GGM• GLasso [Friedman et al., 2008]• LV-GGM [Chandrasekaran et al., 2012]• EM-GLasso [Yuan, 2012].• Generalized Laplacian learning (GenLap) [Pavez and Ortega, 2016]
• m i.i.d realizations of x = [x1, . . . , xn]. m = 400.
• Three types of graphs:1 The complete binary tree (h :=height)2 The grid (w :=width, h :=height)3 The Erdos-Renyi (n,p)
• The Jaccard distance error [Jaccard, 1901, Choi et al., 2010] for edgeselection: between two sets A ,B as
distJ(A ,B) = 1 −|A ∩ B |
|A ∪ B |∈ [0,1].
1 A := non-zero support set of estimated Θ1
2 B := E1, the ground true edge set
16 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
Experiments
• Compare algorithms:• DiLat-GGM• GLasso [Friedman et al., 2008]• LV-GGM [Chandrasekaran et al., 2012]• EM-GLasso [Yuan, 2012].• Generalized Laplacian learning (GenLap) [Pavez and Ortega, 2016]
• m i.i.d realizations of x = [x1, . . . , xn]. m = 400.
• Three types of graphs:1 The complete binary tree (h :=height)2 The grid (w :=width, h :=height)3 The Erdos-Renyi (n,p)
• The Jaccard distance error [Jaccard, 1901, Choi et al., 2010] for edgeselection: between two sets A ,B as
distJ(A ,B) = 1 −|A ∩ B |
|A ∪ B |∈ [0,1].
1 A := non-zero support set of estimated Θ1
2 B := E1, the ground true edge set
17 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
Comparison of mean edge selection error
18 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
Comparison of Learned Network
(a) Ground truth (b) GLasso
(c) LV-GGM (d) DiLat-GGM19 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
Conclusion
• We propose the DiLat-GGM as a generalization of the LV-GGM
• The proposed model learns network topology given internal data and asummary of latent factors from external source
• Efficient algorithm based on CCP is proposed
• Future research direction: large-scale network learning, hierarchicalmodels
20 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
Thank you !
21 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
References I
Venkat Chandrasekaran, Pablo A Parrilo, and Alan S Willsky. Latent variablegraphical model selection via convex optimization. The Annals of Statistics,40(4):1935–1967, 2012.
Seung-Seok Choi, Sung-Hyuk Cha, and Charles C Tappert. A survey ofbinary similarity and distance measures. Journal of Systemics,Cybernetics and Informatics, 8(1):43–48, 2010.
Jerome Friedman, Trevor Hastie, and Robert Tibshirani. Sparse inversecovariance estimation with the graphical lasso. Biostatistics, 9(3):432–441,2008.
Amit Goyal, Francesco Bonchi, and Laks VS Lakshmanan. Learninginfluence probabilities in social networks. In Proceedings of the third ACMinternational conference on Web search and data mining, pages 241–250.ACM, 2010.
Cho-Jui Hsieh, Inderjit S Dhillon, Pradeep K Ravikumar, and Matyas ASustik. Sparse inverse covariance matrix estimation using quadraticapproximation. Advances in neural information processing systems, pages2330–2338, 2011.
22 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
References II
Paul Jaccard. Etude comparative de la distribution florale dans une portiondes Alpes et du Jura. Impr. Corbaz, 1901.
Siddharth Joshi and Stephen Boyd. Sensor selection via convex optimization.IEEE Transactions on Signal Processing, 57(2):451–462, 2009.
Thomas Lipp and Stephen Boyd. Variations and extension of theconvex–concave procedure. Optimization and Engineering, 17(2):263–287, 2016.
Sijia Liu, Sundeep Prabhakar Chepuri, Makan Fardad, Engin Masazade,Geert Leus, and Pramod K Varshney. Sensor selection for estimation withcorrelated measurement noise. IEEE Transactions on Signal Processing,64(13):3509–3522, 2016.
Goran Marjanovic and Alfred O Hero. `0 sparse inverse covarianceestimation. IEEE Transactions on Signal Processing, 63(12):3218–3231,2015.
Eduardo Pavez and Antonio Ortega. Generalized laplacian precision matrixestimation for graph signal processing. In 2016 IEEE InternationalConference on Acoustics, Speech and Signal Processing (ICASSP), pages6350–6354. IEEE, 2016.
23 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
References III
Ming Yuan. Discussion: Latent variable graphical model selection via convexoptimization. The Annals of Statistics, 40(4):1968–1972, 2012.
Alan L Yuille, Anand Rangarajan, and AL Yuille. The concave-convexprocedure (cccp). Advances in neural information processing systems, 2:1033–1040, 2002.
24 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
DiLat-GGM as Difference-of-Convex program
minC,B
− log det(
C − BΘ2BT)+ tr
(Σ1C
)︸ ︷︷ ︸
f(C,B) convex
− tr(Σ1BΘ2BT
)︸ ︷︷ ︸
g(B) convex
+regularizer
s.t. C − BΘ2BT � 0,
• f(C,B) = − log det
[C BBT Θ
−12
]+ tr
(Σ1C
)convex
g(B) = vec(
BT)T (
Σ1 ⊗ Θ2
)vec
(BT)
convex
• can be solved via convex-concave procedure (CCP) [Yuille et al.,2002, Lipp and Boyd, 2016].
25 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
The convex sub-problem
At iteration t ,
(C t+1,B t+1) = minC,B
. . .+ tr(Σ1
(C − 2BDT
t
))(1)
s.t. . . .
where ∇Bg(B t) = 2Σ1B tΘ2, D t := B tΘ2.
• SDP problem⇒ convex
• CCP is a special form of Majorization-minimization (MM) algorithm.
• Guarantee to converge to local stationary point (regardless of choice ofinitial point)
• SDP time complexity O(n6.5)⇒ an efficient solver based on ADMM,O(n3)
26 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
Solving sub-problem using ADMM
• Define R :=
[C BBT Θ
−12
], P =
[P1 PT
21P21 P2
]:= R, W := Θ2P21
We reformulate the convex sub-problem as
minR,P,W
− log detR + tr (S t R) + 1 {R � 0}+ αm ‖P1‖1 + βm ‖W‖2,1 (2)
s.t. P2 = Θ−12
R = P
W = Θ2P21
where 1 {A } is the indicator function, S t :=
[Σ1 −Σ1D t
−DTt Σ1 γt I
]• ADMM solves three subproblems w.r.t. R,P,W iteratively
27 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
Sensitive to α,β
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Jaccard
dis
tance e
rror
=0.01
=0.1
=0.5
=1
=5
Erdos-Renyi n = 30,p = 0.16
28 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
Sensitivity to Θ2
10-1
100
101
102
103
SNR (dB)
0
0.2
0.4
0.6
0.8
1
Jaccard
dis
tance e
rror=0.2, =2.0
=0.1, =1.0=0.1, =2.0
• Θ2 = L2 + σ2G, where G = HHT/n2, Hi,j ∼ N(0,1), L2 is the inverse
covariance matrix of x2.
• The Signal-to-Noise Ratio (SNR) is defined as log
(‖L2‖2
Fσ2
)(dB)
29 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
Sensitivity to Θ2 (cond. correlated latent var.)
0.00
0.05
0.10
0.15
0.20ja
cca
rd d
ista
nce
dilat -ggm
lv-ggm
Erdos-Renyi
30 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
Sensitivity to Θ2 (cond. correlated latent var.)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
24
25
26
27
2829
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
2829
0
1
2
3
4
7
8
9
12
13
14
15
16
17
19
22
23
24
25
6
27
2829
31 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References
Sensitivity to Θ2 (cond. indep. latent var.)
Θ2 = L2 Θ2 = I0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14jacc
ard distance
dilat-ggm
lv-ggm
Complete binary tree
32 / 34
Outline Problem Motivations Model Formulation Experiments Conclusion References