- 1. Inferring Multiple Graph Structures Julien Chiquet1 , Yves
Grandvalet2 , ChristopheAmbroise11 Statistique et Genome, CNRS
& Universite dEvry Val dEssonne2`Heudiasyc, CNRS &
Universite de Technologie de Compiegne NeMo 21 juin 2010 Chiquet,
Grandvalet, Ambroise, arXiv preprint. Inferring multiple Gaussian
graphical structures. Chiquet, Grasseau, Charbonnier and Ambroise,
R-package SIMoNe.
http://stat.genopole.cnrs.fr/~jchiquet/fr/softwares/simoneInferring
Multiple Graph Structures 1
2. ProblemInference few arrays few examples lots of genes high
dimension Which interactions? interactions very high dimension The
main trouble is the low sample size and high dimensional setting
Our main hope is to benet from sparsity: few genes
interactInferring Multiple Graph Structures 2 3. Handling the
scarcity of dataMerge several experimental conditionsexperiment 1
experiment 2 experiment 3Inferring Multiple Graph Structures3 4.
Handling the scarcity of dataInferring each graph independently
does not help experiment 1 experiment 2 experiment 3(1)(1)
(2)(2)(3)(3)(X1 , . . . , Xn1 ) (X1 , . . . , Xn2 ) (X1 , . . . ,
Xn3 )inferenceinference inferenceInferring Multiple Graph
Structures 3 5. Handling the scarcity of dataBy pooling all the
available data experiment 1 experiment 2 experiment 3(X1 , . . . ,
Xn ), n = n1 + n2 + n3 . inferenceInferring Multiple Graph
Structures 3 6. Handling the scarcity of data experiment 1
experiment 2 experiment 3(1)(1)(2)(2) (3)(3)(X1 , . . . , Xn1 )(X1
, . . . , Xn2 )(X1 , . . . , Xn3 )inference
inferenceinferenceInferring Multiple Graph Structures 3 7. Handling
the scarcity of dataBy breaking the separability experiment 1
experiment 2experiment 3(1)(1) (2)(2) (3)(3)(X1 , . . . , Xn1 ) (X1
, . . . , Xn2 )(X1 , . . . , Xn3
)inferenceinferenceinferenceInferring Multiple Graph Structures3 8.
Handling the scarcity of dataBy breaking the separability
experiment 1 experiment 2experiment 3(1)(1) (2)(2) (3)(3)(X1 , . .
. , Xn1 ) (X1 , . . . , Xn2 )(X1 , . . . , Xn3
)inferenceinferenceinferenceInferring Multiple Graph Structures3 9.
OutlineStatistical modelMulti-task learningAlgorithms and
methodsModel selectionExperimentsInferring Multiple Graph
Structures 4 10. OutlineStatistical modelMulti-task
learningAlgorithms and methodsModel selectionExperimentsInferring
Multiple Graph Structures 5 11. Gaussian graphical modelingLetX =
(X1 , . . . , Xp ) N (0p , ) and assume n i.i.d. copies of X,X be
the n p matrix whose kth row is Xk , = (ij )i,jP 1 be the
concentration matrix.Graphical interpretationSince corij|P{i,j} =
ij / ii jj for i = j,ij = 0 Xi Xj |XP{i,j} oredge (i, j) network./
non zeroes in describes the graph structure.Inferring Multiple
Graph Structures6 12. The model likelihoodLet S = n1 X X be the
empirical variance-covariance matrix: Sis a sufcient statistic for
X L(; X) = L(; S)The log-likelihoodn n nL(; S) = log det() trace(S)
log(2).2 2 2The MLE of is S1not dened for n < pnot sparse fully
connected graphInferring Multiple Graph Structures 7 13. Penalized
ApproachesPenalized Likelihood (Banerjee et al., 2008)max L(; S)
1S+well dened for n < psparse sensible graphSDP of size O(p2 )
(solved by Friedman et al., 2007) Neighborhood Selection
(Meinshausen & Bulhman, 2006)12 = argminXj Xj 2 + 1 Rp1 nwhere
Xj is the jth column of X and Xj is X deprived of Xjnot symmetric,
not positive-denitep independent L ASSO problems of size (p
1)Inferring Multiple Graph Structures8 14. Penalized
ApproachesPenalized Likelihood (Banerjee et al., 2008)max L(; S)
1S+well dened for n < psparse sensible graphSDP of size O(p2 )
(solved by Friedman et al., 2007) Neighborhood Selection
(Meinshausen & Bulhman, 2006)12 = argminXj Xj 2 + 1 Rp1 nwhere
Xj is the jth column of X and Xj is X deprived of Xjnot symmetric,
not positive-denitep independent L ASSO problems of size (p
1)Inferring Multiple Graph Structures8 15. Neighborhood vs.
LikelihoodPseudo-likelihood (Besag, 1975) pP(X1 , . . . , Xp ) P(Xj
|{Xk }k=j )j=1n nnlog det(D) L(; S) = trace SD1 2 log(2)2 2 2n
nnL(; S) = log det() trace(S) log(2)2 22with D = diag().Proposition
(Ambroise, Chiquet, Matias, 2008)Neighborhood selection leads to
the graph maximizing thepenalized pseudo-log-likelihood Proof: i =
ij , where = arg max L(; S) 1jjInferring Multiple Graph Structures9
16. Neighborhood vs. LikelihoodPseudo-likelihood (Besag, 1975)
pP(X1 , . . . , Xp ) P(Xj |{Xk }k=j )j=1n nnlog det(D) L(; S) =
trace SD1 2 log(2)2 2 2n nnL(; S) = log det() trace(S) log(2)2
22with D = diag().Proposition (Ambroise, Chiquet, Matias,
2008)Neighborhood selection leads to the graph maximizing
thepenalized pseudo-log-likelihood Proof: i = ij , where = arg max
L(; S) 1jjInferring Multiple Graph Structures9 17.
OutlineStatistical modelMulti-task learningAlgorithms and
methodsModel selectionExperimentsInferring Multiple Graph
Structures 10 18. Multi-task learningWe have T samples
(experimental cond.) of the same variablesX(t) is the tth data
matrix, S(t) is the empirical covarianceexamples are assumed to be
drawn from N (0, (t) )Ignoring the relationships between the tasks
leads to separableobjectivesmaxL((t) ; S(t) ) (t) 1(t) Rpp
,t=1...,TMulti-task learning = solving the T tasks jointlyWe may
couple the objectivesthrough the tting term term,through the
penalty term.Inferring Multiple Graph Structures 11 19. Multi-task
learningWe have T samples (experimental cond.) of the same
variablesX(t) is the tth data matrix, S(t) is the empirical
covarianceexamples are assumed to be drawn from N (0, (t) )Ignoring
the relationships between the tasks leads to
separableobjectivesmaxL((t) ; S(t) ) (t) 1(t) Rpp
,t=1...,TMulti-task learning = solving the T tasks jointlyWe may
couple the objectivesthrough the tting term term,through the
penalty term.Inferring Multiple Graph Structures 11 20. Coupling
through the tting termIntertwined L ASSOTmax L((t) ; S(t) ) (t) 1
(t) ,t...,Tt=11 T(t) is the pooled-tasks covariance matrix.S=n t=1
nt SS(t) =S(t) + (1 )S is a mixture between specic andpooled
covariance matrices. = 0 pools the data sets and infers a single
graph = 1 separates the data sets and infers T graphsindependently
= 1/2 in all our experimentsInferring Multiple Graph Structures12
21. Coupling through penalties: group-L ASSO X1 X2We group
parameters by sets ofcorresponding edges across graphs: X3
X4Graphical group-L ASSOTT 1/2 (t) (t)(t) 2max L ;S ij (t)
,t...,Tt=1i,j t=1i=jSparsity pattern shared between graphsIdentical
graphs across tasksInferring Multiple Graph Structures13 22.
Coupling through penalties: group-L ASSO X1X2X1X2We group
parameters by sets of X1X1 X2X2corresponding edges across graphs:
X3X4X3X4 X3X4X3X4Graphical group-L ASSOTT1/2 (t) (t)(t) 2max L ;S
ij (t) ,t...,Tt=1i,j t=1i=jSparsity pattern shared between
graphsIdentical graphs across tasksInferring Multiple Graph
Structures13 23. Coupling through penalties: group-L ASSO
X1X2X1X2We group parameters by sets of X1X1 X2X2corresponding edges
across graphs: X3X4X3X4 X3X4X3X4Graphical group-L ASSOTT1/2 (t)
(t)(t) 2max L ;S ij (t) ,t...,Tt=1i,j t=1i=jSparsity pattern shared
between graphsIdentical graphs across tasksInferring Multiple Graph
Structures13 24. Coupling through penalties: group-L ASSO
X1X2X1X2We group parameters by sets of X1X1 X2X2corresponding edges
across graphs: X3X4X3X4 X3X4X3X4Graphical group-L ASSOTT1/2 (t)
(t)(t) 2max L ;S ij (t) ,t...,Tt=1i,j t=1i=jSparsity pattern shared
between graphsIdentical graphs across tasksInferring Multiple Graph
Structures13 25. Coupling through penalties: group-L ASSO
X1X2X1X2We group parameters by sets of X1X1 X2X2corresponding edges
across graphs: X3X4X3X4 X3X4X3X4Graphical group-L ASSOTT1/2 (t)
(t)(t) 2max L ;S ij (t) ,t...,Tt=1i,j t=1i=jSparsity pattern shared
between graphsIdentical graphs across tasksInferring Multiple Graph
Structures13 26. Coupling through penalties: cooperative-L ASSOSame
grouping, and bet that X1X2correlations are likely to be
signconsistentGene interactions are eitherX3X4inhibitory or
activating across assaysGraphical cooperative-L ASSOT T1 T1
(t)(t)(t) 2 2(t) 2 2max L S ;ij+ ij(t) +t=1,...,T t=1 i,j t=1t=1
i=jwhere [u]+ = max(0, u) and [u] = min(0, u).Plausible in many
other situationsSparsity pattern shared between graphs, which may
differInferring Multiple Graph Structures 14 27. Coupling through
penalties: cooperative-L ASSOSame grouping, and bet that
X1X2correlations are likely to be signconsistentGene interactions
are eitherX3X4inhibitory or activating across assaysGraphical
cooperative-L ASSOT T1 T1 (t)(t)(t) 2 2(t) 2 2max L S ;ij+ ij(t)
+t=1,...,T t=1 i,j t=1t=1 i=jwhere [u]+ = max(0, u) and [u] =
min(0, u).Plausible in many other situationsSparsity pattern shared
between graphs, which may differInferring Multiple Graph Structures
14 28. Coupling through penalties: cooperative-L ASSOSame grouping,
and bet that X1 X2 X1 X2correlations are likely to be signX1 X2 X1
X2consistentGene interactions are eitherX3 X4inhibitory or
activating across assays X3 X4X3 X4 X3 X4Graphical cooperative-L
ASSOT T 1 T1 (t)(t)(t) 22(t) 22max L S ;ij +ij(t) +t=1,...,T t=1
i,j t=1 t=1 i=jwhere [u]+ = max(0, u) and [u] = min(0, u).Plausible
in many other situationsSparsity pattern shared between graphs,
which may differInferring Multiple Graph Structures14 29. Coupling
through penalties: cooperative-L ASSOSame grouping, and bet that X1
X2 X1 X2correlations are likely to be signX1 X2 X1 X2consistentGene
interactions are eitherX3 X4inhibitory or activating across assays
X3 X4X3 X4 X3 X4Graphical cooperative-L ASSOT T 1 T1 (t)(t)(t)
22(t) 22max L S ;ij +ij(t) +t=1,...,T t=1 i,j t=1 t=1 i=jwhere [u]+
= max(0, u) and [u] = min(0, u).Plausible in many other
situationsSparsity pattern shared between graphs, which may
differInferring Multiple Graph Structures14 30. Coupling through
penalties: cooperative-L ASSOSame grouping, and bet that X1 X2 X1
X2correlations are likely to be signX1 X2 X1 X2consistentGene
interactions are eitherX3 X4inhibitory or activating across assays
X3 X4X3 X4 X3 X4Graphical cooperative-L ASSOT T 1 T1 (t)(t)(t)
22(t) 22max L S ;ij +ij(t) +t=1,...,T t=1 i,j t=1 t=1 i=jwhere [u]+
= max(0, u) and [u] = min(0, u).Plausible in many other
situationsSparsity pattern shared between graphs, which may
differInferring Multiple Graph Structures14 31. Coupling through
penalties: cooperative-L ASSOSame grouping, and bet that X1 X2 X1
X2correlations are likely to be signX1 X2 X1 X2consistentGene
interactions are eitherX3 X4inhibitory or activating across assays
X3 X4X3 X4 X3 X4Graphical cooperative-L ASSOT T 1 T1 (t)(t)(t)
22(t) 22max L S ;ij +ij(t) +t=1,...,T t=1 i,j t=1 t=1 i=jwhere [u]+
= max(0, u) and [u] = min(0, u).Plausible in many other
situationsSparsity pattern shared between graphs, which may
differInferring Multiple Graph Structures14 32. Coupling through
penalties: cooperative-L ASSOSame grouping, and bet that X1 X2 X1
X2correlations are likely to be signX1 X2 X1 X2consistentGene
interactions are eitherX3 X4inhibitory or activating across assays
X3 X4X3 X4 X3 X4Graphical cooperative-L ASSOT T 1 T1 (t)(t)(t)
22(t) 22max L S ;ij +ij(t) +t=1,...,T t=1 i,j t=1 t=1 i=jwhere [u]+
= max(0, u) and [u] = min(0, u).Plausible in many other
situationsSparsity pattern shared between graphs, which may
differInferring Multiple Graph Structures14 33. OutlineStatistical
modelMulti-task learningAlgorithms and methodsModel
selectionExperimentsInferring Multiple Graph Structures 15 34. A
Geometric View of Sparsity Constrained Optimization L(1 , 2 ) max
L(1 , 2 ) (1 , 2 ) 1 ,2 21Inferring Multiple Graph Structures 16
35. A Geometric View of Sparsity Constrained Optimization L(1 , 2 )
max L(1 , 2 ) (1 , 2 ) 1 ,2 max L(1 , 2 ) 1 ,2 s.t. (1 , 2 ) c
21Inferring Multiple Graph Structures 16 36. A Geometric View of
Sparsity Constrained Optimizationmax L(1 , 2 ) (1 , 2 )1 ,2max L(1
, 2 ) 1 ,2s.t. (1 , 2 ) c 21Inferring Multiple Graph Structures16
37. A Geometric View of Sparsity Supporting HyperplaneAn hyperplane
supports a set iffthe set is contained in one half-spacethe set has
at least one point on the hyperplane 21Inferring Multiple Graph
Structures17 38. A Geometric View of Sparsity Supporting
HyperplaneAn hyperplane supports a set iffthe set is contained in
one half-spacethe set has at least one point on the hyperplane 221
1Inferring Multiple Graph Structures17 39. A Geometric View of
Sparsity Supporting HyperplaneAn hyperplane supports a set iffthe
set is contained in one half-spacethe set has at least one point on
the hyperplane 221 1Inferring Multiple Graph Structures17 40. A
Geometric View of Sparsity Supporting HyperplaneAn hyperplane
supports a set iffthe set is contained in one half-spacethe set has
at least one point on the hyperplane 221 1Inferring Multiple Graph
Structures17 41. A Geometric View of Sparsity Supporting
HyperplaneAn hyperplane supports a set iffthe set is contained in
one half-spacethe set has at least one point on the hyperplane 22
21 11 There are Supporting Hyperplane at all points of convex
sets:Generalize tangentsInferring Multiple Graph Structures17 42. A
Geometric View of Sparsity Dual ConeGeneralizes normals 22211
1Inferring Multiple Graph Structures 18 43. A Geometric View of
Sparsity Dual ConeGeneralizes normals 22211 1Inferring Multiple
Graph Structures 18 44. A Geometric View of Sparsity Dual
ConeGeneralizes normals 22211 1Inferring Multiple Graph Structures
18 45. A Geometric View of Sparsity Dual ConeGeneralizes normals
222111 Shape of dual cones sparsity patternInferring Multiple Graph
Structures18 46. Group-L ASSO balls(2)(2) 2=0 2 = 0.3Admissible set
2 tasks (T = 2) 1 12 coefcients (p = 2) =0(1)(1) 2 11 2 11(2)Unit
ball 11 1 (1) (1)1 12 21/2(t) 2i 11 1 i=1t=1 = 0.3(1)(1) 2 11 2
11(2) 11 1 (1) (1)1 1Inferring Multiple Graph Structures19 47.
Group-L ASSO balls(2)(2) 2=0 2 = 0.3Admissible set 2 tasks (T = 2)
1 12 coefcients (p = 2) =0(1)(1) 2 11 2 11(2)Unit ball 11 1 (1)
(1)1 12 21/2(t) 2i 11 1 i=1t=1 = 0.3(1)(1) 2 11 2 11(2) 11 1 (1)
(1)1 1Inferring Multiple Graph Structures19 48. Group-L ASSO
balls(2)(2) 2=0 2 = 0.3Admissible set 2 tasks (T = 2) 1 12
coefcients (p = 2) =0(1)(1) 2 11 2 11(2)Unit ball 11 1 (1) (1)1 12
21/2(t) 2i 11 1 i=1t=1 = 0.3(1)(1) 2 11 2 11(2) 11 1 (1) (1)1
1Inferring Multiple Graph Structures19 49. Group-L ASSO balls(2)(2)
2=0 2 = 0.3Admissible set 2 tasks (T = 2) 1 12 coefcients (p = 2)
=0(1)(1) 2 11 2 11(2)Unit ball 11 1 (1) (1)1 12 21/2(t) 2i 11 1
i=1t=1 = 0.3(1)(1) 2 11 2 11(2) 11 1 (1) (1)1 1Inferring Multiple
Graph Structures19 50. Cooperative-L ASSO balls(2)(2) 2=0 2 =
0.3Admissible set 2 tasks (T = 2)2 coefcients (p = 2) 1 1Unit ball
=0(1)(1) 2 11 2 11(2)1/2 12 2 (t) 21 1 (1) (1)j1 1 +j=1 t=1 2 2 1/2
1 1 (t) 2 = 0.3+j 1 (1) 2 (1)2 +1111j=1t=1(2) 11 1 (1) (1)1
1Inferring Multiple Graph Structures20 51. Cooperative-L ASSO
balls(2)(2) 2=0 2 = 0.3Admissible set 2 tasks (T = 2)2 coefcients
(p = 2) 1 1Unit ball =0(1)(1) 2 11 2 11(2)1/2 12 2 (t) 21 1 (1)
(1)j1 1 +j=1 t=1 2 2 1/2 1 1 (t) 2 = 0.3+j 1 (1) 2 (1)2
+1111j=1t=1(2) 11 1 (1) (1)1 1Inferring Multiple Graph Structures20
52. Cooperative-L ASSO balls(2)(2) 2=0 2 = 0.3Admissible set 2
tasks (T = 2)2 coefcients (p = 2) 1 1Unit ball =0(1)(1) 2 11 2
11(2)1/2 12 2 (t) 21 1 (1) (1)j1 1 +j=1 t=1 2 2 1/2 1 1 (t) 2 =
0.3+j 1 (1) 2 (1)2 +1111j=1t=1(2) 11 1 (1) (1)1 1Inferring Multiple
Graph Structures20 53. Cooperative-L ASSO balls(2)(2) 2=0 2 =
0.3Admissible set 2 tasks (T = 2)2 coefcients (p = 2) 1 1Unit ball
=0(1)(1) 2 11 2 11(2)1/2 12 2 (t) 21 1 (1) (1)j1 1 +j=1 t=1 2 2 1/2
1 1 (t) 2 = 0.3+j 1 (1) 2 (1)2 +1111j=1t=1(2) 11 1 (1) (1)1
1Inferring Multiple Graph Structures20 54. Cooperative-L ASSO
balls(2)(2) 2=0 2 = 0.3Admissible set 2 tasks (T = 2)2 coefcients
(p = 2) 1 1Unit ball =0(1)(1) 2 11 2 11(2)1/2 12 2 (t) 21 1 (1)
(1)j1 1 +j=1 t=1 2 2 1/2 1 1 (t) 2 = 0.3+j 1 (1) 2 (1)2
+1111j=1t=1(2) 11 1 (1) (1)1 1Inferring Multiple Graph Structures20
55. Cooperative-L ASSO balls(2)(2) 2=0 2 = 0.3Admissible set 2
tasks (T = 2)2 coefcients (p = 2) 1 1Unit ball =0(1)(1) 2 11 2
11(2)1/2 12 2 (t) 21 1 (1) (1)j1 1 +j=1 t=1 2 2 1/2 1 1 (t) 2 =
0.3+j 1 (1) 2 (1)2 +1111j=1t=1(2) 11 1 (1) (1)1 1Inferring Multiple
Graph Structures20 56. Decomposition strategy Estimate the j th
neighborhood of the T graphsTmax L(K(t) ; S(t) ) (K(t) ) K(t)
,t=1...,Tt=1decomposes into p convex optimization problems of size
j = argmin fj () + ()RT (p1)where j is a minimizer iff 0 fj () +
()Inferring Multiple Graph Structures21 57. Decomposition strategy
Estimate the j th neighborhood of the T graphsTmax L(K(t) ; S(t) )
(K(t) ) K(t) ,t=1...,Tt=1decomposes into p convex optimization
problems of size j = argmin fj () + ()RT (p1)where j is a minimizer
iff 0 fj () + ()Group-L ASSO: p1[1:T ]() = i 2 i=1 [1:T ]where i is
the vector corresponding to the edges (i, j) acrossgraphsInferring
Multiple Graph Structures21 58. Decomposition strategy Estimate the
j th neighborhood of the T graphsTmax L(K(t) ; S(t) ) (K(t) ) K(t)
,t=1...,Tt=1decomposes into p convex optimization problems of size
j = argmin fj () + ()RT (p1)where j is a minimizer iff 0 fj () +
()Coop-L ASSO:p1[1:T ][1:T ] () = i + i + 2 + 2i=1 [1:T ]where i is
the vector corresponding to the edges (i, j) acrossgraphsInferring
Multiple Graph Structures21 59. Active set algorithm:yellow belt//
0. INITIALIZATION 0, A while 0 L() do/// 1. MASTER PROBLEM:
OPTIMIZATION WITH RESPECT TO AFind a solution h to the smooth
problem h f ( A + h) + h ( A + h) = 0,where h = { h } . A A + h//
2. IDENTIFY NEWLY ZEROED VARIABLESA A{i}// 3. IDENTIFY NEW NON-ZERO
VARIABLES// Select a candidate i Acf ()i arg max vj , where vj =
minj + jAc jendInferring Multiple Graph Structures 22 60. Active
set algorithm:orange belt// 0. INITIALIZATION 0, A while 0 L()
do/// 1. MASTER PROBLEM: OPTIMIZATION WITH RESPECT TO AFind a
solution h to the smooth problem h f ( A + h) + h ( A + h) =
0,where h = { h } . A A + h// 2. IDENTIFY NEWLY ZEROED VARIABLESA
A{i}// 3. IDENTIFY NEW NON-ZERO VARIABLES// Select a candidate i Ac
which violates the more the optimalityconditionsf ()i arg max vj ,
where vj = minj + jAc jif it exists such an i then A A {i}else Stop
and return , which is optimalendendInferring Multiple Graph
Structures 22 61. Active set algorithm: green belt// 0.
INITIALIZATION 0, A while 0 L() do/// 1. MASTER PROBLEM:
OPTIMIZATION WITH RESPECT TO AFind a solution h to the smooth
problem h f ( A + h) + h ( A + h) = 0, where h = { h } . A A + h//
2. IDENTIFY NEWLY ZEROED VARIABLESf ()while i A : i = 0 and min + =
0 do i i A A{i}end// 3. IDENTIFY NEW NON-ZERO VARIABLES// Select a
candidate i Ac such that an infinitesimal change of iprovides the
highest reduction of L f ()i arg max vj , where vj = min + j jAcjif
vi = 0 thenA A {i}elseStop and return , which is
optimalendendInferring Multiple Graph Structures22 62.
OutlineStatistical modelMulti-task learningAlgorithms and
methodsModel selectionExperimentsInferring Multiple Graph
Structures 23 63. Tuning the penalty parameter What does the
literature say?Theory based penalty choices 1. Optimal order of
penalty in the pn framework: n log p Bunea et al. 2007, Bickel et
al. 20092. Control on the probability of connecting two distinct
connectivity setsMeinshausen et al. 2006, Banerjee et al. 2008,
Ambroise et al. 2009practically much too
conservativeCross-validationOptimal in terms of prediction, not in
terms of selectionProblematic with small samples:changes the
sparsity constraint due to sample sizeInferring Multiple Graph
Structures24 64. Tuning the penalty parameter BIC / AICTheorem (Zou
et al. 2008)lasso lasso df( ) = 0Straightforward extensions to the
graphical framework log nBIC() = L( ; X) df( ) 2 AIC() = L( ; X)
df( )Rely on asymptotic approximations, but still relevant for
smalldata setInferring Multiple Graph Structures25 65.
OutlineStatistical modelMulti-task learningAlgorithms and
methodsModel selectionExperimentsInferring Multiple Graph
Structures 26 66. Data GenerationWe setthe number of nodes pthe
number of edges Kthe number of examples nProcess 1. Generate a
random adjacency matrix with 2 Koff-diagonal terms2. Compute the
normalized Laplacian L3. Generate a symmetric matrix of random
signs R4. Compute the concentration matrix Kij = Lij Rij5. compute
by pseudo-inversion of K6. generate correlated Gaussian data N (0,
)Inferring Multiple Graph Structures 27 67. Simulating Related
TasksGenerate 1. an ancestor with p = 20 nodes and K = 20 edges2. T
= 4 children by adding and deleting edges3. T = 4 Gaussian
samplesFigure: ancestor and children with = 2
perturbationsInferring Multiple Graph Structures28 68. Simulating
Related TasksGenerate 1. an ancestor with p = 20 nodes and K = 20
edges2. T = 4 children by adding and deleting edges3. T = 4
Gaussian samplesFigure: ancestor and children with = 2
perturbationsInferring Multiple Graph Structures28 69. Simulating
Related TasksGenerate 1. an ancestor with p = 20 nodes and K = 20
edges2. T = 4 children by adding and deleting edges3. T = 4
Gaussian samplesFigure: ancestor and children with = 2
perturbationsInferring Multiple Graph Structures28 70. Simulation
results Precision/Recall curve ROC curve precision = TP/(TP+FP)
fallout = FP/N (type I error)recall = TP/P (power) recall = TP/P
(power)Inferring Multiple Graph Structures 29 71. Simulation
results large sample sizepenalty: max 0 penalty: max 01.0 1.00.8
0.80.6 0.6precisionrecall0.4 0.4CoopLasso CoopLasso0.2
0.2GroupLassoGroupLassoIntertwined IntertwinedIndependent
IndependentPooledPooled0.0 0.00.0 0.2 0.4 0.6 0.8 1.00.00.20.40.6
0.8 1.0 recall falloutFigure: nt = 100, = 1Inferring Multiple Graph
Structures 29 72. Simulation results large sample sizepenalty: max
0 penalty: max 01.0 1.00.8 0.80.6 0.6precisionrecall0.4
0.4CoopLasso CoopLasso0.2 0.2GroupLassoGroupLassoIntertwined
IntertwinedIndependent IndependentPooledPooled0.0 0.00.0 0.2 0.4
0.6 0.8 1.00.00.20.40.6 0.8 1.0 recall falloutFigure: nt = 100, =
3Inferring Multiple Graph Structures 29 73. Simulation results
large sample sizepenalty: max 0 penalty: max 01.0 1.00.8 0.80.6
0.6precisionrecall0.4 0.4CoopLasso CoopLasso0.2
0.2GroupLassoGroupLassoIntertwined IntertwinedIndependent
IndependentPooledPooled0.0 0.00.0 0.2 0.4 0.6 0.8 1.00.00.20.40.6
0.8 1.0 recall falloutFigure: nt = 100, = 5Inferring Multiple Graph
Structures 29 74. Simulation results medium sample sizepenalty: max
0penalty: max 01.01.00.80.80.60.6precision
recall0.40.4CoopLassoCoopLasso0.20.2GroupLasso
GroupLassoIntertwinedIntertwinedIndependentIndependentPooled
Pooled0.00.00.0 0.2 0.4 0.60.8 1.00.00.20.40.6 0.8 1.0
recallfalloutFigure: nt = 50, = 1Inferring Multiple Graph
Structures29 75. Simulation results medium sample sizepenalty: max
0penalty: max 01.01.00.80.80.60.6precision
recall0.40.4CoopLassoCoopLasso0.20.2GroupLasso
GroupLassoIntertwinedIntertwinedIndependentIndependentPooled
Pooled0.00.00.0 0.2 0.4 0.60.8 1.00.00.20.40.6 0.8 1.0
recallfalloutFigure: nt = 50, = 3Inferring Multiple Graph
Structures29 76. Simulation results medium sample sizepenalty: max
0penalty: max 01.01.00.80.80.60.6precision
recall0.40.4CoopLassoCoopLasso0.20.2GroupLasso
GroupLassoIntertwinedIntertwinedIndependentIndependentPooled
Pooled0.00.00.0 0.2 0.4 0.60.8 1.00.00.20.40.6 0.8 1.0
recallfalloutFigure: nt = 50, = 5Inferring Multiple Graph
Structures29 77. Simulation results small sample sizepenalty: max
0penalty: max 01.01.00.80.80.60.6precision
recall0.40.4CoopLassoCoopLasso0.20.2GroupLasso
GroupLassoIntertwinedIntertwinedIndependentIndependentPooled
Pooled0.00.00.0 0.2 0.4 0.60.8 1.00.00.20.40.6 0.8 1.0
recallfalloutFigure: nt = 25, = 1Inferring Multiple Graph
Structures29 78. Simulation results small sample sizepenalty: max
0penalty: max 01.01.00.80.80.60.6precision
recall0.40.4CoopLassoCoopLasso0.20.2GroupLasso
GroupLassoIntertwinedIntertwinedIndependentIndependentPooled
Pooled0.00.00.0 0.2 0.4 0.60.8 1.00.00.20.40.6 0.8 1.0
recallfalloutFigure: nt = 25, = 3Inferring Multiple Graph
Structures29 79. Simulation results small sample sizepenalty: max
0penalty: max 01.01.00.80.80.60.6precision
recall0.40.4CoopLassoCoopLasso0.20.2GroupLasso
GroupLassoIntertwinedIntertwinedIndependentIndependentPooled
Pooled0.00.00.0 0.2 0.4 0.60.8 1.00.00.20.40.6 0.8 1.0
recallfalloutFigure: nt = 25, = 5Inferring Multiple Graph
Structures29 80. Breast Cancer Prediction of the outcome of
preoperative chemotherapyTwo types of patientsPatient response can
be classied either as1. pathologic complete response (PCR)2.
residual disease (not PCR)Gene expression data133 patients (99 not
PCR, 34 PCR)26 identied genes (differential analysis)Inferring
Multiple Graph Structures 30 81. Package Democancer data:
Coop-LassoInferring Multiple Graph Structures 31 82. ConclusionsTo
sum-upClaried links between neighborhood selection and graphicalL
ASSOIdentied the relevance of Multi-Task Learning in
networkinferenceFirst methods for inferring multiple Gaussian
Graphical ModelsConsistent improvements upon the available baseline
solutionsAvailable in the R package SIMoNePerspectivesExplore
model-selection capabilitiesOther applications of the Cooperative-L
ASSOTheoretical analysis (uniqueness, selection
consistency)Inferring Multiple Graph Structures32