Page 1
Impact of Regularization on Spectral Clustering
Antony Joseph* and Bin Yu#
* Walmart Research Lab in San Francisco(formerly UCB and LBNL)
# Departments of Statistics and EECSUC Berkeley
Workshop on Spectral Algorithms, Simons Inst, Oct. , 2014
Monday, November 3, 2014
Page 2
/46
spectral clusteringin
graphs
Berkeley Drosophila Genome Project (BDGP)(The fruit fly project)
Overview
2
collaborators :
• Siqi Wu, UC Berkeley
• Erwin Frise, Lawrence Berkeley Lab
• Ann Hammonds, Lawrence Berkeley Lab
• Sue Celniker, Lawrence Berkeley Lab
Monday, November 3, 2014
Page 3
/46
spectral clusteringin
graphs
Berkeley Drosophila Genome Project (BDGP)(The fruit fly project)
Overview
2
collaborators :
• Siqi Wu, UC Berkeley
• Erwin Frise, Lawrence Berkeley Lab
• Ann Hammonds, Lawrence Berkeley Lab
• Sue Celniker, Lawrence Berkeley Lab
Monday, November 3, 2014
Page 4
/46
A Graph
Social network people
The fruit fly project pixels/points in the embryo template
...
Context Nodes
3Monday, November 3, 2014
Page 5
/46
The fruit fly project (Berkeley Drosophila Genome Project)
4Monday, November 3, 2014
Page 6
/46
Drosophila(fruit fly)
Widely studied :• genetic mechanism similar to humans • easy to maintain in the lab• short life cycle • ...
5Monday, November 3, 2014
Page 7
Image dataset from the fruit fly project
Monday, November 3, 2014
Page 8
Image dataset from the fruit fly project
tailless
Monday, November 3, 2014
Page 9
Image dataset from the fruit fly project
• Over 100,000 stained embryo images (over 7000 genes)
Monday, November 3, 2014
Page 10
Image dataset from the fruit fly project
• Over 100,000 stained embryo images (over 7000 genes)
• the interaction between different genes
• the genes required for development of various organs.
Goals: Contribute to the understanding of ...
Monday, November 3, 2014
Page 11
/46
‘Fate’ map in early embryos
Lohs-Schardin et. al (’70), Hartenstein et. al. (‘85)
Laser ablation experiments in embryos in early stages of development
7Monday, November 3, 2014
Page 12
/46
‘Fate’ map in early embryos
hind gut
anterior mid-gut
dorsal epidermis
ventral neurogenic region
procephalic neurogenic region
pharynx
mesoderm
esophagus
Lohs-Schardin et. al (’70), Hartenstein et. al. (‘85)
Laser ablation experiments in embryos in early stages of development
7Monday, November 3, 2014
Page 13
/46
Do genes explain the `fate’ map?
.... early stage gene expression images
8Monday, November 3, 2014
Page 14
/46
Discovery of fate map &
communities on graphs
9Monday, November 3, 2014
Page 15
/46
Discovery of fate map &
communities on graphs
Nodes : pixels/points in the embryo
9Monday, November 3, 2014
Page 16
/46
Discovery of fate map &
communities on graphs
9Monday, November 3, 2014
Page 17
/46
Discovery of fate map &
communities on graphs
Edge if lot of genes are co-expressed at the two nodes
9Monday, November 3, 2014
Page 18
/46
Discovery of fate map &
communities on graphs
9Monday, November 3, 2014
Page 19
/46
Discovery of fate map &
communities on graphs
fate map
9
??
Monday, November 3, 2014
Page 20
/46
Edge between node i and node j
10Monday, November 3, 2014
Page 21
/46
= . . . . . .
Edge between node i and node j
10Monday, November 3, 2014
Page 22
/46
= . . . . . .
= . . . . . .
Edge between node i and node j
10Monday, November 3, 2014
Page 23
/46
= . . . . . .
= . . . . . .
> >
Edge between node i and node j
10Monday, November 3, 2014
Page 24
/46
= . . . . . .
= . . . . . .
> >
Edge between node i and node j
90-th percentile
10Monday, November 3, 2014
Page 25
/46
τ = τ =
Take K = 8
11
Comparing unregularized vs. regularized SC
Monday, November 3, 2014
Page 26
/46
τ = τ =
Take K = 8dorsal epidermis
mesoderm
hind gut
ventral neurogenic region
procephalic neurogenic region
anterior mid-gut
pharynx
esophagus
11
Comparing unregularized vs. regularized SC
Monday, November 3, 2014
Page 27
/46
Communities
people like minded people
pixels/points in embryo area of future organs
...
Nodes Communities
12Monday, November 3, 2014
Page 28
/46
Finding communities
13Monday, November 3, 2014
Page 29
/46
Finding communities
Notion of (two) communities
13Monday, November 3, 2014
Page 30
/46
Finding communities
Notion of (two) communities
Methods
Spectral clustering (Fiedler (’73), Donath & Hoffman (’73), ...)
Modularity (Newman & Girvan (‘03)), Latent space methods (Hoff et. al. (’02))Profile-likelihood (Bickel & Chen (’09)), Pseudo-Likelihood (Amini et. al. (’13)),
13Monday, November 3, 2014
Page 31
/4614
Spectral Clustering
Monday, November 3, 2014
Page 32
/46
Notation
15
Adjacency matrix:(symmetric binary)
Number of nodes: n
A ∈ Rn×n
Aij = Aji =
�1, (i, j)
0,
Monday, November 3, 2014
Page 33
/46
Notation
15
Adjacency matrix:(symmetric binary)
Number of nodes: n
A ∈ Rn×n
Aij = Aji =
�1, (i, j)
0,
Each row/column of A associated with a node
Monday, November 3, 2014
Page 34
/46
Notation
15
Adjacency matrix:(symmetric binary)
Number of nodes: n
A ∈ Rn×n
Aij = Aji =
�1, (i, j)
0,
Degree matrix:(diagonal)
D ∈ Rn×n
Dii =�
j
Aij
Monday, November 3, 2014
Page 35
/46
Spectral Clustering
16
(Normalizedsymmetric Laplacian matrix)
= − / − /
Spectral clustering deals with the eigenvectors of the matrix :
Monday, November 3, 2014
Page 36
/46
Spectral Clustering
16
(Normalizedsymmetric Laplacian matrix)
= − / − /
Spectral clustering deals with the eigenvectors of the matrix :
Other matrices used ...
D −A
A
D−1 A ( Normalized random walk Laplacian)
(Unnormalized Laplacian)
(Adjacency matrix)
Monday, November 3, 2014
Page 37
/4617
Illustration of SC
Monday, November 3, 2014
Page 38
/4617
Illustration of SC
A =
Monday, November 3, 2014
Page 39
/4617
A =
Illustration of SC
Monday, November 3, 2014
Page 40
/4617
L =
Illustration of SC
Monday, November 3, 2014
Page 41
/4617
First eigenvector
Seco
nd e
igen
vect
or
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.038 0.0360.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
L =
Illustration of SC
Monday, November 3, 2014
Page 42
/4617
First eigenvector
Seco
nd e
igen
vect
or
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.038 0.0360.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
L =
Illustration of SC
cluster
Seco
nd e
igen
vect
or
First eigenvector
Monday, November 3, 2014
Page 43
/46
SC for finding K clusters (Shi and Malik (00), Ng et. al (’02))
18
n×K V K L
V K
Monday, November 3, 2014
Page 44
/46
SC for finding K clusters (Shi and Malik (00), Ng et. al (’02))
18
n×K V K L
V K
V
Monday, November 3, 2014
Page 45
/46
Popularity of spectral clustering
• Computational advantage :
-requires eigenvector decomposition which is very fast
Theoretical backing :
- relaxation of various cut-based measures
(Hagen & Kahng (’92), Shi & Malik (’00), Ng et al, (’02))
- Stochastic Block Model and its extensions
(McSherry (‘01), Rohe. et. al (‘11), Chaudhari et. al. (’12), Sussman (’12),
Fishkind (’11))
19Monday, November 3, 2014
Page 46
/46
Performance of spectral clustering improves greatly through regularization
Regularization proposed by Amini, Chen, Bickel and Levina (AoS, 2013)
20Monday, November 3, 2014
Page 47
/46
Performance of spectral clustering improves greatly through regularization
Regularization proposed by Amini, Chen, Bickel and Levina (AoS, 2013)
20
A
Aτ = A+τ
n11�, τ > 0.
Lτ Aτ
Vτ K
Vτ = K Lτ
Monday, November 3, 2014
Page 48
/46
Performance of spectral clustering improves greatly through regularization
Regularization proposed by Amini, Chen, Bickel and Levina (AoS, 2013)
Alternative forms of regularization proposed and analyzed in Chaudhuri et. al (2012), Qin & Rohe (’13)
20
A
Aτ = A+τ
n11�, τ > 0.
Lτ Aτ
Vτ K
Vτ = K Lτ
Monday, November 3, 2014
Page 49
/46
Stochastic Block Model
21Monday, November 3, 2014
Page 50
/46
Stochastic Block Model (SBM) (Holland et. al (’83))
22
n
(i, j) Pij
Monday, November 3, 2014
Page 51
/46
Stochastic Block Model (SBM) (Holland et. al (’83))
22
SBM with two blocks
=� �
=
n× n
n
(i, j) Pij
Monday, November 3, 2014
Page 52
/4623
sample.4
.3
.2
.2
Monday, November 3, 2014
Page 53
/4623
sample.4
.3
.2
.2
Monday, November 3, 2014
Page 54
/46
Analysis of regularization for the SBM(Focus on K =2)
24Monday, November 3, 2014
Page 55
/4625
Comparing unregularized vs. regularized SC
==
τ =
first sample eigenvector first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
seco
nd s
ampl
e ei
genv
ecto
r
τ =
.003
.04
.0025
.0025
Monday, November 3, 2014
Page 56
/4625
Comparing unregularized vs. regularized SC
==
τ =
first sample eigenvector first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
seco
nd s
ampl
e ei
genv
ecto
r
τ =
.003
.04
.0025
.0025
k-means success : 100%k-means success : 87%
Monday, November 3, 2014
Page 57
/46
Recap : Regularized spectral clustering
26
Aτ = A+τ
n11�, τ > 0.
Lτ = D−1/2τ AτD
−1/2τ
Vτ
Vτ = Lτ
Monday, November 3, 2014
Page 58
/46
Population level quantities
27
A = P=
Monday, November 3, 2014
Page 59
/46
Population level quantities
27
τ
τ
Pτ = P + τn11
�
Lpopτ
Monday, November 3, 2014
Page 60
/46
Population level quantities
27
τ
τ
Pτ = P + τn11
�
Lpopτ
Recall:
Vτ n× 2
Vτ
Monday, November 3, 2014
Page 61
/46
Population level quantities
27
τ
τ
Pτ = P + τn11
�
Lpopτ
Vτ V popτ
Monday, November 3, 2014
Page 62
/46
Population level quantities
27
1,τ 2,τ
τ
τ
Pτ = P + τn11
�
Lpopτ
Vτ V popτ
Monday, November 3, 2014
Page 63
/46
first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.0380.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
τ
28Monday, November 3, 2014
Page 64
/46
first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.0380.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
τ
,τ
28Monday, November 3, 2014
Page 65
/46
first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.0380.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
τ
,τ
,τ
,τ
28Monday, November 3, 2014
Page 66
/46
first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.0380.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
τ
,τ
,τ
,τ
28Monday, November 3, 2014
Page 67
/46
first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.0380.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
τ
,τ
,τ
,τ
28Monday, November 3, 2014
Page 68
/46
first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.0380.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
τ
,τ
,τ
,τ
28Monday, November 3, 2014
Page 69
/46
first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.0380.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
τ
,τ
,τ
,τ
28Monday, November 3, 2014
Page 70
/46
first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
τ =max = , max ∈ � ,τ − ,τ�
� ,τ − ,τ�
0.052 0.05 0.048 0.046 0.044 0.042 0.04 0.0380.1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
τ
,τ
,τ
,τ
28Monday, November 3, 2014
Page 71
/4629
τ
τ =max = , max ∈ � ,τ − ,τ�
� ,τ − ,τ�
Monday, November 3, 2014
Page 72
/4629
τ
τ =Lτ Lpop
τ
� 1,τ − 2,τ�
Monday, November 3, 2014
Page 73
/46
τ
29
τ
τ =Lτ Lpop
τ
� 1,τ − 2,τ�
Monday, November 3, 2014
Page 74
/4629
τ
τ =Lτ Lpop
τ
� 1,τ − 2,τ�
Monday, November 3, 2014
Page 75
/4629
τ
τ �√ � τ − τ �
µ ,τ
Implication of matrix perturbation theory (Davis - Kahan) :
τ =Lτ Lpop
τ
� 1,τ − 2,τ�
Monday, November 3, 2014
Page 76
/4629
τ
τ
τ �√ � τ − τ �
µ ,τ
Implication of matrix perturbation theory (Davis - Kahan) :
(µ2,τ τ)
τ =Lτ Lpop
τ
� 1,τ − 2,τ�
Monday, November 3, 2014
Page 77
/4629
τ
Implication of concentration of Laplacian (Oliveira (’10)):
with high probability� τ − τ � � min
�√
, + τ, ,
( , + τ)
��log
τ � log ,
τ �√ � τ − τ �
µ ,τ
Implication of matrix perturbation theory (Davis - Kahan) :
τ =Lτ Lpop
τ
� 1,τ − 2,τ�
Monday, November 3, 2014
Page 78
/46
Improvements using extension of techniques in Balakrishnan et. al. (’11).
29
τ
τ �√ � τ − τ �
µ ,τ
Implication of matrix perturbation theory (Davis - Kahan) :
τ =Lτ Lpop
τ
� 1,τ − 2,τ�
Monday, November 3, 2014
Page 79
/4630
Let,
Set,
dn :=
τ = dn
Monday, November 3, 2014
Page 80
/4630
Let,
Set,
dn :=
τ = dn
Result (SBM with two blocks):
dn �√n log n
µ2,0
Monday, November 3, 2014
Page 81
/4630
Let,
Set,
Summary:
dn :=
τ = dn
Result (SBM with two blocks):
dn �√n log n
µ2,0
Monday, November 3, 2014
Page 82
/46
Choice of regularization parameter
31Monday, November 3, 2014
Page 83
/46
� τ − τ �µ ,τ
Recall: trade-offs dictated by
32Monday, November 3, 2014
Page 84
/46
� τ − τ �µ ,τ
Recall: trade-offs dictated by
� τ − ˆτ �µ̂ ,τ
τ
32Monday, November 3, 2014
Page 85
/46
Estimates based on estimated SBM (or degree corrected SBM)
� τ − τ �µ ,τ
Recall: trade-offs dictated by
� τ − ˆτ �µ̂ ,τ
τ
32Monday, November 3, 2014
Page 86
/46
� τ − τ �µ ,τ
Recall: trade-offs dictated by
� τ − ˆτ �µ̂ ,τ
τ
32Monday, November 3, 2014
Page 87
/46
ˆτ , µ̂ , τ
33
P=
τ
C1, C2
Monday, November 3, 2014
Page 88
/46
ˆτ , µ̂ , τ
33
P=
τ
C1, C2
p1, p2 q C1 C2
e.g. p̂1 = C1
Monday, November 3, 2014
Page 89
/46
ˆτ , µ̂ , τ
33
P̂ =
p̂1
p̂2q̂
q̂
q̂
τ
C1, C2
p1, p2 q C1 C2
e.g. p̂1 = C1
Monday, November 3, 2014
Page 90
/46
ˆτ , µ̂ , τ
33
P̂ =
p̂1
p̂2q̂
q̂
q̂
τ
C1, C2
p1, p2 q C1 C2
e.g. p̂1 = C1
P̂ L̂popτ µ̂2,τ L̂pop
τ
Monday, November 3, 2014
Page 91
/4634
Example
==
.003
.01
.0025
.0025
first sample eigenvector first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
seco
nd s
ampl
e ei
genv
ecto
r
τ = 0 τ = 18
Monday, November 3, 2014
Page 92
/4634
Example
==
.003
.01
.0025
.0025
first sample eigenvector first sample eigenvector
seco
nd s
ampl
e ei
genv
ecto
r
seco
nd s
ampl
e ei
genv
ecto
r
k-means success : 94%k-means success : 75%
τ = 0 τ = 18
Monday, November 3, 2014
Page 93
/46
Political blog data
35Monday, November 3, 2014
Page 94
/46
source : Adamic & Glance (’05)
=
36Monday, November 3, 2014
Page 95
/46
0 50 100 150 200 250 300 350 4000
100
200
300
400
500
600
700
800
Histogram of degrees
source : Adamic & Glance (’05)
=
36Monday, November 3, 2014
Page 96
/46
first
eig
enve
ctor
nodes
Political blogs data set
Unregularized Spectral Clustering
37
seco
nd e
igen
vect
or
nodes
Monday, November 3, 2014
Page 97
/4638Monday, November 3, 2014
Page 98
/46
second eigenvector discriminates these from the remaining
38Monday, November 3, 2014
Page 99
/4638Monday, November 3, 2014
Page 100
/4638
third eigenvector
Monday, November 3, 2014
Page 101
/46
Regularized SC for political blogs datasetfir
st e
igen
vect
or
seco
nd e
igen
vect
or
39
τ = .
nodesnodes
Monday, November 3, 2014
Page 102
/46
Regularized SC for political blogs dataset
13% of misclassified nodes for regularizedcompared to 48% for unregularized
first
eig
enve
ctor
seco
nd e
igen
vect
or
39
τ = .
nodesnodes
Monday, November 3, 2014
Page 103
/46
τ = τ =
Take K = 8
40
Comparing unregularized vs. regularized Spectral Clustering (SC)
Monday, November 3, 2014
Page 104
/46
τ = τ =
Take K = 8dorsal epidermis
mesoderm
hind gut
ventral neurogenic region
procephalic neurogenic region
anterior mid-gut
pharynx
esophagus
40
Comparing unregularized vs. regularized Spectral Clustering (SC)
Monday, November 3, 2014
Page 105
/46
Summary
• Theoretical upper bound under SBM shows “bias-variance”-like trade-off while the amount of regularization increases in SC
• Theoretical analysis motivates practically useful scheme (using SBM or degree-corrected SBM) to select regularization parameter in RSC.
Promising results in fruitfly image segmentation
Paper at (2014 rev): http://arxiv.org/pdf/1312.1733.pdf
41Monday, November 3, 2014
Page 106
/46
Ongoing/future directions
The BDGP project (with Antony Joseph, Siqi Wu, Ann Hammonds, Sue Celniker, Erwin Frise)
• Fast algorithm for computing the data-driven choice of regularization parameter• Role of regularization in other scenarios, such as hierarchical clusters• Regularization parameter choice for continuous data
Spectral Clustering (with Antony Joseph)
• Analysis of gene interactions in different regions of early stage embryos• Extension of analysis to later stage embryos
42Monday, November 3, 2014