Top Banner
On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering Chris Ding, Xiaofeng He, Horst D. Simon Published on SDM 05’ Hongchang Gao
31

On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Jan 21, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

On the Equivalence of Nonnegative Matrix Factorization and

Spectral Clustering Chris Ding, Xiaofeng He, Horst D. Simon

Published on SDM 05’ Hongchang Gao

Presenter
Presentation Notes
SIAM International Conference on Data Mining
Page 2: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Outline

• NMF • NMFKmeans • NMFSpectral Clustering • NMFBipartite graph Kmeans

Page 3: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Outline

• NMF • NMFKmeans • NMFSpectral Clustering • NMFBipartite graph Kmeans

Page 4: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

NMF

• Paatero and Tapper (1994) – Positive matrix factorization: a non-negative factor model

with optimal utilization of error estimates of data values – Environmetrices

• Lee and Seung (1999, 2000) – Learning the parts of objects by non-negative matrix

factorization, Nature – Algorithms for non-negative matrix factorization, NIPS

Presenter
Presentation Notes
Paatero published his initial factorization algorithms in 1994. However, Paatero’s work is rarely cited by subsequent authors. This is partially due to Paatero’s unfortunate phrasing of positive matrix factorization, which is misleading as Paatero’s algorithms create a NMF. After that, a lot of work has been devoted the NMF algorithms. Today, we will focus on the relationship between NMF and kmeans and spectral clustering
Page 5: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

NMF

• Matrix Factorization is widely used in machine learning, such as SVD – interpretation of basis vectors is difficult due to

mixed signs

T

nonnegmixed mixedX U V= Σ

Presenter
Presentation Notes
basis vectors that are mixed in sign. Negative elements make interpretation difficult Althogh svd has a lot of strength, but
Page 6: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

NMF

• Nonnegative Matrix Factorization – where – columns of F are the underlying basis vectors – rows of G give the weights associated with each

basis vector

T

nonneg nonnegX F G=

, ,d n d k n kX R F R G R× × ×∈ ∈ ∈

Presenter
Presentation Notes
each of the columns of X can be built from k columns of W.
Page 7: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Outline

• NMF • NMFKmeans • NMFSpectral • NMFBipartite graph Kmenas

Page 8: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Kmeans

• Kmeans clustering is one of most widely used clustering method.

Page 9: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Kmeans

• Reformulate Kmeans Clustering

• Cluster membership indicators:

Page 10: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Kmeans

• Objective function

• Replace , which is the standard inner-product linear Kernel matrix

TW X X=

Presenter
Presentation Notes
Kernel K-means aims at maximizing within-cluster similarities. The advantage of Kernel K -means is that it can describe data distributions more complicated than Gaussion distributions.
Page 11: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Kernel Kmeans

• Map x to higher dimension space:

• Kernel Kmeans objective:

Page 12: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

NMFKmeans

• Orthogonal symmetric NMF is equivalent to Kernel Kmeans clustering

Page 13: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Kernel Kmeans=>Symmetric NMF

• Factorization is equivalent to Kernel K-means clustering with the strict orthogonality relaxed

, 0

, 0

2 2

, 0

2

, 0

arg max ( )

arg min 2 ( )

arg min || || 2 ( ) || ||

arg min || ||

Relaxing the orthogonality H H = I completes the proof

T

T

T

T

T

H H I H

T

H H I H

T T

H H I H

T

H H I H

T

H Tr H WH

Tr H WH

W Tr H WH H H

W HH

= ≥

= ≥

= ≥

= ≥

=

= −

= − +

= −

Page 14: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Symmetric NMF=> Kernel Kmeans

• factorization retains H orthogonality approxiamately. – Proof. is equivalent to

– The first one recover the objective

2min|| ||TW HH−

0max ( )T

HTr H WH

≥2

0min || ||T

HH H

TW HH=

Page 15: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Symmetric NMF=> Kernel Kmeans

• The second one – Minimize the first term, we get

– Minimize the second term

• We should make sure H cannot be all zero

Page 16: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Outline

• NMF • NMFKmeans • NMFSpectral • NMFBipartite graph Kmeans

Page 17: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Spectral Clustering

• Spectral clustering objective functions

Page 18: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix
Page 19: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Spectral Clustering

• Reformulate the objective based on Ncut

• Replace

• Then,

1 1

( , ) ( )1 1( )

TK Kl l l l

Tl ll l l

cut V V V h D W hJK vol V K h Dh= =

− −= =∑ ∑

1/2

1/2|| ||l

ll

D hzD h

=

~ ~

1 1( ) ( )

K KT T Tl l l l

l lJ z I W z z z Tr Z W Z

= =

= − = −∑ ∑

Page 20: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

NMFSpectral Clustering

• The objective of spectral clustering

• This is identical to the Kernel Kmeans clustering

• Spectral ClusteringKernel Kmeans NMF

~

, 0max ( )

T

T

Z Z I ZTr Z W Z

= ≥

Page 21: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Outline

• NMF • NMFKmeans • NMFSpectral • NMFBipartite graph Kmenas

Page 22: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Bipartite graph Kmeans

• Simultaneous clustering of rows and columns

Page 23: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Bipartite graph Kmeans

• Simultaneously cluster the rows and columns of data matrix

• Row Clustering

• Column Clustering

1 2,...,( , )nB x x x=

, 0max ( )

T

T T

F F I FTr F BB F

= ≥

, 0max ( )

T

T T

G G I GTr G B BG

= ≥

Presenter
Presentation Notes
For example, in document clustering, documents are data points, words are features. Data points can be grouped based on features, and features can be grouped based on words.
Page 24: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Bipartite graph Kmeans

• Equivalent problem:

• Solution

• Then,

, Tk k k k k kBg f B f gλ λ= =

2 2,T Tk k k k k kB Bg g BB f fλ λ= =

Presenter
Presentation Notes
The solution of this quadratic optimization is given by the first K eigenvectors This proves that Eq.(20) is the objective for simultaneous row and column K -means clustering.
Page 25: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Bipartite graph Kmeans=>NMF

• The simultaneous row and column Kmeans clustering is equivalent to the following optimization problem

Page 26: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Bipartite graph Kmeans=>NMF

• Proof.

• Therefore, NMF is equivalent to Kmeans clustering with relaxed orthogonality contraints.

,

,

2

,

2

,

max ( )

min ( )

min || || 2 ( ) ( )

min || ||

T

F G

T

F G

T T T

F G

T

F G

Tr F BG

Tr F BG

B Tr F BG Tr F FG G

B FG

⇒ −

⇒ − +

⇒ −

Page 27: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

NMF=>Bipartite graph Kmeans

• In the previous, we assume both F and G are orthogonal. If one of them is orthogonal, we can explicitly write as a Kmeans clustering objective function.

• NMF with orthogonal G is identical to Kmeans clustering of the columns of B.

2|| ||TB FG−

Page 28: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

NMF=>Bipartite graph Kmeans

• Proof. – At first, normalize the row of G, s.t.

– Then, for the objective function

– We have

Page 29: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

NMF=>Bipartite graph Kmeans

• The orthogonality condition of G implies that in each row of G, only one element is nonzero and

• Summing over i: – which is the Kmeans clustering

0,1ikg =

22

1|| ||

k

K

i kk i C

J x f= ∈

= −∑∑

Page 30: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Reference • Ding, Chris HQ, Xiaofeng He, and Horst D. Simon. "On

the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering." SDM. Vol. 5. 2005.

• Li, Tao, and Chris Ding. "The relationships among various nonnegative matrix factorization methods for clustering." Data Mining, 2006. ICDM'06. Sixth International Conference on. IEEE, 2006.

• Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4 (2007): 395-416.

• Shi, Jianbo, and Jitendra Malik. "Normalized cuts and image segmentation." Pattern Analysis and Machine Intelligence, IEEE Transactions on 22.8 (2000): 888-905.

Page 31: On the Equivalence of Nonnegative Matrix Factorization and …ranger.uta.edu/~heng/CSE6389_15_slides/On the Equivalence... · 2015. 2. 26. · the Equivalence of Nonnegative Matrix

Thanks