C. Ding, NMF => Unsupervised Clustering 1 (Semi-)Nonnegative Matrix Factorization and K-mean Clustering Xiaofeng He Lawrence Berkeley Nat’l Lab Horst Simon Lawrence Berkeley Nat’l Lab Tao Li Florida Int’l Univ. Michael Jordan UC Berkeley Haesun Park Georgia Tech Chris Ding Lawrence Berkeley National Laboratory
42
Embed
(Semi-)Nonnegative Matrix Factorization and K-mean Clustering
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
C. Ding, NMF => Unsupervised Clustering 1
(Semi-)Nonnegative Matrix Factorization and
K-mean Clustering
Xiaofeng He Lawrence Berkeley Nat’l LabHorst Simon Lawrence Berkeley Nat’l LabTao Li Florida Int’l Univ.Michael Jordan UC BerkeleyHaesun Park Georgia Tech
Chris DingLawrence Berkeley National Laboratory
C. Ding, NMF => Unsupervised Clustering 2
Nonnegative Matrix Factorization (NMF)
),,,( 21 nxxxX L=Data Matrix: n points in p-dim:
TFGX ≈Decomposition (low-rank approximation)
Nonnegative Matrices0,0,0 ≥≥≥ ijijij GFX
),,,( 21 kgggG L=),,,( 21 kfffF L=
is an image, document, webpage, etc
ix
C. Ding, NMF => Unsupervised Clustering 3
Some historical notes
• Earlier work by statistics people (G. Golub)• P. Paatero (1994) Environmetrices• Lee and Seung (1999, 2000)
– Parts of whole (no cancellation)– A multiplicative update algorithm
C. Ding, NMF => Unsupervised Clustering 4
0 0050 710
080 20 0
.
.
.
.
.
.
.
M
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥
Pixel vector
C. Ding, NMF => Unsupervised Clustering 5
XFGT ≈
),,,( 21 kfffF L= ),,,( 21 kgggG L=
Lee and Seung (1999): Parts-based Perspective
original
),,,( 21 nxxxX L=
C. Ding, NMF => Unsupervised Clustering 6
TFGX ≈ ),,,( 21 kfffF L=
Straightforward NMF doesn’t get parts-based picture
Several People explicitly sparsify F to get parts-based picture
Donono & Stodden (2003) study condition for parts-of-whole
(Li, et al, 2001; Hoyer 2003)
“Parts of Whole” Picture
C. Ding, NMF => Unsupervised Clustering 7
Meanwhile …….A number of studies empirically show the usefulness of NMF for pattern discovery/clustering:
Xu et al (SIGIR’03)Brunet et al (PNAS’04)Many others
We claim:
NMF factors give holistic pictures of the data
C. Ding, NMF => Unsupervised Clustering 8
Our Experiments: NMF gives holistic pictures
C. Ding, NMF => Unsupervised Clustering 9
Our Experiments: NMF gives holistic pictures
C. Ding, NMF => Unsupervised Clustering 10
Task:Prove NMF is doing “Data Clustering”
NMF => K-means Clustering
C. Ding, NMF => Unsupervised Clustering 11
NMF-Kmeans Theorem
2
0||X||min
0,
T
FFG
GIGTG
−≥=
≥
)(Trmin0,
XGXGXX TTT
GIGGT−
≥=
G -orthogonal NMF is equivalent to relaxed K-means clustering.
Proof.
(Ding, He, Simon, SDM 2005)
C. Ding, NMF => Unsupervised Clustering 12
• Also called “isodata”, “vector quantization”• Developed in 1960’s (Lloyd, MacQueen, Hartigan, etc)
K-means clustering
• Computationally Efficient (order-mN)• Most widely used in practice
• Advantage: hard/soft clustering• Convex-NMF enforces notion of cluster centroids
and is naturally sparse
NMF: A new/rich paradigm for unsupervised learning
C. Ding, NMF => Unsupervised Clustering 38
References
• On the Equivalence of Nonnegative Matrix Factorization and K-means /Spectral clustering, Chris Ding, XiaofengHe, Horst Simon, SDM 2005.
• Convex and Semi-Nonnegative Matrix Factorization, Chris Ding, Tao Li, Michael Jordan, submitted
• Orthogonal Non-negative Matrix Tri-Factorization for clustering, Chris Ding, Tao Li, Wei Peng, Haesun Park,KDD 2006.
• Nonnegative Matrix Factorization and Probabilistic Latent Semantic Indexing: Equivalence, Chi-square and a Hybrid Algorithm, Chris Ding, Tao Li, Wei Peng, AAAI 2006.