Learning multiple nonredundant clusterings

Post on 22-Feb-2016

23 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Learning multiple nonredundant clusterings. Presenter : Wei- Hao Huang Authors : Ying Gui , Xiaoli Z. Fern, Jennifer G. DY TKDD, 2010. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation. - PowerPoint PPT Presentation

Transcript

Intelligent Database Systems Lab

國立雲林科技大學National Yunlin University of Science and Technology

1

Learning multiple nonredundant clusterings

Presenter : Wei-Hao Huang  Authors : Ying Gui, Xiaoli Z. Fern, Jennifer G. DY

TKDD, 2010

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

2

Outlines Motivation Objectives Methodology Experiments Conclusions Comments

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

3

Motivation· Data exist multiple groupings that are reasonable

and interesting from different perspectives.· Traditional clustering is restricted to finding only

one single clustering.

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Objectives

4

• To propose a new clustering paradigm for finding all non-redundant clustering solutions of the data.

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

5

Methodology· Orthogonal clustering

─ Cluster space· Clustering in orthogonal subspaces

─ Feature space· Automatically Finding the number of clusters· Stopping criteria

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Orthogonal Clustering Framework

6

X (Face dataset)

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Orthogonal clustering

· Residue space

7

)

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Clustering in orthogonal subspaces· Feature space

─ linear discriminant analysis (LDA)

─ singular value decomposition (SVD)

─ LDA v.s. SVD where

8

Projection Y=ATX

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Clustering in orthogonal subspaces

· Residue space

9

A(t)= eigenvectors of

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Compare moethod1 and mothod2· Residue space· Moethod1

─ · Moethod2

─ ─

· Moethod1 is a special case of Moethod2.─

10

A(t)= eigenvectors of

M’=M then P1=P2

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· To use PCA to reduce dimensional· Clustering

─ K-means clustering Smallest SSE

─ Gaussian mixture model clustering (GMM) Largest maximum likelihood

· Dataset─ Synthetic─ Real-world

Face, WebKB text, Vowel phoneme, Digit

11

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· Evaluation

12

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· Synthetic

13

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· Face dataset

14

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· WebKB dataset

· Vowe phoneme dataset

15

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· Digit dataset

16

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· Finding the number of clusters

─ K-means Gap statistics

17

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· Finding the number of clusters

─ GMMBIC

· Stopping Criteria─ SSE is less than 10% at first iteration─ Kopt=1─ Kopt > Kmax Select Kmax ─ Gap statistics─ BIC Maximize value of BIC

18

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· Synthetic dataset

19

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· Face dataset

20

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· WebKB dataset

21

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

22

Conclusions

• To discover varied interesting and meaningful

clustering solutions.

• Method2 is able to apply any clustering and

dimensionality reduction algorithm.

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

23

Comments· Advantages

─ Find Multiple non-redundant clustering solutions

· Applications─ Data Clustering

top related