05 unsupervised learning(1)

Unsupervised Learning

Dr Khurram Khurshid

Pattern Recognition

CLUSTERING

Clustering is the partitioning of a data set into subsets (clusters), so that the data in each subset (ideally) share some common trait - often according to some defined distance measure.

Clustering is unsupervised classification

http://en.wikipedia.org/wiki/Partition_of_a_set

http://en.wikipedia.org/wiki/Data_set

http://en.wikipedia.org/wiki/Subset

http://en.wikipedia.org/wiki/Metric_(mathematics)

CLUSTERING

• There is no explicit teacher and the system forms clusters or “natural groupings” or structure in the input pattern

CLUSTERING

• Data WITHOUT classes or labels

• Deals with finding a structure in a collection of unlabeled data.

• The process of organizing objects into groups whose members are similar in some way

• A cluster is therefore a collection of objects which are “similar” between them and are “dissimilar” to the objects belonging to other clusters.

1 2 3, , , , dn x x x x x

CLUSTERING

In this case we easily identify the 4 clusters into which the data can be divided

The similarity criterion is distance: two or more objects belong to the same cluster if they are “close” according to a given distance

Types of Clustering

• Hierarchical algorithms– These find successive clusters using previously established

clusters.

1. Agglomerative ("bottom-up"): Agglomerative algorithms begin with each element as a separate cluster and merge them into successively larger clusters.

2. Divisive ("top-down"): Divisive algorithms begin with the whole set and proceed to divide it into

successively into smaller clusters.

Types of Clustering

• Partitional clustering– Construct a partition of a data set to produce several clusters –

At once– The process is repeated iteratively – Termination condition– Examples

• K-means clustering

• Fuzzy c-means clustering

K MEANS CLUSTERING

K MEANS – example 1

9









K MEANS – Example 2

• Suppose we have 4 medicines and each has two attributes (pH and weight index). Our goal is to group these objects into K=2 groups of medicine

Medicine

Weight pH-Index

A 1 1

B 2 1

C 4 3

D 5 4

A B

C

D


Step 1: Compute the similarity between all samples and K centroids

Bc ,Ac 21

Euclidean distance

24.4)14()25( ),(

5)14()15( ),(

222

221

cDd

cDd

Assign each object to the cluster with the nearest seed point


Step 2 - Assign the sample to its closest cluster

The elements of Group matrix below is 1 if and only if the object is assigned to that group


Step 3: Re-calculate the K-centroids

Knowing the members of each cluster, now we compute the new centroid of each group based on these new memberships.

)67.2 ,67.3(

)3/8 ,3/11(

3

431 ,

3

542

)1 ,1(

2

1

c

c


• Step 4 – Repeat the above steps

Compute the distance of all objects to the new centroids


• Step 4 – Repeat the above steps

Assign the membership to objects


• Step 4 – Repeat the above stepsKnowing the members of each cluster, now we compute the new centroid of each group based on these new memberships.

)21

3 ,21

4(2

43 ,

254

)1 ,21

1(2

11 ,

221

2

1

c

c

K MEANS – Example 204/15/23

25

CS

E D

ep

t – M

CS

-NU

ST



• We obtain result that . Comparing the grouping of last iteration and this iteration reveals that the objects does not move group anymore.

• Thus, the computation of the k-mean clustering has reached its stability and no more iteration is needed.

http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html

Kmeans - Examples

D. Comaniciu and P. Meer, Robust Analysis

of Feature Spaces: Color Image

Segmentation, 1997.

• Data Points – RGB Values of pixels• Can be used for Image Segmentation

Kmeans - Examples

Original K=5 K=11

Kmeans - Examples

• Quantization of colors

Kmeans - Examples

• Extraction of text in degraded documents

Original Image Kmeans with k=3

Kmeans - Examples

TASK

Hierarchical clustering

• Agglomerative and divisive clustering on the data set {a, b, c, d ,e }

Step 0 Step 1 Step 2 Step 3 Step 4

b

d

c

e

aa b

d e

c d e

a b c d e

Step 4 Step 3 Step 2 Step 1 Step 0

Agglomerative

Divisive

Agglomerative clustering

1. Convert object attributes to distance matrix2. Set each object as a cluster (thus if we have N

objects, we will have N clusters at the beginning)3. Repeat until number of cluster is one (or known # of

clusters) a. Merge two closest clustersb. Update distance matrix

d1

d2

d3

d4

d5

d1,d2 d4,d5 d3

d3,d4,d5

Starting Situation

• Start with clusters of individual points and a distance/proximity matrix

p1

p3

p5

p4

p2

p1 p2 p3 p4 p5 . . .

.

.

.

Distance Matrix

...p1 p2 p3 p4 p9 p10 p11 p12

Intermediate situation

• After some merging steps, we have some clusters

C1

C4

C2 C5

C3

C2C1

C1

C3

C5

C4

C2

C3 C4 C5

Distance Matrix

...p1 p2 p3 p4 p9 p10 p11 p12


• How do we compare two clusters

C1

C4

C2 C5

C3

Inter cluster distance measures

• Single Link• Average Link• Complete Link• Distance between centroids

Similarity?


• We want to merge the two closest clusters (C2 and C5) and update the distance matrix.

C1

C4

C2 C5

C3

C2C1

C1

C3

C5

C4

C2

C3 C4 C5

Distance Matrix

...p1 p2 p3 p4 p9 p10 p11 p12

Single link

• Smallest distance between an element in one cluster and an element in the other

,( , ) min ( , )

i ji j

x c y cD c c D x y

Complete link

• Largest distance between an element in one cluster and an element in the other

,( , ) max ( , )

i ji j

x c y cD c c D x y

Average Link

• Avg distance between an element in one cluster and an element in the other

,( , ) ( , )

i j

i jx c y c

D c c avg D x y

Distance between centroids

• Distance between the centroids of two clusters

After Merging

• Update the distance matrix

C1

C4

C2 U C5

C3? ? ? ?

?

?

?

C2 U C5C1

C1

C3

C4

C2 U C5

C3 C4

...p1 p2 p3 p4 p9 p10 p11 p12

Example – Single link clustering

Dendrogram

Clustering obtained by cutting the dendrogram at a desired level: each connectedconnected component forms a cluster.

Single Link Clustering

Nested ClustersDendrogram

1

2

3

4

5

6

1

2

3

4

5

3 6 2 5 4 10

0.05

0.1

0.15

0.2

Complete link Clustering

Nested Clusters

Dendrogram

3 6 4 1 2 50

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

1

2

3

4

5

6

1

2 5

3

4

Average link clustering

Nested Clusters

Dendrogram

3 6 4 1 2 50

0.05

0.1

0.15

0.2

0.25

1

2

3

4

5

6

1

2

5

3

4

Comparison

Average Link

Single Link Complete Link1

2

3

4

5

61

2

5

34

1

2

3

4

5

61

2 5

3

41

2

3

4

5

6

12

3

4

5

Agglomerative Clustering - Example

data matrix

Euclidean distance

C-Means Clustering1. Chose the number of clusters and randomly select the

centroids of each cluster.

2. For each data point: Calculate the distance from the data point to each cluster. Instead of assigning the pixel completely to one cluster, use

the weights depending on the distance of that pixel from each cluster.

The closer the cluster, the higher the weigh, and vice versa. Re-compute the centers of the clusters using these

weighted distances.

Mean Shift Algorithm

Mean Shift Algorithm1. Choose a search window size.2. Choose the initial location of the search window.3. Compute the mean location (centroid of the data) in the search window.4. Center the search window at the mean location computed in Step 3.5. Repeat Steps 3 and 4 until convergence.

The mean shift algorithm seeks the “mode” or point of highest density of a data distribution:

54

Mean Shift Algorithm

The mean shift algorithm seeks the “mode” or point of highest density of a data distribution:

Intuitive Description

Distribution of identical billiard balls

Region ofinterest

Center ofmass

Mean Shiftvector

Objective : Find the densest region



Region ofinterest

Center ofmass

Mean Shiftvector




Region ofinterest

Center ofmass

Mean Shiftvector




Region ofinterest

Center ofmass

Mean Shiftvector




Region ofinterest

Center ofmass

Mean Shiftvector




Region ofinterest

Center ofmass

Mean Shiftvector




Region ofinterest

Center ofmass


An example

Window tracks signify the steepest ascent directions

Clustering

Attraction basin : the region for which all trajectories lead to the same mode

Cluster : All data points in the attraction basin of a mode

Mean Shift Segmentation

Place a tiny mean shift window over each data point1. Grow the window and mean shift it2. Track windows that merge along with the data they transversed 3. Until everything is merged into one cluster

MSS Is scale (search window size) sensitive. Solution, use all scales:

Extension

Mean Shift Segmentation

Best 4 clusters: Best 2 clusters:

05 unsupervised learning(1)

Technology