Top Banner
Clustering
126

Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Aug 06, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Clustering

Page 2: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Image Segmentation

Page 3: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Image Segmentation

Pink/White pixel : Apple blossom Orange pixel : Orange

Green pixel : leaf

Page 4: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Image Segmentation

Page 5: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Pixels as features

Page 6: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Principle of clustering:

Put things that are closer to each

other (in feature space) into the

same group

Page 7: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Pixels as features

Page 8: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

But what is a ‘good’ cluster?

Low in-group variability

High out of group

variability

Page 9: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Compactness: Min(in group variability)

Need a measure that shows how ‘compact’ our clusters

are

Distance based measures

Page 10: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Distance-based Measures

Total distance between each element in the cluster and

every other element

Page 11: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Distance-based Measures

Distance between farthest points in cluster

Page 12: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Distance-based Measures

Total distance of every element in the cluster from the

Centroid in the cluster

Page 13: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Distance-based Measures

Total distance of every element in the cluster from the

Centroid in the cluster

Page 14: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Distance-based Measures

Total distance of every element in the cluster from the

Centroid in the cluster

Page 15: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Finding clusters: K-means

Page 16: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K-means algorithm

Minimizes scatter: Distance from centroid

Page 17: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

What is a ‘Centroid’

clusteri

icluster xn

m1

Page 18: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

What is a ‘Centroid’

clusteri

ii

clusteri

i

cluster xww

m1

Page 19: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 20: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 21: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 22: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 23: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 24: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 25: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 26: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 27: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 28: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points are clustered, recompute centroids

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

ii

clusteri

i

cluster xww

m1

Page 29: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 30: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 31: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 32: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 33: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 34: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 35: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 36: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 37: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points are clustered, recompute centroids

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

ii

clusteri

i

cluster xww

m1

Page 38: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points are clustered, recompute centroids

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

ii

clusteri

i

cluster xww

m1

Page 39: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Another example

Page 40: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Another example

Page 41: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Another example

Page 42: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Another example

Page 43: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Going back to our first example

Page 44: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Going back to our first example

Page 45: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Going back to our first example

4 clusters

Page 46: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Going back to our first example

6 clusters

Page 47: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Problems with K-means

Initial conditions important

Page 48: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Problems with K-means

Initial conditions important

Page 49: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Problems with K-means

Initial conditions important

Page 50: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Problems with K-means

Initial conditions important

Page 51: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Problems with K-means

Initial conditions important

Page 52: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Problems with K-means

What is K?

Page 53: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Problems with K-means

K=2

Page 54: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Problems with K-means

K=5

Page 55: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Is there an optimal clustering

method?

Page 56: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Optimal method: Exhaustive Enumeration

Compute distances between every single pair of data

points and cluster on that

Page 57: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Optimal method: Exhaustive Enumeration

Compute distances between every single pair of data

points and cluster on that

Page 58: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Optimal method: Exhaustive Enumeration

Compute distances between every single pair of data

points and cluster on that

Very very computationally expensive

If M data points and we want N clusters:

Compute goodness for every possible combination

N

i

Mi iNi

N

M 0

)()1(!

1

Page 59: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Optimal method: Exhaustive Enumeration

Compute distances between every single pair of data

points and cluster on that

Very very computationally expensive

If M data points and we want N clusters:

Compute goodness for every possible combination

N

i

Mi iNi

N

M 0

)()1(!

1

Page 60: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Optimal method: Exhaustive Enumeration

Compute distances between every single pair of data

points and cluster on that

Very very computationally expensive

If M data points and we want N clusters:

Compute goodness for every possible combination

K-means: Fast but greedy

Page 61: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Going back to our first example

Page 62: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Hierarchical clustering

Page 63: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each
Page 64: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Hierarchical clustering: Bottom up

Page 65: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Bottom up clustering

Page 66: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Bottom up clustering

Page 67: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Bottom up clustering

Page 68: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Bottom up clustering

Page 69: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Bottom up clustering

Initially, every point is its own cluster

Page 70: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Bottom up clustering

Page 71: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Bottom up clustering

Page 72: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Bottom up clustering

Page 73: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Bottom up clustering

Page 74: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Bottom up clustering

Page 75: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Notes about bottom up clustering

Single Link: Nearest neighbor distance

Page 76: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Notes about bottom up clustering

Single Link: Nearest neighbor distance

Complete link: Farthest neighbor distance

Page 77: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Notes about bottom up clustering

Single Link: Nearest neighbor distance

Complete link: Farthest neighbor distance

Centroid: Distance between centroids

Page 78: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Hierarchical clustering: Top Down

Page 79: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Top down clustering

Page 80: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Top down clustering

Page 81: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Top down clustering

Page 82: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K-Means for Top–Down clustering 1. Start with one cluster

2. Split each cluster into two: Perturb centroid of cluster slightly (by < 5%)

to generate two centroids

3. Initialize K means with new set of centroids

4. Iterate Kmeans until convergence

5. If the desired number of clusters is not obtained, return to 2

Page 83: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K-Means for Top–Down clustering 1. Start with one cluster

2. Split each cluster into two: Perturb centroid of cluster slightly (by < 5%) to

generate two centroids

3. Initialize K means with new set of centroids

4. Iterate Kmeans until convergence

5. If the desired number of clusters is not obtained, return to 2

Page 84: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K-Means for Top–Down clustering 1. Start with one cluster

2. Split each cluster into two: Perturb centroid of cluster slightly (by < 5%) to

generate two centroids

3. Initialize K means with new set of centroids

4. Iterate Kmeans until convergence

5. If the desired number of clusters is not obtained, return to 2

Page 85: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K-Means for Top–Down clustering 1. Start with one cluster

2. Split each cluster into two: Perturb centroid of cluster slightly (by < 5%) to

generate two centroids

3. Initialize K means with new set of

centroids

4. Iterate Kmeans until convergence

5. If the desired number of clusters is not obtained, return to 2

Page 86: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K-Means for Top–Down clustering 1. Start with one cluster

2. Split each cluster into two: Perturb centroid of cluster slightly (by < 5%) to

generate two centroids

3. Initialize K means with new set of

centroids

4. Iterate Kmeans until convergence

5. If the desired number of clusters is not obtained, return to 2

Page 87: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K-Means for Top–Down clustering 1. Start with one cluster

2. Split each cluster into two: Perturb centroid of cluster slightly (by < 5%) to

generate two centroids

3. Initialize K means with new set of

centroids

4. Iterate Kmeans until convergence

5. If the desired number of clusters is

not obtained, return to 2

Page 88: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K-Means for Top–Down clustering 1. Start with one cluster

2. Split each cluster into two: Perturb centroid of cluster slightly (by < 5%) to

generate two centroids

3. Initialize K means with new set of

centroids

4. Iterate Kmeans until convergence

5. If the desired number of clusters is

not obtained, return to 2

Page 89: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

When is a data point in a cluster?

Page 90: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Distance from cluster

Euclidean distance from centroid

Page 91: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Distance from cluster

Distance from the closest point

Page 92: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Distance from cluster

Distance from the farthest point

Page 93: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Distance from cluster

Probability of data measured on cluster distribution

Page 94: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

A closer look at ‘Distance’

Page 95: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points are clustered, recompute centroids

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

ii

clusteri

i

cluster xww

m1

Page 96: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

A closer look at ‘Distance’

Original algorithm uses L2 norm and weight=1

This is an instance of generalized EM

The algorithm is not guaranteed to converge for other

distance metrics

clusteri

i

cluster

cluster xN

m1

2||||),( clusterclustercluster mxmx distance

Page 97: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Problems with Euclidean distance

Page 98: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Problems with Euclidean distance

Page 99: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Problems with Euclidean distance

Page 100: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Problems with Euclidean distance

Page 101: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Better way: Map it to different space

f([x,y]) -> [x,y,z]

x = x

y = y

z = a(x2 + y2)

Page 102: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

The Kernel trick

Page 103: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

The Kernel trick

Transform data to higher dimensional space (even

infinite!)

z = F(x)

Page 104: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

The Kernel trick

Transform data to higher dimensional space (even

infinite!)

z = F(x)

Compute distance in higher dimensional space

d(x1, x2) = ||z1- z2||2 = ||F(x1) – F(x2)||

2

Page 105: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

The cool part

Distance in low dimensional space:

||x1- x2||2 = (x1- x2)

T(x1- x2) = x1.x1 + x2.x2 -2 x1.x2

Page 106: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

The cool part

Distance in low dimensional space:

||x1- x2||2 = (x1- x2)

T(x1- x2) = x1.x1 + x2.x2 -2 x1.x2

Distance in high dimensional space:

d(x1, x2) =||F(x1) – F(x2)||2

= F(x1). F(x1) + F(x2). F(x2) -2 F(x1). F(x2)

Note: Every term involves dot products!

Page 107: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Kernel function

Kernel function is just

K(x1,x2) = F(x1). F(x2)

Going back to our distance function in the high

dimensional space:

d(x1, x2) =||F(x1) – F(x2)||2

= F(x1). F(x1) + F(x2). F(x2) -2 F(x1). F(x2)

= K(x1,x1) + K(x2,x2) - 2K(x1,x2)

Kernel functions are more efficient than dot products

Page 108: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Typical Kernel Functions Linear: K(x,y) = xTy + c

Polynomial K(x,y) = (axTy + c)n

Gaussian: K(x,y) = exp(-||x-y||2/s2)

Exponential: K(x,y) = exp(-||x-y||/l)

Several others

Choosing the right Kernel with the right parameters for

your problem is an art

Page 109: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Kernel K-means

Page 110: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Kernel K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 111: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Kernel K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 112: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Kernel K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 113: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Kernel K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 114: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Kernel K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 115: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Kernel K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 116: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Kernel K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 117: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Kernel K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 118: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Kernel K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points clustered, recompute cluster centroid

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

i

cluster

cluster xN

m1

Page 119: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Kernel K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points are clustered, recompute centroids

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

ii

clusteri

i

cluster xww

m1

Page 120: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Kernel K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points are clustered, recompute centroids

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

ii

clusteri

i

cluster xww

m1

Page 121: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Kernel K–means 1. Initialize a set of centroids randomly

2. For each data point x, find the distance from the centroid for each cluster

3. Put data point in the cluster of the closest centroid

• Cluster for which dcluster is minimum

4. When all data points are clustered, recompute centroids

5. If not converged, go back to 2

),( clustercluster mxd distance

clusteri

ii

clusteri

i

cluster xww

m1

Page 122: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Distance metric

FF

FFF

clusteri

ii

T

clusteri

ii

2

cluster )x(wC)x()x(wC)x(||m)x(||)cluster,x(d

Page 123: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Distance metric

FF

FFF

clusteri

ii

T

clusteri

ii

2

cluster )x(wC)x()x(wC)x(||m)x(||)cluster,x(d

FFFFFF

clusteri clusteri clusterj

j

T

iji

2

i

T

i

T )x()x(wwC)x()x(wC2)x()x(

Page 124: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Distance metric

FF

FFF

clusteri

ii

T

clusteri

ii

2

cluster )x(wC)x()x(wC)x(||m)x(||)cluster,x(d

FFFFFF

clusteri clusteri clusterj

j

T

iji

2

i

T

i

T )x()x(wwC)x()x(wC2)x()x(

clusteri clusteri clusterj

jiji

2

ii )x,x(KwwC)x,x(KwC2)x,x(K

Page 125: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Other clustering methods

Regression based clustering

Find a regression representing each cluster

Associate each point to the cluster with the best

regression

Related to kernel methods

Page 126: Clusteringpmuthuku/mlsp_page/lectures/... · 2012-09-27 · K–means 1. Initialize a set of centroids randomly 2. For each data point x, find the distance from the centroid for each

Questions?