Clustering methods in image processing

Dec 2014

Amirkabir University of Technology

Outline

FCM, PCM

Hard C-Means

Fuzzy C-Means Clustering

(FCM)

Possibilistic C-Means

Clustering (PCM)

Comparison of FCM, PCM

Example and Results

About Clustering

What is data clustering?

Unsupervised Learning

Similarity Measures

Quality Measures

SOM (Self Organizing Map)

Topology

Learning algorithm

Example

Application

What is data clustering?

Page 3 of 60

Data clustering is the process of identifying natural groupings or clusters within unlabelled data based on some similarity measure [Jain et al. 2000].

Unsupervised learning

How could we know what constitutes “different” clusters?

• Green Apple and Banana Example.

• two features: shape and color.

Page 4 of 60


Page 5 of 60


intra cluster distances

extra cluster distances

Page 6 of 60

Similarity Measures

Euclidean distance

Manhattan distance (city block)

Cosine similarity (vector dot product)

Mahalanobis distance

Page 7 of 60

Similarity Measures

Euclidean distance




Page 8 of 60

Similarity Measures

Euclidean distance




Page 9 of 60

Similarity Measures

Euclidean distance




Page 10 of 60

Types of Clustering Methods

Partitioning clustering

o SOM clustering

o K-means clustering

o K-medoids clustering

o Fuzzy c-means clustering

o Evolutionary-based clustering

Hierarchical clustering

o Agglomerative clustering

o Divisive clustering

Density based clustering

Grid methods

Page 11 of 60

Cluster Quality Measures

Compactness measure

Separation measure

Combined measure

Number of samples in cluster, h

j-th sample in cluster, h

Center of cluster, h

Page 12 of 60


Compactness measure

Separation measure

Combined measure Center of cluster, h

Center of cluster, h’

Page 13 of 60


Compactness measure

Separation measure

Combined measure

Page 14 of 60

Self Organizing Map (SOM)

Page 15 of 60

Topology

The Kohonen Self-Organizing Network (KSON) belongs to the class of unsupervised learning networks.

Nodes distribute themselves across the input space to recognize groups of similar input vectors.

In other words, the nodes of the KSON can recognize groups of similar input vectors.

Page 16 of 60

Topology

This process is known as competitive learning.

It is based on the competitive learning technique also known as the winner take all strategy.

SOM is a technique which of data through the use of self-organizing neural networks reduce the dimensions

Page 17 of 60

A Schematic Representation of a Typical KSOM

Page 18 of 60

Neighborhood Structure (NC)

One Dimensional

2D Hexagonal

2D Rectangular

Page 19 of 60


One Dimensional

2D Hexagonal

2D Rectangular

Page 20 of 60


One Dimensional

2D Hexagonal

2D Rectangular

Page 21 of 60

The Self-Organizing Map (SOM)

Page 22 of 60

Steps of Learning Algorithm

Page 23 of 60

Steps of Learning Algorithm (cont.)

Page 24 of 60

Steps of Learning Algorithm (cont.)

Page 25 of 60

Example

A Kohonen self-organizing map is used to cluster four vectors given by:

(1,1,0,0)

(0,0,0,1)

(1,1,0,0)

(0,0,1,1)

The maximum numbers of clusters to be formed is m = 3.

Page 26 of 60

Example

Suppose the learning rate (geometric decreasing) is given by:

α(0)=0.3

α(t+1)=0.2α(t)

With only three clusters available and the weights of only one cluster are updated at each step (i.e., Nc = 0), find the weight matrix. Use one single epoch of training.

Page 27 of 60

Step 1

Step 1:

The initial weight matrix is:

0.2 0.4 0.1

W= 0.3 0.2 0.2

0.5 0.3 0.5

0.1 0.1 0.1

Initial radius: Nc = 0

Initial learning rate: α(0)=0.3

Page 28 of 60

Step 2,3 (Pattern 1)

Step 2: For the first input vector (1,1,1,0), do steps 3 – 5

Step 3:

The input vector is closest to output node 1. Thus node 1 is the winner. The weights for node 1 should be updated.

Page 29 of 60

Step 4

Step 4: weights on the winning unit are updated:

Page 30 of 60


Step 2: For the second input vector (0,0,0,1), do steps 3 – 5

Step 3:

The input vector is closest to output node 2. Thus node 2 is the winner. The weights for node 2 should be updated.

Page 31 of 60

Step 4


Page 32 of 60


Step 2: For the third input vector (1,1,0,0), do steps 3 – 5

Step 3:

The input vector is closest to output node 1. Thus node 1 is

the winner. The weights for node 1 should be updated.

Page 33 of 60

Step 4


Page 34 of 60


Step 2: For the forth input vector (0,0,1,1), do steps 3 – 5

Step 3:

The input vector is closest to output node 3. Thus node 3 is

the winner. The weights for node 3 should be updated.

Page 35 of 60

Step 4


Page 36 of 60

Step 5

Epoch 1 is complete.

Reduce the learning rate:

α(t+1)=0.2α(t)=0.2(0.3)=0.06

Repeat from the start for new epochs until Δwj becomes steady for all input patterns or the error is within a tolerable range.

Page 37 of 60

Applications

based on the competitive learning rule, KSONs have been used extensively for clustering applications such as:

Character recognition, Speech recognition, Vector coding, Robotics applications, and Texture segmentation.

Page 38 of 60

Character Recognition (Example)

21 input patterns, 7 letters from 3 different fonts

25 cluster units are available, which means that a maximum of 25 clusters may be formed.

Page 39 of 60

Example (cont.)

No topological structure

only the winning unit is allowed to learn the pattern presented, the 21 patterns form 5 clusters:

UNIT PATTERNS

3 C1, C2, C3

13 B1, B3, D1, D3, E1, K1, K3

16 A1, A2, A3

18 J1, J2, J3

24 B2, D2, E2, K2

Page 40 of 60

Example (cont.)

A linear structure (with R = 1)

The winning node J and its topological neighbors (J+1 and J-1) are allowed to learn on each iteration.

UNIT PATTERNS UNIT PATTERNS

6 K2 20 C1, C2, C3

10 J1, J2, J3 22 D2

14 E1, E3 23 B2, E2

16 K1, K3 25 A1, A2, A3

18 B1, B3, D1, D3

Page 41 of 60

Example (cont.)

Diamond structure

Each cluster unit is indexed by two subscripts.

If unit Xij is the winning unit, the units Xi+1,j; Xi-1,j Xi,j+1, and Xi,j-1 also learn.

i\j 1 2 3 4 5

1 J1, J2, J3 D2

2 C1, C2, C3 D1, D3 B2, E2

3 B1 K2

4 E1, E3, B3 A3

5 K1, K3 A1, A2

Page 42 of 60

Hard C-Means Clustering

{0,1}

The k-means algorithm is an algorithm to cluster n objects based on attributes into k disjoint partitions, where k < n .

Objective function depends on the cluster centers c and the assignment of data points to clusters U.

Constraint 1: ensure that each data point is assigned exactly to one cluster

Constraint 2: ensure that no cluster left empty

Page 43 of 60


The iterative optimization scheme works as follows: At first initial cluster numbers & centers are chosen Then each data point is assigned to its closest cluster center:

Then the data partition U is held fixed and new cluster centers are computed:

The last two steps, are iterated until no change in C or U can be observed

Page 44 of 60


Page 45 of 60


In contrast with Hard C-Means, Fuzzy cluster allows gradual memberships of data points to clusters.

Constraint 1: guarantees that no cluster is empty

Constraint 2: ensures that the sum of the membership degrees for each datum equals 1. This means that each datum receives the same weight in comparison to all other data.

Page 46 of 60


First the membership degrees are optimized for fixed cluster parameters, then the cluster prototypes are optimized for fixed membership degrees, according to the following update formulas:

The parameter m, m > 1, is called the fuzzifier

or weighting exponent. It determines the

‘fuzziness’ of the classification.

Page 47 of 60


Page 48 of 60


Page 49 of 60

Possibililstic C-Means Clustering

Noise

The normalization of memberships in FCM, can lead to undesired effects in the presence of noise and outliers.

The fixed data point weight may result in high membership of noises to clusters.

By dropping the normalization constraint in the Possibililstic C-Means Clustering one tries to achieve a more intuitive assignment of degrees of membership and to avoid undesirable normalization effects.

Page 50 of 60


,

Constraint 1: guarantees that no cluster is empty

The 𝑢𝑖𝑗 ∈ 0,1 interpreted as the degree of representativity or

typicality of the datum 𝑥𝑗 to cluster г𝑖.

𝐽𝑓is modified to:

The second term suppresses the trivial solution since this sum rewards high memberships (close to 1).

Page 51 of 60


The formula for updating the membership degrees that is derived from Jp by setting its derivative to zero is

Considering the case m=2 and substituting η𝑖 for 𝑑𝑖𝑗2yields 𝑢𝑖𝑗=0.5. Therefore η𝑖 is a

parameter that determines the distance to the cluster i at which the membership degree should be 0.5.

Depending on the cluster’s shape the η𝑖 have different geometrical interpretation and can be set to the desired value

Page 52 of 60


However, the information on the actual shape property described by η𝑖 is often unknown. In that case these parameters are estimated by the following formula:

The update equations for the cluster prototypes in the possibilistic algorithms must be identical to their probabilistic counterparts due to the fact that the second, additional term in Jp vanishes in the derivative for fixed (constant) memberships 𝑢𝑖𝑗 .

Page 53 of 60


The interpretation of m is different in the FCM and

the PCM. In the FCM, increasing values of m

represent increased sharing of points among all

clusters, whereas in the PCM, it represent increased

possibility of all points in the data set completely

belonging to a given cluster. Thus, the value of m

that gives us satisfactory performance is different in

the two algorithms

A plot of the PCM membership function for various values of the m

Page 54 of 60

Possibililstic C-Means Clustering - Examples

Original image Cluster1

Cluster2 Cluster3

Cluster1

Cluster2

Cluster3

Page 55 of 60

Possibililstic C-Means Clustering - Examples


Cluster2 Cluster3

Page 56 of 60

Example and Results- using FCM


Cluster2 Cluster3

Cluster1

Cluster2 Cluster3

Noisy image

Page 57 of 60


Cluster2 Cluster3

Cluster1

Cluster2 Cluster3

Example and Results- using PCM

Page 58 of 60

Noisy image

[1] M. Zarinbal, “Designing a fuzzy expert system for diagnosing the brain toumors,” Amirkabir University of Technology, 2009.

[2] M. Egmont-Petersen, D. de Ridder, and H. Handels, “Image processing with neural networks—a review,” Pattern Recognit., vol. 35, no. 10, pp. 2279–2301, 2002.

[3] I. Bankman, Handbook of medical image processing and analysis. academic press, 2008.

[4] R. Archibald, K. Chen, A. Gelb, and R. Renaut, “Improving tissue segmentation of human brain MRI through preprocessing by the Gegenbauer reconstruction

method,” Neuroimage, vol. 20, no. 1, pp. 489–502, 2003.

[5] B. E. Chapman, J. O. Stapelton, and D. L. Parker, “Intracranial vessel segmentation from time-of-flight MRA using pre-processing of the MIP Z-buffer: accuracy

of the ZBS algorithm,” Med. Image Anal., vol. 8, no. 2, pp. 113–126, 2004.

[6] A. Candolfi, R. De Maesschalck, D. Jouan-Rimbaud, P. A. Hailey, and D. L. Massart, “The influence of data pre-processing in the pattern recognition of

excipients near-infrared spectra,” J. Pharm. Biomed. Anal., vol. 21, no. 1, pp. 115–132, 1999.

[7] N. J. Pizzi, “Fuzzy pre-processing of gold standards as applied to biomedical spectra classification,” Artif. Intell. Med., vol. 16, no. 2, pp. 171–182, 1999.

[8] D. Van De Ville, M. Nachtegael, D. Van der Weken, E. E. Kerre, W. Philips, and I. Lemahieu, “Noise reduction by fuzzy image filtering,” Fuzzy Syst. IEEE

Trans., vol. 11, no. 4, pp. 429–436, 2003.

[9] F. Di Martino, “An image coding/decoding method based on direct and inverse fuzzy transforms,” Int. J. Approx. Reason., pp. 110–131, 2008.

References

[10] M. and B. l. S. Sezgin, “Survey over image thresholding techniques and quantitative performance evaluation,” J. Electron. Imaging, pp. 146–165, 2004.

[11] H.-D. Cheng and H. Xu, “A novel fuzzy logic approach to contrast enhancement,” Pattern Recognit., vol. 33, no. 5, pp. 809–819, 2000.

[12] J. C. Bezdek, J. Keller, R. Krisnapuram, and N. R. Pal, Fuzzy models and algorithms for pattern recognition and image processing, vol. 4. Springer, 2005.

[13] L. Cinque, G. Foresti, and L. Lombardi, “A clustering fuzzy approach for image segmentation,” Pattern Recognit., vol. 37, no. 9, pp. 1797–1807, 2004.

[14] J. V. de Oliveira and W. Pedrycz, Advances in fuzzy clustering and its applications. Wiley Online Library, 2007.

[15] M. Halkidi, Y. Batistakis, and M. Vazirgiannis, “On clustering validation techniques,” J. Intell. Inf. Syst., vol. 17, no. 2–3, pp. 107–145, 2001.

[16] P. H. N. Mladenovic, Belacel, N., “FuzzyJ-Means: a new heuristic for fuzzyclustering,” Pattern Recognit., pp. 2193–2200, 2002.

[17] K. P. Detroja, R. D. Gudi, and S. C. Patwardhan, “A possibilistic clustering approach to novel fault detection and isolation,” J. Process Control, vol. 16, no. 10,

pp. 1055–1073, 2006.

[18] A. Flores-Sintas, J. Cadenas, and F. Martin, “A local geometrical properties application to fuzzy clustering,” Fuzzy Sets Syst., vol. 100, no. 1, pp. 245–256,

1998.

[19] R. Krishnapuram and J. M. Keller, “A possibilistic approach to clustering,” Fuzzy Systems, IEEE Transactions on, vol. 1, no. 2. pp. 98–110, 1993.

[20] A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: a review,” ACM Comput. Surv., vol. 31, no. 3, pp. 264–323, 1999.

[21] M. H. F. Zarandi, M. Zarinbal, and I. B. Türksen, “Type-II Fuzzy Possibilistic C-Mean Clustering.,” in IFSA/EUSFLAT Conf., 2009, pp. 30–3

References

Clustering methods in image processing

Engineering