Intelligent Database Systems Presenter : WU, MIN-CONG Authors : KADIM TA¸SDEMIR, PAVEL MILENOV, AND BROOKE TAPSALL 2011,IEEE Topology-Based Hierarchical Clustering of Self-Organizing Maps
Feb 14, 2016
Intelligent Database Systems Lab
Presenter : WU, MIN-CONG
Authors : KADIM TA¸SDEMIR,
PAVEL MILENOV, AND BROOKE TAPSALL
2011,IEEE
Topology-Based Hierarchical Clustering ofSelf-Organizing Maps
Intelligent Database Systems Lab
OutlinesMotivationObjectivesMethodologyExperimentsConclusionsComments
Intelligent Database Systems Lab
Motivation• Hierarchical clustering various distance-based
similarity measures that have some flaw .
• 1. sensitivity to inhomogeneous within-cluster
density distributions, noise or outliers.• 2. depend on the cluster centroids and dispersion around these centroids.
Intelligent Database Systems Lab
Objectives• we employ average linkage for hierarchical clustering
of prototypes based on CONN so that at each
agglomeration step we merge the pair with maximum
average between cluster connectivity that method
CONN linkage, and add a new similarity criteria
CONN_Index.
Intelligent Database Systems Lab
Methodology-SOMs • Adapted BMU
• Updating BMU
• RFi and RFij
Intelligent Database Systems Lab
Methodology-Connectivity Martix
P1 P2 P3
P1 0 3 5
P2 3 0 2
P3 5 2 0
from
CONN(P1,P2) = 3CONN(P1,P3) = 5CONN(P2,P3) = 2CONN(P3,P2) = 2CONN(P3,P1) = 5CONN(P2,P1) = 3
CONNP1 P2 P3
P1 0 3 5
P2 3 0 2
P3 5 2 0
CONN
P1 P2 P3
P1 0 1 3
P2 2 0 1
P3 2 1 0
CADJ
CONN=CADJ(p1,p2)+CADJ(p2,p1) =1+2 =3
Intelligent Database Systems Lab
Methodology-CONN Linkage
S1 S2 S3 S4S1 0 1 2 6S2 1 0 3 4S3 2 3 0 5S4 6 4 5 0
Similary matrixS2 S3
S2 0 3
S3 3 0Delete Add
S2 S3 N1
S2 0 3 5
S3 3 0 4
N1 5 4 0
Intelligent Database Systems Lab
Minimum is better
Maximum is better
Maximum is better
Maximum is better
Maximum is better
nearest 1 is better
Methodology-Number of cluster
Intelligent Database Systems Lab
Methodology-Applicability and Complexity of the Algorithm
represent the data topology
Delaunay graph is to have dense enough prototypes.
occasionally
CONN Linkage’s time complexity = O(p^2*d)
Average Linkage’s time complexity = O(p^3*d)
Intelligent Database Systems Lab
Experiment
Intelligent Database Systems Lab
Experiment
Intelligent Database Systems Lab
Experiment
Intelligent Database Systems Lab
Experiment
Intelligent Database Systems Lab
Experiment
Intelligent Database Systems Lab
Experiment
Intelligent Database Systems Lab
Experiment
Intelligent Database Systems Lab
Conclusions• CONN linkage produces partitionings better than the
ones obtained by distance-based linkages.
• Conn_Index based on CONN graph provided better decisions than other indices in the study reported in this paper.
Intelligent Database Systems Lab
Comments• Advantages– CONN_Index provides the best partitioning for the
preset number of clusters.– CONN linkage is mainly proposed for accurate
clustering of remote sensing imagery
• Applications– hierarchical clustering.