Self-Organizing Maps (SOM) - Utrecht · PDF fileKV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps ... Visualizing the data distribution on top of
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
Visualizing the SOM- SOM Grid- Music Description Map (MDM)- Bar Plots and Chernoff's Faces- U-Matrix and Distance Matrix- Smoothed Data Histogram (SDH)- Component Planes
Growing Hierarchical SOM
Aligned SOM
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
+ tighter relationship between clusters+ more connections+ grid structure fits Gaussian structure in neighborhood kernel calculation (centroids of neighboring map units are equidistant)
Different Topologies / Grid Structures
+ easier to implement– diagonally neighboring map units do not perfectly fit to Gaussian neighborhood function
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
Input:• map of units ui with model vectors mi ("codebook") • training instances X = {xi}• a similarity measure sim(.,.) between data items (e.g., Euclidean distance)• parameters: α(t) (learning rate ∈ [0..1]) and
a neighborhood kernel function with parameter r(t) (‘neighborhood radius’),e.g., pseudo-Gaussian (dij = map distance btw. ui, uj)
On-line Learning: The Online Training Algorithm
Initialize each unit (model vector) mi to represent a randomly selected data item
Loop over time steps t, until convergence:1. Randomly select an example x2. Find the ‘winning unit’ (best matching unit) uc with mc = maxi(sim(mi,x))3. Adapt model vectors of all units as
mi(t +1) = mi(t) + α(t) · uic(t) · [x−mi (t)]
4. Update (decrease) training parameters α(t), r(t)
Online SOM Training Algorithm (one possible variant):
))(exp()( 22 trdtu ijij −=
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
Input:• map of units ui with model vectors mi• training instances X = {xi}• a similarity measure between examples (e.g., Euclidean distance)• a neighborhood kernel function with parameter r(t) (‘neighborhood radius’),
• each data point (example) x uniquely ‘belongs to’ a unit (the BMU of x)• relationship between units: neighboring units cover similar data items• non-uniform distances between model vectors, uniform distances in visualization• "interpolation units" (units with no data associated) are possible
SOM: Illustration
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
• Random Initialization: - random values in same range as X (between min and max of each dimension)- randomly select data items from X and assign them to model vectors mi
+ fast– mapping not consistent for different runs
• Linear Initialization:perform Eigendecomposition of autocorrelation matrix of X → PCAtop 2 Eigenvectors (with largest Eigenvalues) span a 2-dimensional subspaceinitialize model vectors along these Eigenvectors→ predefined linear mapping to start with
+ mapping consistent for different runs (up to rotation / mirroring)– computationally more complex
Initialization of the Model Vectors
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
Visualizing attribute distributions on top of a learned SOM:
• Component Planes: visualize feature values of model vectorsassociated with the map units (or averaged feature values overall instances covered by a unit)
• Bar Charts or Chernoff's Faces: visualize all dimensions of model vectors for each map unit in one plot
[Vesanto, 1999], [Vesanto, 2002]
Visualizing the SOM
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
psychologically motivated visualization method(people can quickly grasp a face's expression)
each attribute value (dimension in data space)is mapped to a specific property of the Chernoff face (e.g., mouth's contour, height/width of face, ear's slope, …)
1. sort all units w.r.t. G2-values of contained terms → U
2. remove highest ranked unit u ∈ U,find similarly labeled units among u's neighbors→ if cosine similarity between label vectors of map unit u and its neighbors i < threshold θ, aggregate u and i
3. goto 2
MDM (II): Connecting Similar Map Units
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
• display smoothed density of data items associated with areas of the map• reveal clusters in the data• many pieces associated with a unit → cluster center
Smoothed Data Histograms (SDH)
Idea for smoothing / density estimation:
• voting matrix whose size equals size of SOM• data items “vote” for a number N of best-matching units• best-matching unit gets N points, 2nd best gets N-1 points, …• N-th best gets 1 point, all others get 0 points (N is parameter, ‘spread’)• the distribution of votes is visualized over the entire map, e.g.,
via a color map (interpolated voting matrix for smoothing)
[Pampalk et al., 2002]
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
• Input: music collection (digital audio files)• calculate audio features for each track, e.g.
• rhythmic [Pampalk, Islands of Music: Analysis, Organization, andVisualization of Music Archives, Diploma Thesis 2001]
• timbral [Mandel & Ellis, Song-Level Features and Support Vector Machinesfor Music Classification, ISMIR 2005]
• train a SOM on audio features• calculate an SDH on the SOM• visualize SDH in 3D using smoothed voting matrix of SDH as height values• build a game-like user interface to explore the user’s
(or someone else’s) music collection
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
Start with 1 unit to expand (= mean of data), level 0Loop until no more units to expand
1. For each unit to expand create new 2x2 SOM (initialize orientation)2. Train SOM on data assigned to ‘parent unit’3. Decision 1: Insert new row or column?
If yes: insert new row/column and goto 2 4. Decision 2: hierarchically expand units of map?
If yes: add units to expand list
Decision 1: Insert new row or column if mean quantization error > threshold(i.e., map does not represent the data well); insert new row or column between unit with highest quantizationerror and adjacent unit with largest distance
Decision 2: Expand unit if quantization error of unit > threshold(i.e., unit does not represent its associated data items well)
Parameters: same as SOM (except no. of units) + 2 thresholds τ1, τ2
The GHSOM Algorithm
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
where MQE0* is the mean quantization error of the whole dataset
with respect to the virtual unit located in the center of the whole dataset
(in contrast to MQE0, which is the mean quantization error of thedata items in the respective sub-branch of the GHSOM)
Decision 1 : Insert new row or column if MMQE > τ1MQE0where MQE0 is the MQE of a virtual unit m0 representing themean of all instances covered by the parent unit:
The GHSOM Algorithm:Decisions 1 (enlarge map) and 2 (insert new map)
Generally: τ1, τ2 are chosen such that 1 > τ1 >> τ2 > 0.
∑=i
i nxm0
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
Problem:• maps of descendants of a unit ui could have arbitrary orientation
→ no visible relationship between different sub-branches (other thancommon parent map)
Solution:• enforce/encourage a specific orientation of the sub-layer SOMs via
initialization
• initialize the model vectors of the 2x2 SOMs such that they correspondto the orientation of the parent map
• for example: initialize the 4 model vectors with the means of theparent vector and each of its 4 immediate neighbors
• for border units: extrapolate "virtual" units.Example: if ui is located on the left border and the unit to its right is ur,create virtual left neighbor ul with ml = mi + (mi – mr)
Exercise: How could the initialization function for the codebook of a new sublevel SOM expressed as weighted parent unit(s' neighbors) look like?
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
• Goal: understand relationship between different ways of representingthe same data
• layers of mutually constrained SOMs (i.e., a stack of SOMs)• each layer trained on slightly different data space / view of the data
(i.e., different dimensions or distance definitions), but same data items• trained so that all layers have same orientation• constraints between layers to enforce smooth transitions between views
Visualizing Effects of Changes in Data Definition:Aligned SOMs
Use:• exploratory analysis of alternative data representations• visualize changes in the inherent structure of the data in response to
changes in features, relative feature weights, different ways of normalizing features, different similarity functions, ...
→ navigation through alternative data spaces
[Pampalk et al., 2003]
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
Parameter Values(define different views of the data)
Distance between layers (relative to distance betweenunits in same layer)E.g., intra-SOM distance between neighboring units = 1inter-SOM distance "between" same map unit = 1/5
…
…
…
Aligned SOMs: The Basic Architecture
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
– Randomly select training instance x and layer l– Find best matching unit for x in l– Adapt neighborhood of best matching unit (intra- and inter-layer
neighborhood)
within layer
between layers
Neighborhood:
Aligned SOM: Training(Online version, simplified)
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
Input:• map of units uli with model vectors mli ("codebook"), l…layer• training instances X = {xi}• a similarity measure sim(.,.) between data items (e.g., Euclidean distance)• parameters: α(t) (learning rate ∈ [0..1]) and
a neighborhood kernel function with parameter r(t) (‘neighborhood radius’),e.g., pseudo-Gaussian (dij = map distance btw. uli, ukj)
Aligned SOM: On-line Learning
Initialize each unit (model vector) mli to represent a randomly selected data item (features weighted according to layer-specific weights, e.g., from 1:0 to 0:1)
Loop over time steps t, until convergence:1. Randomly select an example x and a layer l; apply weighting according
to view/data space of l → xl2. Find the ‘winning unit’ (best matching unit) uc with mc = maxi(sim(mli,xl))3. Adapt model vectors of all units in all layers as
mli(t +1) = mli(t) + α(t) · uic(t) · [xl−mli (t)]
4. Update (decrease) training parameters α(t), r(t)
Online SOM Training Algorithm:))(exp()( 22 trdtu ijij −=
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
LiteratureSOM:[Kohonen, 1982]: Kohonen, T. Self-Organizing Formation of Topologically Correct Feature Maps. Biological Cybernetics, 43:59–69.[Kohonen, 2001]: Kohonen, T. Self-Organizing Maps, volume 30 of Springer Series in Information Sciences. Springer, Berlin, Germany, 3rd edition.[Vesanto, 1999]: Vesanto, J. SOM-Based Data Visualization Methods. Intelligent Data Analysis 3(2):111–126.[Vesanto, 2002]: Vesanto, J. Data Exploration Process Based on the Self-Organizing Map. PhD thesis, Helsinki University of Technology, Espoo, Finland.[Pampalk et al., 2002]: Pampalk, E., Rauber, A., and Merkl, D. Using Smoothed Data Histograms for Cluster Visualization in Self-Organizing Maps. In Proceedings of the International Conference on Artificial Neural Networks (ICANN 2002), Madrid, Spain. Springer.[Knees et al., 2006]: Knees, P., Pohle, T., Schedl, M., and Widmer, G. Automatically Describing Music on a Map. In Proceedings of the 2nd Workshop on Learning the Semantics of Audio Signals (LSAS 2008), Paris, France, June 2008.[Kaski et al., 1998]: WEBSOM – Self-Organizing Maps of Document Collections, Neurocomputing 21, 1998.
[Lagus, Kaski, 1999]: Keyword Selection Method for Characterising Text Document Maps, In Proceedings of the International Conference on Artificial Neural Networks (ICANN 1999), London, UK.
KV Spezielle Kapitel aus Informatik: Exploratory Data Analysis Self-Organizing Maps
Literature (II)GHSOM:[Dittenbach et al., 2002]: Dittenbach, M., Rauber, A., and Merkl, D. Uncovering Hierarchical Structure in Data Using the Growing Hierarchical Self-Organizing Map. Neurocomputing, 48(1–4):199–216.
Aligned SOM:[Pampalk et al. 2003]: Pampalk, E., Goebl, W., Widmer, G. Visualizing Changes in the Structure of Data for Exploratory Feature Selection, In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2003).