Top Banner
Motivation Introduction ICA Application ICA Dependencies Summary Independent Component Analysis for Feature Extraction Carmen Klaussner LCT Language and Communication Technology University of Groningen April, 25th, 2013
21

Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Jun 04, 2018

Download

Documents

dodiep
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

Independent Component Analysisfor Feature Extraction

Carmen Klaussner

LCT Language and Communication TechnologyUniversity of Groningen

April, 25th, 2013

Page 2: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

Independent Component Analysis (ICA)

Independent Component Analysis (ICA) is an unsupervised statisticaltechnique used for:

I separating a multivariate signal into independent subcomponents(blind source separation (BSS))

I revealing underlying latent concepts in feature extraction

Page 3: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

ICA and the Cocktail-Party Problem 1

I Imagine two different speakers in a room

I Two microphones placed at different locationsabout the room

I The microphones are recording mixtures of thevarious speech signals

1http://storage.blogues.canoe.ca/davidakin/200811141920.jpg

Page 4: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

ICA and the Cocktail-Party Problem cont’d

Figure: ICA Model 2

2http://www.imodenergy.com/images/courses/imode201/slide03.jpg

Page 5: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

ICA ModelResult of the recordings are mixed signals x1(t), x2(t)

x1(t) = a11s1 + a12s2

x2(t) = a21s1 + a22s2

I where x1 and x2 are the amplitudes and t the time index

I each recorded signal is a weighted sum of the original speech signalsof the two speakers denoted by s1(t) and s2(t)

I a11, a12, a21, and a22 are some parameters that depend on thedistances of the microphones from the speakers

I Assume that s1(t) and s2(t), at each time instant t, are statisticallyindependent

I Given only the mixed signals: x1, x2 ⇒ retrieve the original speechsignal of each speaker: s1(t), s2(t)

Page 6: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

ICA Model cont’d

x = As

with:

I x = (x1, x2...xn)T is a vector of observed random variables

I s = (s1, s2...sn)T the vector of the latent variables (the independentcomponents.)

I A is the unknown constant matrix, the mixing matrix A

I number of components is arbitrary, at most = no. of samples

Aim of Algorithm: find W = A−1, so that we obtain the independentcomponents by:

s = Wx

Page 7: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

ICA Feature Extraction on Text Documents

x =

D1023.txt D1392.txt D1394.txt D1400.txt D1406.txt . . .

able 73 2 1 32 7 . . .about 684 10 32 319 40 . . .above 51 2 4 31 4 . . .abroad 13 1 0 10 0 . . .absence 14 0 0 6 0 . . .absolutely 6 0 0 7 1 . . .accept 23 0 1 5 2 . . .accepted 14 1 0 7 2 . . .accident 11 0 1 9 0 . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. . . .

ICA Interpretation

I documents are linear mixtures of concepts

I each term is a mixed signal/observation xi at a different time indext (here: time index = document)

I the source signals s are the latent concepts(independent components)

I aim is to find latent concepts − document representation

Page 8: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

ICA on Text Documentsx = As becomes:

Xterm x document = Aterm x concept ∗ Sconcept x document

s = Wx becomes:

Sconcept x document = Wconcept x term ∗ Xterm x document

I S is a new data representation that combines terms into latentconcepts

I A, the mixing matrix assigns a weight for each term in eachcomponent

I term-by-document matrix is unmixed to yield ’original’concept-by-document mapping

Page 9: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

ICA Output Example

x=

D1023.txt D1392.txt D1394.txt D1400.txt D1406.txt . . .

able 73 2 1 32 7 . . .about 684 10 32 319 40 . . .above 51 2 4 31 4 . . .abroad 13 1 0 10 0 . . .absence 14 0 0 6 0 . . .absolutely 6 0 0 7 1 . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. . . .

=

A =

c1 c2 c3 c4 c5 . . .

able 0.076106808 −0.014558451 −0.10733842 −0.091537869 −0.0712187592 . . .about 0.168884358 −0.013135861 −0.04944864 −0.045366695 −0.0675653686 . . .above −0.087822012 −0.025989498 0.05227958 −0.002340966 −0.0181397638 . . .abroad −0.141542609 0.020390763 0.07750117 −0.040127687 0.0002770738 . . .absence −0.002402465 −0.134321250 0.04981664 0.140644925 −0.1017302731 . . .absolutely 0.002845907 −0.004149262 −0.01830506 0.047701236 −0.0910047210 . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. . . .

*

s =

D1023.txt D1392.txt D1394.txt D1400.txt D1406.txt . . .

c1 1.000000 −1.000053 1.000000 1.000000 1.000000 . . .c2 −1.068787 −1.026944 0.9187293 −1.068788 −1.068790 . . .c3 −1.000675 −0.9531389 0.9558447 1.002504 −1.000675 . . .c4 −1.038625 −0.8975203 0.8958735 −1.151772 0.9906527 . . .c5 −0.9303368 0.9171785 −0.9577544 1.164191 1.081455 . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. . . .

Page 10: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

My Master Thesis: Dickens’ Style Analysis

Find characteristic terms of Charles Dickens compared to hiscontemporary writer Wilkie Collins.

Dickens’ keywords:lot, release, answering, ive, sunk, softened, beside, examined, seven, brothers, wear, eleven, correct,path, watched, sorrow, treated, sounds, masters, oclock, upon, lean, reality, song...

Collins’ keywords:gentle, fate, sweet, contrast, forth, whom, changes, strong, art, disturb, ventured, sorrow, blessing,parties, faded, imagination, towards, moon, portrait, daily, guide, game, although, lot, building,learn, visits, pay, animal, humanity...

Page 11: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

ICA Characteristic Terms Extraction

1. use ICA on term-by-document matrix to extract term concepts

2. extract weights for each keyword in document

3. select characteristic terms for each document set

4. test generalisation ability of each term list

terms in documents (individual) ⇒ concepts in documents (global)

⇓terms over authors

Page 12: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

Principal Component Analysis (PCA)I Principal component analysis (PCA) finds directions of maximum

variance in dataI Reduction of feature space by selecting those directions explaining

most of the varianceI Decorrelation of features, so that new data representation only

varies within each featureI Works best on gaussian distributions

Page 13: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

ICA and PCA: a comparison

I ICA is computationally superior to PCAI may not generally be superior (depending on application)

I PCA acts as preprocessing method for ICA

Figure: PCA vs. ICA 3

3http://www.sciencedirect.com/science/article/pii/S0957417406001308

Page 14: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

ICA Ambiguity

I components are extracted “randomly” depending on initial weight

I components are not ranked as in PCA

I ambiguity of signal variance and sign of ICs

I how many components to extract for application???

Given only the mixed signals and the assumption of statisticalindependence of the estimated signals ⇒ ICA retrieves original sources

Page 15: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

Objective Function and Statistical Independence

Statistical independence of two random variables y1, y2:

p(y1, y2) = p(y1)p(y2).

Measures of Statistical Independence:

I Minimization of mutual informationI Kullback-Leibler Divergence and Maximum-entropy

I Maximization of non-GaussianityI Kurtosis and Negentropy

Page 16: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

What is Statistical Independence?I Intuitively, statistical independence of two signals means, that at

each time point, signal 1 does not give any information aboutposition of signal 2 and vice versa

⇒ consequently: permuting the values of one signal and thus changingthe mapping at each time point should not have any effect

Figure: Mapping of two independent signals

Page 17: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

So...Independent component Analysis...

I is a method for blind source separation and feature extraction

I Given only mixed signals and statistical independence assumptionestimates original sources or latent variables

I Computationally expensive, best to try similar, but simplermethods first

Page 18: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

Where to find ICA

There are different implementations of ICA: Infomax, JADE,...FastICA

I implementation for FastICA AlgorithmI For R: http://cran.r-project.org/web/packages/fastICA/index.htmlI For Matlab: http://research.ics.aalto.fi/ica/fastica/

Page 19: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

Thank You!

Questions?

Page 20: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

References I

Altangerel Chagnaa, Cheol-young Ock, Chang-beom Lee, and PurevJaimai.Feature Extraction of Concepts by Independent Component Analysis,2007.

Timo Honkela and Aapo Hyvarinen.Linguistic Feature Extraction using Independent ComponentAnalysis.In Proceedings of IJCNN’04, pages 279–284, Budabest, Hungary,July 2004.

Aapo Hyvarinen and Erkki Oja.Independent component analysis: algorithms and applications.Neural Networks, 13:411–430, 2000.

T. Kolenda, L. K. Hansen, and S. Sigurdsson.Independent Components in Text, 2000.

Page 21: Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA DependenciesSummary Independent Component Analysis for Feature Extraction Carmen

Motivation Introduction ICA Application ICA Dependencies Summary

References II

Jonathon Shlens.A Tutorial on Principal Component Analysis.Technical report, Systems Neurobiology Laboratory, Salk Insitute forBiological Studies, December 2005.