Gaussian Mixture Example: Start After First Iteration.
Post on 21-Dec-2015
218 Views
Preview:
Transcript
Gaussian Mixture Example: Start
After First Iteration
After 2nd Iteration
After 3rd Iteration
After 4th Iteration
After 5th Iteration
After 6th Iteration
After 20th Iteration
A Gaussian Mixture Model for Clustering
Assume that data are generated from a mixture of Gaussian distributions
For each Gaussian distribution Center: i
Variance: (ignore) For each data point
Determine membership
: if belongs to j-th clusterij iz x
Learning Gaussian Mixture Modelwith the known covariance
Pr(X = xi) =P
öj
Pr(X = xi;ö = öj)
=P
öj
Pr(ö = öj)Pr(X = xijö = öj)
=P
öj
Pr(ö = öj)(2ùû2)d=21 exp(à 2û2
jjxiàöj jj2)
Log-likelihood of Data
=P
ilog
hP
öj
Pr(ö = öj)(2ùû2)d=21 exp(à 2û2
jjxiàöj jj2)i
P
ilogPr(X = xi)
Apply MLE to find optimal parameters
Learning a Gaussian Mixture
(with known covariance)
( )ip x x
2
/ 2 22
( ) ( , ) ( ) ( | )
1( ) exp
22
j j
j
i i j j i j
i jj d
p x x p x x p p x x
xp
22
22
1( )
2
1( )
2
1
( )
( )
i j
i n
x
j
k x
nn
e p
e p
[ ] ( | )ij j iE z p x x E-Step
1
( | ) ( )
( | ) ( )
i j j
k
i n jn
p x x p
p x x p
Learning Gaussian Mixture Model
1
1
1[ ]
[ ]
m
j ij imi
iji
E z x
E z
M-Step
1
1( ) [ ]
m
j iji
p E zm
Learning Gaussian Mixture Model
Mixture Model for Document Clustering
A set of language models
1 2, ,..., K
1 2{ ( | ), ( | ),..., ( | )}i i i V ip w p w p w
Mixture Model for Documents Clustering
A set of language models
1 2, ,..., K
1 2{ ( | ), ( | ),..., ( | )}i i i V ip w p w p w
( )ip d d
( , )
1
( ) ( , )
( ) ( | )
( ) ( | )
j
j
k i
j
i i j
j i j
V tf w d
j k jk
p d d p d d
p p d d
p p w
Probability
A set of language models
1 2, ,..., K
1 2{ ( | ), ( | ),..., ( | )}i i i V ip w p w p w
( )ip d d
( , )
1
( ) ( , )
( ) ( | )
( ) ( | )
j
j
k i
j
i i j
j i j
V tf w d
j k jk
p d d p d d
p p d d
p p w
Probability
Mixture Model for Document Clustering
Mixture Model for Document Clustering
A set of language models
1 2, ,..., K
1 2{ ( | ), ( | ),..., ( | )}i i i V ip w p w p w
( )ip d d
( , )
1
( ) ( , )
( ) ( | )
( ) ( | )
j
j
k i
j
i i j
j i j
V tf w d
j k jk
p d d p d d
p p d d
p p w
Probability
Introduce hidden variable zijzij: document di is generated by the
j-th language model j.
Learning a Mixture Model
( , )
1
( , )
1 1
( | ) ( )
( | ) ( )
k i
k i
V tf w d
m j jmVK
tf w dm n n
n m
p w p
p w p
1
[ ] ( | )
( | ) ( )
( | ) ( )
ij j i
i j j
K
i n nn
E z p d d
p d d p
p d d p
E-Step
K: number of language models
Learning a Mixture Model
M-Step
1
1( ) [ ]
N
j iji
p E zN
1
1
[ ] ( , )
( | )
[ ]
N
ij i kk
i j N
ij kk
E z tf w d
p w
E z d
N: number of documents
top related