Gaussian Mixture Model (GMM) using Expectation Maximization

advertisement
Gaussian Mixture Model (GMM)
using Expectation Maximization
(EM) Technique
Book : C.M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006
The Gaussian Distribution
 Univariate Gaussian Distribution
Ν( x | μ, σ ) =
mean
1
2 πσ
2
e
(x − μ) 2
−
2σ 2
variance
 Multi-Variate Gaussian Distribution
1
1


T
−1
(
)
(
)
Ν( x| μ,  ) =
exp
x
μ
x
μ
−
−

−


1/ 2

 2
(2π  )
mean
covariance
We need to estimate these parameters of a distribution
One method – Maximum Likelihood (ML) Estimation.
ML Method for estimating parameters
 Consider log of Gaussian Distribution
1
1
1
T
ln p( x | μ,  ) = − ln( 2π ) − ln  − (x − μ )  −1 (x − μ )
2
2
2
 Take the derivative and equate it to zero
∂ ln p( x | μ,  )
=0
∂μ
μML
1 N
=  xn
N n =1
∂ ln p( x | μ,  )
=0
∂
 ML
1 N
T
=  (x n − μML )(x n − μML )
N n =1
Where, N is the number of samples or data points
Gaussian Mixtures
 Linear super-position of Gaussians
K
p( x ) =  πk N ( x | μk ,  k )
k =1
Number of Gaussians
Mixing coefficient: weightage
for each Gaussian dist.
K
 Normalization and positivity require 0 ≤ πk ≤ 1,  πk = 1
k =1
 Consider log likelihood

K
ln p( X | μ,  , π ) =  ln p( x n ) =  ln  πk N( x n | μk ,  k )
n =1
n =1

 k =1
N
N
ML does not work here as there is no closed form solution
Parameters can be calculated using Expectation Maximization (EM)
technique
Example: Mixture of 3 Gaussian
1
1
0.5
0.5
0
0
(a)
0.5
1
0
0
(b)
0.5
1
Latent variable: posterior prob.
 We can think of the mixing coefficients as prior
probabilities for the components
 For a given value of ‘x’, we can evaluate the
corresponding posterior probabilities, called
responsibilities
 From Bayes rule
p(k )p( x | k )
γ k ( x ) = p( k | x ) =
p( x )
π k N ( x | μk ,  k )
where,
Latent
= K
Variable
 π jN ( x | μ j ,  j )
πk =
Nk
N
j= 1
Interpret Nk as the effective no. of points assigned to cluster k.
Expectation Maximization
 EM algorithm is an iterative optimization technique
which is operated locally
Optimal
Initial point
point
 Estimation step: for given parameter values we can compute
the expected values of the latent variable.
 Maximization step: updates the parameters of our model
based on the latent variable calculated using ML method.
EM Algorithm for GMM
Given a Gaussian mixture model, the goal is to maximize
the likelihood function with respect to the parameters
comprising the means and covariances of the components
and the mixing coefficients).
1. Initialize the means μ j, covariances  j and mixing
coefficients π j , and evaluate the initial value of the log
likelihood.
2. E step. Evaluate the responsibilities using the current
parameter values
π k N ( x | μk ,  k )
γ j (x) = K
 π jN ( x | μ j ,  j )
j= 1
EM Algorithm for GMM
3. M step. Re-estimate the parameters using the current
responsibilities
N
 γ (x
j
μj =
n
)x n
n =1
N
 γ (x
j
n
N
j =
(
)(
 γ j (xn ) xn − μj xn − μj
n =1
)
n =1
N
 γ (x
j
n
)
)
T
1 N
π j =  γ j (xn )
N n =1
n =1
4. Evaluate log likelihood

K
ln p( X | μ,  , π ) =  ln  πk N( x n | μk ,  k )
n =1

 k =1
If there is no convergence, return to step 2.
N
EM Algorithm : Example
EM Algorithm : Example
EM Algorithm : Example
EM Algorithm : Example
EM Algorithm : Example
EM Algorithm : Example
Download