Content deleted Content added
→EM algorithm in GMM: Removed the double block |
|||
(3 intermediate revisions by 3 users not shown) | |||
Line 1:
In statistics, [[Expectation–maximization algorithm|EM (expectation maximization)]] algorithm handles latent variables, while [[Mixture model#Gaussian mixture model|GMM]] is the Gaussian mixture model.
Line 28 ⟶ 26:
: <math>\mu_j =\frac{\sum_{i=1}^m 1\{z^{(i)}=j\} x^{(i)}}{\sum_{i=1}^{m} 1\left\{z^{(i)}=j\right\}}</math>
: <math>\Sigma_j =\frac{\sum_{i=1}^m 1\{z^{(i)}=j\} (x^{(i)}-\mu_j)(x^{(i)}-\mu_j)^T}{\sum_{i=1}^m 1\{z^{(i)}=j\}}</math><ref name="Stanford CS229 Notes">{{cite web |last1=Ng |first1=Andrew |title=CS229 Lecture notes |url=
If <math>z_i</math> is known, the estimation of the parameters results to be quite simple with [[maximum likelihood estimation]]. But if <math>z_i</math> is unknown it is much more complicated.<ref name="Machine Learning —Expectation-Maximization Algorithm (EM)">{{cite web |last1=Hui |first1=Jonathan |title=Machine Learning —Expectation-Maximization Algorithm (EM) |url=https://medium.com/@jonathan_hui/machine-learning-expectation-maximization-algorithm-em-2e954cb76959 |website=Medium |language=en |date=13 October 2019}}</ref>
Line 37 ⟶ 35:
In [[machine learning]], the latent variable <math>z</math> is considered as a latent pattern lying under the data, which the observer is not able to see very directly. <math>x_i</math> is the known data, while <math>\phi, \mu, \Sigma</math> are the parameter of the model. With the EM algorithm, some underlying pattern <math>z</math> in the data <math>x_i</math> can be found, along with the estimation of the parameters. The wide application of this circumstance in machine learning is what makes EM algorithm so important.
[[File:GMM Training on artificial data.gif|thumb|alt=Animation of updates to a GMM at each update to the distribution in the EM algorithm.|GMM Training on artificial data]]
== EM algorithm in GMM ==
Line 46:
1. (E-step) For each <math>i, j</math>, set
<math>w_{j}^{(i)}:=p\left(z^{(i)}=j | x^{(i)} ; \phi, \mu, \Sigma\right)</math>
Line 54 ⟶ 53:
<math>\Sigma_{j} :=\frac{\sum_{i=1}^{m} w_{j}^{(i)}\left(x^{(i)}-\mu_{j}\right)\left(x^{(i)}-\mu_{j}\right)^{T}}{\sum_{i=1}^{m} w_{j}^{(i)}}</math>
<ref name="Stanford CS229 Notes">{{cite web |last1=Ng |first1=Andrew |title=CS229 Lecture notes |url=
With [[Bayes' Rule|Bayes Rule]], the following result is obtained by the E-step:
|