Content deleted Content added
m v2.05b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation) |
Citation bot (talk | contribs) Add: issue, volume, ssrn, date. | Use this bot. Report bugs. | Suggested by Abductive | #UCB_webform 293/3850 |
||
Line 108:
:<math>Q(\mathbf{Z}) = \prod_{i=1}^M q_i(\mathbf{Z}_i\mid \mathbf{X})</math>
It can be shown using the [[calculus of variations]] (hence the name "variational Bayes") that the "best" distribution <math>q_j^{*}</math> for each of the factors <math>q_j</math> (in terms of the distribution minimizing the KL divergence, as described above) satisfies:<ref>{{cite web|last=Nguyen|first=Duy|title= AN IN DEPTH INTRODUCTION TO VARIATIONAL BAYES NOTE|date=15 August 2023 |ssrn=4541076 |url=https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4541076|access-date=15 August 2023}}</ref>
:<math>q_j^{*}(\mathbf{Z}_j\mid \mathbf{X}) = \frac{e^{\operatorname{E}_{q^*_{-j}} [\ln p(\mathbf{Z}, \mathbf{X})]}}{\int e^{\operatorname{E}_{q^*_{-j}} [\ln p(\mathbf{Z}, \mathbf{X})]}\, d\mathbf{Z}_j}</math>
where <math>\operatorname{E}_{q^*_{-j}} [\ln p(\mathbf{Z}, \mathbf{X})]</math> is the [[expected value|expectation]] of the logarithm of the [[joint probability]] of the data and latent variables, taken with respect to <math>q^*</math> over all variables not in the partition: refer to Lemma 4.1 of<ref name=Yoon2021>{{Cite journal |last=Lee|first=Se Yoon| title = Gibbs sampler and coordinate ascent variational inference: A set-theoretical review|journal=Communications in Statistics - Theory and Methods|year=2021|volume=51 |issue=6 |pages=1–21|doi=10.1080/03610926.2021.1921214|arxiv=2008.01006|s2cid=220935477}}</ref> for a derivation of the distribution <math>q_j^{*}(\mathbf{Z}_j\mid \mathbf{X})</math>.
In practice, we usually work in terms of logarithms, i.e.:
Line 355:
==A more complex example==
[[File:bayesian-gaussian-mixture-vb.svg|right|300px|thumb|Bayesian Gaussian mixture model using [[plate notation]]. Smaller squares indicate fixed parameters; larger circles indicate random variables. Filled-in shapes indicate known values. The indication [K] means a vector of size ''K''; [''D'',''D''] means a matrix of size ''D''×''D''; ''K'' alone means a [[categorical variable]] with ''K'' outcomes. The squiggly line coming from ''z'' ending in a crossbar indicates a ''switch'' — the value of this variable selects, for the other incoming variables, which value to use out of the size-''K'' array of possible values.]]
Imagine a Bayesian [[Gaussian mixture model]] described as follows:<ref>{{cite web|last=Nguyen|first=Duy|title= AN IN DEPTH INTRODUCTION TO VARIATIONAL BAYES NOTE|date=15 August 2023 |ssrn=4541076 |url=https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4541076|access-date=15 August 2023}}</ref>
:<math>
Line 410:
Assume that <math>q(\mathbf{Z},\mathbf{\pi},\mathbf{\mu},\mathbf{\Lambda}) = q(\mathbf{Z})q(\mathbf{\pi},\mathbf{\mu},\mathbf{\Lambda})</math>.
Then<ref>{{cite web|last=Nguyen|first=Duy|title= AN IN DEPTH INTRODUCTION TO VARIATIONAL BAYES NOTE|date=15 August 2023 |ssrn=4541076 |url=https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4541076|access-date=15 August 2023}}</ref>
:<math>
|