Revision as of 17:57, 5 February 2022 edit MishchenkoA (talk \| contribs) 62 edits mNo edit summary Tag: Visual edit ← Previous edit		Revision as of 14:45, 8 February 2022 edit undo MishchenkoA (talk \| contribs) 62 edits →Formulation: Tighten up wording, make notation and terminology more consistent with the variational Bayesian methods page. Tag: Visual edit Next edit →
Line 12: == Formulation == [[File:VAE Basic.png\|thumb\|425x425px\|The basic scheme of a variational autoencoder. The model receives <math>\mathbf{x}</math> as input. The encoder compresses it into the latent space. The decoder receives as input the information sampled from the latent space and produces <math>\mathbf{x'}</math> as similar as possible to <math>\mathbf{x}</math>.]] From a formal perspective, given an input dataset <math>\mathbf{x}</math> characterized by an unknown probability ~~function~~distribution <math>P(\mathbf{x})</math>, ~~and~~the objective is to model or approximate the data's true distribution <math>P</math> using a ~~multivariate~~parametrized ~~latent~~distribution ~~encoding~~<math>p_\theta</math> ~~vector~~having parameters <math>\theta</math>. Let <math>\mathbf{z}</math>, ~~the~~be ~~objective~~a israndom tovector ~~model~~jointly-distributed ~~the data as a distribution~~with <math>~~p_\theta(~~\mathbf{ x})</math>,. ~~with~~Conceptually <math>\~~theta~~mathbf z </math> ~~defined~~will asrepresent ~~the~~a ~~set~~latent encoding of ~~the~~<math>\mathbf ~~network~~x ~~parameters~~</math>. [[Marginal distribution\|Marginalizing]] over <math>\mathbf z</math> gives : <math>p_\theta(\mathbf{x}) = \int_{\mathbf{z}}p_\theta(\mathbf{x,z}) \, d\mathbf{z}, </math>▼ ~~It is possible to formalize this distribution as~~ where <math>p_\theta(\mathbf{x,z})</math> represents the [[joint distribution]] under <math>p_\theta</math> of the observable data <math>\mathbf x </math> and its latent representation or encoding <math>\mathbf z </math>. According to the [[Chain rule (probability)\|chain rule]], the equation can be rewritten as ▲: <math>p_\theta(\mathbf{x}) = \int_{\mathbf{z}}p_\theta(\mathbf{x,z}) \, d\mathbf{z} </math> where <math>p_\theta</math> is the [[Model evidence\|evidence]] of the model's data with [[Marginalization (probability)\|marginalization]] performed over the unobserved variables and thus <math>p_\theta(\mathbf{x,z})</math> represents the [[joint distribution]] between input data and its latent representation according to the network parameters <math>\theta</math>. ~~According to the [[Chain rule (probability)\|chain rule]], the equation can be rewritten as~~ : <math>p_\theta(\mathbf{x}) = \int_{\mathbf{z}}p_\theta(\mathbf{x\mid z})p_\theta(\mathbf{z}) \, d\mathbf{z}</math> In the vanilla variational autoencoder, we ~~assume~~usually take <math>\mathbf{z}</math> ~~with~~to be a finite-dimensional ~~dimension~~vector ~~and~~of ~~that~~real numbers, and <math>p_\theta(\mathbf{x\|z})</math> isto be a [[Gaussian distribution]],. ~~then~~Then <math>p_\theta(\mathbf{x})</math> is a mixture of Gaussian distributions. It is now possible to define the set of the relationships between the input data and its latent representation as

Variational autoencoder: Difference between revisions