Content deleted Content added
→Overview of architecture and operation: I am pretty sure I made the right references, but since I am learning this and missed it, please validate. |
m →Overview of architecture and operation: Punctuation correction |
||
Line 14:
The decoder is the second neural network of this model. It is a function that maps from the latent space to the input space, e.g. as the means of the noise distribution. It is possible to use another neural network that maps to the variance, however this can be omitted for simplicity. In such a case, the variance can be optimized with gradient descent.
To optimize this model, one needs to know two terms: the "reconstruction error", and the [[Kullback–Leibler divergence]](KL-D). Both terms are derived from the free energy expression of the probabilistic model, and therefore differ depending on the noise distribution and the assumed prior of the data. For example, a standard VAE task such as IMAGENET is typically assumed to have a gaussianly distributed noise
== Formulation ==
|