Content deleted Content added
Snowman304 (talk | contribs) →Statistical distance VAE variants: updated citations |
Citation bot (talk | contribs) Alter: title, template type. Add: class, doi, pages, issue, volume, journal, eprint, authors 1-1. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Snowman304 | #UCB_toolbar |
||
Line 101:
== Statistical distance VAE variants==
After the initial work of Diederik P. Kingma and [[Max Welling]].<ref>{{Cite arXiv |
proposed to formulate in a more abstract way the operation of the VAE. In these approaches the loss function is composed of two parts :
* the usual reconstruction error part which seeks to ensure that the encoder-then-decoder mapping <math>x \mapsto D_\theta(E_\psi(x))</math> is as close to the identity map as possible; the sampling is done at run time from the empirical distribution <math>\mathbb{P}^{real}</math> of objects available (e.g., for MNIST or IMAGENET this will be the empirical probability law of all images in the dataset). This gives the term: <math> \mathbb{E}_{x \sim \mathbb{P}^{real}} \left[ \|x - D_\theta(E_\phi(x))\|_2^2\right]</math>.
Line 111:
The statistical distance <math>d</math> requires special properties, for instance is has to be posses a formula as expectation because the loss function will need to be optimized by [[Stochastic gradient descent|stochastic optimization algorithms]]. Several distances can be chosen and this gave rise to several flavors of VAEs:
* the sliced Wasserstein distance used by S Kolouri, et al. in their VAE<ref>{{Cite conference |
* the [[Energy distance|energy distance]] implemented in the Radon Sobolev Variational Auto-Encoder<ref>{{Cite journal |last=Turinici |first=Gabriel |year=2021 |title=Radon-Sobolev Variational Auto-Encoders |url=https://www.sciencedirect.com/science/article/pii/S0893608021001556 |journal=Neural Networks |volume=141 |pages=294–305 |arxiv=1911.13135 |doi=10.1016/j.neunet.2021.04.018 |issn=0893-6080 |pmid=33933889}}</ref>
* the [[Maximum Mean Discrepancy]] distance used in the MMD-VAE<ref>{{Cite
* the [[Wasserstein distance]] used in the WAEs<ref>{{Cite arXiv |
* kernel-based distances used in the Kernelized Variational Autoencoder (K-VAE)<ref>{{Cite arXiv |
== See also ==
|