Revision as of 17:53, 1 March 2025 edit Arjayay (talk \| contribs) Autopatrolled, Extended confirmed users, Page movers, Pending changes reviewers, Rollbackers 677,396 edits m Sp ← Previous edit		Revision as of 12:30, 12 March 2025 edit undo G.S.Ray (talk \| contribs) 2 edits m Cleaned up phrasing and made clearer (I hope) the link between the statistical distance and the type of optimization algorithm. Tags: Reverted Visual edit Next edit →
Line 110: We obtain the final formula for the loss: <math display="block"> L_{\theta,\phi} = \mathbb{E}_{x \sim \mathbb{P}^{real}} \left[ \\|x - D_\theta(E_\phi(x))\\|_2^2\right] ~~The~~+d ~~statistical~~\left( ~~distance~~\mu(dz), E_\phi \sharp \mathbb{P}^{real} \right)^2</math><math>d</math> ~~requires~~must ~~special~~have certain properties, ~~for~~depending ~~instance~~on itthe ~~has~~type of algorithm used to bemimize ~~posses~~this aloss ~~formula~~function. asFor ~~expectation~~example, ~~because~~it ~~the~~has ~~loss~~to ~~function~~be ~~will~~expressable as an expectation if it ~~need~~is to be optimized by a [[Stochastic gradient descent\|stochastic optimization ~~algorithms~~algorithm]]. Several distances can be chosen and this ~~gave~~has given rise to several flavors of VAEs:▼ ~~+d \left( \mu(dz), E_\phi \sharp \mathbb{P}^{real} \right)^2</math>~~ ▲The statistical distance <math>d</math> requires special properties, for instance it has to be posses a formula as expectation because the loss function will need to be optimized by [[Stochastic gradient descent\|stochastic optimization algorithms]]. Several distances can be chosen and this gave rise to several flavors of VAEs: * the sliced Wasserstein distance used by S Kolouri, et al. in their VAE<ref>{{Cite conference \|last1=Kolouri \|first1=Soheil \|last2=Pope \|first2=Phillip E. \|last3=Martin \|first3=Charles E. \|last4=Rohde \|first4=Gustavo K. \|date=2019 \|title=Sliced Wasserstein Auto-Encoders \|url=https://openreview.net/forum?id=H1xaJn05FQ \|conference=International Conference on Learning Representations \|publisher=ICPR \|book-title=International Conference on Learning Representations}}</ref> * the [[energy distance]] implemented in the Radon Sobolev Variational Auto-Encoder<ref>{{Cite journal \|last=Turinici \|first=Gabriel \|year=2021 \|title=Radon-Sobolev Variational Auto-Encoders \|url=https://www.sciencedirect.com/science/article/pii/S0893608021001556 \|journal=Neural Networks \|volume=141 \|pages=294–305 \|arxiv=1911.13135 \|doi=10.1016/j.neunet.2021.04.018 \|issn=0893-6080 \|pmid=33933889}}</ref>

Variational autoencoder: Difference between revisions