Variational autoencoder: Difference between revisions

Content deleted Content added
notation change
Line 35:
In this way, the overall problem can be easily translated into the autoencoder ___domain, in which the conditional likelihood distribution <math>p_\theta(x|z)</math> is carried by the ''probabilistic decoder'', while the approximated posterior distribution <math>q_\phi(z|x)</math> is computed by the ''probabilistic encoder''.
 
== ELBOEvidence losslower functionbound (ELBO) ==
{{Main|Evidence lower bound}}
 
As in every [[deep learning]] problem, it is necessary to define a differentiable loss function in order to update the network weights through [[backpropagation]].
 
For variational autoencoders, the idea is to jointly optimize the generative model parameters <math>\theta</math> to reduce the reconstruction error between the input and the output, and <math>\phi</math> to make <math>q_\phi({z| x})</math> as close as possible to <math>p_\theta(z|x)</math>.
 
As reconstruction loss, [[mean squared error]] and [[cross entropy]] are often used.
Line 59:
= E_{z \sim q_\phi({z|x})}(\log(p_\theta({x| z}))) - D_{KL}(q_\phi({z| x}) \parallel p_\theta(z)) </math>This is named the [[evidence lower bound]] (ELBO). Maximizing the ELBO<math display="block">\theta^*,\phi^* = \underset{\theta,\phi}\operatorname{arg max} \, L_{\theta,\phi}(x) </math>is equivalent to simultaneously maximizing <math>p_\theta(x) </math> and minimizing <math> D_{KL}(q_\phi({z| x})\parallel p_\theta({z| x})) </math>. That is, maximizing the log-likelihoof of the observed data, and minimizing the divergence of the approximate posterior <math>q_\phi(\cdot | x) </math> from the exact posterior <math>p_\theta(\cdot | x) </math>.
 
For a more detailed derivation and interpretationmore interpretations of ELBO and its maximization, see [[Evidence lower bound|its main page]].
 
== Reparameterization ==