Talk:Variational autoencoder: Difference between revisions

Content deleted Content added
Yoderj (talk | contribs)
Rate this artice C-class.
Line 87:
 
In particular, the introductory figure shows x being mapped to a gaussian figure and back to x'. It would be good to explicitly state how the encoder and decoder in this figure relate to the various distributions used throughout the article, but I'm not confident on how to do so. [[User:Yoderj|Yoderj]] ([[User talk:Yoderj|talk]]) 19:25, 15 March 2024 (UTC)
 
I will try to make simple answers to your question. The encoder is a bad name and confuses people. In actuality, it is a gaussian distribution. It has a mean and a variance, which are each parameters given by a neural network. This network is initially random, and is trained (using gradients from the "loss function"). The decoder is also a gaussian distribution. It also has a mean and a variance, which are given by another neural network. Is the decoder technically random? It depends. If you are training, you want to make an estimate. To do the estimate, you have to take samples, which means that the result is random. When you are training, the decoder is random. On the other hand, if you are just doing inference, you can have non-randomness in the decoder. Since you are having a gaussian output, you can say I will do maximum a posteriori and only take one sample from the encoder. You may ask, hey unidentified wikipedian. If it is a gaussian decoder don't you also have to add the variance? That is a very fair question, but there are many applications where you can ignore it. Then you only take one sample out.
 
I hope that this clears things up. We have four variables. The mean and variance of the encoder. And the mean and variance of the decoder. These variables can be multidimensional for the multivariate Gaussian, but they are still four variables. Here are some equations to help you understand:
z = mu(x) + sigma(x)*epsilon # reparameterization trick
x' = MU(z) + SIGMA(z)*epsilon
 
And here is the legend:
x: input
z: sample from the latent, aka sample from the encoder, aka output of mu(x) plus output of sigma(x) with randomness
mu, sigma: encoder neural networks
MU, SIGMA: decoder neural networks
x': output