Content deleted Content added
(3 intermediate revisions by one other user not shown) | |||
Line 57:
:I also found this incredibly confusing. As the prior on z is usually fixed and doesn't depend on any parameter. [[User:EitanPorat|EitanPorat]] ([[User talk:EitanPorat|talk]]) 00:16, 19 March 2023 (UTC)
::I see the confusion. p(z) is a probability distribution, but sometimes the same notation is used in conjunction with a parameter set to indicate that actually it is a parameterized function! The article should be cleared up. The encoder should be called q_phi everywhere and the decoder should be called p_theta. The reason is that to optimize the
== The image shows just a normal autoencoder, not a variational autoencoder ==
Line 66:
I'm not sure that image should just be removed, or whether it make sense in the section anyway. [[User:Volker Siegel|Volker Siegel]] ([[User talk:Volker Siegel|talk]]) 14:18, 24 January 2022 (UTC)
:Just to make this point clear: The reparameterization trick is for the gradients! The trick separates the source of randomness to another node in the DAG that does not have any parameters, so that we can propagate gradients through the rest of the DAG that is now a deterministic function. [[Special:Contributions/82.102.110.228|82.102.110.228]] ([[User talk:82.102.110.228|talk]]) 18:57, 27 December 2024 (UTC)
== This is a highly technical topic ==
Line 74 ⟶ 76:
The architecture section is filled with unclear phrases and undefined terms. For example, "noise distribution", "q-distributions or variational posteriors", "p-distributions", "amortized approach", "which is usually intractable" (what is intractable?), "free energy expression". None of these are defined. It is unclear if this section of the article is useful to anyone who is not already familiar with how variational autoencoders work. [[User:Joshuame13|Joshuame13]] ([[User talk:Joshuame13|talk]]) 15:14, 31 January 2023 (UTC)
:I've fixed most of those. The free energy really needs its own section. It is a lower bound that is obtained by using Jensen's inequality on the log likelihood. However, I don't think that Jenssen's inequality is within the scope of this article. [[Special:Contributions/46.199.5.20|46.199.5.20]] ([[User talk:46.199.5.20|talk]]) 19:50, 26 December 2024 (UTC)
== The ELBO section needs more derivation ==
Line 82 ⟶ 86:
:I agree p_theta(z) doesn't make sense. [[User:EitanPorat|EitanPorat]] ([[User talk:EitanPorat|talk]]) 00:17, 19 March 2023 (UTC)
::Agreed. It should be p_phi(z) or even better q_phi(z). [[Special:Contributions/46.199.5.20|46.199.5.20]] ([[User talk:46.199.5.20|talk]]) 20:22, 26 December 2024 (UTC)
== Rating this article C-class ==
|