Autoencoder: Difference between revisions

Content deleted Content added
Tag: extraneous markup
Line 66:
Sparsity may be achieved by additional terms in the [[loss function]] during training (by [[Kullback–Leibler divergence|comparing the probability distribution]] of the hidden unit activations with some low desired value),<ref>{{citation|title=sparse autoencoders|url=https://web.stanford.edu/class/cs294a/sparseAutoencoder.pdf}}</ref> or by manually zeroing all but the few strongest hidden unit activations (referred to as a ''k-sparse autoencoder'').<ref>{{citation|title=k-sparse autoencoder|arxiv=1312.5663}}</ref>
 
''Italic text''====Variational autoencoder (VAE)====
 
Variational autoencoder models inherit autoencoder architecture, but make strong assumptions concerning the distribution of latent variables. They use [[Variational Bayesian methods|variational approach]] for latent representation learning, which results in an additional loss component and specific training algorithm called ''Stochastic Gradient Variational Bayes (SGVB)''.<ref name="VAE" /> It assumes that the data is generated by a directed [[graphical model]] <math>p(\mathbf{x}|\mathbf{z})</math> and that the encoder is learning an approximation <math>q_{\phi}(\mathbf{z}|\mathbf{x})</math> to the [[Posterior probability|posterior distribution]] <math>p_{\theta}(\mathbf{z}|\mathbf{x})</math> where <math>\mathbf{\phi}</math> and <math>\mathbf{\theta}</math> denote the parameters of the encoder (recognition model) and decoder (generative model) respectively. The objective of the variational autoencoder in this case has the following form:
:<math>\mathcal{L}(\mathbf{\phi},\mathbf{\theta},\mathbf{x})=D_{KL}(q_{\phi}(\mathbf{z}|\mathbf{x})||p_{\theta}(\mathbf{z}))-\mathbb{E}_{q_{\phi}(\mathbf{z}|\mathbf{x})}\big(\log p_{\theta}(\mathbf{x}|\mathbf{z})\big)</math>
Here, <math>D_{KL}</math> stands for the [[Kullback–Leibler divergence]]. The prior over the latent variables is usually set to be the centred isotropic multivariate Gaussian <math>p_{\theta}(\mathbf{z})=\mathcal{N}(\mathbf{0,I})</math>; however, alternative configurations have also been recently considered, e.g. <ref> Harris Partaourides and Sotirios P. Chatzis, “Asymmetric Deep Generative Models,” Neurocomputing, vol. 241, pp. 90-96, June 2017. [http://www.sciencedirect.com/science/article/pii/S0925231217302989] </ref>
 
==== Contractive autoencoder (CAE) ====