Autoencoder: Difference between revisions

Content deleted Content added
Improved formatting
Line 114:
==Advantages of depth==
[[File:Autoencoder_structure.png|350x350px|Schematic structure of an autoencoder with 3 fully connected hidden layers. The code (z, or h for reference in the text) is the most internal layer.|thumb]]
Autoencoders are often trained with a single -layer encoder and a single -layer decoder, but using many-layered (deep) encoders and decoders offers many advantages.<ref name=":0" />
 
* Depth can exponentially reduce the computational cost of representing some functions.<ref name=":0" />
* Depth can exponentially decrease the amount of training data needed to learn some functions.<ref name=":0" />
* Experimentally, deep autoencoders yield better compression compared to shallow or linear autoencoders.<ref name=":7" />
 
=== Training ===
[[Geoffrey Hinton]] developed the [[deep belief network]] technique for training many-layered deep autoencoders. His method involves treating each neighbouringneighboring set of two layers as a [[restricted Boltzmann machine]] so that pretraining approximates a good solution, then using backpropagation to fine-tune the results.<ref name=":7">{{cite journal|last1=Hinton|first1=G. E.|last2=Salakhutdinov|first2=R.R.|title=Reducing the Dimensionality of Data with Neural Networks|journal=Science|date=28 July 2006|volume=313|issue=5786|pages=504–507|doi=10.1126/science.1127647|pmid=16873662|bibcode=2006Sci...313..504H|s2cid=1658773}}</ref>
 
Researchers have debated whether joint training (i.e. training the whole architecture together with a single global reconstruction objective to optimize) would be better for deep auto-encoders.<ref name=":9">{{cite arXiv |eprint=1405.1380|last1=Zhou|first1=Yingbo|last2=Arpit|first2=Devansh|last3=Nwogu|first3=Ifeoma|last4=Govindaraju|first4=Venu|title=Is Joint Training Better for Deep Auto-Encoders?|class=stat.ML|date=2014}}</ref> A 2015 study showed that joint training learns better data models along with more representative features for classification as compared to the layerwise method.<ref name=":9" /> However, their experiments showed that the success of joint training depends heavily on the regularization strategies adopted.<ref name=":9" /><ref>R. Salakhutdinov and G. E. Hinton, “Deep boltzmannBoltzmann machines,” in AISTATS, 2009, pp. 448–455.</ref>
 
== Applications ==