Revision as of 06:58, 5 August 2018 edit Lfstevens (talk \| contribs) Extended confirmed users 70,696 edits mNo edit summary Tag: Visual edit ← Previous edit		Revision as of 00:56, 10 August 2018 edit undo Nbro (talk \| contribs) Extended confirmed users 3,319 edits No edit summary Next edit →
Line 3: The idea is to add structures called capsules to a [[convolutional neural network]] (CNN), and to reuse output from several of those capsules to form more stable (with respect to various perturbations) representations for higher order capsules.<ref>{{Cite journal\|last=Hinton\|first=Geoffrey E.\|last2=Krizhevsky\|first2=Alex\|last3=Wang\|first3=Sida D.\|date=2011-06-14\|title=Transforming Auto-Encoders\|url=https://link.springer.com/chapter/10.1007/978-3-642-21735-7_6\|journal=Artificial Neural Networks and Machine Learning – ICANN 2011\|volume=6791\|series=Lecture Notes in Computer Science\|language=en\|publisher=Springer, Berlin, Heidelberg\|pages=44–51\|doi=10.1007/978-3-642-21735-7_6\|isbn=9783642217340}}</ref> The output is a vector consisting of the [[Realization (probability)\|probability of an observation]], and a [[Pose (computer vision)\|pose for that observation]]. This vector is similar to what is done for example when doing ''classification with localization'' in CNNs. Among other benefits, capsnets address the "Picasso problem" in image recognition: images that have all the right parts but that are not in the correct spatial relationship (e.g., in a "face", the positions of the mouth and one eye are switched). For image recognition, capsnets exploit the fact that while viewpoint changes have nonlinear effects at the pixel level, they have linear effects at the part/object level.<ref name=":16">{{cite web\|url=http://www.cedar.buffalo.edu/~srihari/CSE676/9.12%20CapsuleNets.pdf\|title=Capsule Nets\|last=Srihari\|first=Sargur\|publisher=[[University of Buffalo]]\|access-date=2017-12-07}}</ref> This can be compared to inverting the rendering of an object of multiple parts.<ref name=":0">{{Cite book\|url=http://papers.nips.cc/paper/1710-learning-to-parse-images.pdf\|title=Advances in Neural Information Processing Systems 12\|last=Hinton\|first=Geoffrey E\|last2=Ghahramani\|first2=Zoubin\|last3=Teh\|first3=Yee Whye\|date=2000\|publisher=MIT Press\|editor-last=Solla\|editor-first=S. A.\|pages=463–469\|editor-last2=Leen\|editor-first2=T. K.\|editor-last3=Müller\|editor-first3=K.}}</ref> {{TOC limit\|3}}

Capsule neural network: Difference between revisions