Revision as of 20:47, 20 April 2018 edit Ira Leviton (talk \| contribs) Extended confirmed users 358,557 edits m Deleted a duplicate 'the'. ← Previous edit		Revision as of 04:23, 26 May 2018 edit undo Hamstring2441 (talk \| contribs) 21 edits m Remove incorrect statements with no source and contradicting the capsule networks paper. Next edit →
Line 8: In 2000 [[Geoffrey Hinton]] et. al. described an imaging system that combined segmentation and recognition into a single inference process using [[Parse tree\|parse trees]]. So-called credibility networks described the joint distribution over the latent variables and over the possible parse trees. That system proved useful on the [[MNIST database\|MNIST]] handwritten digit database.<ref name=":0" /> Capsule networks were introduced by Hinton and his team in 2017. The approach was claimed to reduce error rates on [[MNIST database\|MNIST]] ~~by 45%~~ and to reduce training set sizes. Results were claimed to be considerably better than a CNN on highly overlapped digits.<ref name=":1"/> In Hinton's original idea one minicolumn would represent and detect one multidimensional entity.<ref>{{Citation\|last=Meher Vamsi\|title=Geoffrey Hinton Capsule theory\|date=2017-11-15\|url=https://www.youtube.com/watch?v=6S1_WqE55UQ\|accessdate=2017-12-06}}</ref><ref group="note" name=":0" /> Line 152: == Alternatives == CapsNets are claimed to have four major conceptual advantages over [[Convolutional neural network\|convolutional neural networks]] (CNN): * Viewpoint invariance: the use of pose matrices allows capsule networks to recognize objects regardless of the perspective from which they are viewed. * Fewer parameters: Because capsules group neurons, the connections between layers require fewer parameters~~. For example, the model that achieved the best MNIST performance only included 300,000 parameters, compared with the 4,000,000 parameters of the baseline CNN~~. * Better generalization to new viewpoints: CNNs, when trained to understand rotations, often learn that an object can be viewed similarly from several different rotations. However, capsule networks generalize better to new viewpoints because pose matrices can capture these characteristics as linear transformations. * Defense against white-box adversarial attacks: the Fast Gradient Sign Method (FGSM) is a typical method for attacking CNNs. It evaluates the gradient of each pixel against the loss of the network, and changes each pixel by at most epsilon (the error term) to maximize the loss. Although this method can drop the accuracy of CNNs dramatically (e.g: to below 20%), capsule networks maintain accuracy above 70%.

Capsule neural network: Difference between revisions