Universal approximation theorem: Difference between revisions

Content deleted Content added
rewrote a little to remove redundancy and to remove a 'we'
Moved sentence fragment in lead to Formal statement section
Line 4:
 
Kurt Hornik showed in 1991<ref name=horn> Kurt Hornik (1991) "Approximation Capabilities of Multilayer Feedforward Networks", ''Neural Networks'', 4(2), 251–257 </ref> that it is not the specific choice of the activation function, but rather the multilayer feedforward architecture itself which gives neural networks the potential of being universal approximators. The output units are always assumed to be linear. For notational convenience, only the single output case will be shown. The general case can easily be deduced from the single output case.
 
The theorem<ref name=cyb/><ref name=horn/><ref>Haykin, Simon (1998). ''Neural Networks: A Comprehensive Foundation'', Volume 2, Prentice Hall. ISBN 0-13-273350-1.</ref><ref>Hassoun, M. (1995) ''Fundamentals of Artificial Neural Networks'' MIT Press, p.&nbsp;48</ref> in mathematical terms:
 
== Formal statement ==
 
The theorem<ref name=cyb/><ref name=horn/><ref>Haykin, Simon (1998). ''Neural Networks: A Comprehensive Foundation'', Volume 2, Prentice Hall. ISBN 0-13-273350-1.</ref><ref>Hassoun, M. (1995) ''Fundamentals of Artificial Neural Networks'' MIT Press, p.&nbsp;48</ref> in mathematical terms:
 
<blockquote>