Universal approximation theorem: Difference between revisions

Content deleted Content added
perceptron != multilayer perceptron; they're very different models
rewrote a little to remove redundancy and to remove a 'we'
Line 1:
In the [[mathematics|mathematical]] theory of [[neural networks]], the '''universal approximation theorem''' states<ref>Balázs Csanád Csáji. Approximation with Artificial Neural Networks; Faculty of Sciences; Eötvös Loránd University, Hungary</ref> that a [[feedforward neural network|feed-forward]] network with a single hidden layer, thecontaining simplesta formfinite number of the [[multilayer perceptronneuron]]s, containingthe asimplest finite numberform of hiddenthe [[neuronmultilayer perceptron]]s, is a universal approximator among [[continuous functions]] on [[Compact_space|compact subsets]] of [[Euclidean space|'''R'''<sup>n</sup>]], under mild assumptions on the activation function.
 
One of the first versions of the [[theorem]] was proved by [[George Cybenko]] in 1989 for [[sigmoid function|sigmoid]] activation functions.<ref name=cyb>Cybenko., G. (1989) [http://actcomm.dartmouth.edu/gvc/papers/approx_by_superposition.pdf "Approximations by superpositions of sigmoidal functions"], ''[[Mathematics of Control, Signals, and Systems]]'', 2 (4), 303-314</ref>
 
Kurt Hornik showed in 1991<ref name=horn> Kurt Hornik (1991) "Approximation Capabilities of Multilayer Feedforward Networks", ''Neural Networks'', 4(2), 251–257 </ref> that it is not the specific choice of the activation function, but rather the multilayer feedforward architecture itself which gives neural networks the potential of being universal approximators. The output units are always assumed to be linear. For notational convenience we shall explicitly formulate our results, only for the casesingle whereoutput therecase iswill only one outputbe unitshown. (The general case can easily be deduced from the simplesingle output case.)
 
The theorem<ref name=cyb/><ref name=horn/><ref>Haykin, Simon (1998). ''Neural Networks: A Comprehensive Foundation'', Volume 2, Prentice Hall. ISBN 0-13-273350-1.</ref><ref>Hassoun, M. (1995) ''Fundamentals of Artificial Neural Networks'' MIT Press, p.&nbsp;48</ref> in mathematical terms: