Universal approximation theorem: Difference between revisions

Content deleted Content added
Replaced non-working link to Cybenko's paper.
correct circular definition; note that the theorem says nothing about learnability
Line 1:
In the [[mathematics|mathematical]] theory of [[neural networks]], the '''universal approximation theorem''' states<ref>Balázs Csanád Csáji. Approximation with Artificial Neural Networks; Faculty of Sciences; Eötvös Loránd University, Hungary</ref> that a [[feedforward neural network|feed-forward]] network with a single hidden layer containing a finite number of [[neuron]]s (i.e., the simplest form of thea [[multilayer perceptron]]), iscan a universal approximator amongapproximate [[continuous functions]] on [[Compact_space|compact subsets]] of [[Euclidean space|'''R'''<sup>n</sup>]], under mild assumptions on the activation function. The theorem thus states that simple neural networks can ''represent'' a wide variety of interesting functions when given appropriate parameters; it does not touch upon the algorithmic [[Computational learning theory|learnability]] of those parameters.
 
One of the first versions of the [[theorem]] was proved by [[George Cybenko]] in 1989 for [[sigmoid function|sigmoid]] activation functions.<ref name=cyb>Cybenko., G. (1989) [http://deeplearning.cs.cmu.edu/pdfs/Cybenko.pdf "Approximations by superpositions of sigmoidal functions"], ''[[Mathematics of Control, Signals, and Systems]]'', 2 (4), 303-314</ref>