Universal approximation theorem: Difference between revisions

Content deleted Content added
OAbot (talk | contribs)
m Open access bot: arxiv updated in citation with #oabot.
This is one and the same journal paper, so also use the same reference.
Line 54:
 
== Arbitrary-width case ==
A spate of papers in the 1980s—1990s, from [[George Cybenko]] and {{ill|Kurt Hornik|de}} etc, established several universal approximation theorems for arbitrary width and bounded depth.<ref>{{cite journal |last1=Funahashi |first1=Ken-Ichi |title=On the approximate realization of continuous mappings by neural networks |journal=Neural Networks |date=January 1989 |volume=2 |issue=3 |pages=183–192 |doi=10.1016/0893-6080(89)90003-8 }}</ref><ref name=cyb"MLP-UA" /><ref name=":0">{{citecyb journal |last1=Hornik |first1=Kurt |last2=Stinchcombe |first2=Maxwell |last3=White |first3=Halbert |title=Multilayer feedforward networks are universal approximators |journal=Neural Networks |date=January 1989 |volume=2 |issue=5 |pages=359–366 |doi=10.1016/0893-6080(89)90020-8 }}</ref><ref name=horn /> See<ref>Haykin, Simon (1998). ''Neural Networks: A Comprehensive Foundation'', Volume 2, Prentice Hall. {{isbn|0-13-273350-1}}.</ref><ref>Hassoun, M. (1995) ''Fundamentals of Artificial Neural Networks'' MIT Press, p.&nbsp;48</ref><ref name="pinkus" /> for reviews. The following is the most often quoted:{{math_theorem
| name = Universal approximation theorem|Let <math>C(X, \mathbb{R}^m)</math> denote the set of [[continuous functions]] from a subset <math>X </math> of a Euclidean <math>\mathbb{R}^n</math> space to a Euclidean space <math>\mathbb{R}^m</math>. Let <math>\sigma \in C(\mathbb{R}, \mathbb{R})</math>. Note that <math>(\sigma \circ x)_i = \sigma(x_i)</math>, so <math>\sigma \circ x</math> denotes <math>\sigma</math> applied to each component of <math>x</math>.
 
Line 95:
Notice also that the neural network is only required to approximate within a compact set <math>K</math>. The proof does not describe how the function would be extrapolated outside of the region.
 
The problem with polynomials may be removed by allowing the outputs of the hidden layers to be multiplied together (the "pi-sigma networks"), yielding the generalization:<ref name=":0MLP-UA" />
{{math_theorem
| name = Universal approximation theorem for pi-sigma networks|With any nonconstant activation function, a one-hidden-layer pi-sigma network is a universal approximator.