Universal approximation theorem: Difference between revisions

Content deleted Content added
No edit summary
Line 4:
In the [[mathematics|mathematical]] theory of [[artificial neural networks]], '''universal approximation theorems''' are theorems<ref name="MLP-UA">{{cite journal |last1=Hornik |first1=Kurt |last2=Stinchcombe |first2=Maxwell |last3=White |first3=Halbert |title=Multilayer feedforward networks are universal approximators |journal=Neural Networks |date=January 1989 |volume=2 |issue=5 |pages=359–366 |doi=10.1016/0893-6080(89)90020-8 }}</ref><ref>Balázs Csanád Csáji (2001) Approximation with Artificial Neural Networks; Faculty of Sciences; Eötvös Loránd University, Hungary</ref> of the following form: Given a family of neural networks, for each function <math>f</math> from a certain [[function space]], there exists a sequence of neural networks <math>\phi_1, \phi_2, \dots</math> from the family, such that <math>\phi_n \to f</math> according to some criterion. That is, the family of neural networks is [[Dense set|dense]] in the function space.
 
The most popular version states that [[feedforward neural network|feedforward networks]] with non-[[polynomial]] [[activation function]]s are dense in the space of [[continuous functionsfunction]]s between two [[Euclidean space]]s, with respect to the [[compact convergence]] [[topology]].
 
Universal approximation theorems are existence theorems: They simply state that there ''exists'' such a sequence <math>\phi_1, \phi_2, \dots \to f</math>, and do not provide any way to actually find such a sequence. They also do not guarantee any method, such as [[backpropagation]], might actually find such a sequence. Any method for searching the space of neural networks, including backpropagation, might find a converging sequence, or not (i.e. the backpropagation might get stuck in a local optimum).
Line 11:
 
== Setup ==
[[Artificial neural networks]] are combinations of multiple simple mathematical functions that implement more complicated functions from (typically) real-valued [[vector (mathematics and physics)|vectors]] to real-valued [[vector (mathematics and physics)|vectors]]. The spaces of multivariate functions that can be implemented by a network are determined by the structure of the network, the set of simple functions, and its multiplicative parameters. A great deal of theoretical work has gone into characterizing these function spaces.
 
Most universal approximation theorems are in one of two classes. The first quantifies the approximation capabilities of neural networks with an arbitrary number of artificial neurons ("''arbitrary width''" case) and the second focuses on the case with an arbitrary number of hidden layers, each containing a limited number of artificial neurons ("''arbitrary depth''" case). In addition to these two classes, there are also universal approximation theorems for neural networks with bounded number of hidden layers and a limited number of neurons in each layer ("''bounded depth and bounded width''" case).