Universal approximation theorem

This is an old revision of this page, as edited by Albertzeyer (talk | contribs) at 14:37, 10 June 2010 (Formal statement: more cleanup). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In mathematics, the universal approximation theorem states[1] that the standard multilayer feed-forward network with a single hidden layer that contains finite number of hidden neurons, and with arbitrary activation function are universal approximators on a compact subset of .

The theorem was first proved by George Cybenko in 1989 for a sigmoid activation function, thus it is also called the Cybenko theorem[2].

Kurt Hornik (1991) showed that it is not the specific choice of the activation function, but rather the multilayer feedforward architecture itself which gives neural networks the potential of being universal approximators. The output units are always assumed to be linear. For notational convenience we shall explicitly formulate our results only for the case where there is only one output unit. (The general case can easily be deduced from the simple case.)

The theorem[3][4][5][6] in mathematical terms:

Formal statement

Let φ(·) be a nonconstant, bounded, and monotonically-increasing continuous function. Let Im0 denote the m0-dimensional unit hypercube [0,1]m0. The space of continuous functions on Im0 is denoted by C(Im0). Then, given any function fC(Im0) and є > 0, there exist an integer m1 and sets of real constants αi, bi ∈ ℝ, wi ∈ ℝm0, where i = 1, ..., m1 such that we may define:

 

as an approximate realization of the function f; that is,

 

for all xIm0.

References

  1. ^ Balázs Csanád Csáji. Approximation with Artificial Neural Networks; Faculty of Sciences; Eötvös Loránd University, Hungary
  2. ^ http://www.google.com/search?q=Cybenko+theorem
  3. ^ G. Cybenko. Approximations by superpositions of sigmoidal functions. Mathematics of Control, Signals, and Systems, 2:303–314, no. 4 pp. 303-314. electronic version, 1989.
  4. ^ Kurt Hornik: Approximation Capabilities of Multilayer Feedforward Networks. Neural Networks, vol. 4, 1991.
  5. ^ Haykin, Simon (1998). Neural Networks: A Comprehensive Foundation, 2, Prentice Hall. ISBN 0132733501.
  6. ^ Hassoun, M. (1995) Fundamentals of Artificial Neural Networks MIT Press, p. 48