Content deleted Content added
No edit summary Tag: Reverted |
Undid revision 1214049992 by 202.131.155.130 (talk) |
||
Line 43:
Also, certain non-continuous activation functions can be used to approximate a sigmoid function, which then allows the above theorem to apply to those functions. For example, the [[step function]] works. In particular, this shows that a [[perceptron]] network with a single infinitely wide hidden layer can approximate arbitrary functions.
Such an <math>f</math> can also be approximated by a network of greater depth by using the same construction for the first layer and approximating the identity function with later layers.
{{Math proof|title=Proof sketch|proof=It suffices to prove the case where <math>m = 1</math>, since uniform convergence in <math>\R^m</math> is just uniform convergence in each coordinate.
|