Universal approximation theorem: Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Add: s2cid, arxiv, page, bibcode, pmid. Removed parameters. | Use this bot. Report bugs. | Suggested by Chris Capoccia | #UCB_toolbar
No edit summary
Tag: Reverted
Line 43:
Also, certain non-continuous activation functions can be used to approximate a sigmoid function, which then allows the above theorem to apply to those functions. For example, the [[step function]] works. In particular, this shows that a [[perceptron]] network with a single infinitely wide hidden layer can approximate arbitrary functions.
 
Such an <math>f</math> can also be approximated by a network of greater depth by using the same construction for the first layer and approximating the identity function with later layers.
 
{{Math proof|title=Proof sketch|proof=It suffices to prove the case where <math>m = 1</math>, since uniform convergence in <math>\R^m</math> is just uniform convergence in each coordinate.