Revision as of 13:10, 3 March 2024 edit Citation bot (talk \| contribs) Bots 5,862,654 edits Add: s2cid, arxiv, page, bibcode, pmid. Removed parameters. \| Use this bot. Report bugs. \| Suggested by Chris Capoccia \| #UCB_toolbar ← Previous edit		Revision as of 17:23, 16 March 2024 edit undo 202.131.155.130 (talk) No edit summary Tag: Reverted Next edit →
Line 43: Also, certain non-continuous activation functions can be used to approximate a sigmoid function, which then allows the above theorem to apply to those functions. For example, the [[step function]] works. In particular, this shows that a [[perceptron]] network with a single infinitely wide hidden layer can approximate arbitrary functions. Such an <math>f</math> can also be approximated by a network of greater depth by using the same ~~construction~~ for the first layer and approximating the identity function with later layers. {{Math proof\|title=Proof sketch\|proof=It suffices to prove the case where <math>m = 1</math>, since uniform convergence in <math>\R^m</math> is just uniform convergence in each coordinate.

Universal approximation theorem: Difference between revisions