Revision as of 01:49, 28 June 2024 edit 112.68.142.232 (talk) Corrected incorrect use of plural form of "criterion". Tags: Visual edit Mobile edit Mobile web edit ← Previous edit		Revision as of 19:39, 15 July 2024 edit undo An Yongle (talk \| contribs) 206 edits →Arbitrary-width case: Even though Cybenko affirms he uses the "Riesz Representation theorem" (which applies only to Hilbert spaces) in his 1989 paper, he actually uses its generalized version, the Riesz–Markov–Kakutani representation theorem-- which applies to the Banach spaces he considers. Next edit →
Line 92: The case where <math>\sigma</math> is a generic non-polynomial function is harder, and the reader is directed to.<ref name="pinkus" />}} The above proof has not specified how one might use a ramp function to approximate arbitrary functions in <math>C_0(\R^n, \R)</math>. A sketch of the proof is that one can first construct flat bump functions, intersect them to obtain spherical bump functions that approximate the [[Dirac delta function]], then use those to approximate arbitrary functions in <math>C_0(\R^n, \R)</math>.<ref>{{Cite journal \|last=Nielsen \|first=Michael A. \|date=2015 \|title=Neural Networks and Deep Learning \|url=http://neuralnetworksanddeeplearning.com/ \|language=en}}</ref> The original proofs, such as the one by Cybenko, use methods from functional analysis, including the [[Hahn–Banach theorem\|Hahn-Banach]] and [[~~Riesz~~Riesz–Markov–Kakutani representation theorem\|~~Riesz~~Riesz–Markov–Kakutani representation]] theorems. Notice also that the neural network is only required to approximate within a compact set <math>K</math>. The proof does not describe how the function would be extrapolated outside of the region.

Universal approximation theorem: Difference between revisions