Universal approximation theorem: Difference between revisions

Content deleted Content added
this is a book, not a journal paper
Line 88:
The case where <math>\sigma</math> is a generic non-polynomial function is harder, and the reader is directed to.<ref name="pinkus" />}}
 
The above proof has not specified how one might use a ramp function to approximate arbitrary functions in <math>C_0(\R^n, \R)</math>. A sketch of the proof is that one can first construct flat bump functions, intersect them to obtain spherical bump functions that approximate the [[Dirac delta function]], then use those to approximate arbitrary functions in <math>C_0(\R^n, \R)</math>.<ref>{{Cite journalbook |last=Nielsen |first=Michael A. |date=2015 |title=Neural Networks and Deep Learning |url=http://neuralnetworksanddeeplearning.com/ |language=en}}</ref> The original proofs, such as the one by Cybenko, use methods from functional analysis, including the [[Hahn–Banach theorem|Hahn-Banach]] and [[Riesz–Markov–Kakutani representation theorem|Riesz–Markov–Kakutani representation]] theorems.
 
Notice also that the neural network is only required to approximate within a compact set <math>K</math>. The proof does not describe how the function would be extrapolated outside of the region.