Universal approximation theorem: Difference between revisions

Content deleted Content added
Line 35:
 
=== Quantitative bounds ===
The question of minimal possible width for universality was first studied in 2021, Park et al obtained the minimum width required for the universal approximation of ''[[Lp space|L<sup>p</sup>]]'' functions using feed-forward neural networks with [[Rectifier (neural networks)|ReLU]] as activation functions.<ref name="park">{{Cite conference |last1=Park |first1=Sejun |last2=Yun |first2=Chulhee |last3=Lee |first3=Jaeho |last4=Shin |first4=Jinwoo |date=2021 |title=Minimum Width for Universal Approximation |conference=International Conference on Learning Representations |arxiv=2006.08859}}</ref> Similar results that can be directly applied to [[residual neural network]]s were also obtained in the same year by Paulo Tabuada and Bahman Gharesifard using [[Control theory|control-theoretic]] arguments.<ref>{{Cite conference |last1=Tabuada |first1=Paulo |last2=Gharesifard |first2=Bahman |date=2021 |title=Universal approximation power of deep residual neural networks via nonlinear control theory |conference=International Conference on Learning Representations |arxiv=2007.06007}}</ref><ref>{{cite journal |last1=Tabuada |first1=Paulo |last2=Gharesifard |first2=Bahman |date=May 2023 |title=Universal Approximation Power of Deep Residual Neural Networks Through the Lens of Control |journal=IEEE Transactions on Automatic Control |volume=68 |issue=5 |pages=2715–2728 |doi=10.1109/TAC.2022.3190051 |s2cid=250512115}}{{Erratum|doi=10.1109/TAC.2024.3390099|checked=yes}}</ref> In 2023, Cai obtained the optimal minimum width bound for the universal approximation.<ref name=":1">{{Cite journal |last=Cai |first=Yongqiang |date=2023-02-01 |title=Achieve the Minimum Width of Neural Networks for Universal Approximation |url=https://openreview.net/forum?id=hfUJ4ShyDEU |journal=ICLR |language=en |arxiv=2209.11395}}</ref> obtained the optimal minimum width bound for the universal approximation.
 
For the arbitrary depth case, Leonie Papon and Anastasis Kratsios derived explicit depth estimates depending on the regularity of the target function and of the activation function.<ref name="jmlr.org">{{Cite journal |last1=Kratsios |first1=Anastasis |last2=Papon |first2=Léonie |date=2022 |title=Universal Approximation Theorems for Differentiable Geometric Deep Learning |url=http://jmlr.org/papers/v23/21-0716.html |journal=Journal of Machine Learning Research |volume=23 |issue=196 |pages=1–73 |arxiv=2101.05390}}</ref> derived explicit depth estimates depending on the regularity of the target function and of the activation function.
 
=== Kolmogorov network ===