Universal approximation theorem: Difference between revisions

Content deleted Content added
OAbot (talk | contribs)
m Open access bot: arxiv updated in citation with #oabot.
Line 111:
Remark: If the activation is replaced by leaky-ReLU, and the input is restricted in a compact ___domain, then the exact minimum width is<ref name=":1" /> <math>d_m = \max\{n, m, 2\}</math>.
 
''Quantitative refinement:'' In the case where <math>f:[0, 1]^n \rightarrow \mathbb{R} </math>, (i.e. <math> m = 1 </math>) and <math>\sigma</math> is the [[Rectifier (neural networks)|ReLU activation function]], the exact depth and width for a ReLU network to achieve <math>\varepsilon</math> error is also known.<ref>{{cite journal |last1=Shen |first1=Zuowei |last2=Yang |first2=Haizhao |last3=Zhang |first3=Shijun |title=Optimal approximation rate of ReLU networks in terms of width and depth |journal=Journal de Mathématiques Pures et Appliquées |date=January 2022 |volume=157 |pages=101–135 |doi=10.1016/j.matpur.2021.07.009 |arxiv=2103.00502 |s2cid = 232075797 }}</ref> If, moreover, the target function <math>f</math> is smooth, then the required number of layer and their width can be exponentially smaller.<ref>{{cite journal |last1=Lu |first1=Jianfeng |last2=Shen |first2=Zuowei |last3=Yang |first3=Haizhao |last4=Zhang |first4=Shijun |title=Deep Network Approximation for Smooth Functions |journal = SIAM Journal on Mathematical Analysis |date=January 2021 |volume=53 |issue=5 |pages=5465–5506 |doi=10.1137/20M134695X |arxiv=2001.03040 |s2cid=210116459 }}</ref> Even if <math>f</math> is not smooth, the curse of dimensionality can be broken if <math>f</math> admits additional "compositional structure".<ref>{{Cite journal |last1=Juditsky |first1=Anatoli B. |last2=Lepski |first2=Oleg V. |last3=Tsybakov |first3=Alexandre B. |date=2009-06-01 |title=Nonparametric estimation of composite functions |journal=The Annals of Statistics |volume=37 |issue=3 |doi=10.1214/08-aos611 |s2cid=2471890 |issn=0090-5364|doi-access=free |arxiv=0906.0865 }}</ref><ref>{{Cite journal |last1=Poggio |first1=Tomaso |last2=Mhaskar |first2=Hrushikesh |last3=Rosasco |first3=Lorenzo |last4=Miranda |first4=Brando |last5=Liao |first5=Qianli |date=2017-03-14 |title=Why and when can deep-but not shallow-networks avoid the curse of dimensionality: A review |journal=International Journal of Automation and Computing |volume=14 |issue=5 |pages=503–519 |doi=10.1007/s11633-017-1054-2 |s2cid=15562587 |issn=1476-8186|doi-access=free |arxiv=1611.00740 }}</ref>
}}