Revision as of 00:07, 21 July 2025 edit Maxeto0910 (talk \| contribs) Extended confirmed users 117,201 edits →Table of activation functions: no sentence Tag: Visual edit ← Previous edit		Revision as of 14:55, 15 August 2025 edit undo Hooman Mallahzadeh (talk \| contribs) Extended confirmed users 4,649 edits Linking. Next edit →
Line 5: The '''activation function''' of a node in an [[artificial neural network]] is a function that calculates the output of the node based on its individual inputs and their weights. Nontrivial problems can be solved using only a few nodes if the activation function is ''nonlinear''.<ref>{{Cite web\|url=http://didattica.cs.unicam.it/lib/exe/fetch.php?media=didattica:magistrale:kebi:ay_1718:ke-11_neural_networks.pdf\|title=Neural Networks, p. 7\|last=Hinkelmann\|first=Knut\|website=University of Applied Sciences Northwestern Switzerland\|access-date=2018-10-06\|archive-date=2018-10-06\|archive-url=https://web.archive.org/web/20181006235506/http://didattica.cs.unicam.it/lib/exe/fetch.php?media=didattica:magistrale:kebi:ay_1718:ke-11_neural_networks.pdf\|url-status=dead}}</ref> Modern activation functions include the logistic ([[Sigmoid function\|sigmoid]]) function used in the 2012 [[speech recognition]] model developed by [[Geoffrey Hinton\|Hinton]] et al;<ref>{{Cite journal \|last1=Hinton \|first1=Geoffrey \|last2=Deng \|first2=Li \|last3=Deng \|first3=Li \|last4=Yu \|first4=Dong \|last5=Dahl \|first5=George \|last6=Mohamed \|first6=Abdel-rahman \|last7=Jaitly \|first7=Navdeep \|last8=Senior \|first8=Andrew \|last9=Vanhoucke \|first9=Vincent \|last10=Nguyen \|first10=Patrick \|last11=Sainath \|first11=Tara\|author11-link= Tara Sainath \|last12=Kingsbury \|first12=Brian \|year=2012 \|title=Deep Neural Networks for Acoustic Modeling in Speech Recognition \|journal=IEEE Signal Processing Magazine \|volume=29 \|issue=6 \|pages=82–97 \|doi=10.1109/MSP.2012.2205597\|s2cid=206485943 }}</ref> the [[ReLU]] used in the 2012 [[AlexNet]] computer vision model<ref>{{Cite journal \|last1=Krizhevsky \|first1=Alex \|last2=Sutskever \|first2=Ilya \|last3=Hinton \|first3=Geoffrey E. \|date=2017-05-24 \|title=ImageNet classification with deep convolutional neural networks \|url=https://dl.acm.org/doi/10.1145/3065386 \|journal=Communications of the ACM \|language=en \|volume=60 \|issue=6 \|pages=84–90 \|doi=10.1145/3065386 \|issn=0001-0782}}</ref><ref>{{Cite journal \|last1=King Abdulaziz University \|last2=Al-johania \|first2=Norah \|last3=Elrefaei \|first3=Lamiaa \|last4=Benha University \|date=2019-06-30 \|title=Dorsal Hand Vein Recognition by Convolutional Neural Networks: Feature Learning and Transfer Learning Approaches \|url=http://www.inass.org/2019/2019063019.pdf \|journal=International Journal of Intelligent Engineering and Systems \|volume=12 \|issue=3 \|pages=178–191 \|doi=10.22266/ijies2019.0630.19}}</ref> and in the 2015 [[Residual neural network\|ResNet]] model; and the smooth version of the ReLU, the [[ReLU#Gaussian-error linear unit (GELU)\|GELU]], which was used in the 2018 [[BERT (language model)\|BERT]] model.<ref name="ReferenceA">{{Cite arXiv \|eprint=1606.08415 \|title=Gaussian Error Linear Units (GELUs) \|last1=Hendrycks \|first1=Dan \|last2=Gimpel \|first2=Kevin \|year=2016 \|class=cs.LG}}</ref> ==Comparison of activation functions==

Activation function: Difference between revisions