Revision as of 19:35, 26 April 2024 edit Funcs (talk \| contribs) 74 edits References Tag: Visual edit ← Previous edit		Revision as of 03:28, 4 May 2024 edit undo Citation bot (talk \| contribs) Bots 5,871,103 edits Add: bibcode, isbn, pages, authors 1-1. Removed parameters. Some additions/deletions were parameter name changes. \| Use this bot. Report bugs. \| Suggested by Dominic3203 \| Category:Artificial neural networks \| #UCB_Category 96/159 Next edit →
Line 3: {{Machine learning}} [[File:Logistic-curve.svg\|thumb\|Logistic activation function]] The '''activation function''' of a node in an [[artificial neural network]] is a function that calculates the output of the node based on its individual inputs and their weights. Nontrivial problems can be solved using only a few nodes if the activation function is ''nonlinear''.<ref>{{Cite web\|url=http://didattica.cs.unicam.it/lib/exe/fetch.php?media=didattica:magistrale:kebi:ay_1718:ke-11_neural_networks.pdf\|title=Neural Networks, p. 7\|last=Hinkelmann\|first=Knut\|website=University of Applied Sciences Northwestern Switzerland\|access-date=2018-10-06\|archive-date=2018-10-06\|archive-url=https://web.archive.org/web/20181006235506/http://didattica.cs.unicam.it/lib/exe/fetch.php?media=didattica:magistrale:kebi:ay_1718:ke-11_neural_networks.pdf\|url-status=dead}}</ref> Modern activation functions include the smooth version of the [[Rectifier (neural networks)\|ReLU]], the GELU, which was used in the 2018 [[BERT (language model)\|BERT]] model,<ref name="ReferenceA">{{Cite arXiv \|eprint=1606.08415 \|title=Gaussian Error Linear Units (GELUs) \|last1=Hendrycks \|first1=Dan \|last2=Gimpel \|first2=Kevin \|year=2016 \|class=cs.LG}}</ref> the logistic ([[Sigmoid function\|sigmoid]]) function used in the 2012 speech recognition model developed by Hinton et al,<ref>{{Cite journal \|last1=Hinton \|first1=Geoffrey \|last2=Deng \|first2=Li \|last3=Deng \|first3=Li \|last4=Yu \|first4=Dong \|last5=Dahl \|first5=George \|last6=Mohamed \|first6=Abdel-rahman \|last7=Jaitly \|first7=Navdeep \|last8=Senior \|first8=Andrew \|last9=Vanhoucke \|first9=Vincent \|last10=Nguyen \|first10=Patrick \|last11=Sainath \|first11=Tara\|author11-link= Tara Sainath \|last12=Kingsbury \|first12=Brian \|year=2012 \|title=Deep Neural Networks for Acoustic Modeling in Speech Recognition \|journal=IEEE Signal Processing Magazine \|volume=29 \|issue=6 \|pages=82–97 \|doi=10.1109/MSP.2012.2205597\|s2cid=206485943 }}</ref> the [[ReLU]] used in the 2012 [[AlexNet]] computer vision model<ref>{{Cite journal \|~~last~~last1=Krizhevsky \|~~first~~first1=Alex \|last2=Sutskever \|first2=Ilya \|last3=Hinton \|first3=Geoffrey E. \|date=2017-05-24 \|title=ImageNet classification with deep convolutional neural networks \|url=https://dl.acm.org/doi/10.1145/3065386 \|journal=Communications of the ACM \|language=en \|volume=60 \|issue=6 \|pages=84–90 \|doi=10.1145/3065386 \|issn=0001-0782}}</ref><ref>{{Cite journal \|~~last~~last1=King Abdulaziz University \|last2=Al-johania \|first2=Norah \|last3=Elrefaei \|first3=Lamiaa \|last4=Benha University \|date=2019-06-30 \|title=Dorsal Hand Vein Recognition by Convolutional Neural Networks: Feature Learning and Transfer Learning Approaches \|url=http://www.inass.org/2019/2019063019.pdf \|journal=International Journal of Intelligent Engineering and Systems \|volume=12 \|issue=3 \|pages=178–191 \|doi=10.22266/ijies2019.0630.19}}</ref> and in the 2015 [[Residual neural network\|ResNet]] model. ==Comparison of activation functions== Line 196: \|title = [[2008 IEEE International Conference on Acoustics, Speech and Signal Processing]] \|date = 2008 \|pages=3265–3268 \|doi=10.1109/ICASSP.2008.4518347\|isbn=978-1-4244-1483-3 \|s2cid=9959057}}</ref> SiL,<ref>{{Cite journal \|arxiv = 1702.03118\|title = Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning\|journal = Neural Networks \|volume = 107\|pages = 3–11\|last1 = Elfwing\|first1 = Stefan\|last2 = Uchibe\|first2 = Eiji\|last3 = Doya\|first3 = Kenji\|year = 2018\|pmid = 29395652\|doi = 10.1016/j.neunet.2017.12.012\|s2cid = 6940861}}</ref> or Swish-{{zwj}}1<ref>{{Cite arXiv \|eprint = 1710.05941 \|title = Searching for Activation Functions\|last1 = Ramachandran\|first1 = Prajit\|last2 = Zoph\|first2 = Barret\|last3 = Le\|first3 = Quoc V\|year = 2017\|class = cs.NE}}</ref>) \| [[File:Swish.svg\|Swish Activation Function\|120px]] \| <math>\frac{x}{1 + e^{-x}}</math> Line 241: ===Quantum activation functions === {{Main\|Quantum function}} In [[quantum neural networks]] programmed on gate-model [[quantum computers]], based on quantum perceptrons instead of variational quantum circuits, the non-linearity of the activation function can be implemented with no need of measuring the output of each [[perceptron]] at each layer. The quantum properties loaded within the circuit such as superposition can be preserved by creating the [[Taylor series]] of the argument computed by the perceptron itself, with suitable quantum circuits computing the powers up to a wanted approximation degree. Because of the flexibility of such quantum circuits, they can be designed in order to approximate any arbitrary classical activation function.<ref>{{cite journal\|doi=10.1007/s11128-022-03466-0 \|issn=1570-0755 \|title=Quantum activation functions for quantum neural networks\|year=2022\|last1=Maronese \|first1=Marco\|last2=Destri \|first2=Claudio\|last3= Prati\|first3=Enrico \|journal= Quantum Information Processing \|volume=21\|issue=4\|page=128 \|arxiv=2201.03700\|bibcode=2022QuIP...21..128M }}</ref> ==See also==

Activation function: Difference between revisions