Content deleted Content added
References |
Citation bot (talk | contribs) Add: bibcode, isbn, pages, authors 1-1. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Dominic3203 | Category:Artificial neural networks | #UCB_Category 96/159 |
||
Line 3:
{{Machine learning}}
[[File:Logistic-curve.svg|thumb|Logistic activation function]]
The '''activation function''' of a node in an [[artificial neural network]] is a function that calculates the output of the node based on its individual inputs and their weights. Nontrivial problems can be solved using only a few nodes if the activation function is ''nonlinear''.<ref>{{Cite web|url=http://didattica.cs.unicam.it/lib/exe/fetch.php?media=didattica:magistrale:kebi:ay_1718:ke-11_neural_networks.pdf|title=Neural Networks, p. 7|last=Hinkelmann|first=Knut|website=University of Applied Sciences Northwestern Switzerland|access-date=2018-10-06|archive-date=2018-10-06|archive-url=https://web.archive.org/web/20181006235506/http://didattica.cs.unicam.it/lib/exe/fetch.php?media=didattica:magistrale:kebi:ay_1718:ke-11_neural_networks.pdf|url-status=dead}}</ref> Modern activation functions include the smooth version of the [[Rectifier (neural networks)|ReLU]], the GELU, which was used in the 2018 [[BERT (language model)|BERT]] model,<ref name="ReferenceA">{{Cite arXiv |eprint=1606.08415 |title=Gaussian Error Linear Units (GELUs) |last1=Hendrycks |first1=Dan |last2=Gimpel |first2=Kevin |year=2016 |class=cs.LG}}</ref> the logistic ([[Sigmoid function|sigmoid]]) function used in the 2012 speech recognition model developed by Hinton et al,<ref>{{Cite journal |last1=Hinton |first1=Geoffrey |last2=Deng |first2=Li |last3=Deng |first3=Li |last4=Yu |first4=Dong |last5=Dahl |first5=George |last6=Mohamed |first6=Abdel-rahman |last7=Jaitly |first7=Navdeep |last8=Senior |first8=Andrew |last9=Vanhoucke |first9=Vincent |last10=Nguyen |first10=Patrick |last11=Sainath |first11=Tara|author11-link= Tara Sainath |last12=Kingsbury |first12=Brian |year=2012 |title=Deep Neural Networks for Acoustic Modeling in Speech Recognition |journal=IEEE Signal Processing Magazine |volume=29 |issue=6 |pages=82–97 |doi=10.1109/MSP.2012.2205597|s2cid=206485943 }}</ref> the [[ReLU]] used in the 2012 [[AlexNet]] computer vision model<ref>{{Cite journal |
==Comparison of activation functions==
Line 196:
|title = [[2008 IEEE International Conference on Acoustics, Speech and Signal Processing]]
|date = 2008
|pages=3265–3268 |doi=10.1109/ICASSP.2008.4518347|isbn=978-1-4244-1483-3 |s2cid=9959057}}</ref> SiL,<ref>{{Cite journal |arxiv = 1702.03118|title = Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning|journal = Neural Networks |volume = 107|pages = 3–11|last1 = Elfwing|first1 = Stefan|last2 = Uchibe|first2 = Eiji|last3 = Doya|first3 = Kenji|year = 2018|pmid = 29395652|doi = 10.1016/j.neunet.2017.12.012|s2cid = 6940861}}</ref> or Swish-{{zwj}}1<ref>{{Cite arXiv |eprint = 1710.05941 |title = Searching for Activation Functions|last1 = Ramachandran|first1 = Prajit|last2 = Zoph|first2 = Barret|last3 = Le|first3 = Quoc V|year = 2017|class = cs.NE}}</ref>)
| [[File:Swish.svg|Swish Activation Function|120px]]
| <math>\frac{x}{1 + e^{-x}}</math>
Line 241:
===Quantum activation functions ===
{{Main|Quantum function}}
In [[quantum neural networks]] programmed on gate-model [[quantum computers]], based on quantum perceptrons instead of variational quantum circuits, the non-linearity of the activation function can be implemented with no need of measuring the output of each [[perceptron]] at each layer. The quantum properties loaded within the circuit such as superposition can be preserved by creating the [[Taylor series]] of the argument computed by the perceptron itself, with suitable quantum circuits computing the powers up to a wanted approximation degree. Because of the flexibility of such quantum circuits, they can be designed in order to approximate any arbitrary classical activation function.<ref>{{cite journal|doi=10.1007/s11128-022-03466-0 |issn=1570-0755 |title=Quantum activation functions for quantum neural networks|year=2022|last1=Maronese |first1=Marco|last2=Destri |first2=Claudio|last3= Prati|first3=Enrico |journal= Quantum Information Processing |volume=21|issue=4|page=128 |arxiv=2201.03700|bibcode=2022QuIP...21..128M }}</ref>
==See also==
|