Mathematics of neural networks in machine learning: Difference between revisions

Content deleted Content added
detail
No edit summary
Line 12:
* an ''activation function'' <math>f</math> that computes the new activation at a given time <math>t+1</math> from <math>a_j(t)</math>, <math>\theta_j</math> and the net input <math>p_j(t)</math> giving rise to the relation
 
: <math> a_j(t+1) = f(a_j(t), p_j(t), \theta_j), </math>,
 
* and an ''output function'' <math>f_{out}</math> computing the output from the activation
 
: <math> o_j(t) = f_\text{out}(a_j(t)). </math>.
 
Often the output function is simply the [[identity function]].
Line 25:
The ''propagation function'' computes the ''input'' <math>p_j(t)</math> to the neuron <math>j</math> from the outputs <math>o_i(t)</math>and typically has the form<ref name="Zell1994ch5.22">{{Cite book|url=http://worldcat.org/oclc/249017987|title=Simulation neuronaler Netze|last=Zell|first=Andreas|date=2003|publisher=Addison-Wesley|isbn=978-3-89319-554-1|edition=1st|language=German|trans-title=Simulation of Neural Networks|chapter=chapter 5.2|oclc=249017987}}</ref>
 
: <math> p_j(t) = \sum_{i} o_i(t) w_{ij}. </math>.
 
=== Bias ===
A bias term can be added, changing the form to the following:<ref name="DAWSON1998">{{cite journal|last1=DAWSON|first1=CHRISTIAN W|year=1998|title=An artificial neural network approach to rainfall-runoff modelling|journal=Hydrological Sciences Journal|volume=43|issue=1|pages=47–66|doi=10.1080/02626669809492102}}</ref>
 
: <math> p_j(t) = \sum_{i}sum_i o_i(t) w_{ij}+ w_{0j}, </math> , where <math>w_{0j}</math> is a bias.
:
 
== Neural networks as functions ==
Line 54 ⟶ 53:
 
* [[Gradient descent|steepest descent]] (with variable [[learning rate]] and [[Gradient descent#The momentum method|momentum]], [[Rprop|resilient backpropagation]]);
* quasi-Newton ([[Broyden–Fletcher–Goldfarb–Shanno algorithm|Broyden-Fletcher-Goldfarb-ShannoBroyden–Fletcher–Goldfarb–Shanno]], [[Secant method|one step secant]]);
* [[Levenberg–Marquardt algorithm|Levenberg-MarquardtLevenberg–Marquardt]] and [[Conjugate gradient method|conjugate gradient]] (Fletcher-ReevesFletcher–Reeves update, Polak-RibiérePolak–Ribiére update, Powell-BealePowell–Beale restart, scaled conjugate gradient).<ref>{{cite conference|author1=M. Forouzanfar|author2=H. R. Dajani|author3=V. Z. Groza|author4=M. Bolic|author5=S. Rajan|last-author-amp=yes|date=July 2010|title=Comparison of Feed-Forward Neural Network Training Algorithms for Oscillometric Blood Pressure Estimation|url=https://www.researchgate.net/publication/224173336|conference=4th Int. Workshop Soft Computing Applications|___location=Arad, Romania|publisher=IEEE}}</ref>
 
=== Algorithm ===