Revision as of 22:43, 14 September 2019 edit Jarble (talk \| contribs) Autopatrolled, Extended confirmed users 150,084 edits m linking ← Previous edit		Revision as of 13:51, 24 December 2019 edit undo Frap (talk \| contribs) Extended confirmed users, File movers, Pending changes reviewers, Rollbackers 35,595 edits →Learning pseudo-code Next edit →
Line 71: This makes <math>w_1</math> the minimizing weight found by gradient descent. == Learning ~~pseudo-code~~pseudocode == To implement the algorithm above, explicit formulas are required for the gradient of the function <math>w \mapsto E(f_N(w, x), y)</math> where the function is <math>E(y,y')= \|y-y'\|^2</math>. Line 95: === Pseudocode === [[Pseudocode]] for a [[stochastic gradient descent]] algorithm for training a three-layer network (one hidden layer): initialize network weights (often small random values) '''do''' '''~~forEach~~for each''' training example named ex '''do''' prediction = <u>neural-net-output</u>(network, ex) ''// forward pass'' actual = <u>teacher-output</u>(ex) Line 104 ⟶ 105: {{nowrap\|compute <math>\Delta w_i</math> for all weights from input layer to hidden layer}} ''// backward pass continued'' update network weights ''// input layer not modified by error estimate'' '''until''' error rate becomes acceptably low '''return''' the network The lines labeled "backward pass" can be implemented using the backpropagation algorithm, which calculates the gradient of the error of the network regarding the network's modifiable weights.<ref>Werbos, Paul J. (1994). ''The Roots of Backpropagation''. From Ordered Derivatives to Neural Networks and Political Forecasting. New York, NY: John Wiley & Sons, Inc.</ref>

Mathematics of neural networks in machine learning: Difference between revisions