Content deleted Content added
m linking |
|||
Line 71:
This makes <math>w_1</math> the minimizing weight found by gradient descent.
== Learning
To implement the algorithm above, explicit formulas are required for the gradient of the function <math>w \mapsto E(f_N(w, x), y)</math> where the function is <math>E(y,y')= |y-y'|^2</math>.
Line 95:
=== Pseudocode ===
[[Pseudocode]] for a [[stochastic gradient descent]] algorithm for training a three-layer network (one hidden layer):
prediction = <u>neural-net-output</u>(network, ex) ''// forward pass''
actual = <u>teacher-output</u>(ex)
Line 104 ⟶ 105:
{{nowrap|compute <math>\Delta w_i</math> for all weights from input layer to hidden layer}} ''// backward pass continued''
update network weights ''// input layer not modified by error estimate''
The lines labeled "backward pass" can be implemented using the backpropagation algorithm, which calculates the gradient of the error of the network regarding the network's modifiable weights.<ref>Werbos, Paul J. (1994). ''The Roots of Backpropagation''. From Ordered Derivatives to Neural Networks and Political Forecasting. New York, NY: John Wiley & Sons, Inc.</ref>
|