In theThe [[Herbert Robbins|Robbins]]-Monro algorithm, introduced in 1951<ref name="rm">A Stochastic Approximation Method, Herbert Robbins and Sutton Monro, ''Annals of Mathematical Statistics'' '''22''', #3 (September 1951), pp. 400–407.</ref>, onepresented a methodology for solving a root finding problem, where the objective is an expected value function. Let us assume that we have a function <math>M(x)</math>, and a constant <math>\alpha</math>, such that the equation <math>M(x) = \alpha</math> has a unique root at <math>x=\theta</math>. It is assume that while we cannot directly observe the function <math>M(x)</math>, we can instead obtain measurements of the random variable <math>N(x)</math> such that <math>\E[N(x)] = M(x)</math>. The structure of the algorithm is to generate iterates of the form
has a function <math>M(x)</math> for which one wishes to find the value of <math>x</math>, <math>x_0</math>, satisfying <math>M(x_0)=\alpha</math>. However, what is observable is not <math>M(x)</math>, but rather a random variable <math>N(x)</math> such that <math>E(N(x)|x)=M(x)</math>. The algorithm is then to construct a sequence <math>x_1, x_2, \dots</math> which satisfies
::<math>x_{n+1}=x_n+a_n(\alpha-N(x_n))</math>.
Here, <math>a_1, a_2, \dots</math> is a sequence of positive step-sizes. [[Herbert Robbins|Robbins]] and Monro proved <ref name="rm" /><sup>, Theorem 2</sup> that <math>x_n</math> [[convergence of random variables|converges]] in <math>L^2</math> (and hence also in probability) to <math>x_0</math> provided that: