Content deleted Content added
m link to a term |
Citation bot (talk | contribs) Add: publisher. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | #UCB_CommandLine |
||
Line 37:
Therefore, the sequence <math display="inline">a_n = n^{-\alpha}</math> with <math display="inline">0 < \alpha < 1</math> satisfies this restriction, but <math display="inline">\alpha = 1</math> does not, hence the longer steps. Under the assumptions outlined in the Robbins–Monro algorithm, the resulting modification will result in the same asymptotically optimal convergence rate <math display="inline">O(1/\sqrt{n})</math> yet with a more robust step size policy.<ref name="pj" /> Prior to this, the idea of using longer steps and averaging the iterates had already been proposed by Nemirovski and Yudin<ref name="NY">On Cezari's convergence of the steepest descent method for approximating saddle points of convex-concave functions, A. Nemirovski and D. Yudin, ''Dokl. Akad. Nauk SSR'' '''2939''', (1978 (Russian)), Soviet Math. Dokl. '''19''' (1978 (English)).</ref> for the cases of solving the stochastic optimization problem with continuous convex objectives and for convex-concave saddle point problems. These algorithms were observed to attain the nonasymptotic rate <math display="inline">O(1/\sqrt{n})</math>.
A more general result is given in Chapter 11 of Kushner and Yin<ref>{{Cite book|url=https://www.springer.com/us/book/9780387008943|title=Stochastic Approximation and Recursive Algorithms and {{!}} Harold Kushner {{!}} Springer|
<math display="block">\theta^n(t)=\theta_{n+i},\quad U^n(t)=(\theta_{n+i}-\theta^*)/\sqrt{a_{n+i}}\quad\mbox{for}\quad t\in[t_{n+i}-t_n,t_{n+i+1}-t_n),i\ge0</math>Let the iterate average be <math>\Theta_n=\frac{a_n}{t}\sum_{i=n}^{n+t/a_n-1}\theta_i</math> and the associate normalized error to be <math>\hat{U}^n(t)=\frac{\sqrt{a_n}}{t}\sum_{i=n}^{n+t/a_n-1}(\theta_i-\theta^*)</math>.
|