Stochastic approximation: Difference between revisions

Content deleted Content added
Line 78:
C3) <math>\sum_{n=0}^{\infty}\varepsilon_n^2 <\infty </math>
 
C4) <math>|X_n| \leq B, \text{ for a fixed bound }\; B. </math>
 
C5) <math>g(\theta)\; \text{ is strictly convex, i.e.} </math>
 
: <math display="block">\inf_{\delta\leq |\theta - \theta^*|\leq 1/\delta}\langle\theta-\theta^*, \nabla g(\theta)\rangle > 0,\text{ for every }\; 0< \delta < 1.
</math>
 
Line 92:
</math> is too far away from <math>\theta^* </math>. As for C3) note that if <math>\theta_n </math> converges to <math>\theta^* </math> then
 
<math display="block">\theta_{n+1} - \theta_n = -\varepsilon_n H(\theta_n, X_{n+1}) \rightarrow 0, \text{ as }\; n\rightarrow \infty.</math> so we must have <math>\varepsilon_n \downarrow 0 </math> ,and the condition C3) ensures it. A natural choice would be <math>\varepsilon_n = 1/n </math>. Condition C5) is a fairly stringent condition on the shape of <math>g(\theta)</math>; it gives the search direction of the algorithm.
 
==== Example (where the stochastic gradient method is appropriate)<ref name="jcsbook" /> ====