Content deleted Content added
Magicheader (talk | contribs) |
Magicheader (talk | contribs) |
||
Line 137:
In such settings, ISGD is simply implemented as follows. Let <math>f(\xi) = \eta q(x_i'w^{old} + \xi ||x_i||^2)</math>, where <math>\xi</math> is scalar.
Then, ISGD is equivalent to:
:<math>w^{new} = w^{old} + \xi^\ast x_i,~\
The scaling factor <math>\xi^\ast\in\mathbb{R}</math> can be found through the [[bisection method]] since
in most regular models, such as the aforementioned generalized linear models, function <math>q()</math> is decreasing,
and thus the search bounds for <math>\xi^\ast</math> are
|