Stochastic gradient descent: Difference between revisions

Content deleted Content added
No edit summary
No edit summary
Line 1:
'''Stochastic gradient descent''' (often abbreviated '''SGD''') is an [[iterative method]] for [[Mathematical optimization|optimizing]] an [[objective function]] with suitable [[smoothness]] properties (e.g. [[Differentiable function|differentiable]] or [[Subgradient method|subdifferentiable]]). It can be regarded as a [[stochastic approximation]] of [[gradient descent]] optimization, since it replaces the actual gradient (calculated from the entire [[data set]]) by an estimate thereof (calculated from a randomly selected subset of the data). Especially in [[big data]] applications this reduces the [[Computational complexity|computational burden]], achieving faster iterations in trade for a slightly lower convergence rate.<ref>{{cite book |first=Léon |last=Bottou |authorlink=Léon Bottou |first2=Olivier |last2=Bousquet |chapter=The Tradeoffs of Large Scale Learning |title=Optimization for Machine Learning |editor-first=Suvrit |editor-last=Sra |editor2-first=Sebastian |editor2-last=Nowozin |editor3-first=Stephen J. |editor3-last=Wright |___location=Cambridge |publisher=MIT Press |year=2012 |isbn=978-0-262-01646-9 |pages=351–368 |chapterurl=https://books.google.com/books?id=JPQx7s2L1A8C&pg=PA351 }}</ref>
 
While the basic idea behind stochastic approximation can be traced back to the [[Robbins–Monro algorithm]] of the 1950s, stochastic gradient descent has become an important optimization method in [[machine learning]].<ref>{{Cite book
|last=Bottou
|first=Léon
|authorlink=Léon Bottou
|contribution=Online Algorithms and Stochastic Approximations
|year=1998
|title=Online Learning and Neural Networks
|publisher=Cambridge University Press
|url=https://archive.org/details/onlinelearningin0000unse
|isbn=978-0-521-65263-6
|url-access=registration
}}</ref><ref>{{cite news
|last=Kiwiel
|first=Krzysztof C.
|title=Convergence and efficiency of subgradient methods for quasiconvex minimization
|journal=Mathematical Programming, Series A
|publisher=Springer|___location=Berlin, Heidelberg
|issn=0025-5610|pages=1–25|volume=90|issue=1
|doi=10.1007/PL00011414|year=2001 |mr=1819784}}</ref>
 
==Background==