Hyperparameter optimization: Difference between revisions

Content deleted Content added
remove incorrect and uncited claim
Random search: add note about continuous and grid search
Line 38:
 
=== Random search ===
Random Search replaces the exhaustive enumeration of all combinations by selecting them randomly. This can be simply applied to the discrete setting described above, but also generalizes to continuous and mixed spaces. A benefit over grid search is that random search can explore many more values than grid search could for continuous hyperparameters. It can outperform Grid search, especially when only a small number of hyperparameters affects the final performance of the machine learning algorithm.<ref name="bergstra" /> In this case, the optimization problem is said to have a low intrinsic dimensionality.<ref>{{Cite journal|last1=Ziyu|first1=Wang|last2=Frank|first2=Hutter|last3=Masrour|first3=Zoghi|last4=David|first4=Matheson|last5=Nando|first5=de Feitas|date=2016|title=Bayesian Optimization in a Billion Dimensions via Random Embeddings|journal=Journal of Artificial Intelligence Research|language=en|volume=55|pages=361–387|doi=10.1613/jair.4806|arxiv=1301.1942|s2cid=279236}}</ref> Random Search is also [[embarrassingly parallel]], and additionally allows the inclusion of prior knowledge by specifying the distribution from which to sample. Despite its simplicity, random search remains one of the important base-lines against which to compare the performance of new hyperparameter optimization methods.
 
[[File:Hyperparameter Optimization using Tree-Structured Parzen Estimators.svg|thumb|Methods such as Bayesian optimization smartly explore the space of potential choices of hyperparameters by deciding which combination to explore next based on previous observations.]]