Bayesian optimization: Difference between revisions

Content deleted Content added
Strategy: the notation says or equal to
No edit summary
Line 16:
==Strategy==
[[File:GpParBayesAnimationSmall.gif|thumb|440x330px|Bayesian optimization of a function (black) with Gaussian processes (purple). Three acquisition functions (blue) are shown at the bottom.<ref>{{Citation|last=Wilson|first=Samuel|title=ParBayesianOptimization R package|date=2019-11-22|url=https://github.com/AnotherSamWilson/ParBayesianOptimization|access-date=2019-12-12}}</ref>]]
Bayesian optimization is typically used on problems of the form <math display="inline">\max_{x \in A} f(x)</math>, where <math display="inline">A</math> is a set of points, <math display="inline">x</math>, which rely upon less (or equal to?) than 20 [[dimension]]s (<math display="inline">\mathbb{R}^d, d \le 20</math>), and whose membership can easily be evaluated. Bayesian optimization is particularly advantageous for problems where <math display="inline">f(x)</math> is difficult to evaluate due to its computational cost. The objective function, <math display="inline">f</math>, is continuous and takes the form of some unknown structure, referred to as a "black box". Upon its evaluation, only <math display="inline">f(x)</math> is observed and its [[derivative]]s are not evaluated.<ref name=":0">{{cite arXiv|last=Frazier|first=Peter I.|date=2018-07-08|title=A Tutorial on Bayesian Optimization|class=stat.ML|eprint=1807.02811}}</ref>
 
Since the objective function is unknown, the Bayesian strategy is to treat it as a random function and place a [[Prior distribution|prior]] over it. The prior captures beliefs about the behavior of the function. After gathering the function evaluations, which are treated as data, the prior is updated to form the [[posterior distribution]] over the objective function. The posterior distribution, in turn, is used to construct an acquisition function (often also referred to as infill sampling criteria) that determines the next query point.