Consensus based optimization: Difference between revisions

Content deleted Content added
ce
Citation bot (talk | contribs)
Altered template type. Add: class, eprint, s2cid, bibcode, pages, issue, volume, journal, date, authors 1-4. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Headbomb | #UCB_toolbar
Line 5:
'''Consensus-based optimization (CBO)'''<ref name=":0">{{Cite journal |last1=Pinnau |first1=René |last2=Totzeck |first2=Claudia |last3=Tse |first3=Oliver |last4=Martin |first4=Stephan |date=January 2017 |title=A consensus-based model for global optimization and its mean-field limit |url=https://www.worldscientific.com/doi/abs/10.1142/S0218202517400061 |journal=Mathematical Models and Methods in Applied Sciences |language=en |volume=27 |issue=1 |pages=183–204 |doi=10.1142/S0218202517400061 |arxiv=1604.05648 |s2cid=119296432 |issn=0218-2025}}</ref> is a multi-agent [[derivative-free optimization]] method, designed to obtain solutions for global optimization problems of the form <math display="block">\min_{x\in \cal{X}} f(x),</math>
[[File:CBORastrigin.gif|thumb|Behavior of CBO on the [[Rastrigin function]]. '''Blue:''' Particles, '''Pink:''' drift vectors and consensus point.]]
where <math>f:\mathcal{X}\to\R</math> denotes the objective function acting on the state space <math>\cal{X}</math>, which we assume to be a [[normed vector space]] in the following. The function <math>f</math> can potentially be nonconvex and nonsmooth. The algorithm employs particles or agents to explore the state space, which communicate with each other to update their positions. Their dynamics follows the paradigm of [[Metaheuristic|metaheuristics]], which blend exporation with exploitation. In this sense, CBO is comparable to [[Ant colony optimization algorithms|ant colony optimization]], wind driven optimization<ref>{{Cite webjournal |title=The Wind Driven Optimization Technique and its Application in Electromagnetics |date=2013 |url=https://ieeexplore.ieee.org/document/6407788 |access-date=2024-02-03 |website=ieeexplore.ieee.org |doi=10.1109/TAP.2013.2238654 |last1=Bayraktar |first1=Zikri |last2=Komurcu |first2=Muge |last3=Bossard |first3=Jeremy A. |last4=Werner |first4=Douglas H. |journal=IEEE Transactions on Antennas and Propagation |volume=61 |issue=5 |pages=2745–2757 |bibcode=2013ITAP...61.2745B |s2cid=38181295 }}</ref>, [[particle swarm optimization]] or [[Simulated annealing]].
 
== The algorithm ==
Line 15:
* '''Scaled noise:''' For each <math>t\geq 0</math> and <math>i=1,\dots,N</math>, we denote by <math>B^i_t</math> independent standard Brownian motions. The function <math>D:{\cal{X}}\to\R^s</math> incorporates the drift of the <math>i</math>th particle and determines the noise model. The most common choices are:
** ''Isotropic noise'', <math>D(\cdot) = \|\cdot \|</math>: In this case <math>s=1</math> and every component of the noise vector is scaled equally. This was used in the original version of the algorithm<ref name=":0" />.
** ''Anisotropic noise<ref>{{cite arxivarXiv |last1=Carrillo |first1=José A. |title=A consensus-based global optimization method for high dimensional machine learning problems |date=2020-03-04 |arxiveprint=1909.09249 |last2=Jin |first2=Shi |last3=Li |first3=Lei |last4=Zhu |first4=Yuhua|class=math.OC }}</ref>'', <math>D(\cdot) = |\cdot|</math>: In the special case, where <math>{\cal{X}}\subset \R^d</math>, this means that <math>s=d</math> and <math>D</math> applies the absolute value function component-wise. Here, every component of the noise vector is scaled, dependent on the corresponding entry of the drift vector.
* '''Hyperparameters:''' The parameter <math>\sigma \geq 0</math> scales the influence of the noise term. The parameter <math>\alpha \geq 0</math> determines the separation effect of the particles<ref name=":0" />:
** in the limit <math>\alpha\to 0</math> every particle is assigned the same weight and the consensus point is a regular mean.
Line 34:
 
=== Polarization ===
If the function <math> f</math> is multi-modal, i.e., has more than one global minimum, the standard CBO algorithm can only find one of these points. However, one can “polarize”<ref>{{cite arxivarXiv |last1=Bungert |first1=Leon |title=Polarized consensus-based dynamics for optimization and sampling |date=2023-10-09 |arxiveprint=2211.05238 |last2=Roith |first2=Tim |last3=Wacker |first3=Philipp|class=math.OC }}</ref> the consensus computation by introducing a kernel <math>k: \cal{X}\times\cal{X}\to[0,\infty)</math> that includes local information into the weighting. In this case, every particle has its own version of the consensus point, which is computed as
 
<math display="block">c_\alpha^j(x) = \frac{1}{\sum_{i=1}^N \omega_\alpha^j(x^i)} \sum_{i=1}^N x^i\ \omega_\alpha^j(x^i), \quad\text{ with }\quad \omega_\alpha^j(\,\cdot\,) = \mathrm{exp}(-\alpha f(\,\cdot\,))\, k(\cdot,x^j).