Consensus based optimization: Difference between revisions

Content deleted Content added
m top: clean up, removed: {{!}} IEEE Journals & Magazine {{!}} IEEE Xplore
Citation bot (talk | contribs)
Alter: url, issue. URLs might have been anonymized. Add: s2cid, arxiv, authors 1-1. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Headbomb | Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox | #UCB_webform_linked 28/35
Line 15:
* '''Scaled noise:''' For each <math>t\geq 0</math> and <math>i=1,\dots,N</math>, we denote by <math>B^i_t</math> independent standard Brownian motions. The function <math>D:{\cal{X}}\to\R^s</math> incorporates the drift of the <math>i</math>th particle and determines the noise model. The most common choices are:
** ''Isotropic noise'', <math>D(\cdot) = \|\cdot \|</math>: In this case <math>s=1</math> and every component of the noise vector is scaled equally. This was used in the original version of the algorithm<ref name=":0" />.
** ''Anisotropic noise<ref>{{Citation |lastlast1=Carrillo |firstfirst1=José A. |title=A consensus-based global optimization method for high dimensional machine learning problems |date=2020-03-04 |url=http://arxiv.org/abs/1909.09249 |access-date=2024-02-05 |doiarxiv=10.48550/arXiv.1909.09249 |last2=Jin |first2=Shi |last3=Li |first3=Lei |last4=Zhu |first4=Yuhua}}</ref>'', <math>D(\cdot) = |\cdot|</math>: In the special case, where <math>{\cal{X}}\subset \R^d</math>, this means that <math>s=d</math> and <math>D</math> applies the absolute value function component-wise. Here, every component of the noise vector is scaled, dependent on the corresponding entry of the drift vector.
* '''Hyperparameters:''' The parameter <math>\sigma \geq 0</math> scales the influence of the noise term. The parameter <math>\alpha \geq 0</math> determines the separation effect of the particles<ref name=":0" />:
** in the limit <math>\alpha\to 0</math> every particle is assigned the same weight and the consensus point is a regular mean.
Line 34:
 
=== Polarization ===
If the function <math> f</math> is multi-modal, i.e., has more than one global minimum, the standard CBO algorithm can only find one of these points. However, one can “polarize”<ref>{{Citation |lastlast1=Bungert |firstfirst1=Leon |title=Polarized consensus-based dynamics for optimization and sampling |date=2023-10-09 |url=http://arxiv.org/abs/2211.05238 |access-date=2024-02-05 |doiarxiv=10.48550/arXiv.2211.05238 |last2=Roith |first2=Tim |last3=Wacker |first3=Philipp}}</ref> the consensus computation by introducing a kernel <math>k: \cal{X}\times\cal{X}\to[0,\infty)</math> that includes local information into the weighting. In this case, every particle has its own version of the consensus point, which is computed as
 
<math display="block">c_\alpha^j(x) = \frac{1}{\sum_{i=1}^N \omega_\alpha^j(x^i)} \sum_{i=1}^N x^i\ \omega_\alpha^j(x^i), \quad\text{ with }\quad \omega_\alpha^j(\,\cdot\,) = \mathrm{exp}(-\alpha f(\,\cdot\,))\, k(\cdot,x^j).
Line 44:
</math> determines the communication radius of particles. This choice corresponds to a local convex regularization of the objective function <math>f
</math>.
* [[Mean-shift algorithm|'''Mean-shift algorithm''']]<ref>{{Cite journal |lastlast1=Fukunaga |firstfirst1=K. |last2=Hostetler |first2=L. |date=January 1975 |title=The estimation of the gradient of a density function, with applications in pattern recognition |url=httphttps://ieeexplore.ieee.org/document/1055330/ |journal=IEEE Transactions on Information Theory |language=en |volume=21 |issue=1 |pages=32–40 |doi=10.1109/TIT.1975.1055330 |issn=0018-9448}}</ref>: Employing polarized CBO for a constant objective function <math>f
</math>, together with no noise (i.e. <math>\sigma = 0
</math>) and an Euler–Maruyama discretization with step size <math>dt=1
</math>, corresponds to the mean-shift algorithm.
* '''Bounded confidence model''': When choosing a constant objective function, no noise model, but also the special kernel function <math>k(x,\tilde x) = 1_{\|x-\tilde x\| \leq \kappa}
</math>, the SDE in {{EquationNote|2=(1)}} transforms to a ODE known as the bounded confidence model<ref>{{Cite journal |lastlast1=Deffuant |firstfirst1=Guillaume |last2=Neau |first2=David |last3=Amblard |first3=Frederic |last4=Weisbuch |first4=Gérard |date=January 2000 |title=Mixing beliefs among interacting agents |url=https://www.worldscientific.com/doi/abs/10.1142/S0219525900000078 |journal=Advances in Complex Systems |language=en |volume=03 |issue=01n041n04 |pages=87–98 |doi=10.1142/S0219525900000078 |s2cid=15604530 |issn=0219-5259}}</ref>, which arises in opinion dynamics.
 
== See also ==