Consensus based optimization: Difference between revisions

Content deleted Content added
Dummy edit to reset G13 clock after undeletion (rfud-helper)
Added CBS
Line 5:
'''Consensus-based optimization (CBO)'''<ref name=":0">{{Cite journal |last1=Pinnau |first1=René |last2=Totzeck |first2=Claudia |last3=Tse |first3=Oliver |last4=Martin |first4=Stephan |date=January 2017 |title=A consensus-based model for global optimization and its mean-field limit |url=https://www.worldscientific.com/doi/abs/10.1142/S0218202517400061 |journal=Mathematical Models and Methods in Applied Sciences |language=en |volume=27 |issue=1 |pages=183–204 |doi=10.1142/S0218202517400061 |arxiv=1604.05648 |s2cid=119296432 |issn=0218-2025}}</ref> is a multi-agent [[derivative-free optimization]] method, designed to obtain solutions for global optimization problems of the form <math display="block">\min_{x\in \cal{X}} f(x),</math>
[[File:CBORastrigin.gif|thumb|Behavior of CBO on the [[Rastrigin function]]. '''Blue:''' Particles, '''Pink:''' drift vectors and consensus point.]]
where <math>f:\mathcal{X}\to\R</math> denotes the objective function acting on the state space <math>\cal{X}</math>, which weis assumeassumed to be a [[normed vector space]] in the following. The function <math>f</math> can potentially be nonconvex and nonsmooth. The algorithm employs particles or agents to explore the state space, which communicate with each other to update their positions. Their dynamics follows the paradigm of [[Metaheuristic|metaheuristics]], which blend exporation with exploitation. In this sense, CBO is comparable to [[Ant colony optimization algorithms|ant colony optimization]], wind driven optimization<ref>{{Cite journal |title=The Wind Driven Optimization Technique and its Application in Electromagnetics |date=2013 |url=https://ieeexplore.ieee.org/document/6407788 |access-date=2024-02-03 |doi=10.1109/TAP.2013.2238654 |last1=Bayraktar |first1=Zikri |last2=Komurcu |first2=Muge |last3=Bossard |first3=Jeremy A. |last4=Werner |first4=Douglas H. |journal=IEEE Transactions on Antennas and Propagation |volume=61 |issue=5 |pages=2745–2757 |bibcode=2013ITAP...61.2745B |s2cid=38181295 }}</ref>, [[particle swarm optimization]] or [[Simulated annealing]].
 
== The algorithm ==
Line 19:
** in the limit <math>\alpha\to 0</math> every particle is assigned the same weight and the consensus point is a regular mean.
** In the limit <math>\alpha\to\infty</math> the consensus point corresponds to the particle with the best objective value, completely ignoring the position of other points in the ensemble.
 
== Convergence ==
 
== Notes on Implementation ==
Line 30 ⟶ 28:
 
=== Sampling ===
Consensus-based optimization can be transformed into a sampling method<ref>{{Citation |last=Carrillo |first=J. A. |title=Consensus Based Sampling |date=2021-11-04 |url=https://arxiv.org/abs/2106.02519 |access-date=2024-08-17 |doi=10.48550/arXiv.2106.02519 |last2=Hoffmann |first2=F. |last3=Stuart |first3=A. M. |last4=Vaes |first4=U.}}</ref> by modifying the noise term and choosing appropriate hyperparameters. Namely, one considers the following SDE{{NumBlk|::|<math>dx^i_t = -(x^i_t-c_\alpha(x_t))\,dt + \sqrt{2 \tilde{\lambda}^{-1}\, C_\alpha(x_t)}\,dB^i_t,</math>|{{EquationRef|2}}}}where the weighted covariance matrix is defined as
 
<math>C_\alpha(x_t) := \frac{1}{\sum_{i=1}^N \omega_\alpha(x_t^i)}\sum_{i=1}^N (x_t^i - c(x_t)) \otimes (x_t^i - c(x_t)) \omega(x_t^i) </math>.
=== Polarization ===
If the function <math> f</math> is multi-modal, i.e., has more than one global minimum, the standard CBO algorithm can only find one of these points. However, one can “polarize”<ref>{{cite arXiv |last1=Bungert |first1=Leon |title=Polarized consensus-based dynamics for optimization and sampling |date=2023-10-09 |eprint=2211.05238 |last2=Roith |first2=Tim |last3=Wacker |first3=Philipp|class=math.OC }}</ref> the consensus computation by introducing a kernel <math>k: \cal{X}\times\cal{X}\to[0,\infty)</math> that includes local information into the weighting. In this case, every particle has its own version of the consensus point, which is computed as
 
If the parameters are chosen such that <math>\tilde{\lambda}^{-1} = (1 + \alpha)</math> the above scheme creates approximate samples of a probability distribution with a density, that is proportional to <math>\exp(-\alpha f)</math>.
<math display="block">c_\alpha^j(x) = \frac{1}{\sum_{i=1}^N \omega_\alpha^j(x^i)} \sum_{i=1}^N x^i\ \omega_\alpha^j(x^i), \quad\text{ with }\quad \omega_\alpha^j(\,\cdot\,) = \mathrm{exp}(-\alpha f(\,\cdot\,))\, k(\cdot,x^j).
 
=== Polarization ===
If the function <math> f</math> is multi-modal, i.e., has more than one global minimum, the standard CBO algorithm can only find one of these points. However, one can “polarize”<ref>{{cite arXiv |last1=Bungert |first1=Leon |title=Polarized consensus-based dynamics for optimization and sampling |date=2023-10-09 |eprint=2211.05238 |last2=Roith |first2=Tim |last3=Wacker |first3=Philipp|class=math.OC }}</ref> the consensus computation by introducing a kernel <math>k: \cal{X}\times\cal{X}\to[0,\infty)</math> that includes local information into the weighting. In this case, every particle has its own version of the consensus point, which is computed as<math display="block">c_\alpha^j(x) = \frac{1}{\sum_{i=1}^N \omega_\alpha^j(x^i)} \sum_{i=1}^N x^i\ \omega_\alpha^j(x^i), \quad\text{ with }\quad \omega_\alpha^j(\,\cdot\,) = \mathrm{exp}(-\alpha f(\,\cdot\,))\, k(\cdot,x^j).
</math>In this case, the drift is a vector field over the state space <math>\cal{X}
</math>. Intuitively, particles are now not only attracted to other particles based on their objective value, but also based on their spatial locality. For a constant kernel function, the polarized version corresponds to standard CBO and is therefore a generalization. We briefly give some examples of common configurations: