Cross-entropy method: Difference between revisions

Content deleted Content added
Software implementations: Add CEopt Matlab package to the list.
 
(22 intermediate revisions by 14 users not shown)
Line 1:
{{Short description|Monte Carlo method for importance sampling and optimization}}
The '''cross-entropy''' ('''CE''') '''method''' is a [[Monte Carlo method|Monte Carlo]] method for [[importance sampling]] and [[Optimization (mathematics)|optimization]]. It is applicable to both [[Combinatorial optimization|combinatorial]] and [[Continuous optimization|continuous]] problems, with either a static or noisy objective.
 
The method approximates the optimal importance sampling estimator by repeating two phases:<ref>Rubinstein, R.Y. and Kroese, D.P. (2004), The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning, Springer-Verlag, New York {{ISBN|978-0-387-21240-1}}.</ref>
 
#Draw a sample from a [[probability distribution]].
#Minimize the ''[[cross entropy|''cross-entropy]]'']] between this distribution and a target distribution to produce a better sample in the next iteration.
 
[[Reuven Rubinstein]] developed the method in the context of ''rare -event simulation'', where tiny probabilities must be estimated, for example in network reliability analysis, queueing models, or performance analysis of telecommunication systems. The method has also been applied to the [[traveling salesman problem|traveling salesman]], [[quadratic assignment problem|quadratic assignment]], [[Sequence alignment|DNA sequence alignment]], [[Maxcut|max-cut]] and buffer allocation problems.
 
==Estimation via importance sampling==
Line 21 ⟶ 22:
<math> g^*(\mathbf{x}) = H(\mathbf{x}) f(\mathbf{x};\mathbf{u})/\ell</math>.
 
This, however, depends on the unknown <math>\ell</math>. The CE method aims to approximate the optimal PDF by adaptively selecting members of the parametric family that are closest (in the [[Kullback–Leibler divergence|Kullback–Leibler]] sense) to the optimal PDF <math>g^*</math>.
This, however, depends on the unknown <math>\ell</math>. The CE method aims to approximate the optimal PDF by adaptively selecting members of the parametric family that are closest (in the [[Kullback–Leibler divergence|Kullback–Leibler]] sense) to the optimal PDF <math>g^*</math>. Some modifications for improving the setting of parameters, convergence, and overall the computational efficiency of the cross-entropy method when dealing with multi-objective optimization problems have been introduced and reported<ref>{{cite journal |last1=Bekker |first1=J. |last2=Aldrich |first2=C. |title=The cross-entropy method in multi-objective optimisation: An assessment |journal=European Journal of Operational Research |date=2011 |volume=211 |issue=1 |pages=112-121 |doi=10.1016/j.ejor.2010.10.028}}</ref>, <ref>{{cite journal |last1=Giagkiozis |first1=I. |last2=Purshouse |first2=R.C. |last3=Fleming |first3=P.J. |title=Generalized decomposition and cross entropy methods for many-objective optimization |journal=Information Sciences |date=2014 |volume=282 |pages=363-387 |doi=10.1016/j.ins.2014.05.045}}</ref>,<ref>{{cite journal |last1=Haber |first1=R.E. |last2=Beruvides |first2=G. |last3=Quiza |first3=R. |last4=Hernandez |first4=A. |title=A simple multi-objective optimization based on the cross-entropy method |journal=A simple multi-objective optimization based on the cross-entropy method |date=2017 |volume=5 |pages=22272-22281 |doi=10.1109/access.2017.2764047}}</ref>.
 
==Generic CE algorithm==
# Choose initial parameter vector <math>\mathbf{v}^{(0)}</math>; set t = 1.
# Generate a random sample <math>\mathbf{X}_1,\dots,\mathbf{X}_N</math> from <math>f(\cdot;\mathbf{v}^{(t-1)})</math>
# Solve for <math>\mathbf{v}^{(t)}</math>, where<br><math>\mathbf{v}^{(t)} = \mathop{\textrm{argmax}}_{\mathbf{uv}} \frac{1}{N} \sum_{i=1}^N H(\mathbf{X}_i)\frac{f(\mathbf{X}_i;\mathbf{u})}{f(\mathbf{X}_i;\mathbf{v}^{(t-1)})} \log f(\mathbf{X}_i;\mathbf{v}^{(t-1)})</math>
# If convergence is reached then '''stop'''; otherwise, increase t by 1 and reiterate from step 2.
 
Line 36 ⟶ 37:
== Continuous optimization&mdash;example==
The same CE algorithm can be used for optimization, rather than estimation.
Suppose the problem is to maximize some function <math>S(x)</math>, for example,
<math>S(x) = \textrm{e}^{-(x-2)^2} + 0.8\,\textrm{e}^{-(x+2)^2}</math>.
To apply CE, one considers first the ''associated stochastic problem'' of estimating
Line 49 ⟶ 50:
parametric family are the sample mean and sample variance corresponding to the ''elite samples'', which are those samples that have objective function value <math>\geq\gamma</math>.
The worst of the elite samples is then used as the level parameter for the next iteration.
This yields the following randomized algorithm that happens to coincide with the so-called Estimation of Multivariate Normal Algorithm (EMNA), an [[estimation of distribution algorithm]].
This yields the following randomized algorithm that happens to coincide with the so-called Estimation of Multivariate Normal Algorithm (EMNA), an [[estimation of distribution algorithm]]. Some recent applications of the cross-entropy optimization method have been reported to solve the dynamic economic dispatch problem with a unit start-stop plan<ref>{{cite journal |last1=Xie |first1=M. |last2=Du |first2=Y. |last3=Wei |first3=W. |last4=Liu |first4=M. |title=A cross-entropy-based hybrid membrane computing method for power system unit commitment problems |journal=Energies |date=2019 |volume=12 |issue=3 |doi=10.3390/en12030486}}</ref>, parametrization of micromilling processes<ref>{{cite journal |last1=La Fe |first1=I. |last2=Beruvides |first2=G. |last3=Quiza |first3=R. |last4=Haber |first4=R.E. |last5=Rivas |first5=M. |title=Automatic Selection of Optimal Parameters Based on Simple Soft-Computing Methods: A Case Study of Micromilling Processes |journal=IEEE Transactions on Industrial Informatics |date=2019 |volume=15 |issue=2 |pages=800-811 |doi=10.1109/tii.2018.2816971}}</ref>, energy scheduling problems<ref>{{cite journal |last1=Wang |first1=L. |last2=Li |first2=Q. |last3=Zhang |first3=B. |last4=DIng |first4=R. |last5=Sun |first5=M. |title=Robust multi-objective optimization for energy production scheduling in microgrids |journal=Engineering Optimization |date=2019 |volume=51 |issue=2 |pages=332-351 |doi=10.1080/0305215x.2018.1457655}}</ref> and robotic automated storage and retrieval system<ref>{{cite journal |last1=Foumani |first1=M. |last2=Moeini |first2=A. |last3=Haythorpe |first3=M. |last4=Smith-Miles |first4=K. |title=A cross-entropy method for optimising robotic automated storage and retrieval systems |journal=International Journal of Production Research |date=2019 |volume=56 |issue=19 |pages=6450-6472 |doi=10.1080/00207543.2018.1456692}}</ref>.
 
===Pseudo-codePseudocode===
''// Initialize parameters''
&mu; :=-6 −6
sigma2&sigma;2 := 100
t := 0
maxits := 100
N := 100
Ne := 10
''// While maxits not exceeded and not converged''
'''while''' t < maxits '''and''' sigma2&sigma;2 > &epsilon; '''do'''
''// Obtain N samples from current sampling distribution''
X := SampleGaussian(&mu;,sigma2 &sigma;2, N)
''// Evaluate objective function at sampled points''
S := exp(-(X-2) ^ 2) + 0.8 exp(-(X + 2) ^ 2)
''// Sort X by objective function values in descending order''
X := sort(X, S)
''// Update parameters of sampling distribution via elite samples''
&mu; := mean(X(1:Ne))
sigma2 &sigma;2 := var(X(1:Ne))
t := t + 1
''// Return mean of final sampling distribution as solution''
'''return''' &mu;
 
==Related methods==
* [[Simulated annealing]]
* [[Genetic algorithms]]
* [[Harmony search]]
* [[Estimation of distribution algorithm]]
* [[Tabu search]]
* [[Natural Evolution Strategy]]
* [[Ant colony optimization algorithms]]
 
==See also==
* [[Cross entropy]]
* [[Kullback–Leibler divergence]]
* [[Randomized algorithm]]
* [[Importance sampling]]
 
== Journal Paperspapers ==
* De Boer, P.-T., Kroese, D.P., Mannor, S. and Rubinstein, R.Y. (2005). A Tutorial on the Cross-Entropy Method. ''Annals of Operations Research'', '''134''' (1), 19–67.[http://www.maths.uq.edu.au/~kroese/ps/aortut.pdf]
*Rubinstein, R.Y. (1997). Optimization of Computer simulationSimulation Models with Rare Events, ''European Journal of Operational Research'', '''99''', 89–112.
 
==Software Implementationsimplementations==
* [https://ceopt.org '''CEopt''' Matlab package]
* [https://cran.r-project.org/web/packages/CEoptim/index.html '''CEoptim''' R package]
* [https://www.nuget.org/packages/Novacta.Analytics '''Novacta.Analytics''' .NET library]
 
==References==