Content deleted Content added
m cleanup |
Link suggestions feature: 3 links added. |
||
(25 intermediate revisions by 17 users not shown) | |||
Line 1:
{{Short description|Family of stochastic optimization methods}}
[[Image:Eda mono-variant gauss iterations.svg|thumb|350px|Estimation of distribution algorithm. For each iteration ''i'', a random draw is performed for a population ''P'' in a distribution ''PDu''. The distribution parameters ''PDe'' are then estimated using the selected points ''PS''. The illustrated example optimizes a continuous objective function ''f(X)'' with a unique optimum ''O''. The sampling (following a normal distribution ''N'') concentrates around the optimum as one goes along unwinding algorithm.]]
'''''Estimation of distribution algorithms''''' ('''EDAs'''), sometimes called '''''probabilistic model-building genetic algorithms''''' (PMBGAs),<ref>{{Citation|last=Pelikan|first=Martin
EDAs belong to the class of [[evolutionary algorithms]]. The main difference between EDAs and most conventional evolutionary algorithms is that evolutionary algorithms generate new candidate solutions using an ''implicit'' distribution defined by one or more variation operators, whereas EDAs use an ''explicit'' probability distribution encoded by a [[Bayesian network]], a [[multivariate normal distribution]], or another model class. Similarly as other evolutionary algorithms, EDAs can be used to solve optimization problems defined over a number of representations from vectors to [[LISP]] style S expressions, and the quality of candidate solutions is often evaluated using one or more objective functions.
The general procedure of an EDA is outlined in the following:
''t'' := 0
initialize model M(0) to represent uniform distribution over admissible solutions
'''while''' (termination criteria not met) '''do'''
''P'' := generate N>0 candidate solutions by sampling M(''t'')
''F'' := evaluate all candidate solutions in ''P''
M(t + 1) := adjust_model(''P'', ''F'', M(''t''))
''t'' := ''t'' + 1
Using explicit probabilistic models in optimization allowed EDAs to feasibly solve optimization problems that were notoriously difficult for most conventional evolutionary algorithms and traditional optimization techniques, such as problems with high levels of [[epistasis]]{{Citation needed|date=September 2017}}. Nonetheless, the advantage of EDAs is also that these algorithms provide an optimization practitioner with a series of probabilistic models that reveal a lot of information about the problem being solved. This information can in turn be used to design problem-specific neighborhood operators for local search, to bias future runs of EDAs on a similar problem, or to create an efficient computational model of the problem.
Line 33:
===Univariate marginal distribution algorithm (UMDA)===
The UMDA<ref>{{cite journal|last1=Mühlenbein|first1=Heinz|title=The Equation for Response to Selection and Its Use for Prediction|journal=Evol. Computation|date=1 September 1997|volume=5|issue=3|pages=303–346|doi=10.1162/evco.1997.5.3.303|pmid=10021762|s2cid=2593514 |url=http://dl.acm.org/citation.cfm?id=1326756|issn=1063-6560|url-access=subscription}}</ref> is a simple EDA that uses an operator <math>\alpha_{UMDA}</math> to estimate marginal probabilities from a selected population <math>S(P(t))</math>. By assuming <math>S(P(t))</math> contain <math>\lambda</math> elements, <math>\alpha_{UMDA}</math> produces probabilities:
<math>
Line 52:
</math>
where <math>\gamma\in(0,1]</math> is a parameter defining the [[learning rate]], a small value determines that the previous model <math>p_t(X_i)</math> should be only slightly modified by the new solutions sampled. PBIL can be described as
<math>
Line 59:
===Compact genetic algorithm (cGA)===
The CGA,<ref>{{cite journal|last1=Harik|first1=G.R.|last2=Lobo|first2=F.G.|last3=Goldberg|first3=D.E.|title=The compact genetic algorithm|journal=IEEE Transactions on Evolutionary Computation|date=1999|volume=3|issue=4|pages=287–297|doi=10.1109/4235.797971}}</ref> also relies on the implicit populations defined by univariate distributions. At each generation <math>t</math>, two individuals <math>x,y</math> are sampled, <math>P(t)=\beta_2(D(t))</math>. The population <math>P(t)</math> is then
<math>
Line 65:
</math>
where, <math>\gamma\in(0,1]</math> is a constant defining the [[learning rate]], usually set to <math>\gamma=1/N</math>. The CGA can be defined as
<math>
Line 78:
</math>
Bivariate and multivariate distributions are usually represented as
===Mutual information maximizing input clustering (MIMIC)===
The MIMIC<ref>{{cite journal|last1=Bonet|first1=Jeremy S. De|last2=Isbell|first2=Charles L.|last3=Viola|first3=Paul|title=MIMIC: Finding Optima by Estimating Probability Densities|journal=Advances in Neural Information Processing Systems|date=1 January 1996|pages=424|
<math>
Line 94:
===Bivariate marginal distribution algorithm (BMDA)===
The BMDA<ref>{{cite
The resulting model is a forest with multiple trees rooted at nodes <math>\Upsilon_t</math>. Considering <math>I_t</math> the non-root variables, BMDA estimates a factorized distribution in which the root variables can be sampled independently, whereas all the others must be conditioned to the parent variable <math>\pi_i</math>.
Line 118:
===Extended compact genetic algorithm (eCGA)===
The ECGA<ref>{{cite
<math>
Line 130:
</math>
The CPC, on the other hand, quantifies the data compression in terms of entropy of the [[marginal distribution]] over all partitions, where <math>\lambda</math> is the selected population size, <math>|\tau|</math> is the number of decision variables in the linkage set <math>\tau</math> and <math>H(\tau)</math> is the [[joint entropy]] of the variables in <math>\tau</math>
<math>
Line 143:
===Bayesian optimization algorithm (BOA)===
The BOA<ref>{{cite journal|last1=Pelikan|first1=Martin|last2=Goldberg|first2=David E.|last3=Cantu-Paz|first3=Erick|title=BOA: The Bayesian Optimization Algorithm|date=1 January 1999|pages=525–532|
<math>
Line 149:
</math>
The Bayesian network structure, on the other hand, must be built iteratively (linkage-learning). It starts with a network without edges and, at each step, adds the edge which better improves some scoring metric (e.g. [[Bayesian information criterion]] (BIC) or Bayesian-Dirichlet metric with likelihood equivalence (BDe)).<ref>{{cite journal|last1=Larrañaga|first1=Pedro|last2=Karshenas|first2=Hossein|last3=Bielza|first3=Concha|last4=Santana|first4=Roberto|title=A review on probabilistic graphical models in evolutionary computation|journal=Journal of Heuristics|date=21 August 2012|volume=18|issue=5|pages=795–819|doi=10.1007/s10732-012-9208-4|s2cid=9734434 |url=http://oa.upm.es/15826/}}</ref> The scoring metric evaluates the network structure according to its accuracy in modeling the selected population. From the built network, BOA samples new promising solutions as follows: (1) it computes the ancestral ordering for each variable, each node being preceded by its parents; (2) each variable is sampled conditionally to its parents. Given such scenario, every BOA step can be defined as
<math>
Line 156:
===Linkage-tree Genetic Algorithm (LTGA)===
The LTGA<ref>{{cite
<math>
Line 175:
<math>x_i[\tau]</math>:= <math>x_j[\tau]</math>
'''if''' <math>f(x_i) \leq f_{x_i}</math> '''then'''
<math>x_i[\tau]:= x_j[\tau]</math>
'''return''' <math>P(t)</math>
{{algorithm-end}}
Line 186:
==Other==
* Probability collectives (PC)<ref>{{cite journal|last1=WOLPERT|first1=DAVID H.|title=Advances in Distributed Optimization Using Probability Collectives|last2=STRAUSS|first2=CHARLIE E. M.|last3=RAJNARAYAN|first3=DEV
* Hill climbing with learning (HCwL)<ref>{{Cite journal|
* Estimation of multivariate normal algorithm (EMNA){{Citation needed|date=June 2018}}
* Estimation of Bayesian networks algorithm (EBNA){{Citation needed|date=June 2018}}
* Stochastic hill climbing with learning by vectors of normal distributions (SHCLVND)<ref>{{Cite journal|
* Real-coded PBIL{{Citation needed|date=June 2018}}
* Selfish Gene Algorithm (SG)<ref>{{Cite
* Compact Differential Evolution (cDE)<ref>{{Cite journal|
* Compact Particle Swarm Optimization (cPSO)<ref>{{Cite journal|
* Compact Bacterial Foraging Optimization (cBFO)<ref>{{Citation|
* Probabilistic incremental program evolution (PIPE)<ref>{{Cite journal|
* Estimation of Gaussian networks algorithm (EGNA){{Citation needed|date=June 2018}}
* Estimation multivariate normal algorithm with thresheld convergence<ref>{{Cite
* Dependency Structure Matrix Genetic Algorithm (DSMGA)<ref>{{Citation|last1=Yu|first1=Tian-Li|title=Genetic Algorithm Design Inspired by Organizational Theory: Pilot Study of a Dependency Structure Matrix Driven Genetic Algorithm|date=2003|work=Genetic and Evolutionary Computation — GECCO 2003|pages=1620–1621|publisher=Springer Berlin Heidelberg|language=en|doi=10.1007/3-540-45110-2_54|isbn=9783540406037|last2=Goldberg|first2=David E.|last3=Yassine|first3=Ali|last4=Chen|first4=Ying-Ping}}</ref><ref>{{Cite book|last1=Hsu|first1=Shih-Huan|last2=Yu|first2=Tian-Li|date=2015-07-11|title=Optimization by Pairwise Linkage Detection, Incremental Linkage Set, and Restricted / Back Mixing: DSMGA-II|publisher=ACM|pages=519–526|doi=10.1145/2739480.2754737|isbn=9781450334723|arxiv=1807.11669|s2cid=17031156 }}</ref>
==Related==
* [[CMA-ES]]
* [[Cross-entropy method]]
* [[Ant colony optimization algorithms]]
==References==
|