Content deleted Content added
Jeanpmartins (talk | contribs) No edit summary |
Jeanpmartins (talk | contribs) No edit summary |
||
Line 1:
= Overview =
[[Image:Eda mono-variant gauss iterations.svg|thumb|350px|Estimation of distribution algorithm. For each iteration ''i'', a random draw is performed for a population ''P'' in a distribution ''PDu''. The distribution parameters ''PDe'' are then estimated using the selected points ''PS''. The illustrated example optimizes a continuous objective function ''f(X)'' with a unique optimum ''O''. The sampling (following a normal distribution ''N'') concentrates around the optimum as one goes along unwinding algorithm.]]
Line 8:
The general procedure of an EDA is outlined in the following:
<source lang=python>
:# P = generate N>0 candidate solutions by sampling M(t)▼
</source>
Using explicit probabilistic models in optimization allowed EDAs to feasibly solve optimization problems that were notoriously difficult for most conventional evolutionary algorithms and traditional optimization techniques, such as problems with high levels of [[epistasis]]. Nonetheless, the advantage of EDAs is also that these algorithms provide an optimization practitioner with a series of probabilistic models that reveal a lot of information about the problem being solved. This information can in turn be used to design problem-specific neighborhood operators for local search, to bias future runs of EDAs on a similar problem, or to create an efficient computational model of the problem.
For example, if the population is represented by bit strings of length 4, the EDA can represent the population of promising solution using a single vector of four probabilities (p1, p2, p3, p4) where each component of p defines the probability of that position being a 1. Using this probability vector it is possible to create an arbitrary number of candidate solutions.
=Estimation of Distribution Algorithms (EDAs)=
This section describes the models built by some well known EDAs of different levels of complexity. It is always assumed a population <math>P(t)</math> at the generation <math>t</math>, a selection operator <math>S</math>, a model-building operator <math>\alpha</math> and a sampling operator <math>\beta</math>.
==Univariate Distributions==
The most simple EDAs assume that decision variables are independent, i.e. <math>p(X_1,X_2) = p(X_1)\cdot p(X_2)</math>. Therefore, univariate EDAs rely only on univariate statistics and multivariate distributions must be factorized as the product of <math>N</math> univariate probability distributions,
<math>
D_\text{Univariate} := p(X_1,\dots,X_N) = \prod_{i=1}^N p(X_i).
</math>
Such factorizations are used in many different EDAs, next we describe some of them.
===Univariate Marginal Distribution Algorithm (UMDA)===
The UMDA <ref>{{cite journal|last1=Mühlenbein|first1=Heinz|title=The Equation for Response to Selection and Its Use for Prediction|journal=Evol. Computation|date=1 September 1997|volume=5|issue=3|pages=303–346|doi=10.1162/evco.1997.5.3.303|url=http://dl.acm.org/citation.cfm?id=1326756|issn=1063-6560}}</ref> is a simple EDA that uses an operator <math>\alpha_{UMDA}</math> to estimate marginal probabilities from a selected population <math>S(P(t))</math>. By assuming <math>S(P(t))</math> contain <math>\lambda</math> elements, <math>\alpha_{UMDA}</math> produces probabilities:
<math>
p_{t+1}(X_i) = \dfrac{1}{\lambda} \sum_{x\in S(P(t))} x_i,~\forall i\in 1,2,\dots,N.
</math>
Every UMDA step can be described as follows
<math>
D(t+1) = \alpha_\text{UMDA} \circ S \circ \beta_{\lambda}(D(t)).
</math>
===Population-Based Incremental Learning (PBIL)===
The PBIL<ref>{{cite journal|last1=Baluja|first1=Shummet|title=Population-Based Incremental Learning: A Method for Integrating Genetic Search Based Function Optimization and Competitive Learning|date=1 January 1994|url=http://dl.acm.org/citation.cfm?id=865123|publisher=Carnegie Mellon University}}</ref>, represents the population implicitly by its model, from which it samples new solutions and updates the model. At each generation, <math>\mu</math> individuals are sampled and <math>\lambda\leq \mu</math> are selected. Such individuals are then used to update the model as follows
<math>
p_{t+1}(X_i) = (1- \gamma) p_{t}(X_i) + (\gamma/\lambda) \sum_{x\in S(P(t))} x_i,~\forall i\in 1,2,\dots,N,
</math>
where <math>\gamma\in(0,1]</math> is a parameter defining the learning rate, a small value determines that the previous model <math>p_t(X_i)</math> should be only slightly modified by the new solutions sampled. PBIL can be described as
<math>
D(t+1) = \alpha_\text{PIBIL} \circ S \circ \beta_\mu(D(t))
</math>
Better-known EDAs include
<ref>{{cite journal|last1=Pelikan|first1=Martin|last2=Goldberg|first2=David E.|last3=Lobo|first3=Fernando G.|title=A Survey of Optimization by Building and Using Probabilistic Models|journal=Computational Optimization and Applications|date=2002|volume=21|issue=1|pages=5–20|doi=10.1023/A:1013500812258}}</ref>
|