Constrained optimization: Difference between revisions

Content deleted Content added
SmackBot (talk | contribs)
Standard headings &/or gen fixes.
OAbot (talk | contribs)
m Open access bot: url-access updated in citation with #oabot.
 
(122 intermediate revisions by 70 users not shown)
Line 1:
{{Short description|Optimizing objective functions that have constrained variables}}
In [[constraint satisfaction]], '''constraint optimization''' seeks for a solution maximizing or minimizing a cost function.
In [[mathematical optimization]], '''constrained optimization''' (in some contexts called '''constraint optimization''') is the process of optimizing an objective function with respect to some [[variable (mathematics)|variables]] in the presence of [[Constraint (mathematics)|constraints]] on those variables. The objective function is either a [[Loss function|cost function]] or [[energy function]], which is to be [[Maxima and minima|minimized]], or a [[reward function]] or [[utility function]], which is to be [[maximize]]d. Constraints can be either '''hard constraints''', which set conditions for the variables that are required to be satisfied, or '''soft constraints''', which have some variable values that are penalized in the objective function if, and based on the extent that, the conditions on the variables are not satisfied.
 
== Relation to constraint-satisfaction problems ==
==Definition==
 
The constrained-optimization problem (COP) is a significant generalization of the classic [[constraint-satisfaction problem]] (CSP) model.<ref>{{Citation|last1=Rossi|first1=Francesca|title=Chapter 1 – Introduction|date=2006-01-01|url=http://www.sciencedirect.com/science/article/pii/S1574652606800052|work=Foundations of Artificial Intelligence|volume=2|pages=3–12|editor-last=Rossi|editor-first=Francesca|series=Handbook of Constraint Programming|publisher=Elsevier|doi=10.1016/s1574-6526(06)80005-2|access-date=2019-10-04|last2=van Beek|first2=Peter|last3=Walsh|first3=Toby|editor2-last=van Beek|editor2-first=Peter|editor3-last=Walsh|editor3-first=Toby|url-access=subscription}}</ref> COP is a CSP that includes an ''objective function'' to be optimized. Many algorithms are used to handle the optimization part.
A constraint optimization problem can be defined as a regular [[constraint satisfaction problem]] in which constraints are weighted and the goal is to find a solution maximizing the weight of satisfied constraints.
 
==General form==
Alternative, a constraint optimization problem can be defined as a regular constraint satisfaction problem augumented with a number of "local" cost functions. The aim of constraint optimization is to find a solution to the problem whose cost, evaluated as the sum of the cost functions, is maximized or minimized. The regular constraints are called ''hard constraints'', while the cost functions are called ''soft constraints''. These names illustrate that hard constraints are to be necessarily satisfied, while soft constraints only express a preference of some solutions (those having a high or low cost) over other ones (those having lower/higher cost).
 
A general constrained minimization problem may be written as follows:<ref name="edo2021">{{Cite book|url=https://www.researchgate.net/publication/352413464|title=Engineering Design Optimization|last1=Martins|first1=J. R. R. A.|last2=Ning|first2=A.|date=2021|publisher=Cambridge University Press|isbn=978-1108833417|language=en}}</ref>
==Branch and bound==
 
: <math>
Constraint optimization can be solved by [[branch and bound]] algorithms. These are backtracking algorithms storing the cost of the best solution found during execution and use it for avoiding part of the search. More precisely, whenever the algorithm encounters a partial solution that cannot be extended to form a solution of better cost than the stored best cost, the algorithm backtracks, instead of trying to extend this solution.
\begin{array}{rcll}
\min &~& f(\mathbf{x}) & \\
\mathrm{subject~to} &~& g_i(\mathbf{x}) = c_i &\text{for } i=1,\ldots,n \quad \text{Equality constraints} \\
&~& h_j(\mathbf{x}) \geq d_j &\text{for } j=1,\ldots,m \quad \text{Inequality constraints}
\end{array}
</math>
 
where <math> g_i(\mathbf{x}) = c_i ~\mathrm{for~} i=1,\ldots,n </math> and <math> h_j(\mathbf{x}) \ge d_j ~\mathrm{for~} j=1,\ldots,m </math> are constraints that are required to be satisfied (these are called [[Constraint (mathematics)#Hard and soft constraints|hard constraints]]), and <math>f(\mathbf{x})</math> is the objective function that needs to be optimized subject to the constraints.
Assuming that cost is to be maximized, the efficiency of these algorithms depends on how the cost that can be obtained from extending a partial solution is evaluated. Indeed, if the algorithm can backtrack from a partial solution, part of the search is skipped. The lower the estimated cost, the better the algorithm, as a lower estimated cost is more likely to be lower than the best cost of solution found so far.
 
In some problems, often called ''constraint optimization problems'', the objective function is actually the sum of cost functions, each of which penalizes the extent (if any) to which a [[Constraint (mathematics)#Hard and soft constraints|soft constraint]] (a constraint which is preferred but not required to be satisfied) is violated.
 
==Solution methods==
 
Many constrained optimization algorithms can be adapted to the unconstrained case, often via the use of a [[penalty method]]. However, search steps taken by the unconstrained method may be unacceptable for the constrained problem, leading to a lack of convergence. This is referred to as the Maratos effect.<ref>Wenyu Sun; Ya-Xiang Yuan (2010). ''Optimization Theory and Methods: Nonlinear Programming'', Springer, {{ISBN|978-1441937650}}. p. 541</ref>
 
===Equality constraints===
 
====Substitution method====
 
For very simple problems, say a function of two variables subject to a single equality constraint, it is most practical to apply the method of substitution.<ref>{{cite book |first=Mike |last=Prosser |title=Basic Mathematics for Economists |___location=New York |publisher=Routledge |year=1993 |isbn=0-415-08424-5 |chapter=Constrained Optimization by Substitution |pages=338–346 }}</ref> The idea is to substitute the constraint into the objective function to create a [[Function composition|composite function]] that incorporates the effect of the constraint. For example, assume the objective is to maximize <math>f(x,y) = x \cdot y</math> subject to <math>x + y = 10</math>. The constraint implies <math>y = 10 - x</math>, which can be substituted into the objective function to create <math>p(x) = x (10 - x) = 10x - x^{2}</math>. The first-order necessary condition gives <math>\frac{\partial p}{\partial x} = 10 - 2x = 0</math>, which can be solved for <math>x=5</math> and, consequently, <math>y = 10 - 5 = 5</math>.
 
====Lagrange multiplier====
{{main|Lagrange multipliers}}
If the constrained problem has only equality constraints, the method of [[Lagrange multipliers]] can be used to convert it into an unconstrained problem whose number of variables is the original number of variables plus the original number of equality constraints. Alternatively, if the constraints are all equality constraints and are all linear, they can be solved for some of the variables in terms of the others, and the former can be substituted out of the objective function, leaving an unconstrained problem in a smaller number of variables.
 
===Inequality constraints===
 
With inequality constraints, the problem can be characterized in terms of the [[geometric optimality conditions]], [[Fritz John conditions]] and [[Karush–Kuhn–Tucker conditions]], under which simple problems may be solvable.
 
====Linear programming====
 
If the objective function and all of the hard constraints are linear and some hard constraints are inequalities, then the problem is a [[linear programming]] problem. This can be solved by the [[simplex method]], which usually works in [[polynomial time]] in the problem size but is not guaranteed to, or by [[interior point method]]s which are guaranteed to work in polynomial time.
 
====Nonlinear programming====
 
If the objective function or some of the constraints are nonlinear, and some constraints are inequalities, then the problem is a [[nonlinear programming]] problem.
 
====Quadratic programming====
 
If all the hard constraints are linear and some are inequalities, but the objective function is quadratic, the problem is a [[quadratic programming]] problem. It is one type of nonlinear programming. It can still be solved in polynomial time by the [[ellipsoid method]] if the objective function is [[Convex function|convex]]; otherwise the problem may be [[NP hard]].
 
====KKT conditions====
 
Allowing inequality constraints, the [[Karush-Kuhn-Tucker conditions|KKT approach]] to nonlinear programming generalizes the method of Lagrange multipliers. It can be applied under differentiability and convexity.
 
====Branch and bound====
 
Constraint optimization can be solved by [[branch -and -bound]] algorithms. These are backtracking algorithms storing the cost of the best solution found during execution and useusing it forto avoidingavoid part of the search. More precisely, whenever the algorithm encounters a partial solution that cannot be extended to form a solution of better cost than the stored best cost, the algorithm backtracks, instead of trying to extend this solution.
 
Assuming that cost is to be maximizedminimized, the efficiency of these algorithms depends on how the cost that can be obtained from extending a partial solution is evaluated. Indeed, if the algorithm can backtrack from a partial solution, part of the search is skipped. The lower the estimated cost, the better the algorithm, as a lower estimated cost is more likely to be lower than the best cost of solution found so far.
 
On the other hand, this estimated cost cannot be lower than the effective cost that can be obtained by extending the solution, as otherwise the algorithm could backtrack while a solution better than the best found so far exists. As a result, the algorithm requires an upper bound on the cost that can be obtained from extending a partial solution, and this upper bound should be as small as possible.
 
A variation of this approach called Hansen's method uses [[Interval arithmetic#History|interval methods]].<ref>{{cite book |last=Leader|first=Jeffery J. | title=Numerical Analysis and Scientific Computation |year=2004|publisher=Addison Wesley |isbn= 0-201-73499-0 }}</ref> It inherently implements rectangular constraints.
===First-choice bounding functions===
 
====First-choice bounding functions====
 
One way for evaluating this upper bound for a partial solution is to consider each soft constraint separately. For each soft constraint, the maximal possible value for any assignment to the unassigned variables is assumed. The sum of these values is an upper bound because the soft constraints cannot assume a higher value. It is exact because the maximal values of soft constraints may derive from different evaluations: a soft constraint may be maximal for <math>x=a</math> while another constraint is maximal for <math>x=b</math>.
 
=====Russian doll search=====
 
This method<ref>Verfaillie, Gérard, Michel Lemaître, and Thomas Schiex. "[https://web.archive.org/web/20180616030142/https://pdfs.semanticscholar.org/c83b/19ca9cc73aefb1a9e7b4780ba161b2149a03.pdf Russian doll search for solving constraint optimization problems]." AAAI/IAAI, Vol. 1. 1996.</ref> runs a branch-and-bound algorithm on <math>n</math> problems, where <math>n</math> is the number of variables. Each such problem is the subproblem obtained by dropping a sequence of variables <math>x_1,\ldots,x_i</math> from the original problem, along with the constraints containing them. After the problem on variables <math>x_{i+1},\ldots,x_n</math> is solved, its optimal cost can be used as an upper bound while solving the other problems,
 
In particular, the cost estimate of a solution having <math>x_{i+1},\ldots,x_n</math> as unassigned variables is added to the cost that derives from the evaluated variables. Virtually, this corresponds on ignoring the evaluated variables and solving the problem on the unassigned ones, except that the latter problem has already been solved. More precisely, the cost of soft constraints containing both assigned and unassigned variables is estimated as above (or using an arbitrary other method); the cost of soft constraints containing only unassigned variables is instead estimated using the optimal solution of the corresponding problem, which is already known at this point.
 
There is similarity between the Russian Doll Search method and [[dynamic programming]]. Like dynamic programming, Russian Doll Search solves sub-problems in order to solve the whole problem. But, whereas Dynamic Programming
==Bucket elimination==
directly combines the results obtained on sub-problems to get the result of the whole problem, Russian Doll Search only uses them as bounds during its search.
 
====Bucket elimination====
The [[bucket elimination]] algorithm can be adapted for constraint optimization. A given variable can be indeed removed from the problem by replacing all soft constraints containing it with a new soft constraint. The cost of this new constraint is computed assuming a maximal value for every value of the removed variable. Formally, if <math>x</math> is the variable to be removed, <math>C_1,\ldots,C_n</math> are the soft constraints containing it, and <math>y_1,\ldots,y_m</math> are their variables except <math>x</math>, the new soft constraint is defined by:
 
The [[bucket elimination]] algorithm can be adapted for constraint optimization. A given variable can be indeed removed from the problem by replacing all soft constraints containing it with a new soft constraint. The cost of this new constraint is computed assuming a maximal value for every value of the removed variable. Formally, if <math>x</math> is the variable to be removed, <math>C_1,\ldots,C_n</math> are the soft constraints containing it, and <math>y_1,\ldots,y_m</math> are their variables except <math>x</math>, the new soft constraint is defined by:
<!-- not exactly the correct notation, but clear enough -->
:<math>C(y_1=a_1,\ldots,y_n=a_n) = \max_{a}max_a \sum_i C_i(x=a,y_1=a_1,\ldots,y_n=a_n).</math>
 
Bucket elimination works with an (arbitrary) ordering of the variables. Every variable is associated a bucket of constraints; the bucket of a variable contains all constraints having the variable has the highest in the order. Bucket elimination proceed from the last variable to the first. For each variable, all constraints of the bucket are replaced as above to remove the variable. The resulting constraint is then placed in the appropriate bucket.
 
==See also==
 
* [[Constrained least squares]]
* [[Distributed constraint optimization]]
*[[Category:Constraint satisfaction problem]] (CSP)
* [[Constraint programming]]
* [[Integer programming]]
* [[Metric projection]]
* [[Penalty method]]
* [[Superiorization]]
 
==References==
 
{{reflist}}
 
==Further reading==
 
*{{cite book |first=Dimitri P. |last=Bertsekas |author-link=Dimitri Bertsekas |title=Constrained Optimization and Lagrange Multiplier Methods |___location=New York |publisher=Academic Press |year=1982 |isbn=0-12-093480-9 }}
*{{cite book
| first=Rina
Line 44 ⟶ 110:
| title=Constraint Processing
| publisher=Morgan Kaufmann
| url=https://archive.org/details/constraintproces00rina
| url=http://www.ics.uci.edu/~dechter/books/index.html
| year=2003
| idisbn=ISBN 1-55860-890-7
| url-access=registration
}}
 
[[Category:{{Optimization]] algorithms}}
 
[[Category:Constraint satisfaction]]
[[Category:Mathematical optimization]]
[[Category:Constraint programming]]