Convex optimization: Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Misc citation tidying. | Use this bot. Report bugs. | Suggested by AManWithNoPlan | #UCB_CommandLine
 
(48 intermediate revisions by 24 users not shown)
Line 1:
{{short description|Subfield of mathematical optimization}}
{{multiple issues|
{{Technical|date=June 2013}}
{{More footnotes needed|date=February 2012}}
}}
 
'''Convex optimization''' is a subfield of [[mathematical optimization]] that studies the problem of minimizing [[convex function]]s over [[convex set]]s (or, equivalently, maximizing [[concave functions]] over convex sets). Many classes of convex optimization problems admit polynomial-time algorithms,<ref name="Nesterov 1994">{{harvnb|Nesterov|Nemirovskii|1994}}</ref> whereas mathematical optimization is in general [[NP-hard]].<ref>
{{cite journal | last1 = Murty | first1 = Katta | last2 = Kabadi | first2 = Santosh | title = Some NP-complete problems in quadratic and nonlinear programming | journal = Mathematical Programming | volume = 39 | issue = 2 | pages = 117–129 | year = 1987 | doi = 10.1007/BF02592948| hdl = 2027.42/6740 | s2cid = 30500771 | hdl-access = free}}</ref><ref>Sahni, S. "Computationally related problems," in SIAM Journal on Computing, 3, 262--279, 1974.</ref><ref>[{{cite journal | url=https://link.springer.com/article/10.1007/BF00120662 | doi=10.1007/BF00120662 | title=Quadratic programming with one negative eigenvalue is NP-hard], | date=1991 | last1=Pardalos | first1=Panos M. Pardalos| last2=Vavasis and| first2=Stephen A. Vavasis in| ''journal=Journal of Global Optimization'', Volume| volume=1, Number| 1,pages=15–22 1991,| pg.15url-22.access=subscription }}</ref>
 
Convex optimization has applications in a wide range of disciplines, such as automatic [[control systems]], estimation and [[signal processing]], communications and networks, electronic [[circuit design]],<ref>{{harvnb|Boyd|Vandenberghe|2004|p=17}}</ref> data analysis and modeling, [[finance]], [[statistics]] ([[optimal design|optimal experimental design]]),<ref>Chritensen/Klarbring, chpt. 4.</ref> and [[structural optimization]], where the approximation concept has proven to be efficient.<ref>{{harvnb|Boyd|Vandenberghe|2004}}</ref><ref>Schmit, L.A.; Fleury, C. 1980: ''Structural synthesis by combining approximation concepts and dual methods''. J. Amer. Inst. Aeronaut. Astronaut 18, 1252-1260</ref>
With recent advancements in computing and [[Mathematical optimization#Computational optimization techniques|optimization algorithms]], convex programming is nearly as straightforward as [[linear programming]].<ref>{{harvnb|Boyd|Vandenberghe|2004|p=8}}</ref>
 
==Definition==
 
=== Abstract form ===
A convex optimization problem is an [[optimization problem]] in which the objective function is a [[convex function]] and the [[Feasible region|feasible set]] is a [[convex set]]. A function <math>f</math> mapping some subset of <math>\mathbb{R}^n</math>into <math>\mathbb{R} \cup \{\pm \infty\}</math> is convex if its ___domain is convex and for all <math>\theta \in [0, 1]</math> and all <math>x, y</math> in its ___domain, the following condition holds: <math>f(\theta x + (1 - \theta)y) \leq \theta f(x) + (1 - \theta) f(y)</math>. A set S is convex if for all members <math>x, y \in S</math> and all <math>\theta \in [0, 1]</math>, we have that <math>\theta x + (1 - \theta) y \in S</math>.
A convex optimization problem is defined by two ingredients:<ref>{{cite book |last1=Hiriart-Urruty |first1=Jean-Baptiste |url=https://books.google.com/books?id=Gdl4Jc3RVjcC&q=lemarechal+convex+analysis+and+minimization |title=Convex analysis and minimization algorithms: Fundamentals |last2=Lemaréchal |first2=Claude |year=1996 |isbn=9783540568506 |page=291|publisher=Springer }}</ref><ref>{{cite book |last1=Ben-Tal |first1=Aharon |url=https://books.google.com/books?id=M3MqpEJ3jzQC&q=Lectures+on+Modern+Convex+Optimization:+Analysis,+Algorithms, |title=Lectures on modern convex optimization: analysis, algorithms, and engineering applications |last2=Nemirovskiĭ |first2=Arkadiĭ Semenovich |year=2001 |isbn=9780898714913 |pages=335–336}}</ref>
 
* The ''objective function'', which is a real-valued [[convex function]] of ''n'' variables, <math>f :\mathcal D \subseteq \mathbb{R}^n \to \mathbb{R}</math>;
Concretely, a convex optimization problem is the problem of finding some <math>\mathbf{x^\ast} \in C</math> attaining
* The ''feasible set'', which is a [[convex subset]] <math>C\subseteq \mathbb{R}^n</math>.
:<math>\inf \{ f(\mathbf{x}) : \mathbf{x} \in C \}</math>,
 
where the objective function <math>f :\mathcal D \subseteq \mathbb{R}^n \to \mathbb{R}</math> is convex, as is the feasible set <math>C</math>.<ref>{{cite book|url=https://books.google.com/books?id=Gdl4Jc3RVjcC&q=lemarechal+convex+analysis+and+minimization|title=Convex analysis and minimization algorithms: Fundamentals|last1=Hiriart-Urruty|first1=Jean-Baptiste|last2=Lemaréchal|first2=Claude|year=1996|page=291|isbn=9783540568506}}</ref>
The goal of the problem is to find some <math>\mathbf{x^\ast} \in C</math> attaining
<ref>{{cite book|url=https://books.google.com/books?id=M3MqpEJ3jzQC&q=Lectures+on+Modern+Convex+Optimization:+Analysis,+Algorithms,|title=Lectures on modern convex optimization: analysis, algorithms, and engineering applications|last1=Ben-Tal|first1=Aharon|last2=Nemirovskiĭ|first2=Arkadiĭ Semenovich|year=2001|pages=335–336|isbn=9780898714913}}</ref> If such a point exists, it is referred to as an ''optimal point'' or ''solution''; the set of all optimal points is called the ''optimal set''. If <math>f</math> is unbounded below over <math>C</math> or the infimum is not attained, then the optimization problem is said to be ''unbounded''. Otherwise, if <math>C</math> is the empty set, then the problem is said to be ''infeasible''.<ref name="bv4">{{harvnb|Boyd|Vandenberghe|2004|loc=chpt. 4}}</ref>
:<math>\inf \{ f(\mathbf{x}) : \mathbf{x} \in C \}</math>.
In general, there are three options regarding the existence of a solution:<ref name=":2">{{Cite book |last1=Boyd |first1=Stephen |url=https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf |title=Convex Optimization |last2=Vandenberghe |first2=Lieven |publisher=[[Cambridge University Press]] |year=2004 |isbn=978-0-521-83378-3 |access-date=12 Apr 2021}}</ref>{{Rp|___location=chpt.4}}
 
* If such a point ''x''* exists, it is referred to as an ''optimal point'' or ''solution''; the set of all optimal points is called the ''optimal set''; and the problem is called ''solvable''.
* If <math>f</math> is unbounded below over <math>C</math>, or the infimum is not attained, then the optimization problem is said to be ''unbounded''.
* Otherwise, if <math>C</math> is the empty set, then the problem is said to be ''infeasible''.
 
===Standard form===
Line 30 ⟶ 29:
\end{align}</math>
 
where:<ref name="bv4:2" />{{Rp|___location=chpt.4}}
 
* <math>\mathbf{x} \in \mathbb{R}^n</math> is the vector of optimization variablevariables;
* The objective function <math>f: \mathcal D \subseteq \mathbb{R}^n \to \mathbb{R}</math> is a [[convex function]];
* The inequality constraint functions <math>g_i : \mathbb{R}^n \to \mathbb{R}</math>, <math>i=1, \ldots, m</math>, are convex functions;
* The equality constraint functions <math>h_i : \mathbb{R}^n \to \mathbb{R}</math>, <math>i=1, \ldots, p</math>, are [[affine transformation]]s, that is, of the form: <math>h_i(\mathbf{x}) = \mathbf{a_i}\cdot \mathbf{x} - b_i</math>, where <math>\mathbf{a_i}</math> is a vector and <math>b_i</math> is a scalar.
 
The feasible set <math>C</math> of the optimization problem consists of all points <math>\mathbf{x} \in \mathcal{D}</math> satisfying the inequality and the equality constraints. This set is convex because <math>\mathcal{D}</math> is convex, the [[sublevel set]]s of convex functions are convex, affine sets are convex, and the intersection of convex sets is convex.<ref name=":2" />{{Rp|___location=chpt.2}}
This notation describes the problem of finding <math>\mathbf{x} \in \mathbb{R}^n</math> that minimizes <math>f(\mathbf{x})</math> among all <math>\mathbf{x}</math> satisfying <math>g_i(\mathbf{x}) \leq 0</math>, <math>i=1, \ldots, m</math> and <math>h_i(\mathbf{x}) = 0</math>, <math>i=1, \ldots, p</math>. The function <math>f</math> is the objective function of the problem, and the functions <math>g_i</math> and <math>h_i</math> are the constraint functions.
 
Many optimization problems can be equivalently formulated in this standard form. For example, the problem of maximizing a [[concave function]] <math>f</math> can be re-formulated equivalently as the problem of minimizing the convex function <math>-f</math>. The problem of maximizing a concave function over a convex set is commonly called a convex optimization problem.<ref>{{cite web | url=https://www.solver.com/convex-optimization | title=Optimization Problem Types - Convex Optimization | date=9 January 2011 }}</ref>
The feasible set <math>C</math> of the optimization problem consists of all points <math>\mathbf{x} \in \mathcal{D}</math> satisfying the constraints. This set is convex because <math>\mathcal{D}</math> is convex, the [[sublevel set]]s of convex functions are convex, affine sets are convex, and the intersection of convex sets is convex.<ref>{{harvnb|Boyd|Vandenberghe|2004|loc=chpt. 2}}</ref>
 
===Epigraph form (standard form with linear objective){{Anchor|linear}}===
A solution to a convex optimization problem is any point <math>\mathbf{x} \in C</math> attaining <math>\inf \{ f(\mathbf{x}) : \mathbf{x} \in C \}</math>. In general, a convex optimization problem may have zero, one, or many solutions.<ref>{{cite web | url=https://inst.eecs.berkeley.edu/~ee227a/fa10/login/l_cvx_pbs.html | title=Convex Problems }}</ref>
In the standard form it is possible to assume, without loss of generality, that the objective function ''f'' is a [[linear function]]. This is because any program with a general objective can be transformed into a program with a linear objective by adding a single variable t and a single [[Constraint (mathematics)|constraint]], as follows:<ref name=":02">{{Cite book |last=Arkadi Nemirovsky |url=https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=8c3cb6395a35cb504019f87f447d65cb6cf1cdf0 |title=Interior point polynomial-time methods in convex programming |year=2004}}</ref>{{Rp|___location=1.4}}
 
:<math>\begin{align}
Many optimization problems can be equivalently formulated in this standard form. For example, the problem of maximizing a [[concave function]] <math>f</math> can be re-formulated equivalently as the problem of minimizing the convex function <math>-f</math>. The problem of maximizing a concave function over a convex set is commonly called a convex optimization problem.<ref>{{cite web | url=https://www.solver.com/convex-optimization | title=Optimization Problem Types - Convex Optimization | date=9 January 2011 }}</ref>
&\underset{\mathbf{x},t}{\operatorname{minimize}}& & t \\
&\operatorname{subject\ to}
& &f(\mathbf{x}) - t \leq 0 \\
&& &g_i(\mathbf{x}) \leq 0, \quad i = 1, \dots, m \\
&&&h_i(\mathbf{x}) = 0, \quad i = 1, \dots, p,
\end{align}</math>
 
=== Conic form ===
Every convex program can be presented in a ''conic form'', which means minimizing a linear objective over the intersection of an affine plane and a convex cone:<ref name=":02" />{{Rp|___location=5.1}}
:<math>\begin{align}
&\underset{\mathbf{x}}{\operatorname{minimize}}& & c^T x \\
&\operatorname{subject\ to}
& &x \in (b+L)\cap K
\end{align}</math>
where K is a closed [[Convex cone|pointed convex cone]], L is a [[linear subspace]] of R''<sup>n</sup>'', and b is a vector in R''<sup>n</sup>''. A linear program in standard form is the special case in which K is the nonnegative orthant of R''<sup>n</sup>''.
 
=== Eliminating linear equality constraints ===
It is possible to convert a convex program in standard form, to a convex program with no equality constraints.<ref name=":2" />{{Rp|page=132}} Denote the equality constraints ''h<sub>i</sub>''(''x'')=0 as ''Ax''=''b'', where ''A'' has ''n'' columns. If ''Ax''=''b'' is infeasible, then of course the original problem is infeasible. Otherwise, it has some solution ''x''<sub>0</sub> , and the set of all solutions can be presented as: ''Fz''+''x''<sub>0</sub>, where ''z'' is in ''R<sup>k</sup>'', ''k''=''n''-rank(''A''), and ''F'' is an ''n''-by-''k'' matrix. Substituting ''x'' = ''Fz''+''x''<sub>0</sub> in the original problem gives: <blockquote><math>\begin{align}
&\underset{\mathbf{x}}{\operatorname{minimize}}& & f(\mathbf{F \mathbf{z} + \mathbf{x}_0}) \\
&\operatorname{subject\ to}
& &g_i(\mathbf{F \mathbf{z} + \mathbf{x}_0}) \leq 0, \quad i = 1, \dots, m \\
\end{align}</math></blockquote>where the variables are '''z'''. Note that there are rank(''A'') fewer variables. This means that, in principle, one can restrict attention to convex optimization problems without equality constraints. In practice, however, it is often preferred to retain the equality constraints, since they might make some algorithms more efficient, and also make the problem easier to understand and analyze.
 
==Special cases==
 
The following problem classes are all convex optimization problems, or can be reduced to convex optimization problems via simple transformations:<ref name=":2" />{{Rp|___location=chpt.4}}<ref name="rewriting">
{{cite journal |last1=Agrawal |first1=Akshay |last2=Verschueren |first2=Robin |last3=Diamond |first3=Steven |last4=Boyd |first4=Stephen |year=2018 |title=A rewriting system for convex optimization problems |url=https://web.stanford.edu/~boyd/papers/pdf/cvxpy_rewriting.pdf |journal=Control and Decision |volume=5 |issue=1 |pages=42–60 |arxiv=1709.04494 |doi=10.1080/23307706.2017.1397554 |s2cid=67856259}}</ref>
[[File:Hierarchy compact convex.svg|thumb|A hierarchy of convex optimization problems. (LP: [[linear programming]], QP: [[quadratic programming]], SOCP [[Second-order cone programming|second-order cone program]], SDP: [[semidefinite programming]], CP: [[conic optimization]].)]]
 
*[[Linear programming]] problems are the simplest convex programs. In LP, the objective and constraint functions are all linear.
*[[Quadratic programming]] are the next-simplest. In QP, the constraints are all linear, but the objective may be a convex quadratic function.
*[[Second order cone programming]] are more general.
*[[Semidefinite programming]] are more general.
*[[Conic optimization]] are even more general - see figure to the right,
Other special cases include;
*[[Least squares]]
*[[Quadratically constrained quadratic programming|Quadratic minimization with convex quadratic constraints]]
*[[Geometric programming]]
*[[Entropy maximization]] with appropriate constraints.
 
==Properties==
The following are useful properties of convex optimization problems:<ref name="rockafellar93">{{cite journal | author = Rockafellar, R. Tyrrell | title = Lagrange multipliers and optimality | journal = SIAM Review | volume = 35 | issue = 2 | year = 1993 | pages = 183–238 |url = http://web.williams.edu/Mathematics/sjmiller/public_html/105Sp10/handouts/Rockafellar_LagrangeMultAndOptimality.pdf | doi=10.1137/1035044| citeseerx = 10.1.1.161.7209}}</ref><ref name="bv4:2" />{{Rp|___location=chpt.4}}
 
* every point that is [[local minimum]] is also a [[global minimum]];
* the optimal set is convex;
* if the objective function is ''strictly'' convex, then the problem has at most one optimal point.
Line 54 ⟶ 94:
These results are used by the theory of convex minimization along with geometric notions from [[functional analysis]] (in Hilbert spaces) such as the [[Hilbert projection theorem]], the [[separating hyperplane theorem]], and [[Farkas' lemma]].{{citation needed|date=July 2022}}
 
==ApplicationsAlgorithms==
 
=== Unconstrained and equality-constrained problems ===
The following problem classes are all convex optimization problems, or can be reduced to convex optimization problems via simple transformations:<ref name="bv4"/><ref name="rewriting">
The convex programs easiest to solve are the ''unconstrained'' problems, or the problems with only equality constraints. As the equality constraints are all linear, they can be eliminated with [[linear algebra]] and integrated into the objective, thus converting an equality-constrained problem into an unconstrained one.
{{cite journal|url=https://web.stanford.edu/~boyd/papers/pdf/cvxpy_rewriting.pdf|last1=Agrawal|first1=Akshay|last2=Verschueren|first2=Robin|last3=Diamond|first3=Steven|last4=Boyd|first4=Stephen|title=A rewriting system for convex optimization problems|journal=Control and Decision|volume=5|issue=1|year=2018|pages=42–60|doi=10.1080/23307706.2017.1397554|arxiv=1709.04494|s2cid=67856259}}</ref>
[[File:Hierarchy compact convex.png|thumb|A hierarchy of convex optimization problems. (LP: linear program, QP: quadratic program, SOCP second-order cone program, SDP: semidefinite program, CP: cone program.)]]
 
In the class of unconstrained (or equality-constrained) problems, the simplest ones are those in which the objective is [[Quadratic programming|quadratic]]. For these problems, the [[KKT conditions]] (which are necessary for optimality) are all linear, so they can be solved analytically.<ref name=":2" />{{Rp|___location=chpt.11}}
*[[Least squares]]
*[[Linear programming]]
* Convex [[quadratic programming|quadratic minimization]] with linear constraints
*[[Quadratically constrained quadratic programming|Quadratic minimization with convex quadratic constraints]]
*[[Conic optimization]]
*[[Geometric programming]]
*[[Second order cone programming]]
*[[Semidefinite programming]]
*[[Entropy maximization]] with appropriate constraints
Convex optimization has practical applications for the following.
 
For unconstrained (or equality-constrained) problems with a general convex objective that is twice-differentiable, [[Newton's method in optimization|Newton's method]] can be used. It can be seen as reducing a general unconstrained convex problem, to a sequence of quadratic problems.<ref name=":2" />{{Rp|___location=chpt.11}}Newton's method can be combined with [[line search]] for an appropriate step size, and it can be mathematically proven to converge quickly.
* [[Portfolio optimization]].<ref name=":0">{{Cite web|last1=Boyd|first1=Stephen|last2=Diamond|first2=Stephen|last3=Zhang|first3=Junzi|last4=Agrawal|first4=Akshay|title=Convex Optimization Applications|url=https://web.stanford.edu/~boyd/papers/pdf/cvx_applications.pdf|url-status=live|access-date=12 Apr 2021|archive-url=https://web.archive.org/web/20151001185038/http://web.stanford.edu/~boyd/papers/pdf/cvx_applications.pdf |archive-date=2015-10-01 }}</ref>
* Worst-case risk analysis.<ref name=":0" />
* Optimal advertising.<ref name=":0" />
* Variations of [[Regression analysis|statistical regression]] (including [[Regularization (mathematics)|regularization]] and [[quantile regression]]).<ref name=":0" />
* Model fitting<ref name=":0" /> (particularly [[multiclass classification]]<ref name=":1">{{Cite web|last=Malick|first=Jérôme|date=2011-09-28|title=Convex optimization: applications, formulations, relaxations|url=https://www-ljk.imag.fr//membres/Jerome.Malick/Talks/11-INRIA.pdf|url-status=live|access-date=12 Apr 2021|archive-url=https://web.archive.org/web/20210412044738/https://www-ljk.imag.fr//membres/Jerome.Malick/Talks/11-INRIA.pdf |archive-date=2021-04-12 }}</ref>).
* [[Electricity generation]] optimization.<ref name=":1" />
* [[Combinatorial optimization]].<ref name=":1" />
* Non-probabilistic modelling of [[uncertainty]].<ref>Ben Haim Y. and Elishakoff I., Convex Models of Uncertainty in Applied Mechanics, Elsevier Science Publishers, Amsterdam, 1990</ref>
* Localization using wireless signals <ref>[[Ahmad Bazzi]], Dirk TM Slock, and Lisa Meilhac. "Online angle of arrival estimation in the presence of mutual coupling." 2016 IEEE Statistical Signal Processing Workshop (SSP). IEEE, 2016.</ref>
 
Other efficient algorithms for unconstrained minimization are [[gradient descent]] (a special case of [[Method of steepest descent|steepest descent]]).
==Lagrange multipliers==
 
=== General problems ===
The more challenging problems are those with inequality constraints. A common way to solve them is to reduce them to unconstrained problems by adding a [[barrier function]], enforcing the inequality constraints, to the objective function. Such methods are called [[interior point methods]].<ref name=":2" />{{Rp|___location=chpt.11}}They have to be initialized by finding a feasible interior point using by so-called ''phase I'' methods, which either find a feasible point or show that none exist. Phase I methods generally consist of reducing the search in question to a simpler convex optimization problem.<ref name=":2" />{{Rp|___location=chpt.11}}
 
Convex optimization problems can also be solved by the following contemporary methods:<ref>For methods for convex minimization, see the volumes by Hiriart-Urruty and Lemaréchal (bundle) and the textbooks by [[Andrzej Piotr Ruszczyński|Ruszczyński]], [[Dimitri Bertsekas|Bertsekas]], and
Boyd and Vandenberghe (interior point).
</ref>
* [[Subgradient method#Subgradient-projection & bundle methods|Bundle methods]] (Wolfe, Lemaréchal, Kiwiel), and
* [[Subgradient method#Subgradient-projection & bundle methods|Subgradient projection]] methods (Polyak),
* [[Interior-point methods]],<ref name="Nesterov 1994" /> which make use of [[self-concordant function|self-concordant]] barrier functions <ref>{{cite book |title=Interior-Point Polynomial Algorithms in Convex Programming |last1= Nesterov |first1=Yurii |first2=Nemirovskii |last2=Arkadii |year=1995 |publisher=Society for Industrial and Applied Mathematics |isbn=978-0898715156 }}</ref> and self-regular barrier functions.<ref name="PengRoos2002">{{cite journal|last1=Peng|first1=Jiming|last2=Roos|first2=Cornelis|last3=Terlaky|first3=Tamás|title=Self-regular functions and new search directions for linear and semidefinite optimization|journal=Mathematical Programming|volume=93|issue=1|year=2002|pages=129–171|issn=0025-5610|doi=10.1007/s101070200296|s2cid=28882966}}</ref>
*[[Cutting-plane methods]]
*[[Ellipsoid method]]
*[[Subgradient method]]
*[[Drift plus penalty|Dual subgradients and the drift-plus-penalty method]]
Subgradient methods can be implemented simply and so are widely used.<ref>{{Cite journal |date=2006 |title=Numerical Optimization |url=https://link.springer.com/book/10.1007/978-0-387-40065-5 |journal=Springer Series in Operations Research and Financial Engineering |language=en |doi=10.1007/978-0-387-40065-5|isbn=978-0-387-30303-1 |url-access=subscription }}</ref> Dual subgradient methods are subgradient methods applied to a [[Duality (optimization)|dual problem]]. The [[Drift plus penalty|drift-plus-penalty]] method is similar to the dual subgradient method, but takes a time average of the primal variables.{{Citation needed|date=April 2021}}
 
== Lagrange multipliers ==
{{Main|Lagrange multiplier}}
{{Unreferenced section|date=April 2021}}
 
Consider a convex minimization problem given in standard form by a cost function <math>f(x)</math> and inequality constraints <math>g_i(x)\leq 0</math> for <math> 1 \leq i \leq m</math>. Then the ___domain <math>\mathcal{X}</math> is:
 
:<math>\mathcal{X} = \left\{x\in X \vert g_1(x), \ldots, g_m(x)\leq 0\right\}.</math>
 
The Lagrangian function for the problem is<ref>{{cite book |first1=Brian |last1=Beavis |first2=Ian M. |last2=Dobbs |chapter=Static Optimization |title=Optimization and Stability Theory for Economic Analysis |___location=New York |publisher=Cambridge University Press |year=1990 |isbn=0-521-33605-8 |page=40 |chapter-url=https://books.google.com/books?id=L7HMACFgnXMC&pg=PA40 }}</ref>
The Lagrangian function for the problem is
 
:<math>L(x,\lambda_{0},\lambda_1, \ldots ,\lambda_{m})=\lambda_{0} f(x) + \lambda_{1} g_{1} (x)+\cdots + \lambda_{m} g_{m} (x).</math>
Line 106 ⟶ 146:
Conversely, if some <math>x</math> in <math>X</math> satisfies (1)–(3) for [[scalar (mathematics)|scalar]]s <math>\lambda_{0},\ldots,\lambda_{m} </math> with <math>\lambda_{0}=1</math> then <math>x</math> is certain to minimize <math>f</math> over <math>X</math>.
 
==Algorithms Software ==
There is a large software ecosystem for convex optimization. This ecosystem has two main categories: ''solvers'' on the one hand and ''modeling tools'' (or ''interfaces'') on the other hand.
Unconstrained convex optimization can be easily solved with [[gradient descent]] (a special case of [[Method of steepest descent|steepest descent]]) or [[Newton's method in optimization|Newton's method]], combined with [[line search]] for an appropriate step size; these can be mathematically proven to converge quickly, especially the latter method.<ref name=":2">{{Cite book|last1=Boyd|first1=Stephen|url=https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf|title=Convex Optimization|last2=Vandenberghe|first2=Lieven|publisher=[[Cambridge University Press]]|year=2004|isbn=978-0-521-83378-3|access-date=12 Apr 2021|url-status=live}}</ref> Convex optimization with linear equality constraints can also be solved using [[KKT matrix]] techniques if the objective function is a [[quadratic function]] (which generalizes to a variation of Newton's method, which works even if the point of initialization does not satisfy the constraints), but can also generally be solved by eliminating the equality constraints with [[linear algebra]] or solving the dual problem.<ref name=":2" /> Finally, convex optimization with both linear equality constraints and convex inequality constraints can be solved by applying an unconstrained convex optimization technique to the objective function plus [[Logarithmic barrier function|logarithmic barrier]] terms.<ref name=":2" /> (When the starting point is not feasible - that is, satisfying the constraints - this is preceded by so-called ''phase I'' methods, which either find a feasible point or show that none exist. Phase I methods generally consist of reducing the search in question to yet another convex optimization problem.<ref name=":2" />)
 
Solvers implement the algorithms themselves and are usually written in C. They require users to specify optimization problems in very specific formats which may not be natural from a modeling perspective. Modeling tools are separate pieces of software that let the user specify an optimization in higher-level syntax. They manage all transformations to and from the user's high-level model and the solver's input/output format.
Convex optimization problems can also be solved by the following contemporary methods:<ref>For methods for convex minimization, see the volumes by Hiriart-Urruty and Lemaréchal (bundle) and the textbooks by [[Andrzej Piotr Ruszczyński|Ruszczyński]], [[Dimitri Bertsekas|Bertsekas]], and
Boyd and Vandenberghe (interior point).
</ref>
* [[Subgradient method#Subgradient-projection & bundle methods|Bundle methods]] (Wolfe, Lemaréchal, Kiwiel), and
* [[Subgradient method#Subgradient-projection & bundle methods|Subgradient projection]] methods (Polyak),
* [[Interior-point methods]],<ref name="Nesterov 1994"/> which make use of [[self-concordant function|self-concordant]] barrier functions <ref>{{cite book |title=Interior-Point Polynomial Algorithms in Convex Programming |last1= Nesterov |first1=Yurii |first2=Nemirovskii |last2=Arkadii |year=1995 |publisher=Society for Industrial and Applied Mathematics |isbn=978-0898715156 }}</ref> and self-regular barrier functions.<ref name="PengRoos2002">{{cite journal|last1=Peng|first1=Jiming|last2=Roos|first2=Cornelis|last3=Terlaky|first3=Tamás|title=Self-regular functions and new search directions for linear and semidefinite optimization|journal=Mathematical Programming|volume=93|issue=1|year=2002|pages=129–171|issn=0025-5610|doi=10.1007/s101070200296|s2cid=28882966}}</ref>
*[[Cutting-plane methods]]
*[[Ellipsoid method]]
*[[Subgradient method]]
*[[Drift plus penalty|Dual subgradients and the drift-plus-penalty method]]
Subgradient methods can be implemented simply and so are widely used.<ref>Bertsekas</ref>{{Citation needed|date=April 2021}} Dual subgradient methods are subgradient methods applied to a [[Duality (optimization)|dual problem]]. The [[Drift plus penalty|drift-plus-penalty]] method is similar to the dual subgradient method, but takes a time average of the primal variables.{{Citation needed|date=April 2021}}
 
Below are two tables. The first shows shows modelling tools (such as CVXPY and JuMP.jl) and the second solvers (such as SCS and MOSEK). They are by no means exhaustive.
=== Implementations ===
Convex optimization and related algorithms have been implemented in the following software programs:
{| class="wikitable sortable"
|+
Line 134 ⟶ 163:
|[[MATLAB]]
|Interfaces with SeDuMi and SDPT3 solvers; designed to only express convex optimization problems.
|!{{Yes}}
|<ref name=":3">{{Cite web|last=Borchers|first=Brian|title=An Overview Of Software For Convex Optimization|url=http://infohost.nmt.edu/~borchers/presentation.pdf|url-status=dead|archive-url=https://web.archive.org/web/20170918180026/http://infohost.nmt.edu/~borchers/presentation.pdf|archive-date=2017-09-18|access-date=12 Apr 2021}}</ref>
|-
|CVXMOD
|[[Python (programming language)|Python]]
|Interfaces with the [https://cvxopt.org/ CVXOPT] solver.
|Yes
|<ref name=":3" />
|-
|CVXPY
|Python
|
!{{Yes}}
|
|<ref>{{Cite web|title=Welcome to CVXPY 1.1 — CVXPY 1.1.11 documentation|url=https://www.cvxpy.org/|access-date=2021-04-12|website=www.cvxpy.org}}</ref>
|-
Line 152 ⟶ 175:
|[[Julia (programming language)|Julia]]
|Disciplined convex programming, supports many solvers.
|!{{Yes}}
|<ref>{{cite arXiv |last1=Udell |first1=Madeleine |last2=Mohan |first2=Karanveer |last3=Zeng |first3=David |last4=Hong |first4=Jenny |last5=Diamond |first5=Steven |last6=Boyd |first6=Stephen |date=2014-10-17 |title=Convex Optimization in Julia |class=math.OC |eprint=1410.4821 }}</ref>
|-
Line 158 ⟶ 181:
|[[R (programming language)|R]]
|
|!{{Yes}}
|<ref>{{Cite web|title=Disciplined Convex Optimiation - CVXR|url=https://www.cvxgrp.org/CVXR/|access-date=2021-06-17|website=www.cvxgrp.org}}</ref>
|-
|GAMS
|YALMIP
|MATLAB, Octave
|Interfaces with CPLEX, GUROBI, MOSEK, SDPT3, SEDUMI, CSDP, SDPA, PENNON solvers; also supports integer and nonlinear optimization, and some nonconvex optimization. Can perform [[robust optimization]] with uncertainty in LP/SOCP/SDP constraints.
|Yes
|<ref name=":3" />
|-
|LMI lab
|MATLAB
|Expresses and solves semidefinite programming problems (called "linear matrix inequalities")
|No
|<ref name=":3" />
|-
|LMIlab translator
|
|Modeling system for linear, nonlinear, mixed integer linear/nonlinear, and second-order cone programming problems.
|Transforms LMI lab problems into SDP problems.
!{{No}}
|Yes
|<ref name=":3" />
|-
|GloptiPoly
|xLMI
|MATLAB,
Octave
|Similar to LMI lab, but uses the SeDuMi solver.
|Modeling system for polynomial optimization.
|Yes
!{{Yes}}
|<ref name=":3" />
|-
|JuMP.jl
|AIMMS
|[[Julia (programming language)|Julia]]
|
|Supports many solvers. Also supports integer and nonlinear optimization, and some nonconvex optimization.
|Can do robust optimization on linear programming (with MOSEK to solve second-order cone programming) and [[mixed integer linear programming]]. Modeling package for LP + SDP and robust versions.
!{{Yes}}
|No
|<ref>{{cite journal |last1=Lubin |first1=Miles |last2=Dowson |first2=Oscar |last3=Dias Garcia |first3=Joaquim |last4=Huchette |first4=Joey |last5=Legat |first5=Benoît |last6=Vielma |first6=Juan Pablo |date=2023 |title=JuMP 1.0: Recent improvements to a modeling language for mathematical optimization | journal = Mathematical Programming Computation | doi = 10.1007/s12532-023-00239-3 |eprint=2206.03866 }}</ref>
|<ref name=":3" />
|-
|ROME
|
|Modeling system for robust optimization. Supports distributionally robust optimization and [[uncertainty set]]s.
|!{{Yes}}
|<ref name=":3" />
|-
|GloptiPoly 3
|MATLAB,
Octave
|Modeling system for polynomial optimization.
|Yes
|<ref name=":3" />
|-
Line 207 ⟶ 212:
|
|Modeling system for [[polynomial optimization]]. Uses SDPT3 and SeDuMi. Requires Symbolic Computation Toolbox.
|!{{Yes}}
|<ref name=":3" />
|-
Line 213 ⟶ 218:
|
|Modeling system for polynomial optimization. Uses the SDPA or SeDuMi solvers.
|!{{Yes}}
|<ref name=":3" />
|-
|YALMIP
|MATLAB, Octave
|Interfaces with CPLEX, GUROBI, MOSEK, SDPT3, SEDUMI, CSDP, SDPA, PENNON solvers; also supports integer and nonlinear optimization, and some nonconvex optimization. Can perform [[robust optimization]] with uncertainty in LP/SOCP/SDP constraints.
!{{Yes}}
|<ref name=":3" />
|}
{| class="wikitable sortable"
|+
!Program
!Language
!Description
![[Free and open-source software|FOSS]]?
!Ref
|-
|AIMMS
|
|Can do robust optimization on linear programming (with MOSEK to solve second-order cone programming) and [[mixed integer linear programming]]. Modeling package for LP + SDP and robust versions.
!{{No}}
|<ref name=":3" />
|-
Line 219 ⟶ 244:
|
|Supports primal-dual methods for LP + SOCP. Can solve LP, QP, SOCP, and mixed integer linear programming problems.
|!{{No}}
|<ref name=":3" />
|-
Line 225 ⟶ 250:
|[[C (programming language)|C]]
|Supports primal-dual methods for LP + SDP. Interfaces available for MATLAB, [[R (programming language)|R]], and Python. Parallel version available. SDP solver.
|!{{Yes}}
|<ref name=":3" />
|-
Line 231 ⟶ 256:
|Python
|Supports primal-dual methods for LP + SOCP + SDP. Uses Nesterov-Todd scaling. Interfaces to MOSEK and DSDP.
|!{{Yes}}
|<ref name=":3" />
|-
Line 237 ⟶ 262:
|
|Supports primal-dual methods for LP + SOCP.
|!{{No}}
|<ref name=":3" />
|-
Line 243 ⟶ 268:
|MATLAB, Octave, [[MEX file|MEX]]
|Solves LP + SOCP + SDP. Supports primal-dual methods for LP + SOCP + SDP.
|!{{Yes}}
|<ref name=":3" />
|-
Line 249 ⟶ 274:
|[[C++]]
|Solves LP + SDP. Supports primal-dual methods for LP + SDP. Parallelized and extended precision versions are available.
|!{{Yes}}
|<ref name=":3" />
|-
Line 255 ⟶ 280:
|MATLAB, Octave, MEX
|Solves LP + SOCP + SDP. Supports primal-dual methods for LP + SOCP + SDP.
|!{{Yes}}
|<ref name=":3" />
|-
Line 261 ⟶ 286:
|
|Supports general-purpose codes for LP + SOCP + SDP. Uses a bundle method. Special support for SDP and SOCP constraints.
|!{{Yes}}
|<ref name=":3" />
|-
Line 267 ⟶ 292:
|
|Supports general-purpose codes for LP + SDP. Uses a dual interior point method.
|!{{Yes}}
|<ref name=":3" />
|-
Line 273 ⟶ 298:
|
|Supports general-purpose codes for SOCP, which it treats as a nonlinear programming problem.
|!{{No}}
|<ref name=":3" />
|-
Line 279 ⟶ 304:
|
|Supports general-purpose codes. Uses an augmented Lagrangian method, especially for problems with SDP constraints.
|!{{No}}
|<ref name=":3" />
|-
Line 285 ⟶ 310:
|
|Supports general-purpose codes. Uses low-rank factorization with an augmented Lagrangian method.
|!{{Yes}}
|<ref name=":3" />
|-
|GAMS
|
|Modeling system for linear, nonlinear, mixed integer linear/nonlinear, and second-order cone programming problems.
|No
|<ref name=":3" />
|-
|Optimization Services
|
|XML standard for encoding optimization problems and solutions.
|
|<ref name=":3" />
|}
 
== Applications ==
Convex optimization can be used to model problems in a wide range of disciplines, such as automatic [[control systems]], estimation and [[signal processing]], communications and networks, electronic [[circuit design]],<ref name=":2" />{{Rp|___location=|page=17}} data analysis and modeling, [[finance]], [[statistics]] ([[optimal design|optimal experimental design]]),<ref>Christensen/Klarbring, chpt. 4.</ref> and [[structural optimization]], where the approximation concept has proven to be efficient.<ref name=":2" /><ref>Schmit, L.A.; Fleury, C. 1980: ''Structural synthesis by combining approximation concepts and dual methods''. J. Amer. Inst. Aeronaut. Astronaut 18, 1252-1260</ref> Convex optimization can be used to model problems in the following fields:
 
* [[Portfolio optimization]].<ref name=":0">{{Cite web |last1=Boyd |first1=Stephen |last2=Diamond |first2=Stephen |last3=Zhang |first3=Junzi |last4=Agrawal |first4=Akshay |title=Convex Optimization Applications |url=https://web.stanford.edu/~boyd/papers/pdf/cvx_applications.pdf |url-status=live |archive-url=https://web.archive.org/web/20151001185038/http://web.stanford.edu/~boyd/papers/pdf/cvx_applications.pdf |archive-date=2015-10-01 |access-date=12 Apr 2021}}</ref>
* Worst-case risk analysis.<ref name=":0" />
* Optimal advertising.<ref name=":0" />
* Variations of [[Regression analysis|statistical regression]] (including [[Regularization (mathematics)|regularization]] and [[quantile regression]]).<ref name=":0" />
* Model fitting<ref name=":0" /> (particularly [[multiclass classification]]<ref name=":1">{{Cite web |last=Malick |first=Jérôme |date=2011-09-28 |title=Convex optimization: applications, formulations, relaxations |url=https://www-ljk.imag.fr//membres/Jerome.Malick/Talks/11-INRIA.pdf |url-status=live |archive-url=https://web.archive.org/web/20210412044738/https://www-ljk.imag.fr//membres/Jerome.Malick/Talks/11-INRIA.pdf |archive-date=2021-04-12 |access-date=12 Apr 2021}}</ref>).
* [[Electricity generation]] optimization.<ref name=":1" />
* [[Combinatorial optimization]].<ref name=":1" />
* Non-probabilistic modelling of [[uncertainty]].<ref>Ben Haim Y. and Elishakoff I., Convex Models of Uncertainty in Applied Mechanics, Elsevier Science Publishers, Amsterdam, 1990</ref>
* Localization using wireless signals <ref>[[Ahmad Bazzi]], Dirk TM Slock, and Lisa Meilhac. "Online angle of arrival estimation in the presence of mutual coupling." 2016 IEEE Statistical Signal Processing Workshop (SSP). IEEE, 2016.</ref>
 
==Extensions==
Line 310 ⟶ 336:
* [[Optimization problem]]
* [[Proximal gradient method]]
* [[Algorithmic problems on convex sets]]
 
==Notes==
Line 348 ⟶ 375:
| isbn = 978-1-886529-28-1
}}
*{{Cite book|last1=Borwein|first1=Jonathan|url=https://carma.newcastle.edu.au/resources/jon/Preprints/Books/CaNo2/cano2f.pdf|title=Convex Analysis and Nonlinear Optimization: Theory and Examples, Second Edition|last2=Lewis|first2=Adrian|publisher=Springer|year=2000|access-date=12 Apr 2021|url-status=live}}
* {{cite book
| author1 = Christensen, Peter W.