Semidefinite programming: Difference between revisions

Browse history interactively

← Previous edit Next edit →

Content deleted Content added

VisualWikitext

Revision as of 16:22, 31 December 2023 edit Erel Segal (talk \| contribs) Extended confirmed users, IP block exemptions 14,576 edits No edit summary Tag: Visual edit ← Previous edit		Revision as of 23:24, 19 June 2025 edit undo OAbot (talk \| contribs) Bots 643,717 edits m Open access bot: url-access updated in citation with #oabot. Next edit →
(19 intermediate revisions by 7 users not shown)
Line 1: {{Short description\|Subfield of convex optimization}} '''Semidefinite programming''' ('''SDP''') is a subfield of [[~~convex~~mathematical ~~optimization~~programming]] concerned with the optimization of a linear [[objective function]] (a user-specified function that the user wants to minimize or maximize) over the intersection of the [[Cone (linear algebra)\|cone]] of [[Positive-definite matrix#Negative-definite, semidefinite and indefinite matrices\|positive semidefinite]] [[Matrix (mathematics)\|matrices]] with an [[affine space]], i.e., a [[spectrahedron]].<ref name=":0">{{Citation \|~~last~~last1=Gärtner \|~~first~~first1=Bernd \|title=Semidefinite Programming \|date=2012 \|url=https://doi.org/10.1007/978-3-642-22015-9_2 \|work=Approximation Algorithms and Semidefinite Programming \|pages=15–25 \|editor-last=Gärtner \|editor-first=Bernd \|access-date=2023-12-31 \|place=Berlin, Heidelberg \|publisher=Springer \|language=en \|doi=10.1007/978-3-642-22015-9_2 \|isbn=978-3-642-22015-9 \|last2=Matoušek \|first2=Jiří \|editor2-last=Matousek \|editor2-first=Jiri\|url-access=subscription }}</ref> Semidefinite programming is a relatively new field of optimization which is of growing interest for several reasons. Many practical problems in [[operations research]] and [[combinatorial optimization]] can be modeled or approximated as semidefinite programming problems. In automatic control theory, SDPs are used in the context of [[linear matrix inequality\|linear matrix inequalities]]. SDPs are in fact a special case of [[conic optimization\|cone programming]] and can be efficiently solved by [[interior point methods]]. Line 24: An <math>n \times n</math> matrix <math>M</math> is said to be [[Positive-definite matrix#Positive-semidefinite\|positive semidefinite]] if it is the [[Gram matrix]] of some vectors (i.e. if there exist vectors <math>x^1, \ldots, x^n</math> such that <math>m_{i,j}=x^i \cdot x^j</math> for all <math>i,j</math>). If this is the case, we denote this as <math>M \succeq 0</math>. Note that there are several other equivalent definitions of being positive semidefinite, for example, positive semidefinite matrices are [[self-adjoint]] matrices that have only non-negative [[Eigenvalues and eigenvectors\|eigenvalues]]. Denote by <math>\mathbb{S}^n</math> the space of all <math>n \times n</math> real symmetric matrices. The space is equipped with the [[Inner product space\|inner product]] (where <math>{\rm trtrace}</math> denotes the [[Trace (linear algebra)\|trace]]):<blockquote><math> \langle A,B\~~rangle_{\mathbb{S}^n}~~rangle := {\rm trtrace}(A^T B) = \sum_{i=1,j=1}^n▼ ~~<math>~~ ▲ \langle A,B\rangle_{\mathbb{S}^n} = {\rm tr}(A^T B) = \sum_{i=1,j=1}^n A_{ij}B_{ij}. </math></blockquote>We can rewrite the mathematical program given in the previous section equivalently as▼ ~~</math>~~ ▲We can rewrite the mathematical program given in the previous section equivalently as :<math> \begin{array}{rl} {\displaystyle\min_{X \in \mathbb{S}^n}} & \langle C, X \~~rangle_{\mathbb{S}^n}~~rangle \\ \text{subject to} & \langle A_k, X \~~rangle_{\mathbb{S}^n}~~rangle \leq b_k, \quad k = 1,\ldots,m \\ & X \succeq 0. \end{array} Line 42 ⟶ 39: where entry <math>i,j</math> in <math>C</math> is given by <math>\frac{c_{i,j} + c_{j,i}}{2}</math> from the previous section and <math>A_k</math> is a symmetric <math>n \times n</math> matrix having <math>i,j</math>th entry <math>\frac{a_{i,j,k}+a_{j,i,k}}{2}</math> from the previous section. Thus, the matrices <math>C</math> and <math>A_k</math> are symmetric and the above inner products are well-defined. Note that if we add [[slack variable]]s appropriately, this SDP can be converted to ~~one~~an ~~of the~~''equational form'': :<math> \begin{array}{rl} {\displaystyle\min_{X \in \mathbb{S}^n}} & \langle C, X \~~rangle_{\mathbb{S}^n}~~rangle \\ \text{subject to} & \langle A_k, X \~~rangle_{\mathbb{S}^n}~~rangle = b_k, \quad k = 1,\ldots,m \\ & X \succeq 0. \end{array} Line 53 ⟶ 50: For convenience, an SDP may be specified in a slightly different, but equivalent form. For example, linear expressions involving nonnegative [[scalar (mathematics)\|scalar]] variables may be added to the program specification. This remains an SDP because each variable can be incorporated into the matrix <math>X</math> as a diagonal entry (<math>X_{ii}</math> for some <math>i</math>). To ensure that <math>X_{ii} \geq 0</math>, constraints <math>X_{ij} = 0</math> can be added for all <math>j \neq i</math>. As another example, note that for any positive semidefinite matrix <math>X</math>, there exists a set of vectors <math>\{ v_i \}</math> such that the <math>i</math>, <math>j</math> entry of <math>X</math> is <math>X_{ij} = (v_i, v_j)</math> the [[dot product\|scalar product]] of <math>v_i</math> and <math>v_j</math>. Therefore, SDPs are often formulated in terms of linear expressions on scalar products of vectors. Given the solution to the SDP in the standard form, the vectors <math>\{ v_i \}</math> can be recovered in <math>O(n^3)</math> time (e.g., by using an incomplete [[Cholesky decomposition]] of X). == Relations to other optimization problems == The space of semidefinite matrices is a [[convex cone]]. Therefore, SDP is a special case of [[conic optimization]], which is a special case of convex optimization. When the matrix ''C'' is diagonal, the inner products <''C'',''X''> is equivalent to a vector product of the diagonal of ''C'' and the diagonal of ''X''. Analogously, when the matrices ''A<sub>k</sub>'' are diagonal, the corresponding inner products are equivalent to vector products. In these vector products, only the diagonal elements of ''X'' are used, so we can add constraints equating the non-diagonal elements of ''X'' to 0. The condition <math>X \succeq 0</math> is then equivalent to the condition that all diagonal elements of ''X'' are non-negative. Then, the resulting SDP becomes a [[linear program]] in which the variables are the diagonal elements of ''X''. == Duality theory == Line 62 ⟶ 64: :<math> \begin{array}{rl} {\displaystyle\min_{X \in \mathbb{S}^n}} & \langle C, X \~~rangle_{\mathbb{S}^n}~~rangle \\ \text{subject to} & \langle A_i, X \~~rangle_{\mathbb{S}^n}~~rangle = b_i, \quad i = 1,\ldots,m \\ & X \succeq 0 \end{array} Line 71 ⟶ 73: :<math> \begin{array}{rl} {\displaystyle\max_{y \in \mathbb{R}^m}} & ~~\langle~~ b,^T y ~~\rangle_{\mathbb{R}^m}~~ \\ \text{subject to} & {\displaystyle\sum_{i=1}^m} y_i A_i \preceq C \end{array} Line 82 ⟶ 84: The [[weak duality]] theorem states that the value of the primal SDP is at least the value of the dual SDP. Therefore, any feasible solution to the dual SDP lower-bounds the primal SDP value, and conversely, any feasible solution to the primal SDP upper-bounds the dual SDP value. This is because :<math> \langle C, X \rangle - ~~\langle~~ b,^T y ~~\rangle~~ = \langle C, X \rangle - \sum_{i=1}^m y_i b_i = \langle C, X \rangle - \sum_{i=1}^m y_i \langle A_i, X \rangle Line 96 ⟶ 98: feasible (i.e., there exists <math>X_0\in\mathbb{S}^n, X_0\succ 0</math> such that <math>\langle A_i,X_0\~~rangle_{\mathbb{S}^n}~~rangle = b_i</math>, <math>i=1,\ldots,m</math>). Then there is an optimal solution <math>y^</math> to (D-SDP) and :<math>\langle C,X^\~~rangle_{\mathbb{S}^n}~~rangle = ~~\langle~~b^T b,y^~~\rangle_{\R^m}~~.</math>▼ ~~Then there is an optimal solution <math>y^</math> to (D-SDP) and~~ ▲:<math>\langle C,X^\rangle_{\mathbb{S}^n} = \langle b,y^\rangle_{\R^m}.</math> (ii) Suppose the dual problem (D-SDP) is bounded above and strictly Line 107 ⟶ 108: the equality from (i) holds. A sufficient condition for strong duality to hold for a SDP problem (and in general, for any convex optimization problem) is the [[Slater's condition]]. It is also possible to attain strong duality for SDPs without additional regularity conditions by using an extended dual problem proposed by Ramana.<ref name=":1">{{Cite journal \|last=Ramana \|first=Motakuri V. \|date=1997 \|title=An exact duality theory for semidefinite programming and its complexity implications \|url=http://link.springer.com/10.1007/BF02614433 \|journal=Mathematical Programming \|language=en \|volume=77 \|issue=1 \|pages=129–162 \|doi=10.1007/BF02614433 \|s2cid=12886462 \|issn=0025-5610\|url-access=subscription }}</ref><ref>{{Cite journal \|last1=Vandenberghe \|first1=Lieven \|last2=Boyd \|first2=Stephen \|date=1996 \|title=Semidefinite Programming \|url=http://epubs.siam.org/doi/10.1137/1038003 \|journal=SIAM Review \|language=en \|volume=38 \|issue=1 \|pages=49–95 \|doi=10.1137/1038003 \|issn=0036-1445\|url-access=subscription }}</ref> == Examples == Line 207 ⟶ 208: This is an SDP because the objective function and constraints are all linear functions of vector inner products. Solving the SDP gives a set of unit vectors in <math>\mathbf{R^n}</math>; since the vectors are not required to be collinear, the value of this relaxed program can only be higher than the value of the original quadratic integer program. Finally, a rounding procedure is needed to obtain a partition. Goemans and Williamson simply choose a uniformly random hyperplane through the origin and divide the vertices according to which side of the hyperplane the corresponding vectors lie. Straightforward analysis shows that this procedure achieves an expected ''approximation ratio'' (performance guarantee) of 0.87856 - ε. (The expected value of the cut is the sum over edges of the probability that the edge is cut, which is proportional to the angle <math>\cos^{-1}\langle v_{i}, v_{j}\rangle</math> between the vectors at the endpoints of the edge over <math>\pi</math>. Comparing this probability to <math>(1-\langle v_{i}, v_{j}\rangle)/{2}</math>, in expectation the ratio is always at least 0.87856.) Assuming the [[unique games conjecture]], it can be shown that this approximation ratio is essentially optimal. Since the original paper of Goemans and Williamson, SDPs have been applied to develop numerous approximation algorithms. ~~Recently~~Subsequently, Prasad Raghavendra has developed a general framework for constraint satisfaction problems based on the [[unique games conjecture]].<ref>{{Cite book\|chapter-url=http://doi.acm.org/10.1145/1374376.1374414\|doi=10.1145/1374376.1374414\|chapter=Optimal algorithms and inapproximability results for every CSP?\|title=Proceedings of the fortieth annual ACM symposium on Theory of computing\|year=2008\|last1=Raghavendra\|first1=Prasad\|pages=245–254\|isbn=9781605580470\|s2cid=15075197}}</ref> === Other applications === Semidefinite programming has been applied to find approximate solutions to combinatorial optimization problems, such as the solution of the [[max cut]] problem with an [[approximation ratio]] of 0.87856. SDPs are also used in geometry to determine tensegrity graphs, and arise in control theory as [[Linear matrix inequality\|LMIs]], and in inverse elliptic coefficient problems as convex, non-linear, semidefiniteness constraints.<ref>{{citation \|last1=Harrach \|first1=Bastian \|title=Solving an inverse elliptic coefficient problem by convex non-linear semidefinite programming \|date=2021 \|journal=Optimization Letters \|volume=16 \|issue=5 \|pages=1599–1609 \|language=en \|arxiv=2105.11440 \|doi=10.1007/s11590-021-01802-4 \|s2cid=235166806}}</ref> It is also widely used in physics to constrain [[Conformal field theory\|conformal field theories]] with the [[conformal bootstrap]].<ref>{{cite journal \|last=Simmons-Duffin \|first=David \|date=2015-02-06 \|title=A Semidefinite Program Solver for the Conformal Bootstrap \|journal=Journal of High Energy Physics \|volume=2015 \|issue=6 \|page=174 \|arxiv=1502.02033 \|bibcode=2015JHEP...06..174S \|doi=10.1007/JHEP06(2015)174 \|s2cid=256009551}}</ref> == Run-time complexity == The '''semidefinite feasibility problem''' (SDF) is the following [[decision problem]]: given an SDP, decide whether it has at least one feasible solution. The exact run-time complexity of this problem is unknown (as of 1997). However, Ramana proved the following:<ref name=":1" /> * In the [[Turing machine]] model, SDF is in NP iff it is in co-NP. Therefore, SDF is not NP-complete unless NP=coNP. * In the [[Blum–Shub–Smale machine]] model, SDF is in the intersection of NP and co-NP. == Algorithms for solving SDPs == There are several types of algorithms for solving SDPs. These algorithms output the value of the SDP up to an additive error <math>\epsilon</math> in time that is polynomial in the program description size and <math>\log (1/\epsilon)</math>. === Ellipsoid method === The [[ellipsoid method]] is a general method for convex programming, and can be used in particular to solve SDPs. In the context of SDPs, the ellipsoid method provides the following guarantee.<ref name=":0" />{{Rp\|___location=Thm.2.6.1}}Consider an SDP in the following equational form:<blockquote><math> \begin{array}{rl} {\displaystyle\max_{X \in \mathbb{S}^n}} & \langle C, X \rangle \\ \text{subject to} & \langle A_k, X \rangle = b_k, \quad k = 1,\ldots,m \\ & X \succeq 0. \end{array} </math></blockquote>Let ''L'' be the affine subspace of matrices in S<sup>n</sup> satisfying the ''m'' equational constraints; so the SDP can be written as: <math> \max_{X\in L} \langle C,X\rangle \text{ subject to } X\succeq 0 </math>. Suppose all coefficients in the SDP are rational numbers. Let ''R'' be an explicitly given upper bound on the maximum [[Frobenius norm]] of a feasible solution, and ''ε>''0 a constant. A matrix ''X'' in S<sup>n</sup> is called ''ε-deep'' if every matrix ''Y'' in ''L'' with [[Frobenius distance]] at most ''ε'' from ''X'' satisfies the feasibility condition <math> Y\succeq 0 </math>. Denote <math> v_{deep} := \sup \{\langle C,X\rangle: X \text{ is } \epsilon\text{-deep} \} </math>. The ellipsoid returns one of the following outputs: * A matrix X* in L (that is, satisfying all linear equality constraints exactly), such that the Frobenius distance between X* and some feasible solution is at most ''ε'' (that is, approximately satisfying the inequality constraint <math> X\succeq 0 </math>), and <math> \langle C,X^\rangle \geq v_{deep} -\epsilon </math> (that is, approximately optimal objective value). A certificate that the problem has no ''ε-deep'' solutions (that is, the problem is approximately infeasible). The run-time is polynomial in the binary encodings of the inputs and in log(R/''ε''), in the [[Turing machine]] model. Note that, in general, ''R'' may be doubly-exponential in ''n.'' In that case, the run-time guarantee of the ellipsoid method is exponential in ''n''. But in most applications, ''R'' is not so huge. In these cases, the ellipsoid method is the only known method that guarantees polynomial runtime in the Turing machine model.''<ref name=":0" />''{{Rp\|___location=\|page=23}} But in practice, its performance is not so good. === Interior point methods === Line 235 ⟶ 268: === Approximate methods === Algorithms that solve SDPs approximately have been proposed as well. The main goal of such methods is to achieve lower complexity in applications where approximate solutions are sufficient and complexity must be minimal. A prominent method that has been used for data detection in multiple-input multiple-output (MIMO) wireless systems is Triangular Approximate SEmidefinite Relaxation (TASER),<ref>{{Cite journal\|last1=Castañeda\|first1=O.\|last2=Goldstein\|first2=T.\|last3=Studer\|first3=C.\|date=December 2016\|title=Data Detection in Large Multi-Antenna Wireless Systems via Approximate Semidefinite Relaxation\|journal=IEEE Transactions on Circuits and Systems I: Regular Papers\|volume=63\|issue=12\|pages=2334–2346\|doi=10.1109/TCSI.2016.2607198\|arxiv=1609.01797\|hdl=20.500.11850/448631\|issn=1558-0806\|doi-access=free}}</ref> which operates on the Cholesky decomposition factors of the semidefinite matrix instead of the semidefinite matrix. This method calculates approximate solutions for a max-cut-like problem that are often comparable to solutions from exact solvers but in only 10-20 algorithm iterations. [[Elad Hazan\|Hazan]]<ref>{{Cite book \|last=Hazan \|first=Elad \|date=2008 \|editor-last=Laber \|editor-first=Eduardo Sany \|editor2-last=Bornstein \|editor2-first=Claudson \|editor3-last=Nogueira \|editor3-first=Loana Tito \|editor4-last=Faria \|editor4-first=Luerbio \|chapter=Sparse Approximate Solutions to Semidefinite Programs \|chapter-url=https://link.springer.com/chapter/10.1007/978-3-540-78773-0_27 \|title=LATIN 2008: Theoretical Informatics \|series=Lecture Notes in Computer Science \|volume=4957 \|language=en \|___location=Berlin, Heidelberg \|publisher=Springer \|pages=306–316 \|doi=10.1007/978-3-540-78773-0_27 \|isbn=978-3-540-78773-0}}</ref> has developed an approximate algorithm for solving SDPs with the additional constraint that the [[Trace (linear algebra)\|trace]] of the variables matrix must be 1. == Preprocessing algorithms == Line 243 ⟶ 276: * Delete redundant rows and columns; * Reduce the size of the variable matrix.<ref>{{citation \|last1=Zhu \|first1=Yuzixuan \|title=Sieve-SDP: a simple facial reduction algorithm to preprocess semidefinite programs \|date=2019 \|url=http://link.springer.com/10.1007/s12532-019-00164-4 \|journal=Mathematical Programming Computation \|volume=11 \|issue=3 \|pages=503–586 \|language=en \|arxiv=1710.08954 \|doi=10.1007/s12532-019-00164-4 \|issn=1867-2949 \|s2cid=53645581 \|last2=Pataki \|first2=Gábor \|last3=Tran-Dinh \|first3=Quoc}}</ref> == See also == * [[Square-root sum problem]] - a special case of an SDP feasibility problem. == References ==