Revision as of 10:48, 10 December 2023 edit Erel Segal (talk \| contribs) Extended confirmed users, IP block exemptions 14,576 edits →Details Tag: Visual edit ← Previous edit		Revision as of 11:13, 10 December 2023 edit undo Erel Segal (talk \| contribs) Extended confirmed users, IP block exemptions 14,576 edits →Convergence and complexity Tag: Visual edit Next edit →
Line 18: == Definitions == We are given a [[convex program]] of the form:<math display="block"> \begin{aligned} \underset{x \in \mathbb{R}^n}{\text{minimize}}\quad & f(x) \\ ~~<math display="block>~~ \text{subject to}\quad & x \in G. \end{aligned} </math>where f is a [[convex function]] and G is a [[convex set]]. Without loss of generality, [[Convex optimization#linear\|we can assume that the objective ''f'' is a linear function]]. Usually, the convex set ''G'' is represented by a set of convex inequalities and linear equalities; the linear equalities can be eliminated using linear algebra, so for simplicity we assume there are only convex inequalities, and the program can be described as follows, where the ''g<sub>i</sub>'' are convex functions:<math display="block"> \begin{aligned} \underset{x \in \mathbb{R}^n}{\text{minimize}}\quad & f(x) \\ \text{subject to}\quad & g_i(x) \leq 0 \text{ for } i = 1, \dots, m,. \\ ~~& x \in G.~~ \end{aligned} ~~where f and the ''g<sub>i~~</~~sub~~math>~~'' are [[convex function]]<nowiki/>s and G is a [[convex set]]. Without loss of generality, [[Convex optimization#linear\|we can assume that the objective ''f'' is a linear function]].~~ We assume that the constraint functions belong to some family (e.g. quadratic functions), so that the program can be represented by a finite ''vector of coefficients'' (e.g. the coefficients to the quadratic functions). The dimension of this coefficient vector is called the ''size'' of the program. A ''numerical solver'' for a given family of programs is an algorithm that, given the coefficient vector, generates a sequence of approximate solutions ''x<sub>t</sub>'' for ''t''=1,2,..., using finitely many arithmetic operations. A numerical solver is called ''convergent'' if, for any progarm from the family and any positive ''ε''>0, there is some ''T'' (which may depend on the program and on ''ε'') such that, for any ''t''>''T'', the approximate solution ''x<sub>t</sub>'' is ''ε-approximate,'' that is: <blockquote>''f''(''x'') - f<sup></sup> ≤ ''ε''▼ ~~</math>~~ ▲where f and the ''g<sub>i</sub>'' are [[convex function]]<nowiki/>s and G is a [[convex set]]. Without loss of generality, [[Convex optimization#linear\|we can assume that the objective ''f'' is a linear function]]. We assume that the constraint functions belong to some family (e.g. quadratic functions), so that the program can be represented by a finite ''vector of coefficients'' (e.g. the coefficients to the quadratic functions). The dimension of this coefficient vector is called the ''size'' of the program. A ''numerical solver'' for a given family of programs is an algorithm that, given the coefficient vector, generates a sequence of approximate solutions ''x<sub>t</sub>'' for ''t''=1,2,..., using finitely many arithmetic operations. A numerical solver is called ''convergent'' if, for any progarm from the family and any positive ''ε''>0, there is some ''T'' (which may depend on the program and on ''ε'') such that, for any ''t''>''T'', the approximate solution ''x<sub>t</sub>'' is ''ε-approximate,'' that is:<blockquote>''f''(''x'') - f<sup></sup> ≤ ''ε'' ''g<sub>i</sub>''(''x'') ≤ ''ε'' for ''i'' in 1,...,''m'', Line 65 ⟶ 66: === Details === We are given a convex optimization problem (P) in "standard form":<blockquote>'''minimize ''c''<sup>T</sup>''x'' s.t. ''x'' in ''G''''', </blockquote>where ''G'' is convex and closed. We can also assume that ''G'' is bounded (we can easily make it bounded by adding a constraint \|''x''\|≤''R'' for some sufficiently large ''R'').<ref name=":0" />{{Rp\|___location=Sec.4}} To use the interior-point method, we need a [[self-concordant barrier]] for ''G''. Let ''b'' be an ''M''-self-concordant barrier for ''G'', where ''M''≥1 is the self-concordance parameter. We assume that we can compute efficiently the value of ''b'', its gradient, and its [[Hessian matrix\|Hessian]], for every point x in the interior of ''G''. Line 74 ⟶ 75: === Convergence and complexity === The convergence rate of the method is given by the following formula, for every ''i'':<ref name=":0" />{{Rp\|___location=Prop.4.4.1}}<blockquote><math>c^T x_i - c^* \leq \frac{2 M}{t_0} \mu^{-i} </math></blockquote>~~<math>\mu = \left(1+r/\sqrt{M}\right) </math>.~~ Taking <math>\mu = \left(1+r/\sqrt{M}\right) </math>, the number of Newton steps required to go from ''x<sub>i</sub>'' to ''x<sub>i</sub>''<sub>+1</sub> is at most a fixed number, that depends only on ''r'' and ''L''. In particular, the total number of Newton steps required to find an ''ε''-approximate solution (i.e., finding ''x'' in ''G'' such that ''c''<sup>T</sup>''x'' - c* ≤ ''ε'') is at most:<ref name=":0" />{{Rp\|___location=Thm.4.4.1}}<blockquote><math>O(1) \cdot \sqrt{M} \cdot \ln\left(\frac{M}{t_0 \varepsilon} + 1\right) </math></blockquote>where the constant factor O(1) depends only on ''r'' and ''L''. The number of Newton steps required for the two-step initialization procedure is at most:<ref name=":0" />{{Rp\|___location=Thm.4.5.1}}<blockquote><math>O(1) \cdot \sqrt{M} \cdot \ln\left(\frac{M}{1-\pi_{x^_f}(\bar{x})} + 1\right) + O(1) \cdot \sqrt{M} \cdot \ln\left(\frac{M \text{Var}_G(c)}{\epsilon} + 1\right) </math>{{Clarify\|reason=It is not clear what this "pi" function is\|date=November 2023}}</blockquote>where the constant factor O(1) depends only on ''r'' and ''L'', and <math>\text{Var}_G(c) := \max_{x\in G} c^T x - \min_{x\in G} c^T x </math>, and <math>\bar{x}</math> is some point in the interior of ''G''. Overall, the overall Newton complexity of finding an ''ε''-approximate solution is at most<blockquote><math>O(1) \cdot \sqrt{M} \cdot \ln\left(\frac{V}{\varepsilon} + 1\right) </math>, where V is some problem-dependent constant: <math>V = \frac{\text{Var}_G(c)}{1-\pi_{x^_f(\bar{x})}} Overall, the overall Newton complexity of finding an ''ε''-approximate solution is at most<blockquote><math>O(1) \cdot \sqrt{M} \cdot \ln\left(\frac{V}{\varepsilon} + 1\right) </math>, where V is some problem-dependent constant: <math>V = \frac{\text{Var}_G(c)}{1-\pi_{x^_f(\bar{x})}} </math>.</blockquote>Each Newton step takes O(''n''<sup>3</sup>) arithmetic operations. === Initialization: phase-I methods === To initialize the path-following methods, we need a point in the relative interior of the feasible region ''G''. In other words: if ''G'' is defined by the inequalities ''g<sub>i</sub>''(''x'') ≤ 0, then we need some ''x'' for which ''g<sub>i</sub>''(''x'') < 0 for all ''i'' in 1,...,''m''. If we do not have such a point, we need to find one using a so-called '''phase I method'''.<ref name=":3" />{{Rp\|___location=11.4}} A simple phase-I method is to solve the following convex program:<math display="block"> \begin{aligned} \text{minimize}\quad & s \\ \text{subject to}\quad & g_i(x) \leq s \text{ for } i = 1, \dots, m \end{aligned} </math>Denote the optimal solution by x,''s''. If ''s''<0, then we know that x is an interior point of the original problem and can go on to "phase II", which is solving the original problem. * If ''s''>0, then we know that the original program is infeasible - the feasible region is empty. If ''s''=0 and it is attained by some solution x, then the problem is feasible but has no interior point; if it is not attained, then the problem is infeasible. For this program it is easy to get an interior point: we can take arbitrarily ''x''=0, and take ''s'' to be any number larger than max(''f''<sub>1</sub>(0),...,''f<sub>m</sub>''(0)). Therefore, it can be solved using interior-point methods. However, the run-time is proportional to log(1/''s''). As s comes near 0, it becomes harder and harder to find an exact solution to the phase-I problem, and thus harder to decide whether the original problem is feasible. === Practical considerations ===

Interior-point method: Difference between revisions