Interior-point method: Difference between revisions

Content deleted Content added
Line 18:
 
== Definitions ==
We are given a [[convex program]] of the form:<math display="block">
\begin{aligned}
 
\underset{x \in \mathbb{R}^n}{\text{minimize}}\quad & f(x) \\
<math display="block>
\text{subject to}\quad & x \in G.
\end{aligned}
</math>where f is a [[convex function]] and G is a [[convex set]]. Without loss of generality, [[Convex optimization#linear|we can assume that the objective ''f'' is a linear function]]. Usually, the convex set ''G'' is represented by a set of convex inequalities and linear equalities; the linear equalities can be eliminated using linear algebra, so for simplicity we assume there are only convex inequalities, and the program can be described as follows, where the ''g<sub>i</sub>'' are convex functions:<math display="block">
\begin{aligned}
\underset{x \in \mathbb{R}^n}{\text{minimize}}\quad & f(x) \\
\text{subject to}\quad & g_i(x) \leq 0 \text{ for } i = 1, \dots, m,. \\
& x \in G.
\end{aligned}
where f and the ''g<sub>i</submath>'' are [[convex function]]<nowiki/>s and G is a [[convex set]]. Without loss of generality, [[Convex optimization#linear|we can assume that the objective ''f'' is a linear function]]. We assume that the constraint functions belong to some family (e.g. quadratic functions), so that the program can be represented by a finite ''vector of coefficients'' (e.g. the coefficients to the quadratic functions). The dimension of this coefficient vector is called the ''size'' of the program. A ''numerical solver'' for a given family of programs is an algorithm that, given the coefficient vector, generates a sequence of approximate solutions ''x<sub>t</sub>'' for ''t''=1,2,..., using finitely many arithmetic operations. A numerical solver is called ''convergent'' if, for any progarm from the family and any positive ''ε''>0, there is some ''T'' (which may depend on the program and on ''ε'') such that, for any ''t''>''T'', the approximate solution ''x<sub>t</sub>'' is ''ε-approximate,'' that is: <blockquote>''f''(''x'') - f<sup>*</sup> ≤ ''ε''
</math>
where f and the ''g<sub>i</sub>'' are [[convex function]]<nowiki/>s and G is a [[convex set]]. Without loss of generality, [[Convex optimization#linear|we can assume that the objective ''f'' is a linear function]]. We assume that the constraint functions belong to some family (e.g. quadratic functions), so that the program can be represented by a finite ''vector of coefficients'' (e.g. the coefficients to the quadratic functions). The dimension of this coefficient vector is called the ''size'' of the program. A ''numerical solver'' for a given family of programs is an algorithm that, given the coefficient vector, generates a sequence of approximate solutions ''x<sub>t</sub>'' for ''t''=1,2,..., using finitely many arithmetic operations. A numerical solver is called ''convergent'' if, for any progarm from the family and any positive ''ε''>0, there is some ''T'' (which may depend on the program and on ''ε'') such that, for any ''t''>''T'', the approximate solution ''x<sub>t</sub>'' is ''ε-approximate,'' that is:<blockquote>''f''(''x'') - f<sup>*</sup> ≤ ''ε''
 
''g<sub>i</sub>''(''x'') ≤ ''ε'' for ''i'' in 1,...,''m'',
Line 65 ⟶ 66:
 
=== Details ===
We are given a convex optimization problem (P) in "standard form":<blockquote>'''minimize ''c''<sup>T</sup>''x'' s.t. ''x'' in ''G''''', </blockquote>where ''G'' is convex and closed. We can also assume that ''G'' is bounded (we can easily make it bounded by adding a constraint |''x''|≤''R'' for some sufficiently large ''R'').<ref name=":0" />{{Rp|___location=Sec.4}}
 
To use the interior-point method, we need a [[self-concordant barrier]] for ''G''. Let ''b'' be an ''M''-self-concordant barrier for ''G'', where ''M''≥1 is the self-concordance parameter. We assume that we can compute efficiently the value of ''b'', its gradient, and its [[Hessian matrix|Hessian]], for every point x in the interior of ''G''.
Line 74 ⟶ 75:
 
=== Convergence and complexity ===
 
The convergence rate of the method is given by the following formula, for every ''i'':<ref name=":0" />{{Rp|___location=Prop.4.4.1}}<blockquote><math>c^T x_i - c^* \leq \frac{2 M}{t_0} \mu^{-i} </math></blockquote><math>\mu = \left(1+r/\sqrt{M}\right) </math>. Taking <math>\mu = \left(1+r/\sqrt{M}\right) </math>, the number of Newton steps required to go from ''x<sub>i</sub>'' to ''x<sub>i</sub>''<sub>+1</sub> is at most a fixed number, that depends only on ''r'' and ''L''. In particular, the total number of Newton steps required to find an ''ε''-approximate solution (i.e., finding ''x'' in ''G'' such that ''c''<sup>T</sup>''x'' - c* ≤ ''ε'') is at most:<ref name=":0" />{{Rp|___location=Thm.4.4.1}}<blockquote><math>O(1) \cdot \sqrt{M} \cdot \ln\left(\frac{M}{t_0 \varepsilon} + 1\right) </math></blockquote>where the constant factor O(1) depends only on ''r'' and ''L''. The number of Newton steps required for the two-step initialization procedure is at most:<ref name=":0" />{{Rp|___location=Thm.4.5.1}}<blockquote><math>O(1) \cdot \sqrt{M} \cdot \ln\left(\frac{M}{1-\pi_{x^*_f}(\bar{x})} + 1\right)
+
O(1) \cdot \sqrt{M} \cdot \ln\left(\frac{M \text{Var}_G(c)}{\epsilon} + 1\right)
</math>{{Clarify|reason=It is not clear what this "pi" function is|date=November 2023}}</blockquote>where the constant factor O(1) depends only on ''r'' and ''L'', and <math>\text{Var}_G(c) := \max_{x\in G} c^T x - \min_{x\in G} c^T x
</math>, and <math>\bar{x}</math> is some point in the interior of ''G''. Overall, the overall Newton complexity of finding an ''ε''-approximate solution is at most<blockquote><math>O(1) \cdot \sqrt{M} \cdot \ln\left(\frac{V}{\varepsilon} + 1\right) </math>, where V is some problem-dependent constant: <math>V = \frac{\text{Var}_G(c)}{1-\pi_{x^*_f(\bar{x})}}
 
Overall, the overall Newton complexity of finding an ''ε''-approximate solution is at most<blockquote><math>O(1) \cdot \sqrt{M} \cdot \ln\left(\frac{V}{\varepsilon} + 1\right) </math>, where V is some problem-dependent constant: <math>V = \frac{\text{Var}_G(c)}{1-\pi_{x^*_f(\bar{x})}}
</math>.</blockquote>Each Newton step takes O(''n''<sup>3</sup>) arithmetic operations.
 
=== Initialization: phase-I methods ===
To initialize the path-following methods, we need a point in the relative interior of the feasible region ''G''. In other words: if ''G'' is defined by the inequalities ''g<sub>i</sub>''(''x'') ≤ 0, then we need some ''x'' for which ''g<sub>i</sub>''(''x'') < 0 for all ''i'' in 1,...,''m''. If we do not have such a point, we need to find one using a so-called '''phase I method'''.<ref name=":3" />{{Rp|___location=11.4}} A simple phase-I method is to solve the following convex program:<math display="block">
\begin{aligned}
\text{minimize}\quad & s \\
\text{subject to}\quad & g_i(x) \leq s \text{ for } i = 1, \dots, m
\end{aligned}
</math>Denote the optimal solution by x*,''s''*.
 
* If ''s''*<0, then we know that x* is an interior point of the original problem and can go on to "phase II", which is solving the original problem.
* If ''s''*>0, then we know that the original program is infeasible - the feasible region is empty.
* If ''s''*=0 and it is attained by some solution x*, then the problem is feasible but has no interior point; if it is not attained, then the problem is infeasible.
 
For this program it is easy to get an interior point: we can take arbitrarily ''x''=0, and take ''s'' to be any number larger than max(''f''<sub>1</sub>(0),...,''f<sub>m</sub>''(0)). Therefore, it can be solved using interior-point methods. However, the run-time is proportional to log(1/''s''*). As s* comes near 0, it becomes harder and harder to find an exact solution to the phase-I problem, and thus harder to decide whether the original problem is feasible.
 
=== Practical considerations ===