Interior-point method: Difference between revisions

Content deleted Content added
No edit summary
Line 11:
An interior point method was discovered by Soviet mathematician I. I. Dikin in 1967.<ref>{{Cite journal |last1=Dikin |first1=I.I. |year=1967 |title=Iterative solution of problems of linear and quadratic programming. |url=https://zbmath.org/?q=an:0189.19504 |journal=Dokl. Akad. Nauk SSSR |volume=174 |issue=1 |pages=747–748}}</ref> The method was reinvented in the U.S. in the mid-1980s. In 1984, [[Narendra Karmarkar]] developed a method for [[linear programming]] called [[Karmarkar's algorithm]],<ref>{{cite conference |last1=Karmarkar |first1=N. |year=1984 |title=Proceedings of the sixteenth annual ACM symposium on Theory of computing – STOC '84 |pages=302 |doi=10.1145/800057.808695 |isbn=0-89791-133-4 |archive-url=https://web.archive.org/web/20131228145520/http://retis.sssup.it/~bini/teaching/optim2010/karmarkar.pdf |archive-date=28 December 2013 |doi-access=free |chapter-url=http://retis.sssup.it/~bini/teaching/optim2010/karmarkar.pdf |chapter=A new polynomial-time algorithm for linear programming |url-status=dead}}</ref> which runs in provably polynomial time (<math>O(n^{3.5} L)</math> operations on ''L''-bit numbers, where ''n'' is the number of variables and constants), and is also very efficient in practice. Karmarkar's paper created a surge of interest in interior point methods. Two years later, [[James Renegar]] invented the first ''path-following'' interior-point method, with run-time <math>O(n^{3} L)</math>. The method was later extended from linear to convex optimization problems, based on a [[self-concordant]] [[barrier function]] used to encode the [[convex set]].<ref name=":0">{{Cite book |last=Arkadi Nemirovsky |url=https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=8c3cb6395a35cb504019f87f447d65cb6cf1cdf0 |title=Interior point polynomial-time methods in convex programming |year=2004}}</ref>
 
Any convex optimization problem can be transformed into minimizing (or maximizing) a [[linear function]] over a convex set by converting to the [[Epigraph (mathematics)|epigraph]] form.<ref name=":3">{{cite book |last=Boyd |first=Stephen |title=Convex Optimization |last2=Vandenberghe |first2=Lieven |title=Convex Optimization |publisher=[[Cambridge University Press]] |___location=Cambridge |year=2004 |pages=143 |isbn=978-0-521-83378-3 |___location=Cambridge |pages= |mr=2061575}}</ref>{{Rp|___location=143}} The idea of encoding the [[candidate solution|feasible set]] using a barrier and designing barrier methods was studied by Anthony V. Fiacco, Garth P. McCormick, and others in the early 1960s. These ideas were mainly developed for general [[nonlinear programming]], but they were later abandoned due to the presence of more competitive methods for this class of problems (e.g. [[sequential quadratic programming]]).
 
[[Yurii Nesterov]] and [[Arkadi Nemirovski]] came up with a special class of such barriers that can be used to encode any convex set. They guarantee that the number of [[iteration]]s of the algorithm is bounded by a polynomial in the dimension and accuracy of the solution.<ref>{{Cite journal |mr=2115066 |doi=10.1090/S0273-0979-04-01040-7 |title=The interior-point revolution in optimization: History, recent developments, and lasting consequences |year=2004 |last1=Wright |first1=Margaret H. |journal=Bulletin of the American Mathematical Society |volume=42 |pages=39–57|doi-access=free }}</ref><ref name=":0" />
Line 57:
* The constraints (and the objective) are linear functions;
* The barrier function is [[Logarithmic barrier function|logarithmic]]: b(x) := - sum''<sub>j</sub>'' log(''-g<sub>j</sub>''(''x'')).
* The formula for updating the penalty parameter ''t'' is: updated geometrically, that is, ''t<submath>t_{i+1} := \mu \cdot t_i</submath>, where ''μ''+1 =is a constant (they took <math>\mu = 1+0.001/\cdot \sqrt(''{m''))*''t<sub>i}</submath>'', where ''m'' is the number of inequality constraints);
* The solver is Newton's method, and a ''single'' step of Newton is done for each single step in ''t''.
 
Line 65:
 
=== Details ===
We are given a convex optimization problem (P) in "standard form":<blockquote>'''minimize ''c''<sup>T</sup>''x'' s.t. ''x'' in ''G''''', </blockquote>where ''G'' is convex and closed. We can also assume that ''G'' is bounded (otherwise, we can addeasily make it bounded by adding a constraint |''x''|≤''R'' for some sufficiently large ''R'').<ref name=":0" />{{Rp|___location=Sec.4}}
 
To use the interior-point method, we need a [[self-concordant barrier]] for ''G''. Let ''b'' be an ''M''-self-concordant barrier for ''G'', where ''M''≥1 is the self-concordance parameter. We assume that we can compute efficiently the value of ''b'', its gradient, and its [[Hessian matrix|Hessian]], for every point x in the interior of ''G''.
 
For every ''t''>0, we define the ''penalized objective'' '''f<sub>t</sub>(x) := ''c''<sup>T</sup>''x +'' b(''x'')''',. We define the path of minimizers by: '''x*(t) := arg min f<sub>t</sub>(x)'''. We apporimate this path along an increasing sequence ''t<sub>i</sub>''. The sequence is initialized by a certain non-trivial two-phase initialization procedure. Then, it is updated according to the following rule (where ''r''>0 is a parameter called the ''penalty rate''):<blockquote><math>t_{i+1} := \left(1+r/\sqrt{M}\right)t_i</math>.</blockquote>For each ''t<sub>i</sub>'', we find an approximate minimum of ''f<sub>ti</sub>'', denoted by ''x<sub>i</sub>''. The approximate minimum is chosen to satisfy the following "closeness condition" (where ''L'' is the ''path tolerance''):<blockquote><math>\sqrt{[\nabla_x f_t(x_i)]^T [\nabla_x^2 f_t(x_i)]^{-1} [\nabla_x f_t(x_i)]} \leq L</math>.</blockquote>To find ''x<sub>i</sub>''<sub>+1</sub>, we start with ''x<sub>i</sub>'' and apply the [[damped Newton method]]. We apply several steps of this method, until the above "closeness relation" is satisfied. The first point that satisfies this relation is denoted by ''x<sub>i</sub>''<sub>+1</sub>.<ref name=":0" />{{Rp|___location=Sec.4}}
 
=== Convergence and complexity ===
Line 82:
 
=== Practical considerations ===
The theoretic guarantees assume that the penalty parameter is increased at the rate <math>\mu = \left(1+r/\sqrt{M}\right)</math>, so the worst-case number of required Newton steps is <math>O(\sqrt{M})</math>. In practicetheory, itif ''μ'' is possiblelarger to(e.g. increase2 or more), then the penaltyworst-case parameternumber of required Newton steps is in <math>O(M)</math>. However, in practice, larger ''μ'' leads to a much faster; theseconvergence. These methods are called ''long -step methods''.<ref techniquesname=":0" />{{Rp|___location=Sec.4.6}} They enable toIn solvepractice, problemsif with''μ'' is between 3 and 100, then the program converges within 20-40 Newton steps, regardless of the problemnumber sizeof constraints (though the runtime of each Newton step of course grows with the number of constraints). The exact value of ''μ'' within this range has little effect on the performane.<ref name=":03" />{{Rp|___location=Secchpt.4.611}}
 
== Potential-reduction methods ==