Interior-point method: Difference between revisions

Content deleted Content added
m Potential-reduction methods: replace norse rune(!!) with the intersection symbol
OAbot (talk | contribs)
m Open access bot: url-access updated in citation with #oabot.
 
(6 intermediate revisions by 6 users not shown)
Line 9:
 
== History ==
An interior point method was discovered by Soviet mathematician I. I. Dikin in 1967.<ref>{{Cite journal |last1=Dikin |first1=I.I. |year=1967 |title=Iterative solution of problems of linear and quadratic programming. |url=https://zbmath.org/?q=an:0189.19504 |journal=Dokl. Akad. Nauk SSSR |volume=174 |issue=1 |pages=747–748|zbl=0189.19504 }}</ref> The method was reinvented in the U.S. in the mid-1980s. In 1984, [[Narendra Karmarkar]] developed a method for [[linear programming]] called [[Karmarkar's algorithm]],<ref>{{cite conference |last1=Karmarkar |first1=N. |year=1984 |title=Proceedings of the sixteenth annual ACM symposium on Theory of computing – STOC '84 |pages=302 |doi=10.1145/800057.808695 |isbn=0-89791-133-4 |archive-url=https://web.archive.org/web/20131228145520/http://retis.sssup.it/~bini/teaching/optim2010/karmarkar.pdf |archive-date=28 December 2013 |doi-access=free |chapter-url=http://retis.sssup.it/~bini/teaching/optim2010/karmarkar.pdf |chapter=A new polynomial-time algorithm for linear programming |url-status=dead}}</ref> which runs in provablyprobably polynomial time (<math>O(n^{3.5} L)</math> operations on ''L''-bit numbers, where ''n'' is the number of variables and constants), and is also very efficient in practice. Karmarkar's paper created a surge of interest in interior point methods. Two years later, [[James Renegar]] invented the first ''path-following'' interior-point method, with run-time <math>O(n^{3} L)</math>. The method was later extended from linear to convex optimization problems, based on a [[self-concordant]] [[barrier function]] used to encode the [[convex set]].<ref name=":0">{{Cite book |last=Arkadi Nemirovsky |url=https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=8c3cb6395a35cb504019f87f447d65cb6cf1cdf0 |title=Interior point polynomial-time methods in convex programming |year=2004}}</ref>
 
Any convex optimization problem can be transformed into minimizing (or maximizing) a [[linear function]] over a convex set by converting to the [[Epigraph (mathematics)|epigraph]] form.<ref name=":3">{{cite book |lastlast1=Boyd |firstfirst1=Stephen |title=Convex Optimization |last2=Vandenberghe |first2=Lieven |publisher=[[Cambridge University Press]] |year=2004 |isbn=978-0-521-83378-3 |___location=Cambridge |pages= |mr=2061575}}</ref>{{Rp|___location=143}} The idea of encoding the [[candidate solution|feasible set]] using a barrier and designing barrier methods was studied by Anthony V. Fiacco, Garth P. McCormick, and others in the early 1960s. These ideas were mainly developed for general [[nonlinear programming]], but they were later abandoned due to the presence of more competitive methods for this class of problems (e.g. [[sequential quadratic programming]]).
 
[[Yurii Nesterov]] and [[Arkadi Nemirovski]] came up with a special class of such barriers that can be used to encode any convex set. They guarantee that the number of [[iteration]]s of the algorithm is bounded by a polynomial in the dimension and accuracy of the solution.<ref>{{Cite journal |mr=2115066 |doi=10.1090/S0273-0979-04-01040-7 |title=The interior-point revolution in optimization: History, recent developments, and lasting consequences |year=2004 |last1=Wright |first1=Margaret H. |journal=Bulletin of the American Mathematical Society |volume=42 |pages=39–57|doi-access=free }}</ref><ref name=":0" />
 
The class of primal-dual path-following interior-point methods is considered the most successful. [[Mehrotra predictor–corrector method|Mehrotra's predictor–corrector algorithm]] provides the basis for most implementations of this class of methods.<ref>{{cite journal |last=Potra |first=Florian A. |author2=Stephen J. Wright |title=Interior-point methods |journal=Journal of Computational and Applied Mathematics |volume=124 |year=2000 |issue=1–2 |pages=281–302 |doi=10.1016/S0377-0427(00)00433-7|doi-access=free |bibcode=2000JCoAM.124..281P }}</ref>
 
== Definitions ==
Line 40:
 
* '''Potential reduction methods''': [[Karmarkar algorithm|Karmarkar's algorithm]] was the first one.
* '''Path-following methods''': the algorithms of [[James Renegar]]<ref name=":1">{{Cite journal |last=Renegar |first=James |date=1988-01-01 |title=A polynomial-time algorithm, based on Newton's method, for linear programming |url=https://doi.org/10.1007/BF01580724 |journal=Mathematical Programming |language=en |volume=40 |issue=1 |pages=59–93 |doi=10.1007/BF01580724 |issn=1436-4646|url-access=subscription }}</ref> and Clovis Gonzaga<ref name=":2">{{Citation |last=Gonzaga |first=Clovis C. |title=An Algorithm for Solving Linear Programming Problems in O(n3L) Operations |date=1989 |url=https://doi.org/10.1007/978-1-4613-9617-8_1 |work=Progress in Mathematical Programming: Interior-Point and Related Methods |pages=1–28 |editor-last=Megiddo |editor-first=Nimrod |access-date=2023-11-22 |place=New York, NY |publisher=Springer |language=en |doi=10.1007/978-1-4613-9617-8_1 |isbn=978-1-4613-9617-8|url-access=subscription }}</ref> were the first ones.
* '''Primal-dual methods'''.
 
Line 63:
* The solver is Newton's method, and a ''single'' step of Newton is done for each single step in ''t''.
 
They proved that, in this case, the difference ''x<sub>i</sub>'' - ''x''*(''t<sub>i</sub>'') remains at most 0.01, and f(''x<sub>i</sub>'') - f* is at most 2*''m''/''t<sub>i</sub>''. Thus, the solution accuracy is proportional to 1/''t<sub>i</sub>'', so to add a single accuracy-digit, it is suffiicentsufficient to multiply ''t<sub>i</sub>'' by 2 (or any other constant factor), which requires O(sqrt(''m'')) Newton steps. Since each Newton step takes O(''m n''<sup>2</sup>) operations, the total complexity is O(''m<sup>3/2</sup> n''<sup>2</sup>) operations for accuracy digit.
 
[[Yurii Nesterov|Yuri Nesterov]] extended the idea from linear to non-linear programs. He noted that the main property of the logarithmic barrier, used in the above proofs, is that it is [[self-concordant]] with a finite barrier parameter. Therefore, many other classes of convex programs can be solved in polytime using a path-following method, if we can find a suitable self-concordant barrier function for their feasible region.<ref name=":0" />{{Rp|___location=Sec.1}}
Line 72:
To use the interior-point method, we need a [[self-concordant barrier]] for ''G''. Let ''b'' be an ''M''-self-concordant barrier for ''G'', where ''M''≥1 is the self-concordance parameter. We assume that we can compute efficiently the value of ''b'', its gradient, and its [[Hessian matrix|Hessian]], for every point x in the interior of ''G''.
 
For every ''t''>0, we define the ''penalized objective'' '''f<sub>t</sub>(x) := t''c''<sup>T</sup>''x +'' b(''x'')'''. We define the path of minimizers by: '''x*(t) := arg min f<sub>t</sub>(x)'''. We approximate this path along an increasing sequence ''t<sub>i</sub>''. The sequence is initialized by a certain non-trivial two-phase initialization procedure. Then, it is updated according to the following rule: <math>t_{i+1} := \mu \cdot t_i</math>.
 
For each ''t<sub>i</sub>'', we find an approximate minimum of ''f<sub>ti</sub>'', denoted by ''x<sub>i</sub>''. The approximate minimum is chosen to satisfy the following "closeness condition" (where ''L'' is the ''path tolerance''):<blockquote><math>\sqrt{[\nabla_x f_t(x_i)]^T [\nabla_x^2 f_t(x_i)]^{-1} [\nabla_x f_t(x_i)]} \leq L</math>.</blockquote>To find ''x<sub>i</sub>''<sub>+1</sub>, we start with ''x<sub>i</sub>'' and apply the [[damped Newton method]]. We apply several steps of this method, until the above "closeness relation" is satisfied. The first point that satisfies this relation is denoted by ''x<sub>i</sub>''<sub>+1</sub>.<ref name=":0" />{{Rp|___location=Sec.4}}
Line 100:
 
=== Practical considerations ===
The theoretic guarantees assume that the penalty parameter is increased at the rate <math>\mu = \left(1+r/\sqrt{M}\right)</math>, so the worst-case number of required Newton steps is <math>O(\sqrt{M})</math>. In theory, if ''μ'' is larger (e.g. 2 or more), then the worst-case number of required Newton steps is in <math>O(M)</math>. However, in practice, larger ''μ'' leads to a much faster convergence. These methods are called ''long-step methods''.<ref name=":0" />{{Rp|___location=Sec.4.6}} In practice, if ''μ'' is between 3 and 100, then the program converges within 20-40 Newton steps, regardless of the number of constraints (though the runtime of each Newton step of course grows with the number of constraints). The exact value of ''μ'' within this range has little effect on the performaneperformance.<ref name=":3" />{{Rp|___location=chpt.11}}
 
== Potential-reduction methods ==
Line 171:
:<math>(x,\lambda) \to (x + \alpha p_x, \lambda + \alpha p_\lambda).</math>[[File:Interior_Point_Trajectory.webm|center|thumb|400x400px|Trajectory of the iterates of ''x'' by using the interior point method.]]
 
== Types of Convexconvex Programsprograms Solvablesolvable via Interiorinterior-Pointpoint Methodsmethods ==
Here are some special cases of convex programs that can be solved efficiently by interior-point methods.<ref name=":0" />{{Rp|___location=Sec.10}}