Mehrotra predictor–corrector method: Difference between revisions

Content deleted Content added
m top: clean up, typo(s) fixed: Therefore → Therefore, using AWB
Link suggestions feature: 3 links added.
Tags: Visual edit Mobile edit Mobile web edit Newcomer task Suggested: add links
 
(11 intermediate revisions by 9 users not shown)
Line 1:
{{Short description|1989 Optimisation algorithm}}
'''Mehrotra's predictor–corrector method''' in [[Optimization (mathematics)|optimization]] is a specific [[interior point method]] for [[linear programming]]. It was proposed in 1989 by Sanjay Mehrotra.<ref>{{cite journal|last=Mehrotra|first=S.|title=On the implementation of a primal–dual interior point method|journal=SIAM Journal on Optimization|volume=2|year=1992|issue=4|pages=575–601|doi=10.1137/0802028}}</ref>
 
Line 9 ⟶ 10:
The complete search direction is the sum of the predictor direction and the corrector direction.
 
Although there is no theoretical complexity bound on it yet, Mehrotra's predictor–corrector method is widely used in practice.<ref>"In 1989, Mehrotra described a practical algorithm for linear programming that remains the basis of most current software; his work appeared in 1992."<p>{{cite journal|last=Potra|first=Florian A.|author2=Stephen J. Wright|title=Interior-point methods|journal=Journal of Computational and Applied Mathematics|volume=124|year=2000|issue=1–2|pages=281–302|doi=10.1016/S0377-0427(00)00433-7|doi-access=|bibcode=2000JCoAM.124..281P }}</ref> Its corrector step uses the same [[Cholesky decomposition]] found during the predictor step in an effective way, and thus it is only marginally more expensive than a standard interior point algorithm. However, the additional overhead per iteration is usually paid off by a reduction in the number of iterations needed to reach an optimal solution. It also appears to converge very fast when close to the optimum.
 
== Derivation ==
The derivation of this section follows the outline by Nocedal and Wright.<ref name=":0">{{Cite book|title=Numerical Optimisation|lastlast1=Nocedal|firstfirst1=Jorge|last2=Wright|first2=Stephen J.|publisher=Springer|year=2006|isbn=978-0387-30303-1|___location=United States of America|pages=392–417, 448–496}}</ref>
 
=== Predictor step - Affine scaling direction ===
Line 34 ⟶ 35:
\end{align}</math>
 
where <math>X=\text{diag}(x)</math> and <math>S=\text{diag}(s)</math> whence <math>e=(1,1,\dots,1)^T\in\mathbb{R}^{1n \times n1}</math>.
 
These conditions can be reformulated as a mapping <math>F: \mathbb{R}^{2n+m}\rightarrow\mathbb{R}^{2n+m}</math> as follows
Line 43 ⟶ 44:
\end{align}</math>
 
The predictor-corrector method then works by using [[Newton's method]] to obtain the [[affine scaling]] direction. This is achieved by solving the following system of linear equations
 
<math>J(x,\lambda,s) \begin{bmatrix} \Delta x^\text{aff}\\\Delta\lambda^\text{aff} \\\Delta s^\text{aff}\end{bmatrix} = -F(x,\lambda,s)</math>
Line 58 ⟶ 59:
 
=== Centering step ===
The average value of the products <math>x_is_i,\;i=1,2,\dots,n</math> constitute an important measure of the desirability of a certain set <math>(x^k,s^k)</math> (the superscripts denote the value of the iteration number, <math>k</math>, of the method). This is called the duality measure and is defined by
 
<math>\mu=\frac{1}{n}\sum_{i=1}^n x_is_i = \frac{x^Ts}{n}.</math>
Line 69 ⟶ 70:
 
=== Corrector step ===
Considering the system used to compute the affine scaling direction defined in the above, one can note that taking a full step in the affine scaling direction does results in the complementarity condition not being satisfied:
 
<math>\left(x_i+\Delta x_i^\text{aff}\right)\left(s_i+\Delta s_i^\text{aff}\right) = x_is_i + x_i\Delta s_i^\text{aff} + s_i\Delta x_i^\text{aff} + \Delta x_i^\text{aff}\Delta s_i^\text{aff} = \Delta x_i^\text{aff}\Delta s_i^\text{aff} \ne 0.</math>
Line 98 ⟶ 99:
 
<math>\begin{align}
\mu_\text{aff} &= (x+\alpha^\text{pri}_\text{aff}\Delta x^\text{aff})^T(s+\alpha^\text{dual}_\text{aff}\Delta s^\text{aff})^T/n,\\
\alpha^\text{pri}_\text{aff} &= \min\left(1, \underset{i:\Delta x_i^\text{aff}<0}{\min} -\frac{x_i}{\Delta x_i^\text{aff}}\right),\\
\alpha^\text{dual}_\text{aff} &= \min\left(1, \underset{i:\Delta s_i^\text{aff}<0}{\min} -\frac{s_i}{\Delta s_i^\text{aff}}\right),
Line 106 ⟶ 107:
 
== Step lengths ==
In practical implementations, a version of [[line search]] is performed to obtain the maximal step length that can be taken in the search direction without violating nonnegativity, <math>(x,s) \geq 0</math>.<ref name=":0" />
 
== Adaptation to Quadratic Programming ==
Line 117 ⟶ 118:
{{DEFAULTSORT:Mehrotra predictor-corrector method}}
[[Category:Optimization algorithms and methods]]
[[Category:Linear programming]]