Draft:Collinear gradients method

In mathematics, Collinear gradients method (ColGM)[1]  is an iterative method of directional search for the local extremum of a smooth multivariate function , which do moving towards the extremum along the vector such that the gradients , i.e. they are collinear vectors. This is a first-order method (it uses only the first derivatives ) with a quadratic convergence rate. It can be applied to functions of high dimension with several local extremes. GolGM can be attributed to the Truncated Newton method family.

Collinear vectors and with the direction of minimization for a convex function,

The concept of the method

edit

For a smooth function in a relatively large vicinity of a point , there is a point , where the gradients and are collinear vectors. The direction to the extremum from the point will be the direction . The vector points to the maximum or minimum, depending on the position of the point . It can be in front or behind of relative to the direction to (see the picture). Next, we will consider minimization.

The next iteration of ColGM: (1) where the optimal is found analytically from the assumption of a quadratic one-dimensional function :

(2)

Angle brackets are an inner product space in the Euclidean space . If is a convex function in the vicinity of , then for the front point we get the number , for the back . In any case, we follow step (1).

For a strictly convex quadratic function the ColGM step is i.e. it is a Newton's step (a second-order method with a quadratic convergence rate), where is the Hesse matrix. Such steps ensure the quadratic convergence rate for ColGM.

In general, if has a variable convexity and saddle points are possible, then the minimization direction should be checked by the angle between the vectors and . If , then is the direction of maximization, and in (1) we should take with the opposite sign.

Search for collinear gradients

edit

Collinearity of gradients is estimated by the residual of their directions, which has the form of a system of   equations for search a root  : (3)   where the sign  , this allows us to equally evaluate the collinearity of gradients, both co-directional and oppositely directed,  .

System (3) is solved iteratively (sub-iterations  ) by the conjugate gradient method, assuming that the system is linear in the  -vicinity:

(4)  

where vector  ,  ,  ,  , the product of the Hesse matrix   by   is found by numerical differentiation:

(5)  

where  ,   is a small positive number such that  .

The initial approximation is set at 45° to all coordinate axes and  -length:

(6)  

The initial radius   is the vicinity of the point   and it is modifid:

(7)  

Necessary  . Here, the small positive number   is noticeably larger than the machine epsilon.

Sub-iterations   terminate when at least one of the conditions is met:

  1.   — accuracy achieved;
  2.   — convergence has stopped;
  3.   — redundancy of sub-iterations.

Algorithm for choosing the minimization direction

edit
  • Parameters:  .
  • Input data:  .
  1.  . If   then set   from (7).
  2. Find   from (6).
  3. Calculate  . Find   from (3) for  .
  4. If   or   or   or {  and  } then set  , return  ,  , stop.
  5. If   then set   else  .
  6. Find  .
  7. Searching for  :
    1. Memorize  ,  ,  ,  ,  ;
    2. Find  . Calculate   and  . Find   from (5) and assign  ;
    3. If   then   and return to step 7.2;
    4. Restore  ,  ,  ,  ,  ;
    5. Set  .
  8. Perform sub-iteration   from (4).
  9.  , Go to step 3

The parameter  . For functions without saddle points, we recommend  ,  . To "bypass" saddle points, we recommend  ,  .

The described algorithm allows us to approximately find collinear gradients from the system of equations (3). The resulting direction   for the ColGM algorithm (1) will be approximate Newton direction (truncated Newton method).

Demonstrations

edit

In all the demonstrations, ColGM shows convergence no worse and sometimes even better (for functions of variable convexity) than Newton's method.

The "rotated ellipsoid" test function

edit

A strictly convex quadratic function:

 
ColGM minimization,  
 

In the drawing, three black starting points   are set for  . The gray dots are sub-iterations of   with   (shown as a dotted line, inflated for demonstration). Parameters  ,  . It took one iteration for all   and no more than two sub-iterations  .

For   (parameter  ) with the starting point   ColGM achieved   with an accuracy of 1% in 3 iterations and 754 calculations   and  . Other first-order methods: Quasi-Newtonian BFGS (working with matrices) required 66 iterations and 788 calculations; conjugate gradients (Fletcher—Reeves) - 274 iterations and 2236 calculations; Newton's finite difference method — 1 iteration and 1001 calculations. Newton's method second order — 1 iteration.

As the dimension of   increases, computational errors in the implementation of the collinearity condition (3) may increase markedly. Because of this, the ColGM, in comparison with the Newton's method, in the considered example required more than one iteration.

 
ColGM minimization: 3 iterations and 16 calculations   and  

Test function Rosenbrock

edit
 

The parameters are the same, except  . The descent trajectory of the ColGM completely coincides with the Newton's method. In the drawing, the blue starting point is  , and the red one is  . Unit vector of the gradient are drawn at each point  .

 

Parameters  ,  .

 
ColGM minimization: 7 iterations and 22 calculations   and  . The red lines are  .
|
 
Minimization by Newton's method: 9 iterations ( )
 
Minimization by conjugate gradient method (Fletcher-Reeves): 9 iterations and 62 calculations   and  
|
 
Minimization by quasi-Newton BFGS: 6 iterations and 55 calculations   and  . Red line (violation of the curvature condition) — steepest descent method.

ColGM is very economical in terms of the number of calculations   and  . Due to formula (2), it does not require expensive calculations of the step multiplier   by linear search (for example, golden-section search, etc.).

References

edit
  1. ^ Tolstykh V.K. Collinear Gradients Method for Minimizing Smooth Functions // Oper. Res. Forum. — 2023. — Vol. 4. — No. 20. — doi: s43069-023-00193-9