Biconjugate gradient stabilized method: Difference between revisions

Content deleted Content added
top: could use with some additional material that is not simply mathematical derivations, such as: what is it used for, what's the historical context, etc.
m Reverted edit by 186.177.184.151 (talk) to last version by Weberjoh
 
(39 intermediate revisions by 27 users not shown)
Line 1:
{{Short description|Concept in mathematics}}
{{Technical|date=May 2015}}
 
In [[numerical linear algebra]], the '''biconjugate gradient stabilized method''', often abbreviated as '''BiCGSTAB''', is an [[iterative method]] developed by [[Henk van der Vorst|H. A. van der Vorst]] for the numerical solution of nonsymmetric [[System of linear equations|linear system]]s. It is a variant of the [[biconjugate gradient method]] (BiCG) and has faster and smoother convergence than the original BiCG as well as other variants such as the [[conjugate gradient squared method]] (CGS). It is a [[Krylov subspace]] method. Unlike the original BiCG method, it doesn't require multiplication by the transpose of the system matrix.
 
==Algorithmic steps==
===Unpreconditioned BiCGSTAB===
In the following sections, {{math|1=('''<var>x</var>''','''<var>y</var>''') = '''<var>x</var>'''<sup>T</sup> '''<var>y</var>'''}} denotes the [[dot product]] of vectors. To solve a linear system {{math|'''<var>Ax</var>''' {{=}} '''<var>b</var>'''}}, BiCGSTAB starts with an initial guess {{math|'''<var>x</var>'''<sub>0</sub>}} and proceeds as follows:
 
# {{math|'''<var>r</var>'''<sub>0</sub> {{=}} '''<var>b</var>''' − '''<var>Ax</var>'''<sub>0</sub>}}
# Choose an arbitrary vector {{math|'''<var>r̂</var>'''<sub>0</sub>}} such that {{math|('''<var>r̂</var>'''<sub>0</sub>, '''<var>r</var>'''<sub>0</sub>) ≠ 0}}, e.g., {{math|'''<var>r̂</var>'''<sub>0</sub> {{=}} '''<var>r</var>'''<sub>0</sub>}}
# {{math|<var>ρ</var><sub>0</sub> {{=}} ('''<var>α</var>'''<sub>0</sub>, {{=}} '''<var>ωr</var>'''<sub>0</sub>) {{=}} 1}}
# {{math|'''<var>vp</var>'''<sub>0</sub> {{=}} '''<var>pr</var>'''<sub>0</sub> {{=}} '''0'''}}
# For {{math|<var>i</var> {{=}} 1, 2, 3, …}}
## {{math|'''<var>ρ<sub>i</sub>v</var>''' {{=}} ('''<var>Ap</var>'''<sub>0</sub>, '''<var>ri</var>'''<sub><var>i<var>−1</sub>)}}
## {{math|<var>βα</var> {{=}} (<var>ρ<sub>i</sub></var>/<var>ρ</var><sub><var>i</var>−1</sub>)/('''<var>α</var>/'''<varsub>ω0</var><sub>, <var>i'''v'''</var>−1</sub>)}}
## {{math|<var>'''ph'''<sub>i</sub></var> {{=}} '''<var>rx</var>'''<sub><var>i</var>−1</sub> + <var>β</var>(α'''<var>p'''</var>'''<sub><var>i</var>−1</sub> }}
## {{math|<var>ω'''s'''</var><sub> {{=}} <var>i<var>−1</sub>'''<var>vr'''</var>'''<sub><var>i</var>−1</sub>) − <var>α'''v'''</var>}}
## If {{math|<var>'''vh'''<sub>i</sub></var> {{=}} is accurate enough, i.e., if <var>'''s'''</var>Ap is small enough, then set {{math|</var>'''x'''<sub>i<var/sub>i</var> {{=}} <var>'''h'''</subvar>}} and quit
## {{math|<var>α</var> {{=}} <var>ρ<sub>i</sub></var>/('''<var>r̂</var>'''<sub>0</sub>, <var>'''v'''<sub>i</sub></var>)}}
## {{math|'''<var>s</var>''' {{=}} '''<var>r</var>'''<sub><var>i<var>−1</sub> − <var>α'''v'''<sub>i</sub></var>}}
## {{math|'''<var>t</var>''' {{=}} '''<var>As</var>'''}}
## {{math|<var>ω<sub>i</sub></var> {{=}} (<var>'''t'''</var>t, </var>''', s'''</var>s<)/(<var>''')/(t'''</var>t, </var>''', t'''<var>t</var>''')}}
## {{math|<var>'''x'''<sub>i</sub></var> {{=}} '''<var>x</var>'''<sub><var>i<var>−1</sub> + <var>α'''ph'''<sub>i</sub></var> + <var>ω<sub>i</sub>'''s'''</var>}}
## If {{math|<var>'''xr'''<sub>i</sub></var> {{=}} is<var>'''s'''</var> accurate enough then quit<var>ω'''t'''</var>}}
## If {{math|<var>'''rx'''<sub>i</sub></var>}} is accurate enough, i.e., if {{=}} '''math|<var>s</var>''' − <var>ωr'''<sub>i</sub>'''t'''</var>}} is small enough, then quit
## {{math|<var>ρ<sub>i</sub></var> {{=}} ('''<var>r̂</var>'''<sub>0</sub>, '''<var>r</var>'''<sub><var>i</var></sub>)}}
## {{math|<var>αβ</var> {{=}} (<var>ρ<sub>i</sub></var>/('''<var>ρ</var>'''<sub>0</subvar>, i</var>'''v'''−1</sub>i)(<var>α</subvar>/<var>ω</var>)}}
## {{math|<var>'''p'''<sub>i</sub></var> {{=}} '''<var>r</var>'''<sub><var>i</var></sub> + <var>β</var>('''<var>p</var>'''<sub><var>i</var>−1</sub> − <var>ω</var>'''<var>v</var>''')}}
In some cases, choosing the vector {{math|'''<var>r̂</var>'''<sub>0</sub>}} randomly improves numerical stability.<ref>{{Cite journal |last=Schoutrop |first=Chris |last2=Boonkkamp |first2=Jan ten Thije |last3=Dijk |first3=Jan van |date=July 2022 |title=Reliability Investigation of BiCGStab and IDR Solvers for the Advection-Diffusion-Reaction Equation |url=https://doi.org/10.4208/cicp.OA-2021-0182 |journal=Communications in Computational Physics |language=en |volume=32 |issue=1 |pages=156–188 |doi=10.4208/cicp.oa-2021-0182 |issn=1815-2406}}</ref>
 
===Preconditioned BiCGSTAB===
Line 29 ⟶ 33:
# {{math|'''<var>r</var>'''<sub>0</sub> {{=}} '''<var>b</var>''' − '''<var>Ax</var>'''<sub>0</sub>}}
# Choose an arbitrary vector {{math|'''<var>r̂</var>'''<sub>0</sub>}} such that {{math|('''<var>r̂</var>'''<sub>0</sub>, '''<var>r</var>'''<sub>0</sub>) ≠ 0}}, e.g., {{math|'''<var>r̂</var>'''<sub>0</sub> {{=}} '''<var>r</var>'''<sub>0</sub>}}
# {{math|<var>ρ</var><sub>0</sub> {{=}} ('''<var>α</var>'''<sub>0</sub>, {{=}} '''<var>ωr</var>'''<sub>0</sub>) {{=}} 1}}
# {{math|'''<var>vp</var>'''<sub>0</sub> {{=}} '''<var>pr</var>'''<sub>0</sub> {{=}} '''0'''}}
# For {{math|<var>i</var> {{=}} 1, 2, 3, …}}
## {{math|'''<var>ρ<sub>i</sub>y</var>''' {{=}} ({{SubSup|'''<var>K</var>'''|2|−1}}{{SubSup|'''<subvar>0K</subvar>, '''|1|−1}}'''<var>rp</var>'''<sub><var>i</var>−1</sub>)}}
## {{math|<var>β'''v'''</var> {{=}} ('''<var>ρ<sub>i</sub>Ay</var>/<var>ρ</var><sub><var>i<var>−1</sub>)(<var>α</var>/<var>ω</var><sub><var>i<var>−1</sub>)'''}}
## {{math|<var>'''p'''<sub>i</sub>α</var> {{=}} '''<var>rρ</var>'''<sub><var>i</var>−1</sub> + <var>β</var>('''<var>p</var>'''<sub><var>i<var>−10</sub>, <var>ω</var><sub><var>i<var>−1</sub>'''<var>v</var>'''<sub></var>i<var>−1</sub>)}}
## {{math|<var>'''h'''<var>y</var>''' {{=}} '''<var>Kx</var>'''<supsub>−1</sup>'''<var>pi</var>'''−1</sub> + <var>iα'''y'''</var></sub> }}
## {{math|<var>'''v'''<sub>i</subvar>s</var>''' {{=}} '''<var>Ayr</var>'''<sub><var>i</var>−1</sub> − <var>α'''v'''</var>}}
## If {{math|<var>α'''h'''</var>}} is accurate enough then {{=}} math|<var>ρ<sub>i</sub></var>/('''<var>r̂</var>x'''<sub>0i</sub>,</var> {{=}} <var>'''vh'''<sub>i</sub></var>)}} and quit
## {{math|'''<var>sz</var>''' {{=}} {{SubSup|'''<var>rK</var>'''<sub>|2|−1}}{{SubSup|'''<var>i<var>−1K</sub> − <var>α'''v|1|−1}}'''<sub>i</subvar>s</var>'''}}
## {{math|'''<var>z</var>''' {{=}} '''<var>K</var>'''<sup>−1</sup>'''<var>s</var>'''}}
## {{math|'''<var>t</var>''' {{=}} '''<var>Az</var>'''}}
## {{math|<var>ω<sub>i</sub></var> {{=}} ({{SubSup|'''<var>K</var>'''|1|−1}}'''<var>t</var>''', {{SubSup|'''<var>K</var>'''|1|−1}}'''<var>s</var>''')/({{SubSup|'''<var>K</var>'''|1|−1}}'''<var>t</var>''', {{SubSup|'''<var>K</var>'''|1|−1}}'''<var>t</var>''')}}
## {{math|<var>'''x'''<sub>i</sub></var> {{=}} '''<var>x</var>'''<sub><var>i<var>−1</sub> + <var>α'''yh'''</var> + <var>ω<sub>i</sub>'''z'''</var>}}
## {{math|<var>'''r'''<sub>i</sub></var> {{=}} '''<var>s</var>''' − <var>ω'''t'''</var>}}
## If {{math|<var>'''x'''<sub>i</sub></var>}} is accurate enough then quit
## {{math|<var>'''r'''ρ<sub>i</sub></var> {{=}} ('''<var>s</var>''' − <var>ω<sub>i0</sub>, '''t<var>r</var>'''<sub><var>i</var></sub>)}}
## {{math|'''<var>sβ</var>''' {{=}} '''(<var>rρ<sub>i</sub></var>/<var>ρ</var>'''<sub><var>i</var>−1</sub>)(<var>α'''v'''<sub/var>i</sub<var>ω</var>)}}
## {{math|<var>'''p'''<sub>i</sub></var> {{=}} '''<var>r</var>'''<sub><var>i</var></sub> + <var>β</var>('''<var>p</var>'''<sub><var>i</var>−1</sub> − <var>ω</var>'''<var>v</var>''')}}
 
This formulation is equivalent to applying unpreconditioned BiCGSTAB to the explicitly preconditioned system
Line 79 ⟶ 85:
 
:{{math|<var>P<sub>i</sub></var>('''<var>A</var>''') {{=}} <var>P</var><sub><var>i</var>−1</sub>('''<var>A</var>''') − <var>α<sub>i</sub>'''A'''T</var><sub><var>i</var>−1</sub>('''<var>A</var>''')}},
:{{math|<var>T<sub>i</sub></var>('''<var>A</var>''') {{=}} <var>P<sub>i</sub></var>('''<var>A</var>''') + <var>β</var><sub><var>i</var>+1</sub><var>T</var><sub><var>i</var>−1</sub>('''<var>A</var>''')}}.
 
===Derivation of BiCGSTAB from BiCG===
Line 86 ⟶ 92:
:{{math|'''<var>r̃</var>'''<sub><var>i</var></sub> {{=}} <var>Q<sub>i</sub></var>('''<var>A</var>''')<var>P<sub>i</sub></var>('''<var>A</var>''')'''<var>r</var>'''<sub>0</sub>}}
 
where {{math|<var>Q<sub>i</sub></var>('''<var>A</var>''') {{=}} ('''<var>I</var>''' − <var>ω</var><sub>1</sub>'''<var>A</var>''')('''<var>I</var>''' − <var>ω</var><sub>2</sub>'''<var>A</var>''')⋯('''<var>I</var>''' − <var>ω<sub>i</sub>'''A'''</var>)}} with suitable constants {{math|<var>ω<sub>j</sub></var>}} instead of {{math|<var>'''r'''<sub>i</sub></var> {{=}} <var>P<sub>i</sub></var>('''<var>A</var>''')<var>'''r'''<sub>0</sub></var>}} in the hope that {{math|<var>Q<sub>i</sub></var>('''<var>A</var>''')}} will enable faster and smoother convergence in {{math|<var>'''r̃'''<sub>i</sub></var>}} than {{math|<var>'''r'''<sub>i</sub></var>}}.
 
It follows from the recurrence relations for {{math|<var>P<sub>i</sub></var>('''<var>A</var>''')}} and {{math|<var>T<sub>i</sub></var>('''<var>A</var>''')}} and the definition of {{math|<var>Q<sub>i</sub></var>('''<var>A</var>''')}} that
Line 94 ⟶ 100:
which entails the necessity of a recurrence relation for {{math|<var>Q<sub>i</sub></var>('''<var>A</var>''')<var>T<sub>i</sub></var>('''<var>A</var>''')'''<var>r</var>'''<sub>0</sub>}}. This can also be derived from the BiCG relations:
 
:{{math|<var>Q<sub>i</sub></var>('''<var>A</var>''')<var>T<sub>i</sub></var>('''<var>A</var>''')'''<var>r</var>'''<sub>0</sub> {{=}} <var>Q<sub>i</sub></var>('''<var>A</var>''')<var>P<sub>i</sub></var>('''<var>A</var>''')'''<var>r</var>'''<sub>0</sub> + <var>β</var><sub><var>i</var>+1</sub>('''<var>I</var>''' − <var>ω<sub>i</sub>'''A'''</var>)<var>Q</var><sub><var>i</var>−1</sub>('''<var>A</var>''')<var>PT</var><sub><var>i</var>−1</sub>('''<var>A</var>''')'''<var>r</var>'''<sub>0</sub>}}.
 
Similarly to defining {{math|<var>'''r̃'''<sub>i</sub></var>}}, BiCGSTAB defines
Line 128 ⟶ 134:
:{{math|<var>ρ̃</var><sub><var>i</var></sub> {{=}} (<var>Q</var><sub><var>i</var>−1</sub>('''<var>A</var>'''<sup>T</sup>)'''<var>r̂</var>'''<sub>0</sub>, <var>P</var><sub><var>i</var>−1</sub>('''<var>A</var>''')'''<var>r</var>'''<sub>0</sub>) {{=}} ('''<var>r̂</var>'''<sub>0</sub>, <var>Q</var><sub><var>i</var>−1</sub>('''<var>A</var>''')<var>P</var><sub><var>i</var>−1</sub>('''<var>A</var>''')'''<var>r</var>'''<sub>0</sub>) {{=}} ('''<var>r̂</var>'''<sub>0</sub>, '''<var>r</var>'''<sub><var>i</var>−1</sub>)}}.
 
Due to biorthogonality, {{math|'''<var>r</var>'''<sub><var>i</var>−1</sub> {{=}} <var>P</var><sub><var>i</var>−1</sub>('''<var>A</var>''')'''<var>r</var>'''<sub>0</sub>}} is orthogonal to {{math|<var>U</var><sub><var>i</var>−2</sub>('''<var>A</var>'''<sup>T</sup>)'''<var>r̂</var>'''<sub>0</sub>}} where {{math|<var>U</var><sub><var>i</var>−2</sub>('''<var>A</var>'''<sup>T</sup>)}} is any polynomial of degree {{math|<var>i</var> − 2}} in {{math|'''<var>A</var>'''<sup>T</sup>}}. Hence, only the highest-order terms of {{math|<var>P</var><sub><var>i</var>−1</sub>('''<var>A</var>'''<sup>T</sup>)}} and {{math|<var>Q</var><sub><var>i</var>−1</sub>('''<var>A</var>'''<sup>T</sup>)}} matter in the [[dot productsproduct]]s {{math|(<var>P</var><sub><var>i</var>−1</sub>('''<var>A</var>'''<sup>T</sup>)'''<var>r̂</var>'''<sub>0</sub>, <var>P</var><sub><var>i</var>−1</sub>('''<var>A</var>''')'''<var>r</var>'''<sub>0</sub>)}} and {{math| (<var>Q</var><sub><var>i</var>−1</sub>('''<var>A</var>'''<sup>T</sup>)'''<var>r̂</var>'''<sub>0</sub>, <var>P</var><sub><var>i</var>−1</sub>('''<var>A</var>''')'''<var>r</var>'''<sub>0</sub>)}}. The leading coefficients of {{math|<var>P</var><sub><var>i</var>−1</sub>('''<var>A</var>'''<sup>T</sup>)}} and {{math|<var>Q</var><sub><var>i</var>−1</sub>('''<var>A</var>'''<sup>T</sup>)}} are {{math|(−1)<sup><var>i</var>−1</sup><var>α</var><sub>1</sub><var>α</var><sub>2</sub>⋯<var>α</var><sub><var>i</var>−1</sub>}} and {{math|(−1)<sup><var>i</var>−1</sup><var>ω</var><sub>1</sub><var>ω</var><sub>2</sub>⋯<var>ω</var><sub><var>i</var>−1</sub>}}, respectively. It follows that
 
:{{math|<var>ρ<sub>i</sub></var> {{=}} (<var>α</var><sub>1</sub>/<var>ω</var><sub>1</sub>)(<var>α</var><sub>2</sub>/<var>ω</var><sub>2</sub>)⋯(<var>α</var><sub><var>i</var>−1</sub>/<var>ω</var><sub><var>i</var>−1</sub>)<var>ρ̃</var><sub><var>i</var></sub>}},
Line 140 ⟶ 146:
:{{math|<var>α<sub>i</sub></var> {{=}} <var>ρ<sub>i</sub></var>/('''<var>p̂</var>'''<sub><var>i</var></sub>, <var>'''Ap'''<sub>i</sub></var>) {{=}} (<var>P</var><sub><var>i</var>−1</sub>('''<var>A</var>'''<sup>T</sup>)'''<var>r̂</var>'''<sub>0</sub>, <var>P</var><sub><var>i</var>−1</sub>('''<var>A</var>''')'''<var>r</var>'''<sub>0</sub>)/(<var>T</var><sub><var>i</var>−1</sub>('''<var>A</var>'''<sup>T</sup>)'''<var>r̂</var>'''<sub>0</sub>, <var>'''A'''T</var><sub><var>i</var>−1</sub>('''<var>A</var>''')'''<var>r</var>'''<sub>0</sub>)}}.
 
Similarly to the case above, only the highest-order terms of {{math|<var>P</var><sub><var>i</var>−1</sub>('''<var>A</var>'''<sup>T</sup>)}} and {{math|<var>T</var><sub><var>i</var>−1</sub>('''<var>A</var>'''<sup>T</sup>)}} matter in the [[dot productsproduct]]s thanks to biorthogonality and biconjugacy. It happens that {{math|<var>P</var><sub><var>i</var>−1</sub>('''<var>A</var>'''<sup>T</sup>)}} and {{math|<var>T</var><sub><var>i</var>−1</sub>('''<var>A</var>'''<sup>T</sup>)}} have the same leading coefficient. Thus, they can be replaced simultaneously with {{math|<var>Q</var><sub><var>i</var>−1</sub>('''<var>A</var>'''<sup>T</sup>)}} in the formula, which leads to
 
:{{math|<var>α<sub>i</sub></var> {{=}} (<var>Q</var><sub><var>i</var>−1</sub>('''<var>A</var>'''<sup>T</sup>)'''<var>r̂</var>'''<sub>0</sub>, <var>P</var><sub><var>i</var>−1</sub>('''<var>A</var>''')'''<var>r</var>'''<sub>0</sub>)/(<var>Q</var><sub><var>i</var>−1</sub>('''<var>A</var>'''<sup>T</sup>)'''<var>r̂</var>'''<sub>0</sub>, <var>'''A'''T</var><sub><var>i</var>−1</sub>('''<var>A</var>''')'''<var>r</var>'''<sub>0</sub>) {{=}} <var>ρ̃</var><sub><var>i</var></sub>/('''<var>r̂</var>'''<sub>0</sub>, '''<var>A</var>'''<var>Q</var><sub><var>i</var>−1</sub>('''<var>A</var>''')<var>T</var><sub><var>i</var>−1</sub>('''<var>A</var>''')'''<var>r</var>'''<sub>0</sub>) {{=}} <var>ρ̃</var><sub><var>i</var></sub>/('''<var>r̂</var>'''<sub>0</sub>, '''<var>Ap̃</var>'''<sub><var>i</var></sub>)}}.
Line 153 ⟶ 159:
 
==Generalization==
BiCGSTAB can be viewed as a combination of BiCG and [[GMRES]] where each BiCG step is followed by a GMRES({{math|1}}) (i.e., GMRES restarted at each step) step to repair the irregular convergence behavior of CGS, as an improvement of which BiCGSTAB was developed. However, due to the use of degree-one minimum residual polynomials, such repair may not be effective if the matrix {{math|'''<var>A</var>'''}} has large complex eigenpairs. In such cases, BiCGSTAB is likely to stagnate, as confirmed by numerical experiments.
 
One may expect that higher-degree minimum residual polynomials may better handle this situation. This gives rise to algorithms including BiCGSTAB2{{ref|bicgstab2}} and the more general BiCGSTAB({{math|<var>l</var>}}){{ref|bicgstab(l)}}. In BiCGSTAB({{math|<var>l</var>}}), a GMRES({{math|<var>l</var>}}) step follows every {{math|<var>l</var>}} BiCG steps. BiCGSTAB2 is equivalent to BiCGSTAB({{math|<var>l</var>}}) with {{math|<var>l</var> {{=}} 2}}.
Line 163 ⟶ 169:
 
==References==
{{reflist}}
* {{cite doi|10.1137/0913035}}
* {{Cite journal | doi = 10.1137/0913035 | title = Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Nonsymmetric Linear Systems | year = 1992 | last1 = Van der Vorst | first1 = H. A. | journal = [[SIAM Journal on Scientific Computing|SIAM J. Sci. Stat. Comput.]] | volume = 13 | issue = 2 | pages = 631–644 | hdl = 10338.dmlcz/104566 | hdl-access = free }}
* {{cite book
|last = Saad
|first = Y.
|title = Iterative Methods for Sparse Linear Systems
|chapter-url = https://archive.org/details/iterativemethods0000saad
|chapter-url-access = registration
|edition = 2nd
|chapter = §7.4.2 BICGSTAB
|pages = [https://archive.org/details/iterativemethods0000saad/page/231 231–234]
|year = 2003
|publisher = SIAM
|isbn = 978-0-89871-534-7
|doi = 10.2277/0898715342
}}
* {{note|bicgstab2}}{{Cite journal| doi = 10.1137/0914062| title = Variants of BICGSTAB for Matrices with Complex Spectrum| year = 1993| last1 = Gutknecht | first1 = M. H.| journal = [[SIAM Journal on Scientific Computing|SIAM J. Sci. Comput.]]| volume = 14| issue = 5| pages = 1020–1033 }}
* {{note|bicgstab2}}{{cite doi|10.1137/0914062}}
* {{note|bicgstab(l)}}{{cite journal
| last1 = Sleijpen
| first1 = G. L. G.
| last2 = Fokkema
| first2 = D. R.
| date = November 1993
| title = BiCGstab(''l'') for linear equations involving unsymmetric matrices with complex spectrum
| journal = [[Electronic Transactions on Numerical Analysis]]
| volume = 1
| pages = 11–32
Line 188 ⟶ 198:
| ___location = Kent, OH
| issn = 1068-9613
| url = http://www.emis.ams.orgde/journals/ETNA/vol.1.1993/pp11-32.dir/pp11-32.pdf
| format = PDF
}}