Talk:Lanczos algorithm: Difference between revisions

Content deleted Content added
Cewbot (talk | contribs)
m Maintain {{WPBS}} and vital articles: 1 WikiProject template. Create {{WPBS}}. Keep majority rating "C" in {{WPBS}}. Remove 1 same rating as {{WPBS}} in {{Maths rating}}. Remove 1 deprecated parameter: field.
 
(7 intermediate revisions by 5 users not shown)
Line 1:
{{WikiProject banner shell|class=C|
{{WikiProject Mathematics|priority=Low }}
}}
== Old comments that did not have a section of their own ==
[[http://en.wikipedia.org/wiki/Lanczos_algorithm]] - an unfocused variety of Lanczos algorithm
Line 39 ⟶ 42:
*'''Seconded''' — the article spends a lot of ink on the Lanczos iteration (but could do a better job at explaining it) for producing a tridiagonal matrix, says various other algorithms can be used for calculating eigenvalues and eigenvectors of that tridiagonal matrix, but is almost silent on how the two are related. As far as I can tell, early steps of the iteration tends to put most of the weight on the extreme eigenvalues (largest and smallest both, ''regardless of their absolute values''), meaning those are fairly accurately reproduced in the tridiagonal matrix, and the algorithm proceeds towards less extreme eigenvalues the longer it is run; it's 'tends' because the initial weight distribution depends on the initial vector, which is chosen at random. What is not clear from mere thought experiments is how concentrated the distribution is … [[Special:Contributions/130.243.68.202|130.243.68.202]] ([[User talk:130.243.68.202|talk]]) 14:22, 2 May 2017 (UTC)
 
:Thinking some more about this, I find it desirable to modify the way the algorithm is stated — partly to address the case <math>\beta_{j+1}=0</math>, and partly to do something about the "changes to" remark at the end, which is confusing in that no variable assigned in the algorithm is ever changed. It turns out the remark is boilerplate text in the {{tl|algorithm-end}} template, and there is no option to omit it. Since [[Wikipedia:WikiProject_Computer_science/Manual_of_style#Algorithms_and_data_structures]] does not recommend using that template, and instead recommends using (numbered) lists with explicit '''input''' and '''output''' headings, a rewrite from scratch seems in order.
=== Rewrite of algorithm ===
:Said rewrite is now live in the article. It also addresses the gap of where to go when you've got the tridiagonal matrix. A remaining gap is the result that eigenvalues of <math>T</math> approximates eigenvalues of <math>A</math>. [[Special:Contributions/130.243.68.122|130.243.68.122]] ([[User talk:130.243.68.122|talk]]) 15:48, 26 May 2017 (UTC)
Thinking some more about this, I find it desirable to modify the way the algorithm is stated — partly to address the case <math>\beta_{j+1}=0</math>, and partly to do something about the "changes to" remark at the end, which is confusing in that no variable assigned in the algorithm is ever changed. It turns out the remark is boilerplate text in the {{tl|algorithm-end}} template, and there is no option to omit it. Since [[Wikipedia:WikiProject_Computer_science/Manual_of_style#Algorithms_and_data_structures]] does not recommend using that template, and instead recommends using (numbered) lists with explicit '''input''' and '''output''' headings, a rewrite from scratch seems in order.
 
Following up on the previous comment, the section "Application to eigenproblem" misleadingly implies that the eigenvalues of the mxm matrix T are exactly the eigenvalues of A, which is obviously false. (Indeed the equation in parentheses at the end of the first paragraph is only valid if m =n). The point is that these eigenvalues approximate the eigenvalues of A; more work needs to be done to understand why and how good the approximation is.
:'''Input''' a [[Hermitian matrix]] <math>A</math> of size <math>n \times n</math>, and optionally a number of iterations <math>m</math> (as default let <math>m=n</math>)
:* Strictly speaking, the algorithm does not need access to the the explicit matrix, but only a function <math>v \mapsto A v</math> that computes the product of the matrix by an arbitrary vector. This function is called at most <math>m</math> times.
:'''Output''' an <math>n \times m</math> matrix <math>V</math> with orthonormal columns and a tridiagonal real symmetric matrix <math>T = V^* A V</math> of size <math>m \times m</math>. If <math>m=n</math> then <math>V</math> is [[unitary matrix|unitary]] and <math> A = V T V^* </math>.
:'''Warning''' The Lanczos iteration is prone to numerical instability. When executed in non-exact arithmetic, additional measures (as outlined further down) should be taken to ensure validity of the results.
:# Let <math>v_1 \in \mathbb{C}^n</math> be an arbitrary vector with [[Euclidean norm]] <math>1</math>.
:# Abbreviated initial iteration step:
:## Let <math> w_1' = A v_1 </math>.
:## Let <math> \alpha_1 = w_1'^* v_j </math>.
:## Let <math> w_1 = w_1' - \alpha_1 v_1 </math>.
:# For <math> j=2,\dots,m </math> do:
:## Let <math> \beta_j = \left\| w_{j-1} \right\| </math> (also [[Euclidean norm]]).
:## If <math> \beta_j \neq 0 </math> then let <math> v_j = w_{j-1} / \beta_j </math>,
:##: else pick as <math>v_j</math> an arbitrary vector with Euclidean norm <math>1</math> that is orthogonal to all of <math> v_1,\dots,v_{j-1} </math>.
:## Let <math> w_j' = A v_j </math>.
:## Let <math> \alpha_j = w_j'^* v_j </math>.
:## Let <math> w_j = w_j' - \alpha_j v_j - \beta_j v_{j-1} </math>.
:# Let <math>V</math> be the matrix with columns <math> v_1,\dots,v_m </math>. Let <math>T = \begin{pmatrix}
\alpha_1 & \beta_2 & & & & 0 \\
\beta_2 & \alpha_2 & \beta_3 & & & \\
& \beta_3 & \alpha_3 & \ddots & & \\
& & \ddots & \ddots & \beta_{m-1} & \\
& & & \beta_{m-1} & \alpha_{m-1} & \beta_m \\
0 & & & & \beta_m & \alpha_m \\
\end{pmatrix}</math>.
:'''Note''' <math> A v_j = w_j' = \beta_{j+1} v_{j+1} + \alpha_j v_j + \beta_j v_{j-1} </math> for <math> 1 < j < m </math>.
 
Not counting the matrix–vector multiplication, each iteration does <math>O(n)</math> arithmetical operations. If <math>d</math> is the average number of nonzero elements in a row of <math>A</math>, then the matrix–vector multiplication can be done in <math>O(dn)</math> arithmetical operations. Total complexity is thus <math>O(dmn)</math>, or <math>O(dn^2)</math> if <math>m=n</math>; the Lanczos algorithm can be really fast for sparse matrices. Schemes for improving numerical stability are typically judged against this high performance.
 
The vector <math> w_j' </math> is not used after <math> w_j </math> is computed, and the vector <math> w_j </math> is not used after <math> v_{j+1} </math> is computed. Hence one may use the same storage for all three. Likewise, if only the tridiagonal matrix <math>T</math> is sought, then the raw iteration does not need <math> v_{j-1} </math> after having computed <math> w_j </math>, although some schemes for improving the numerical stability would need it later on.
[[Special:Contributions/130.243.68.122|130.243.68.122]] ([[User talk:130.243.68.122|talk]]) 12:39, 24 May 2017 (UTC)
 
=== Application to the eigenproblem ===
The Lanczos algorithm is most often brought up in the context of finding the [[eigenvalue]]s and [[eigenvector]]s of a matrix, but whereas an ordinary [[Matrix diagonalization|diagonalization of a matrix]] would make eigenvectors and eigenvalues apparent from inspection, the same is not true for the tridiagonalization performed by the Lanczos algorithm; nontrivial additional steps are needed to compute even a single eigenvalue or eigenvector. None the less, applying the Lanczos algorithm is often a significant step forward in computing the eigendecomposition.
First observe that when <math>T</math> is <math>n \times n</math>, it is [[Matrix similarity|similar]] to <math>A</math>: if <math>\lambda</math> is an eigenvalue of <math>T</math> then it is also an eigenvalue of <math>A</math>, and if <math> T x = \lambda x </math> (<math>x</math> is an eigenvector of <math>T</math>) then <math> y = V x </math> is the corresponding eigenvetor of <math>A</math> (since <math> A y = A V x = V T V^* V x = V T I x = V T x = V (\lambda x) = \lambda V x = \lambda y</math>). Thus the Lanczos algorithm transforms the eigendecomposition problem for <math>A</math> into the eigendecomposition problem for <math>T</math>.
# For tridiagonal matrices, there exist a number of specialised algorithms, often with better computational complexity than general-purpose algorithms. For example if <math>T</math> is an <math>m \times m</math> tridiagonal symmetric matrix then:
#* The [[Tridiagonal matrix#Determinant|continuant recursion]] allows computing the [[characteristic polynomial]] in <math>O(m^2)</math> operations, and evaluating it at a point in <math>O(m)</math> operations.
#* The [[divide-and-conquer eigenvalue algorithm]] can be used to compute the entire eigendecomposition of <math>T</math> in <math>O(m^2)</math> operations.
#* The Fast Multipole Method<ref>{{cite journal|last1=Coakley|first1=Ed S.|last2=Rokhlin|first2=Vladimir|title=A fast divide-and-conquer algorithm for computing the spectra of real symmetric tridiagonal matrices|journal=Applied and Computational Harmonic Analysis|date=2013|volume=34|pages=379–414|doi=10.1016/j.acha.2012.06.003}}</ref> can compute all eigenvalues in just <math>O(m \log m)</math> operations.
# Some general eigendecomposition algorithms, notably the [[QR algorithm]], are known to converge faster for tridiagonal matrices than for general matrices. Asymptotic complexity is <math>O(m^2)</math> just as for the divide-and-conquer algorithm (though the constant factor may be different); since the eigenvectors together have <math>m^2</math> elements, this is asymptotically optimal.
# Even algorithms whose convergence rates are unaffected by unitary transformations, such as the [[power method]] and [[inverse iteration]], may enjoy low-level performance benefits from being applied to the tridiagonal matrix <math>T</math> rather than the original matrix <math>A</math>. Since <math>T</math> is very sparse with all nonzero elements in highly predictable positions, it permits compact storage with excellent performance visavi [[Cache (computing)|caching]]. Likewise, <math>T</math> is a [[real number|real]] matrix with all eigenvectors and eigenvalues real, whereas <math>A</math> in general may have complex elements and eigenvectors, so real arithmetic is sufficient for finding the eigenvectors and eigenvalues of <math>T</math>.
# If <math>n</math> is very large, then reducing <math>m</math> so that <math>T</math> is of a manageable size will still allow finding the more extreme eigenvalues and eigenvectors of <math>A</math>; in the <math>m \ll n</math> region, the Lanczos algorithm can be viewed as a [[lossy compression]] scheme for Hermitian matrices, that emphasises preserving the extreme eigenvalues.
The combination of good performance for sparse matrices and the ability to compute several (without computing all) eigenvalues are the main reasons for choosing to use the Lanczos algorithm.
 
=== Application to tridiagonalization ===
Though the eigenproblem is often the motivation for applying the Lanczos algorithm, the operation that the algorithm primarily performs is rather tridiagonalization of a matrix, but for that the numerically stable [[Householder transformation]]s have been favoured ever since the 1950's, and during the 1960's the Lanczos algorithm was disregarded. Interest in it was rejuvenated by the Kaniel–Paige theory and the development of methods to prevent numerical instability, but the Lanczos algorithm remains the alternative algorithm that one tries only if Householder is not satisfactory.<ref>{{cite book|last1=Golub|first1=Gene H.|last2=Van Loan|first2=Charles F.|title=Matrix computations|date=1996|publisher=Johns Hopkins Univ. Press|___location=Baltimore|isbn=0-8018-5413-X|edition=3. ed.}}</ref>
 
Aspects in which the two algorithms differ include:
* Lanczos takes advantage of <math>A</math> being a sparse matrix, whereas Householder does not, and will generate [[Sparse matrix#Reducing fill-in|fill-in]].
* Lanczos works throughout with the original matrix <math>A</math> (and has no problem with it being known only implicitly), whereas raw Householder rather wants to modify the matrix during the computation (although that can be avoided).
* Each iteration of the Lanczos algorithm produces another column of the final transformation matrix <math>V</math>, whereas an iteration of Householder rather produces another factor in a unitary factorisation <math> Q_1 Q_2 \dots Q_n</math> of <math>V</math>. Each factor is however determined by a single vector, so the storage requirements are the same for both algorithms, and <math>V = Q_1 Q_2 \dots Q_n</math> can be computed in <math>O(n^3)</math> time.
* Householder is numerically stable, whereas raw Lanczos is not.
[[Special:Contributions/130.243.68.122|130.243.68.122]] ([[User talk:130.243.68.122|talk]]) 14:46, 26 May 2017 (UTC)
 
== Define variables ==
Line 106 ⟶ 58:
 
: It's not stated explicitly at that point, but presumably <math>A</math> is already taken to be Hermitian (as it needs to be for the Lanczos algorithm to work), which means it ''has'' an eigendecomposition of the form stated. Instead using the SVD decomposition in this argument won't work, because the entire point is that <math>U' U = I</math> so that the product telescopes! Possibly it would be clearer to just use <math>A = U \operatorname{diag}(\sigma_i) U^{-1}</math>, i.e., hold off on requiring orthogonality — the reason being that the paragraph in question is about the plain power method, which applies in a greater generality. [[Special:Contributions/130.243.68.202|130.243.68.202]] ([[User talk:130.243.68.202|talk]]) 13:01, 2 May 2017 (UTC)
 
== External links modified ==
 
Hello fellow Wikipedians,
 
I have just modified one external link on [[Lanczos algorithm]]. Please take a moment to review [[special:diff/815699167|my edit]]. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit [[User:Cyberpower678/FaQs#InternetArchiveBot|this simple FaQ]] for additional information. I made the following changes:
*Added archive https://web.archive.org/web/20110314171151/http://www.graphlab.ml.cmu.edu/pmf.html to http://www.graphlab.ml.cmu.edu/pmf.html
 
When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.
 
{{sourcecheck|checked=false|needhelp=}}
 
Cheers.—[[User:InternetArchiveBot|'''<span style="color:darkgrey;font-family:monospace">InternetArchiveBot</span>''']] <span style="color:green;font-family:Rockwell">([[User talk:InternetArchiveBot|Report bug]])</span> 14:26, 16 December 2017 (UTC)