Rayleigh–Ritz method: Difference between revisions

Content deleted Content added
No edit summary
Line 1:
The '''Rayleigh–Ritz method''' is a direct numerical method of approximating [[eigenvalues and eigenvectors|eigenvalue]], originated in the context of solving physical [[Boundary value problem|boundary value problems]] and named after [[Lord Rayleigh]] and [[Walther Ritz]].
 
The name '''Rayleigh–Ritz''' is being debated<ref name="Leissa">{{cite journal|last1=Leissa|first1=A.W.|title=The historical bases of the Rayleigh and Ritz methods|journal=Journal of Sound and Vibration|volume=287|issue=4-5|year=2005|pages=961–978| doi=10.1016/j.jsv.2004.12.021| bibcode=2005JSV...287..961L| url=https://www.sciencedirect.com/science/article/abs/pii/S0022460X05000362 }}</ref><ref name="Ilanko">{{cite journal|last1=Ilanko|first1=Sinniah|title=Comments on the historical bases of the Rayleigh and Ritz methods|journal=Journal of Sound and Vibration|volume=319|issue=1-2|year=2009|pages=731-733 | doi=10.1016/j.jsv.2008.06.001}}</ref> vs. the [[Ritz method]] after [[Walther Ritz]], since the numerical procedure has been published by [[Walther Ritz]] in 1908-1909. According to,<ref name="Leissa" /> [[Lord Rayleigh]] wrote a paper congratulating Ritz on his work in 1911, but stating that he himself had used Ritz's method in many places in his book and in another publication. This statement, although later disputed, and the fact that the method in the trivial case of a single vector results in the [[Rayleigh quotient]] make the arguable misnomer persist. According to,<ref name="Ilanko"/> citing [[Richard Courant]], both [[Lord Rayleigh]] and [[Walther Ritz]] independently conceived the idea of utilizing the equivalence between [[Boundary value problem|boundary value problems]] of [[partial differential equations]] on the one hand and problems of the [[calculus of variations]] on the other hand for numerical calculation of the solutions, by substituting for the variational problems simpler approximating extremum problems in which a finite number of parameters need to be determined; see the article [[Ritz method]] for details. Ironically for the debate, the modern justification of the algorithm drops the [[calculus of variations]] in favor of the simpler and more general approach of [[orthogonal projection]] as in [[Galerkin method]] named after [[Boris Galerkin]], thus leading also to the [[Ritz-Galerkin method]] naming.
The name '''Rayleigh–Ritz''' is being debated<ref name="Leissa">{{cite journal|last1=Leissa|first1=A.W.|title=The historical bases of the Rayleigh and Ritz methods|journal=Journal of Sound and Vibration|volume=287|issue=4-5|year=2005|pages=961–978|doi=10.1016/j.jsv.2004.12.021|bibcode=2005JSV...287..961L|url=https://www.sciencedirect.com/science/article/abs/pii/S0022460X05000362}}</ref>
<ref name="Ilanko">{{cite journal|last1=Ilanko|first1=Sinniah|title=Comments on the historical bases of the Rayleigh and Ritz methods|journal=Journal of Sound and Vibration|volume=319|issue=1-2|year=2009|pages=731-733|doi=10.1016/j.jsv.2008.06.001}}</ref> vs. the [[Ritz method]] after [[Walther Ritz]], since the numerical procedure has been published by [[Walther Ritz]] in 1908-1909. According to,<ref name="Leissa" /> [[Lord Rayleigh]] wrote a paper congratulating Ritz on his work in 1911, but stating that he himself had used Ritz's method in many places in his book and in another publication. This statement, although later disputed, and the fact that the method in the trivial case of a single vector results in the [[Rayleigh quotient]] make the arguable misnomer persist. According to,<ref name="Ilanko"/> citing [[Richard Courant]], both [[Lord Rayleigh]] and [[Walther Ritz]] independently conceived the idea of utilizing the equivalence between [[Boundary value problem|boundary value problems]] of [[partial differential equations]] on the one hand and problems of the [[calculus of variations]] on the other hand for numerical calculation of the solutions, by substituting for the variational problems simpler approximating extremum problems in which a finite number of parameters need to be determined; see the article [[Ritz method]] for details. Ironically for the debate, the modern justification of the algorithm drops the [[calculus of variations]] in favor of the simpler and more general approach of [[orthogonal projection]] as in [[Galerkin method]] named after [[Boris Galerkin]], thus leading also to the [[Ritz-Galerkin method]] naming.
 
It is used in all applications that involve approximating [[eigenvalues and eigenvectors]], often under different names. In [[quantum mechanics]], where a system of particles is described using a [[Hamiltonian (quantum mechanics)|Hamiltonian]], the [[Ritz method]] uses [[ansatz|trial wave functions]] to approximate the ground state eigenfunction with the lowest energy. In the [[finite element method]] context, mathematically the same algorithm is commonly called the [[Ritz-Galerkin method]]. The Rayleigh–Ritz method or [[Ritz method]] terminology is typical in mechanical and structural engineering to approximate the [[Normal mode|eigenmodes]] and [[Resonance|resonant frequencies]] of a structure.
 
== For matrix eigenvalue problems ==
In numerical linear algebra, the '''Rayleigh–Ritz method''' is commonly<ref name="TrefethenIII1997">{{cite book| last1=Trefethen| first1=Lloyd N. | last2= Bau, III|first2=David|title=Numerical Linear Algebra|url=https://books.google.com/books?id=JaPtxOytY7kC| year=1997| publisher=SIAM| isbn=978-0-89871-957-4|page=254}}</ref> applied to approximate an eigenvalue problem
:<math display="block"> A \textbfmathbf{x} = \lambda \textbfmathbf{x}</math>
for the matrix <math> A \in \mathbb{C}^{N \times N}</math> of size <math>N</math> using a projected matrix of a smaller size <math>m < N</math>, generated from a given matrix <math> V \in \mathbb{C}^{N \times m} </math> with [[orthonormal]] columns. The matrix version of the algorithm is the most simple:
 
# Compute the <math> m \times m </math> matrix <math> V^* A V </math>, where <math>V^*</math> denotes the complex-conjugate transpose of <math>V</math>
# Solve the eigenvalue problem <math> V^* A V \mathbf{y}_i = \mu_i \mathbf{y}_i</math>
# Compute the Ritz vectors <math>\tilde{\textbfmathbf{x}}_i = V \textbfmathbf{y}_i</math> and the Ritz value <math>\tilde{\lambda}_i=\mu_i</math>
# Output approximations <math>(\tilde{\lambda}_i,\tilde{\textbfmathbf{x}}_i)</math>, called the Ritz pairs, to [[eigenvalues and eigenvectors]] of the original matrix <math>A</math>.
 
If the subspace with the orthonormal basis given by the columns of the matrix <math> V \in \mathbb{C}^{N \times m} </math> contains <math> k \leq m </math> vectors that are close to eigenvectors of the matrix <math>A</math>, the '''Rayleigh–Ritz method''' above finds <math>k</math> Ritz vectors that well approximate these eigenvectors. The easily computable quantity <math> \| A \tilde{\textbfmathbf{x}}_i - \tilde{\lambda}_i \tilde{\textbfmathbf{x}}_i\|</math> determines the accuracy of such an approximation for every Ritz pair.
 
In the easiest case <math>m = 1</math>, the <math> N \times m </math> matrix <math>V</math> turns into a unit column-vector <math>v</math>, the <math> m \times m </math> matrix <math> V^* A V </math> is a scalar that is equal to the [[Rayleigh quotient]] <math>\rho(v) = v^*Av/v^*v</math>, the only <math>i = 1</math> solution to the eigenvalue problem is <math>y_i = 1</math> and <math>\mu_i = \rho(v)</math>, and the only one Ritz vector is <math>v</math> itself. Thus, the Rayleigh–Ritz method turns into computing of the [[Rayleigh quotient]] if <math>m = 1</math>.
 
Another useful connection to the [[Rayleigh quotient]] is that <math>\mu_i = \rho(v_i)</math> for every Ritz pair <math>(\tilde{\lambda}_i, \tilde{\textbfmathbf{x}}_i)</math>, allowing to derive some properties of Ritz values <math>\mu_i</math> from the corresponding theory for the [[Rayleigh quotient]]. For example, if <math>A</math> is a [[Hermitian matrix]], its [[Rayleigh quotient]] (and thus its every Ritz value) is real and takes values within the closed interval of the smallest and largest eigenvalues of <math>A</math>.
 
=== Example ===
The matrix
:<math display="block">A = \begin{bmatrix}
2 & 0 & 0 \\
0 & 2 & 1 \\
0 & 1 & 2
\end{bmatrix}</math>
has eigenvalues <math>1, 2, 3</math> and the corresponding eigenvectors
:<math display="block">\mathbf x_{\lambda=1} = \begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix}, \quad
\mathbf x_{\lambda=2} = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}, \quad
\mathbf x_{\lambda=3} = \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix}.</math>
Let us take
:<math display="block">V = \begin{bmatrix}
0 & 0 \\
1 & 0 \\
0 & 1
\end{bmatrix},</math>
then
:<math display="block">V^* A V = \begin{bmatrix}
2 & 1 \\
1 & 2
\end{bmatrix}</math>
with eigenvalues <math>1, 3</math> and the corresponding eigenvectors
:<math display="block">\mathbf y_{\mu=1} = \begin{bmatrix} 1 \\ -1 \end{bmatrix}, \quad
\mathbf y_{\mu=3} = \begin{bmatrix} 1 \\ 1 \end{bmatrix},</math>
so that the Ritz values are <math>1, 3</math> and the Ritz vectors are
:<math display="block">\mathbf \tilde{x}_{\tilde{\lambda}=1} = \begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix}, \quad
\mathbf \tilde{x}_{\tilde{\lambda}=3} = \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix}.</math>
We observe that each one of the Ritz vectors is exactly one of the eigenvectors of <math>A</math> for the given <math>V</math> as well as the Ritz values give exactly two of the three eigenvalues of <math>A</math>. A mathematical explanation for the exact approximation is based on the fact that the [[column space]] of the matrix <math>V</math> happens to be exactly the same as the subspace spanned by the two eigenvectors <math>\mathbf x_{\lambda=1}</math> and <math>\mathbf x_{\lambda=3}</math> in this example.
 
== For matrix singular value problems ==
[[Singular_value_decompositionSingular value decomposition#Truncated_SVDTruncated SVD| Truncated singular value decomposition (SVD)]] in numerical linear algebra can also use the '''Rayleigh–Ritz method''' to find approximations to left and right singular vectors of the matrix <math> M \in \mathbb{C}^{M \times N}</math> of size <math>M</math>-by-<math> \times N</math> in given subspaces by turning the singular value problem into an eigenvalue problem.
 
=== Using the normal matrix ===
The definition of the singular value <math>\sigma</math> and the corresponding left and right singular vectors is <math>M v = \sigma u</math> and <math>M^* u = \sigma v</math>. Having found one set (left of right) of approximate singular vectors and singular values by applying naively the Rayleigh–Ritz method to the [[Hermitian matrix|Hermitian]] '''normal matrix''' <math> M^* M \in \mathbb{C}^{N \times N}</math> or <math> M M^* \in \mathbb{C}^{M \times M}</math>, whichever one is smaller size, one could determine the other set of left of right singular vectors simply by dividing by the singular values, i.e., <math>u = Mv / \sigma</math> and <math>v = M^* u / \sigma</math>. However, the division is unstable or fails for small or zero singular values.
 
An alternative approach, e.g., defining the normal matrix as <math> A = M^* M \in \mathbb{C}^{N \times N}</math> of size <math>N</math>-by-<math> \times N</math>, takes advantage of the fact that for a given <math>N</math>-by-<math> \times m</math> matrix <math> W \in \mathbb{C}^{N \times m} </math> with [[orthonormal]] columns the eigenvalue problem of the Rayleigh–Ritz method for the <math>m</math>-by-<math> \times m</math> matrix
:<math display="block"> W^* A W = W^* M^* M W = (M W)^* M W</math>
can be interpreted as a singular value problem for the <math>N</math>-by-<math> \times m</math> matrix <math>M W</math>. This interpretation allows simple simultaneous calculation of both left and right approximate singular vectors as follows.
 
# Compute the <math> N \times m </math> matrix <math> M W </math>.
# Compute the [[Singular_value_decompositionSingular value decomposition#Thin_SVDThin SVD|thin, or economy-sized, SVD]] <math> M W = \mathbf {U}{{ \Sigma }}\mathbf {V}_hV_h,</math> with <math>N</math>-by-<math> \times m</math> matrix <math>\mathbf {U}</math>, <math>m</math>-by-<math> \times m</math> diagonal matrix <math>{\Sigma}</math>, and <math>m</math>-by-<math> \times m</math> matrix <math>\mathbf {V}_h</math>.
# Compute the matrices of the Ritz left <math>U = \mathbf {U}</math> and right <math>V_h = \mathbf {V}_h W^*</math> singular vectors.
# Output approximations <math>U, \Sigma, V_h</math>, called the Ritz singular triplets, to selected singular values and the corresponding left and right singular vectors of the original matrix <math>M</math> representing an approximate [[Singular_value_decompositionSingular value decomposition#Truncated_SVDTruncated SVD| Truncated singular value decomposition (SVD)]] with left singular vectors restricted to the column-space of the matrix <math>W</math>.
 
The algorithm can be used as a post-processing step where the matrix <math>W</math> is an output of an eigenvalue solver, e.g., such as [[LOBPCG]], approximating numerically selected eigenvectors of the normal matrix <math> A = M^* M</math>.
 
==== Example ====
The matrix
: <math display="block">M = \begin{bmatrix}
1 & 0 & 0 & 0 \\
0 & 2 & 0 & 0 \\
0 & 0 & 3 & 0 \\
0 & 0 & 0 & 4 \\
0 & 0 & 0 & 0
\end{bmatrix}</math>
has its normal matrix
: <math display="block">A = M^* M = \begin{bmatrix}
1 & 0 & 0 & 0 \\
0 & 4 & 0 & 0 \\
0 & 0 & 9 & 0 \\
0 & 0 & 0 & 16 \\
\end{bmatrix},</math>,
singular values <math>1, 2, 3, 4</math> and the corresponding [[Singular_value_decompositionSingular value decomposition#Thin_SVDThin SVD|thin SVD]]
:<math display="block">A =
\begin{bmatrix}
0 & 0 & 0 & 1 \\
0 & 0 & 1 & 0 \\
0 & 1 & 0 & 0 \\
1 & 0 & 0 & 0 \\
0 & 0 & 0 & 0
\end{bmatrix}
\begin{bmatrix}
4 & 0 & 0 & 0 \\
0 & 3 & 0 & 0 \\
0 & 0 & 2 & 0 \\
0 & 0 & 0 & 1
\end{bmatrix}
\begin{bmatrix}
0 & 0 & 0 & 1 \\
0 & 0 & 1 & 0 \\
0 & 1 & 0 & 0 \\
1 & 0 & 0 & 0
\end{bmatrix},</math>,
where the columns of the first multiplier from the complete set of the left singular vectors of the matrix <math>A</math>, the diagonal entries of the middle term are the singular values, and the columns of the last multiplier transposed (although the transposition does not change it)
:<math display="block">
\begin{bmatrix}
0 & 0 & 0 & 1 \\
0 & 0 & 1 & 0 \\
0 & 1 & 0 & 0 \\
1 & 0 & 0 & 0
\end{bmatrix}^*
\quad = \quad
\begin{bmatrix}
0 & 0 & 0 & 1 \\
0 & 0 & 1 & 0 \\
0 & 1 & 0 & 0 \\
1 & 0 & 0 & 0
\end{bmatrix}
Line 121 ⟶ 124:
 
Let us take
:<math display="block">W = \begin{bmatrix}
1 / \sqrt{2}/2 & 1 / \sqrt{2}/2 \\
1 / \sqrt{2}/2 & -1 / \sqrt{2}/2 \\
0 & 0 \\
0 & 0
\end{bmatrix}</math>
with the column-space that is spanned by the two exact right singular vectors
:<math> \mathbf {U} display= "block">\begin{bmatrix}
:<math>
0 & 01 \\
\begin{bmatrix}
01 & 10 \\
10 & 0 \\
0 & 0\\
0 & 0
\end{bmatrix}</math>
</math>
corresponding to the singular values 1 and 2.
 
Following the algorithm step 1, we compute
:<math display="block">MW = \begin{bmatrix}
1 / \sqrt{2}/2 & 1 / \sqrt{2}/2 \\
\sqrt{2} & -\sqrt{2} \\
0 & 0 \\
0 & 0
\end{bmatrix},</math>
and on step 2 its [[Singular_value_decompositionSingular value decomposition#Thin_SVDThin SVD|thin SVD]] <math> M W = \mathbf {U}{{\Sigma }}\mathbf {V}_h</math> with
<math display="block"> \mathbf {U} = \begin{bmatrix}
with
0 & 01 \\
:<math> \mathbf {U} = \begin{bmatrix}
01 & 10 \\
10 & 0 \\
0 & 0 \\
0 & 0\\
0 & 0
\end{bmatrix},
\quad
\Sigma = \begin{bmatrix}
2 & 0 \\
0 & 1
\end{bmatrix},
\quad
\mathbf {V}_h = \begin{bmatrix}
1 / \sqrt{2}/2 & -1 / \sqrt{2}/2 \\
1 / \sqrt{2}/2 & 1 / \sqrt{2}/2
\end{bmatrix}.
</math>
Thus we already obtain the singular values 2 and 1 from <math>\Sigma</math> and from <math>\mathbf {U}</math> the corresponding two left singular vectors <math>u</math> as <math>[0, 1, 0, 0, 0]^*</math> and <math>[1, 0, 0, 0, 0]^*</math>, which span the column-space of the matrix <math>W</math>, explaining why the approximations are exact for the given <math>W</math>.
 
Finally, step 3 computes the matrix <math>V_h = \mathbf {V}_h W^*</math>
<math display="block">\mathbf {V}_h =
: <math>
\mathbf {V}_h = \begin{bmatrix}
1 / \sqrt{2}/2 & -1 / \sqrt{2}/2 \\
1 / \sqrt{2}/2 & 1 / \sqrt{2}/2
\end{bmatrix}
\,
\begin{bmatrix}
1 / \sqrt{2}/2 & 1 / \sqrt{2}/2 & 0 & 0 \\
1 / \sqrt{2}/2 & -1 / \sqrt{2}/2 & 0 & 0
\end{bmatrix} =
\begin{bmatrix}
0 & 1 & 0 & 0 \\
1 & 0 & 0 & 0
\end{bmatrix}</math>
</math>
recovering from its rows the two right singular vectors <math>v</math> as <math>[0, 1, 0, 0]^*</math> and <math>[1, 0, 0, 0]^*</math>.
We validate the first vector: <math>MvM v = \sigma u</math>
: <math display="block">
\begin{bmatrix}
1 & 0 & 0 & 0 \\
0 & 2 & 0 & 0 \\
0 & 0 & 3 & 0 \\
0 & 0 & 0 & 4 \\
0 & 0 & 0 & 0
\end{bmatrix}
\,
\begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix}
= \, 2 \,
\begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix}
</math>
and <math>M^* u = \sigma v</math>
: <math display="block">
\begin{bmatrix}
1 & 0 & 0 & 0 & 0 \\
0 & 2 & 0 & 0 & 0 \\
0 & 0 & 3 & 0 & 0 \\
0 & 0 & 0 & 4 & 0
\end{bmatrix}
\,
\begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix}
= \, 2 \,
\begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix}.
</math>
Thus, for the given matrix <math>W</math> with its column-space that is spanned by two exact right singular vectors, we determine these right singular vectors, as well as the corresponding left singular vectors and the singular values, all exactly. For an arbitrary matrix <math>W</math>, we obtain approximate singular triplets which are optimal given <math>W</math> in the sense of optimality of the Rayleigh–Ritz method.
Line 219 ⟶ 218:
 
== Notes and references==
{{Reflist|refs=}}
<ref name="TrefethenIII1997">{{cite book|last1=Trefethen|first1=Lloyd N. |last2= Bau, III|first2=David|title=Numerical Linear Algebra|url=https://books.google.com/books?id=JaPtxOytY7kC|year=1997|publisher=SIAM|isbn=978-0-89871-957-4|page=254}}</ref>
}}
 
==External links==