Rayleigh–Ritz method: Difference between revisions

Content deleted Content added
m The sentence was not helped by the word 'one', when, in English the sentence is clearer by its omission.
 
(17 intermediate revisions by 9 users not shown)
Line 1:
{{Short description|Method for approximating eigenvalues}}
{{merge|RitzAdditional method|discuss=Talk:Ritz method#Merge proposalcitations|date=June 2024}}
 
The '''Rayleigh–Ritz method''' is a direct numerical method of approximating [[eigenvalues and eigenvectors|eigenvalues]], originated in the context of solving physical [[Boundaryboundary value problem|boundary value problems]]s and named after [[Lord Rayleigh]] and [[Walther Ritz]].
 
In this method, an infinite-dimensional [[linear operator]] is approximated by a finite-dimensional [[Dilation (operator theory)|compression]], on which we can use an [[eigenvalue algorithm]].
It is used in all applications that involve approximating [[eigenvalues and eigenvectors]], often under different names. In [[quantum mechanics]], where a system of particles is described using a [[Hamiltonian (quantum mechanics)|Hamiltonian]], the [[Ritz method]] uses [[ansatz|trial wave functions]] to approximate the ground state eigenfunction with the lowest energy. In the [[finite element method]] context, mathematically the same algorithm is commonly called the [[Ritz-Galerkin method]]. The Rayleigh–Ritz method or [[Ritz method]] terminology is typical in mechanical and structural engineering to approximate the [[Normal mode|eigenmodes]] and [[Resonance|resonant frequencies]] of a structure.
 
It is used in all applications that involve approximating [[eigenvalues and eigenvectors]], often under different names. In [[quantum mechanics]], where a system of particles is described using a [[Hamiltonian (quantum mechanics)|Hamiltonian]], the Ritz method uses [[ansatz|trial wave functions]] to approximate the ground state eigenfunction with the lowest energy. In the [[finite element method]] context, mathematically the same algorithm is commonly called the [[Ritz-Galerkin method]]. The Rayleigh–Ritz method or Ritz method terminology is typical in mechanical and structural engineering to approximate the [[Normal mode|eigenmodes]] and [[Resonance|resonant frequencies]] of a structure.
 
== Naming and attribution ==
 
The name of the method and its origin story have been debated by histroianshistorians.<ref name="Leissa">{{cite journal|last1=Leissa|first1=A.W.|title=The historical bases of the Rayleigh and Ritz methods|journal=Journal of Sound and Vibration|volume=287|issue=4–5|year=2005|pages=961–978| doi=10.1016/j.jsv.2004.12.021| bibcode=2005JSV...287..961L| url=https://www.sciencedirect.com/science/article/abs/pii/S0022460X05000362 |url-access=subscription}}</ref><ref name="Ilanko">{{cite journal|last1=Ilanko|first1=Sinniah|title=Comments on the historical bases of the Rayleigh and Ritz methods|journal=Journal of Sound and Vibration|volume=319|issue=1–2|year=2009|pages=731–733 | doi=10.1016/j.jsv.2008.06.001|bibcode=2009JSV...319..731I }}</ref> It has been called [[Ritz method]] after [[Walther Ritz]], since the numerical procedure has been published by [[Walther Ritz]] in 1908-1909. According to A. W. Leissa,<ref name="Leissa" /> [[Lord Rayleigh]] wrote a paper congratulating Ritz on his work in 1911, but stating that he himself had used Ritz's method in many places in his book and in another publication. This statement, although later disputed, and the fact that the method in the trivial case of a single vector results in the [[Rayleigh quotient]] make the case for the name ''Rayleigh–Ritz'' method. According to S. Ilanko,<ref name="Ilanko"/> citing [[Richard Courant]], both [[Lord Rayleigh]] and [[Walther Ritz]] independently conceived the idea of utilizing the equivalence between [[Boundaryboundary value problem|boundary value problems]]s of [[partial differential equations]] on the one hand and problems of the [[calculus of variations]] on the other hand for numerical calculation of the solutions, by substituting for the variational problems simpler approximating extremum problems in which a finite number of parameters need to be determined; see the article [[Ritz method]] for details. Ironically for the debate, the modern justification of the algorithm drops the [[calculus of variations]] in favor of the simpler and more general approach of [[orthogonal projection]] as in [[Galerkin method]] named after [[Boris Galerkin]], thus leading also to the [[Ritz-Galerkin method]] naming.{{cn|reason=need historian reference here|date=June 2024}}
 
== Method ==
Let <math>T</math> be a [[linear operator]] on a [[Hilbert space]] <math>\mathcal{H}</math>, with [[inner product]] <math>(\cdot, \cdot)</math>. Now consider a [[finite set]] of functions <math>\mathcal{L} = \{\varphi_1, ...,\varphi_n\}</math>. Depending on the application these functions may be:
 
* A subset of the [[orthonormal basis]] of the original operator;<ref name=daviesplum>{{cite journal|last1=Davies|first1=E. B.|last2=Plum|first2=M.|title=Spectral Pollution|journal=IMA Journal of Numerical Analysis|author-link1=E. Brian Davies|year=2003|arxiv=math/0302145 |bibcode=2003math......2145D }}</ref>
* A space of [[Spline (mathematics)|splines]] (as in the [[Galerkin method]]);<ref name=sulimayers>{{cite book|last1=Süli|first1=Endre|author-link1=Endre Süli|last2=Mayers|first2=David|title=An Introduction to Numerical Analysis|publisher=[[Cambridge University Press]]|isbn=0521007941|year=2003}}</ref>
* A set of functions which approximate the [[eigenfunctions]] of the operator.<ref name=levitinshargorodsky>{{cite journal|last1=Levitin|first1=Michael|last2=Shargorodsky|first2=Eugene|title=Spectral pollution and second order relative spectra for self-adjoint operators|journal=IMA Journal of Numerical Analysis|year=2004|volume=24 |issue=3 |pages=393–416 |doi=10.1093/imanum/24.3.393 |arxiv=math/0212087 }}</ref>
 
One could use the orthonormal basis generated from the eigenfunctions of the operator, which will produce [[diagonal matrix|diagonal]] approximating matrices, but in this case we would have already had to calculate the spectrum.
 
We now approximate <math>T</math> by <math>T_{\mathcal{L}}</math>, which is defined as the matrix with entries<ref name=daviesplum />
 
<math display="block">(T_{\mathcal{L}})_{i,j} = (T \varphi_i, \varphi_j).</math>
 
and solve the eigenvalue problem <math>T_{\mathcal{L}}u = \lambda u</math>. It can be shown that the matrix <math>T_{\mathcal{L}}</math> is the [[Dilation (operator theory)|compression]] of <math>T</math> to <math>\mathcal{L}</math>.<ref name=daviesplum />
 
For [[differential operators]] (such as [[Sturm-Liouville problem|Sturm-Liouville operators]]), the inner product <math>(\cdot, \cdot)</math> can be replaced by the [[weak formulation]] <math>\mathcal{A}(\cdot, \cdot)</math>.<ref name=sulimayers /><ref name=pryce>{{cite book|last1=Pryce|first1=John D.|title=Numerical Solution of Sturm-Liouville Problems|isbn=0198534159|publisher=Oxford University Press|year=1994}}</ref>
 
If a subset of the orthonormal basis was used to find the matrix, the eigenvectors of <math>T_{\mathcal{L}}</math> will be [[linear combinations]] of orthonormal basis functions, and as a result they will be approximations of the eigenvectors of <math>T</math>.<ref name=arfkenweber>{{cite book|last1=Arfken|first1 = George B.|author-link1=George B. Arfken|last2 = Weber| first2 = Hans J.|year = 2005|title= Mathematical Methods For Physicists|url= https://books.google.com/books?id=tNtijk2iBSMC&pg=PA83|edition= 6th|publisher=Academic Press| isbn=978-0-08-047069-6 }}</ref>
 
== Properties ==
=== Spectral pollution ===
It is possible for the Rayleigh–Ritz method to produce values which do not converge to actual values in the spectrum of the operator as the truncation gets large. These values are known as spectral pollution.<ref name=daviesplum /><ref name=levitinshargorodsky /><ref>{{cite magazine|url=https://ima.org.uk/16912/unscrambling-the-infinite-can-we-compute-spectra/|last1=Colbrook|first1=Matthew|title=Unscrambling the Infinite: Can we Compute Spectra?|magazine=Mathematics Today|publisher=Institute of Mathematics and its Applications}}</ref> In some cases (such as for the [[Schrödinger equation]]), there is no approximation which both includes all eigenvalues of the equation, and contains no pollution.<ref>{{cite journal|last1=Colbrook|first1=Matthew|last2=Roman|first2=Bogdan|last3=Hansen|first3=Anders|title=How to Compute Spectra with Error Control|url=https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.122.250201|journal=Physical Review Letters|year=2019|volume=122 |issue=25 |page=250201 |doi=10.1103/PhysRevLett.122.250201 |pmid=31347861 |bibcode=2019PhRvL.122y0201C }}</ref>
 
The spectrum of the compression (and thus pollution) is bounded by the [[numerical range]] of the operator; in many cases it is bounded by a subset of the numerical range known as the [[essential numerical range]].<ref>{{cite journal|last1=Pokrzywa|first1=Andrzej|title=Method of orthogonal projections and approximation of the spectrum of a bounded operator|year=1979|journal=Studia Mathematica|volume=65 |pages=21–29 |doi=10.4064/sm-65-1-21-29 }}</ref><ref>{{cite journal|last1=Bögli|first1=Sabine|author1-link=Sabine Bögli|last2=Marletta|first2=Marco|last3=Tretter|first3=Christiane|author3-link=Christiane Tretter|title=The essential numerical range for unbounded linear operators|journal=Journal of Functional Analysis|year=2020|volume=279 |doi=10.1016/j.jfa.2020.108509 |arxiv=1907.09599 }}</ref>
 
== For matrix eigenvalue problems ==
In [[numerical linear algebra]], the '''Rayleigh–Ritz method''' is commonly<ref name="TrefethenIII1997">{{cite book| last1=Trefethen| first1=Lloyd N. | last2= Bau, III|first2=David|title=Numerical Linear Algebra|url=https://books.google.com/books?id=JaPtxOytY7kC| year=1997| publisher=SIAM| isbn=978-0-89871-957-4|page=254}}</ref> applied to approximate an eigenvalue problem
<math display="block"> A \mathbf{x} = \lambda \mathbf{x}</math>
for the matrix <math> A \in \mathbb{C}^{N \times N}</math> of size <math>N</math> using a projected matrix of a smaller size <math>m < N</math>, generated from a given matrix <math> V \in \mathbb{C}^{N \times m} </math> with [[orthonormal]] columns. The matrix version of the algorithm is the most simple:
Line 22 ⟶ 49:
If the subspace with the orthonormal basis given by the columns of the matrix <math> V \in \mathbb{C}^{N \times m} </math> contains <math> k \leq m </math> vectors that are close to eigenvectors of the matrix <math>A</math>, the '''Rayleigh–Ritz method''' above finds <math>k</math> Ritz vectors that well approximate these eigenvectors. The easily computable quantity <math> \| A \tilde{\mathbf{x}}_i - \tilde{\lambda}_i \tilde{\mathbf{x}}_i\|</math> determines the accuracy of such an approximation for every Ritz pair.
 
In the easiest case <math>m = 1</math>, the <math> N \times m </math> matrix <math>V</math> turns into a unit column-vector <math>v</math>, the <math> m \times m </math> matrix <math> V^* A V </math> is a scalar that is equal to the [[Rayleigh quotient]] <math>\rho(v) = v^*Av/v^*v</math>, the only <math>i = 1</math> solution to the eigenvalue problem is <math>y_i = 1</math> and <math>\mu_i = \rho(v)</math>, and the only one Ritz vector is <math>v</math> itself. Thus, the Rayleigh–Ritz method turns into computing of the [[Rayleigh quotient]] if <math>m = 1</math>.
 
Another useful connection to the [[Rayleigh quotient]] is that <math>\mu_i = \rho(v_i)</math> for every Ritz pair <math>(\tilde{\lambda}_i, \tilde{\mathbf{x}}_i)</math>, allowing to derive some properties of Ritz values <math>\mu_i</math> from the corresponding theory for the [[Rayleigh quotient]]. For example, if <math>A</math> is a [[Hermitian matrix]], its [[Rayleigh quotient]] (and thus its every Ritz value) is real and takes values within the closed interval of the smallest and largest eigenvalues of <math>A</math>.
Line 170 ⟶ 197:
\end{bmatrix}.
</math>
Thus we already obtain the singular values 2 and 1 from <math>\Sigma</math> and from <math>\mathbf {U}</math> the corresponding two left singular vectors <math>u</math> as <math>[0, 1, 0, 0, 0]^*</math> and <math>[1, 0, 0, 0, 0]^*</math>, which span the column-space of the matrix <math>W</math>, explaining why the approximations are exact for the given <math>W</math>.
 
Finally, step 3 computes the matrix <math>V_h = \mathbf {V}_h W^*</math>
Line 216 ⟶ 243:
</math>
Thus, for the given matrix <math>W</math> with its column-space that is spanned by two exact right singular vectors, we determine these right singular vectors, as well as the corresponding left singular vectors and the singular values, all exactly. For an arbitrary matrix <math>W</math>, we obtain approximate singular triplets which are optimal given <math>W</math> in the sense of optimality of the Rayleigh–Ritz method.
 
== Applications and examples ==
 
=== In quantum physics ===
In quantum physics, where the spectrum of the [[Hamiltonian (quantum mechanics)|Hamiltonian]] is the set of discrete energy levels allowed by a quantum mechanical system, the Rayleigh–Ritz method is used to approximate the energy states and wavefunctions of a complicated atomic or nuclear system.<ref name=arfkenweber /> In fact, for any system more complicated than a single hydrogen atom, there is no known exact solution for the spectrum of the Hamiltonian.<ref name=pryce />
 
In this case, a [[ansatz|trial wave function]], <math>\Psi</math>, is tested on the system. This trial function is selected to meet boundary conditions (and any other physical constraints). The exact function is not known; the trial function contains one or more adjustable parameters, which are varied to find a lowest energy configuration.
 
It can be shown that the ground state energy, <math>E_0</math>, satisfies an inequality:
<math display="block"> E_0 \le \frac{\langle \Psi | \hat{H}| \Psi \rangle}{\langle \Psi | \Psi \rangle}. </math>
 
That is, the ground-state energy is less than this value.
The trial wave-function will always give an expectation value larger than or equal to the ground-energy.
 
If the trial wave function is known to be [[orthogonality|orthogonal]] to the ground state, then it will provide a boundary for the energy of some excited state.
 
The Ritz ansatz function is a linear combination of ''N'' known basis functions <math>\left\lbrace\Psi_i\right\rbrace</math>, parametrized by unknown coefficients:
<math display="block"> \Psi = \sum_{i=1}^N c_i \Psi_i. </math>
 
With a known Hamiltonian, we can write its expected value as
<math display="block"> \varepsilon = \frac{\left\langle \displaystyle\sum_{i=1}^N c_i\Psi_i \right| \hat{H} \left| \displaystyle\sum_{i=1}^Nc_i\Psi_i \right\rangle}{\left\langle \left. \displaystyle\sum_{i=1}^N c_i\Psi_i \right| \displaystyle\sum_{i=1}^Nc_i\Psi_i \right\rangle} = \frac{\displaystyle\sum_{i=1}^N\displaystyle\sum_{j=1}^Nc_i^*c_jH_{ij}}{\displaystyle\sum_{i=1}^N\displaystyle\sum_{j=1}^Nc_i^*c_jS_{ij}} \equiv \frac{A}{B}. </math>
 
The basis functions are usually not orthogonal, so that the [[overlap matrix]] '''''S''''' has nonzero nondiagonal elements. Either <math>\left\lbrace c_i \right\rbrace</math> or <math>\left\lbrace c_i^* \right\rbrace</math> (the conjugation of the first) can be used to minimize the expectation value. For instance, by making the partial derivatives of <math>\varepsilon</math> over <math>\left\lbrace c_i^* \right\rbrace</math> zero, the following equality is obtained for every ''k'' = 1, 2, ..., ''N'':
<math display="block"> \frac{\partial\varepsilon}{\partial c_k^*} = \frac{\displaystyle\sum_{j=1}^Nc_j(H_{kj}-\varepsilon S_{kj})}{B} = 0, </math>
which leads to a set of ''N'' [[secular equation]]s:
<math display="block">\sum_{j=1}^N c_j \left( H_{kj} - \varepsilon S_{kj} \right) = 0 \quad \text{for} \quad k = 1,2,\dots,N. </math>
 
In the above equations, energy <math>\varepsilon</math> and the coefficients <math>\left\lbrace c_j \right\rbrace</math> are unknown. With respect to '''''c''''', this is a homogeneous set of linear equations, which has a solution when the [[determinant]] of the coefficients to these unknowns is zero:
<math display="block">\det \left( H - \varepsilon S \right) = 0, </math>
which in turn is true only for ''N'' values of <math>\varepsilon</math>. Furthermore, since the Hamiltonian is a [[hermitian operator]], the '''''H''''' matrix is also [[hermitian matrix|hermitian]] and the values of <math>\varepsilon_i</math> will be real. The lowest value among <math>\varepsilon_i</math> (i=1,2,..,N), <math>\varepsilon_0</math>, will be the best approximation to the ground state for the basis functions used. The remaining ''N-1'' energies are estimates of excited state energies. An approximation for the wave function of state ''i'' can be obtained by finding the coefficients <math>\left\lbrace c_j \right\rbrace</math> from the corresponding secular equation.
 
=== In mechanical engineering ===
The Rayleigh–Ritz method is often used in [[mechanical engineering]] for finding the approximate real [[resonant frequency|resonant frequencies]] of multi [[degrees of freedom (physics and chemistry)|degree of freedom]] systems, such as [[spring mass system]]s or [[flywheel]]s on a shaft with varying [[cross section (geometry)|cross section]]. It is an extension of Rayleigh's method. It can also be used for finding buckling loads and post-buckling behaviour for columns.
 
Consider the case whereby we want to find the resonant frequency of oscillation of a system. First, write the oscillation in the form,
<math display="block">y(x,t) = Y(x) \cos\omega t</math>
with an unknown mode shape <math>Y(x)</math>. Next, find the total energy of the system, consisting of a kinetic energy term and a potential energy term. The kinetic energy term involves the square of the [[time derivative]] of <math>y(x,t)</math> and thus gains a factor of <math>\omega ^2</math>. Thus, we can calculate the total energy of the system and express it in the following form:
<math display="block">E = T + V \equiv A[Y(x)] \omega^2\sin^2 \omega t + B[Y(x)] \cos^2 \omega t</math>
 
By conservation of energy, the average kinetic energy must be equal to the average potential energy. Thus,
<math display="block">\omega^2 = \frac{B[Y(x)]}{A[Y(x)]} = R[Y(x)]</math>
which is also known as the [[Rayleigh quotient]]. Thus, if we knew the mode shape <math>Y(x)</math>, we would be able to calculate <math>A[Y(x)]</math> and <math>B[Y(x)]</math>, and in turn get the eigenfrequency. However, we do not yet know the mode shape. In order to find this, we can approximate <math>Y(x)</math> as a combination of a few approximating functions <math>Y_i(x)</math>
<math display="block">Y(x) = \sum_{i=1}^N c_i Y_i(x)</math>
where <math>c_1,c_2,\cdots,c_N</math> are constants to be determined. In general, if we choose a random set of <math>c_1,c_2,\cdots,c_N</math>, it will describe a superposition of the actual eigenmodes of the system. However, if we seek <math>c_1,c_2,\cdots,c_N</math> such that the eigenfrequency <math>\omega^2</math> is minimised, then the mode described by this set of <math>c_1,c_2,\cdots,c_N</math> will be close to the lowest possible actual eigenmode of the system. Thus, this finds the lowest eigenfrequency. If we find eigenmodes orthogonal to this approximated lowest eigenmode, we can approximately find the next few eigenfrequencies as well.
 
In general, we can express <math>A[Y(x)]</math> and <math>B[Y(x)]</math> as a collection of terms quadratic in the coefficients <math>c_i</math>:
<math display="block">B[Y(x)] = \sum_i \sum_j c_i c_j K_{ij} = \mathbf{c}^\mathsf{T} K \mathbf{c}</math>
<math display="block">A[Y(x)] = \sum_i \sum_j c_i c_j M_{ij} = \mathbf{c}^\mathsf{T} M \mathbf{c}</math>
where <math>K</math> and <math>M</math> are the stiffness matrix and mass matrix of a discrete system respectively.
 
The minimization of <math>\omega^2</math> becomes:
<math display="block">\frac{\partial \omega^2}{\partial c_i} = \frac{\partial}{\partial c_i} \frac{\mathbf{c}^\mathsf{T} K \mathbf{c}}{\mathbf{c}^\mathsf{T} M \mathbf{c}} = 0</math>
 
Solving this,
<math display="block">\mathbf{c}^\mathsf{T} M \mathbf{c}\frac{\partial \mathbf{c}^\mathsf{T} K \mathbf{c}}{\partial \mathbf{c}} - \mathbf{c}^\mathsf{T} K \mathbf{c} \frac{\partial \mathbf{c}^\mathsf{T} M \mathbf{c}}{\partial \mathbf{c}} = 0</math>
<math display="block">K \mathbf c - \frac{\mathbf{c}^\mathsf{T} K \mathbf{c}}{\mathbf{c}^\mathsf{T} M \mathbf{c}}M\mathbf{c} = \mathbf{0}</math>
<math display="block">K \mathbf{c} - \omega^2 M \mathbf{c} = \mathbf{0}</math>
 
For a non-trivial solution of c, we require determinant of the matrix coefficient of c to be zero.
<math display="block">\det(K - \omega^2 M)=0</math>
 
This gives a solution for the first ''N'' eigenfrequencies and eigenmodes of the system, with N being the number of approximating functions.
 
=== Simple case of double spring-mass system ===
The following discussion uses the simplest case, where the system has two lumped springs and two lumped masses, and only two mode shapes are assumed. Hence {{math|1=''M'' = [''m''<sub>1</sub>, ''m''<sub>2</sub>]}} and {{math|1=''K'' = [''k''<sub>1</sub>, ''k''<sub>2</sub>]}}.
 
A [[mode shape]] is assumed for the system, with two terms, one of which is weighted by a factor&nbsp;''B'', e.g. ''Y'' =&nbsp;[1,&nbsp;1]&nbsp;+&nbsp;''B''[1,&nbsp;−1].
[[Simple harmonic motion]] theory says that the [[velocity]] at the time when deflection is zero, is the [[angular frequency]] <math>\omega</math> times the deflection (y) at time of maximum deflection. In this example the [[kinetic energy]] (KE) for each mass is <math display="inline">\frac{1}{2}\omega^2 Y_1^2 m_1</math> etc., and the [[potential energy]] (PE) for each [[Spring (device)|spring]] is <math display="inline">\frac{1}{2} k_1 Y_1^2</math> etc.
 
We also know that without damping, the maximal KE equals the maximal PE. Thus,
<math display="block">\sum_{i=1}^2 \left(\frac{1}{2} \omega^2 Y_i^2 M_i\right)=\sum_{i=1}^2 \left(\frac{1}{2} K_i Y_i^2\right)</math>
 
The overall amplitude of the mode shape cancels out from each side, always. That is, the actual size of the assumed deflection does not matter, just the mode ''shape''.
 
Mathematical manipulations then obtain an expression for <math>\omega</math>, in terms of B, which can be [[derivative|differentiated]] with respect to B, to find the minimum, i.e. when <math>d\omega/dB=0</math>. This gives the value of B for which <math>\omega</math> is lowest. This is an upper bound solution for <math>\omega</math> if <math>\omega</math> is hoped to be the predicted fundamental frequency of the system because the mode shape is ''assumed'', but we have found the lowest value of that upper bound, given our assumptions, because B is used to find the optimal 'mix' of the two assumed mode shape functions.
 
There are many tricks with this method, the most important is to try and choose realistic assumed mode shapes. For example, in the case of [[beam deflection]] problems it is wise to use a deformed shape that is analytically similar to the expected solution. A [[quartic function|quartic]] may fit most of the easy problems of simply linked beams even if the order of the deformed solution may be lower. The springs and masses do not have to be discrete, they can be continuous (or a mixture), and this method can be easily used in a [[spreadsheet]] to find the natural frequencies of quite complex distributed systems, if you can describe the distributed KE and PE terms easily, or else break the continuous elements up into discrete parts.
 
This method could be used iteratively, adding additional mode shapes to the previous best solution, or you can build up a long expression with many Bs and many mode shapes, and then differentiate them [[partial differentiation|partially]].
 
=== In dynamical systems ===
The [[Koopman operator]] allows a finite-dimensional [[nonlinear system]] to be encoded as an infinite-dimensional [[linear system]]. In general, both of these problems are difficult to solve, but for the latter we can use the Ritz-Galerkin method to approximate a solution.<ref>{{cite arXiv|last1=Servadio|first1=Simone|last2=Arnas|first2=David|last3=Linares|first3=Richard|title=A Koopman Operator Tutorial with Orthogonal Polynomials|date=2021 |class=math.NA |eprint=2111.07485 }}</ref>
 
== The relationship with the finite element method ==
 
In the language of the finite element method, the matrix <math>H_{kj}</math> is precisely the ''stiffness matrix'' of the Hamiltonian in the piecewise linear element space, and the matrix <math>S_{kj}</math> is the ''mass matrix''. In the language of linear algebra, the value <math>\epsilon</math> is an eigenvalue of the discretized Hamiltonian, and the vector <math>c</math> is a discretized eigenvector.
 
== See also ==
*[[Ritz method]]
*[[Rayleigh quotient]]
*[[Arnoldi iteration]]
*[[Sturm&ndash;Liouville theory]]
*[[Hilbert space]]
*[[Galerkin method]]
 
== Notes and references==
* {{cite journal|last=Ritz|first=Walther|author-link=Walther Ritz|title=Über eine neue Methode zur Lösung gewisser Variationsprobleme der mathematischen Physik|journal=Journal für die Reine und Angewandte Mathematik|volume=135|pages=1–61|year=1909|doi=10.1515/crll.1909.135.1 |url=http://gdz.sub.uni-goettingen.de/no_cache/dms/load/img/?IDDOC=261182|url-access=subscription}}
* {{cite journal|last=MacDonald|first=J. K.|title=Successive Approximations by the Rayleigh-Ritz Variation Method|journal=Phys. Rev.|volume=43|year=1933|issue=10 |pages=830–833 |doi=10.1103/PhysRev.43.830 |bibcode=1933PhRv...43..830M |url=http://journals.aps.org/pr/abstract/10.1103/PhysRev.43.830|url-access=subscription}}
{{Reflist}}
 
==External links==
*[https://web.archive.org/web/20081010161336/http://www.math.nps.navy.mil/~bneta/4311.pdf Course on Calculus of Variations, has a section on Rayleigh–Ritz method].
* [https://encyclopediaofmath.org/wiki/Ritz_method Ritz method] in the ''[[Encyclopedia of Mathematics]]''
*{{cite journal | title=From Euler, Ritz, and Galerkin to Modern Computing | last1=Gander | first1=Martin J.| last2=Wanner | first2=Gerhard | journal=SIAM Review | year=2012 | volume=54 | issue=4 | pages=627–666 | doi=10.1137/100804036| url=https://archive-ouverte.unige.ch/unige:171273 | citeseerx=10.1.1.297.5697 }}
 
{{DEFAULTSORT:Rayleigh-Ritz Method}}