Content deleted Content added
→Diagonalizable matrices: Corrected typo in matrix |
Citation bot (talk | contribs) Altered author-link. | Use this bot. Report bugs. | Suggested by Dominic3203 | Category:Matrix theory | #UCB_Category 110/117 |
||
(22 intermediate revisions by 14 users not shown) | |||
Line 1:
{{short description|Function that maps matrices to matrices}}
In [[mathematics]], every [[analytic function]] can be used for defining a '''matrix function'''
This is used for defining the [[exponential of a matrix]], which is involved in the [[closed-form expression|closed-form]] solution of systems of [[linear differential equation]]s.
== Extending scalar function to matrix functions ==
Line 7 ⟶ 9:
=== Power series ===
If the
then a matrix function <math>A\mapsto f(A)</math> can be defined by substituting
===Diagonalizable matrices===
d_1 & \cdots & 0 \\
such that <math>A = P~ D~ P^{-1}</math>.▼
\vdots & \ddots & \vdots \\
0 & \cdots & d_n
▲:<math>f(A) = P \begin{bmatrix}
f(d_1) & \cdots & 0 \\
\vdots & \ddots & \vdots \\
0 & \cdots & f(d_n)
\end{bmatrix}
It can be verified that the matrix {{math|''f''(''A'')}} does not depend on a particular choice of {{mvar|P}}.
For example, suppose one is seeking <math>\Gamma(A
1&3\\
2&1
\end{bmatrix}
One has
1-\sqrt{6}&
0 & 1+ \sqrt{6}
\end{bmatrix} P^{-1}~, </math>
for
1/2 & 1/2 \\
-\frac{1}{\sqrt{6}}
\end{bmatrix} ~. </math>
Application of the formula then simply yields
1/2 & 1/2 \\
-\frac{1}{\sqrt{6}
\end{bmatrix}
\begin{bmatrix}
\Gamma(1-\sqrt{6})
0&\Gamma(1+\sqrt{6})
\end{bmatrix} \cdot
\begin{bmatrix}
1
1
\end{bmatrix}
\begin{bmatrix}
\end{bmatrix} ~.
</math>
Likewise,
1/2 & 1/2 \\
-\frac{1}{\sqrt{6}
\end{bmatrix}
\begin{bmatrix}
(1-\sqrt{6})^4
0&(1+\sqrt{6})^4
\end{bmatrix} \cdot
\begin{bmatrix}
1
1
\end{bmatrix}
\begin{bmatrix} 73 & 84\\
56 & 73
\end{bmatrix} ~.
</math>
=== Jordan decomposition ===
{{main|Jordan normal form}}
All complex matrices, whether they are diagonalizable or not, have a [[Jordan normal form]] <math>A = P\,J\,P^{-1}</math>, where the matrix ''J'' consists of [[Jordan block]]s. Consider these blocks separately and apply the [[power series]] to a Jordan block:
<math display="block"> f \left( \begin{bmatrix}
▲:<math> f \left( \begin{bmatrix}
\lambda & 1 & 0 & \cdots & 0 \\
0 & \lambda & 1 & \vdots & \vdots \\
Line 89 ⟶ 95:
\begin{bmatrix}
\frac{f(\lambda)}{0!} & \frac{f'(\lambda)}{1!} & \frac{f''(\lambda)}{2!} & \cdots & \frac{f^{(n-1)}(\lambda)}{(n-1)!}
0 & \frac{f(\lambda)}{0!} & \frac{f'(\lambda)}{1!} & \vdots & \frac{f^{(n-
0 & 0 & \ddots & \ddots & \vdots
\vdots & \cdots & \ddots & \frac{f(\lambda)}{0!} & \frac{f'(\lambda)}{1!}
0 & \cdots & \cdots & 0 & \frac{f(\lambda)}{0!}
\end{bmatrix}.
Line 98 ⟶ 104:
This definition can be used to extend the ___domain of the matrix function
beyond the set of matrices with [[spectral radius]] smaller than the radius of convergence of the power series.
Note that there is also a connection to [[Divided difference#Polynomials and power series|divided differences]].
Line 105 ⟶ 111:
==== Hermitian matrices ====
A [[Hermitian matrix]] has all real eigenvalues and can always be diagonalized by a [[unitary matrix]] P, according to the [[spectral theorem]].
In this case, the Jordan definition is natural.
real functions:
If <math> f(a) \leq g(a)</math> for all eigenvalues of <math>A</math>, then <math>f(A) \preceq g(A)</math>.
(As a convention, <math>X \preceq Y \Leftrightarrow Y - X </math> is a [[positive-semidefinite matrix]].)
The proof follows directly from the definition.
=== Cauchy integral ===
[[Cauchy's integral formula]] from [[complex analysis]] can also be used to generalize scalar functions to matrix functions. Cauchy's integral formula states that for any [[analytic function]] {{mvar|f}} defined on a set {{math|''D'' ⊂
where {{mvar|C}} is a closed simple curve inside the ___domain
Now, replace {{mvar|x}} by a matrix {{mvar|A}} and consider a path {{mvar|C}} inside {{mvar|D}} that encloses all [[eigenvalue]]s of {{mvar|A}}. One possibility to achieve this is to let {{mvar|C}} be a circle around the [[origin (mathematics)|origin]] with [[radius]] larger than
This integral can readily be evaluated numerically using the [[trapezium rule]], which [[Convergent series|converges]] exponentially in this case. That means that the [[precision (arithmetic)|precision]] of the result doubles when the number of nodes is doubled. In routine cases, this is bypassed by [[Sylvester's formula]].
Line 129 ⟶ 134:
=== Matrix perturbations ===
The above Taylor power series allows the scalar <math>x</math> to be replaced by the matrix. This is not true in general when expanding in terms of <math>A(\eta) = A+\eta B</math>
* Distributive law: <math display="block">f(A + \eta B) = (A+\eta B)^{3} = A^{3} + \eta(A^{2}B + ABA + BA^{2}) + \eta^{2}(AB^{2} + BAB + B^{2}A) + \eta^{3}B^{3}</math>▼
* Using scalar Taylor expansion for <math>f(a+\eta b)</math> and replacing scalars with matrices at the end
▲:<math>f(A+\eta B) = (A+\eta B)^{3} = A^{3} + \eta(A^{2}B + ABA + BA^{2}) + \eta^{2}(AB^{2} + BAB + B^{2}A) + \eta^{3}B^{3}</math>
f(a+\eta b) &=
▲* Using scalar Taylor expansion for <math>f(a+\eta b)</math> and replacing scalars with matrices at the end :
\end{align}</math>
▲f(a+\eta b) &=& f(a) + f'(a)\frac{\eta b}{1!} + f''(a)\frac{(\eta b)^{2}}{2!} + f'''(a)\frac{(\eta b)^{3}}{3!} \\[.5em]
▲&=& a^{3} + 3a^{2}(\eta b) + 3a(\eta b)^{2} + (\eta b)^{3} \\[.5em]
▲&\to & A^{3} + 3A^{2}(\eta B) + 3A(\eta B)^{2} + (\eta B)^{3}
▲\end{array}</math>
The scalar expression assumes commutativity while the matrix expression does not, and thus they cannot be equated directly unless <math>[A,B]=0</math>. For some ''f''(''x'') this can be dealt with using the same method as scalar Taylor series. For example, <math display="inline">f(x) = \frac{1}{x}</math>. If <math>A^{-1}</math> exists then <math>f(A+\eta B) = f(\mathbb{I} + \eta A^{-1}B)f(A)</math>. The expansion of the first term then follows the power series given above,
The convergence criteria of the power series then apply, requiring <math>\Vert \eta A^{-1}B \Vert</math> to be sufficiently small under the appropriate matrix norm. For more general problems, which cannot be rewritten in such a way that the two matrices commute, the ordering of matrix products produced by repeated application of the Leibniz rule must be tracked.
=== Arbitrary function of a
An arbitrary function ''f''(''A
<math display="block">f(A) = \frac{f(\lambda_+) + f(\lambda_-)}{2} I + \frac{A - \left (\frac{tr(A)}{2}\right )I}{\sqrt{\left (\frac{tr(A)}{2}\right
where <math>\lambda_\pm</math> are the eigenvalues of its characteristic equation,
▲f(A) = \frac{f(\lambda_+) + f(\lambda_-)}{2} I + \frac{A - \left (\frac{tr(A)}{2}\right )I}{\sqrt{\left (\frac{tr(A)}{2}\right )^2 - |A|}} \frac{f(\lambda_+) - f(\lambda_-)}{2} ~,
<math display="block">\lambda_\pm = \frac{tr(A)}{2} \pm \sqrt{\left (\frac{tr(A)}{2}\right )^2 - |A|} .</math>▼
However, if there is degeneracy, the following formula is used, where f' is the derivative of f.
▲where <math>\lambda_\pm</math> are the eigenvalues of its characteristic equation, |A-λI|=0, and are given by
<math display="block">f(A) = f \left( \frac{tr(A)}{2} \right) I + \mathrm{adj} \left( \frac{tr(A)}{2}I - A \right ) f' \left( \frac{tr(A)}{2} \right) .</math>
▲\lambda_\pm = \frac{tr(A)}{2} \pm \sqrt{\left (\frac{tr(A)}{2}\right )^2 - |A|} .
=== Examples ===
Line 163:
* [[Matrix logarithm]]
* [[Matrix exponential]]
* [[Matrix sign function]]<ref>{{Cite web|last=Higham|first=Nick|date=2020-12-15|title=What Is the Matrix Sign Function?| url=https://nhigham.com/2020/12/15/what-is-the-matrix-sign-function/|access-date=2020-12-27|website=Nick Higham|language=en}}</ref>
== Classes of matrix functions ==
Using the [[Loewner order|semidefinite ordering]] (<math>X \preceq Y \Leftrightarrow Y - X</math> is [[positive-semidefinite matrix|positive-semidefinite]] and
<math> X \prec Y \Leftrightarrow Y - X </math> is [[positive-definite matrix|positive definite]]), some
of the classes of scalar functions can be extended to matrix functions of [[Hermitian matrix|Hermitian matrices]].<ref name="Bhatia">{{cite book | author=Bhatia, R. | title=Matrix Analysis |series=Graduate Texts in Mathematics | year = 1997 | publisher=Springer | volume=169}}</ref>
Line 176:
=== Operator concave/convex ===
A function {{mvar|f}} is called operator concave if and only if
for all self-adjoint matrices {{math|''A'',''H''}} with spectra in the ___domain of {{mvar|f}} and <math>\tau \in [0,1]</math>. This definition is analogous to a [[concave function|concave scalar function]]. An operator convex function can be defined be switching <math>\preceq</math> to <math>\succeq</math> in the definition above.
===Examples===
The matrix log is both operator monotone and operator concave.
==See also==
*[[Algebraic Riccati equation]]
*[[Sylvester's formula]]
*[[Loewner order]]
*[[Matrix calculus]]
*[[Trace inequalities]]
Line 193 ⟶ 194:
==References==
* {{cite book|last1=Higham|first1=Nicholas J.|title=Functions of matrices theory and computation|date=2008|publisher=Society for Industrial and Applied Mathematics|___location=Philadelphia|author-link=Nicholas Higham|isbn=9780898717778}}
{{Authority control}}
[[Category:Matrix theory]]
|