Definite matrix: Difference between revisions

Content deleted Content added
Tags: Reverted references removed Visual edit: Switched
Reverted 1 edit by Zzingae (talk): Huh?
Tags: Twinkle Undo Mobile edit Mobile web edit Advanced mobile edit
Line 183:
 
With this in mind, the one-to-one change of variable <math>\mathbf{y} = P\mathbf{z}</math> shows that <math>\mathbf{z}^* M\mathbf{z}</math> is real and positive for any complex vector <math>\mathbf z</math> if and only if <math>\mathbf{y}^* D\mathbf{y}</math> is real and positive for any <math>y</math>; in other words, if <math>D</math> is positive definite. For a diagonal matrix, this is true only if each element of the main diagonal—that is, every eigenvalue of <math>M</math>—is positive. Since the [[spectral theorem]] guarantees all eigenvalues of a Hermitian matrix to be real, the positivity of eigenvalues can be checked using [[Descartes' rule of signs|Descartes' rule of alternating signs]] when the [[characteristic polynomial]] of a real, symmetric matrix <math>M</math> is available.
 
==Which matrices are covariance matrices?==
From the identity just above, let <math>\mathbf{b}</math> be a <math>(p \times 1)</math> real-valued vector, then
 
:<math>\operatorname{var}(\mathbf{b}^{\rm T}\mathbf{X}) = \mathbf{b}^{\rm T} \operatorname{var}(\mathbf{X}) \mathbf{b},\,</math>
 
which must always be nonnegative, since it is the [[variance#Properties|variance]] of a real-valued random variable, so a covariance matrix is always a [[positive-semidefinite matrix]].
 
The above argument can be expanded as follows:<math display="block">
\begin{align}
& w^{\rm T} \operatorname{E} \left[(\mathbf{X} - \operatorname{E}[\mathbf{X}]) (\mathbf{X} - \operatorname{E}[\mathbf{X}])^{\rm T}\right] w
= \operatorname{E} \left[w^{\rm T}(\mathbf{X} - \operatorname{E}[\mathbf{X}]) (\mathbf{X} - \operatorname{E}[\mathbf{X}])^{\rm T}w\right] \\
&= \operatorname{E} \big[\big( w^{\rm T}(\mathbf{X} - \operatorname{E}[\mathbf{X}]) \big)^2 \big] \geq 0,
\end{align}
</math>where the last inequality follows from the observation that <math>w^{\rm T}(\mathbf{X} - \operatorname{E}[\mathbf{X}])</math> is a scalar.
 
Conversely, every symmetric positive semi-definite matrix is a covariance matrix. To see this, suppose <math>M</math> is a <math>p \times p</math> symmetric positive-semidefinite matrix. From the finite-dimensional case of the [[spectral theorem]], it follows that <math>M</math> has a nonnegative symmetric [[Square root of a matrix|square root]], which can be denoted by '''M'''<sup>1/2</sup>. Let <math>\mathbf{X}</math> be any <math>p \times 1</math> column vector-valued random variable whose covariance matrix is the <math>p \times p</math> identity matrix. Then
 
:<math>\operatorname{var}(\mathbf{M}^{1/2} \mathbf{X}) = \mathbf{M}^{1/2} \, \operatorname{var}(\mathbf{X}) \, \mathbf{M}^{1/2} = \mathbf{M}.</math>
 
 
==Decomposition==
Line 216 ⟶ 196:
<math>M</math> is positive definite if and only if such a decomposition exists with <math>B</math> [[Invertible matrix|invertible]].
More generally, <math>M</math> is positive semidefinite with rank <math>k</math> if and only if a decomposition exists with a <math>k \times n</math> matrix <math>B</math> of full row rank (i.e. of rank <math>k</math>).
Moreover, for any decomposition <math>M = B^* B</math>, <math>\operatorname{rank}(M) = \operatorname{rank}(B)</math>.<ref>{{harvtxt|Horn|Johnson|2013}}, p. 440, Theorem 7.2.7</ref>
 
{{math proof | proof =
Line 237 ⟶ 217:
In other words, a Hermitian matrix <math>M</math> is positive semidefinite if and only if it is the [[Gram matrix]] of some vectors <math>b_1,\dots,b_n</math>.
It is positive definite if and only if it is the Gram matrix of some [[linearly independent]] vectors.
In general, the rank of the Gram matrix of vectors <math>b_1,\dots,b_n</math> equals the dimension of the space [[Linear span|spanned]] by these vectors.<ref>{{harvtxt|Horn|Johnson|2013}}, p. 441, Theorem 7.2.10</ref>
 
===Uniqueness up to unitary transformations===
The decomposition is not unique:
if <math>M = B^* B</math> for some <math>k \times n</math> matrix <math>B</math> and if <math>Q</math> is any [[unitary matrix|unitary]] <math>k \times k</math> matrix (meaning <math>Q^* Q = Q Q^* = I</math>),
then <math>M = B^* B = B^* Q^* Q B =A^* A</math> for <math>A=Q B</math>.
 
However, this is the only way in which two decompositions can differ: the decomposition is unique up to [[unitary transformation]]s.
More formally, if <math>A</math> is a <math>k \times n</math> matrix and <math>B</math> is a <math>\ell \times n</math> matrix such that <math>A^* A = B^* B</math>,
then there is a <math>\ell \times k</math> matrix <math>Q</math> with orthonormal columns (meaning <math>Q^* Q = I_{k \times k}</math>) such that <math>B = Q A</math>.<ref>{{harvtxt|Horn|Johnson|2013}}, p. 452, Theorem 7.3.11</ref>
When <math>\ell = k</math> this means <math>Q</math> is [[unitary matrix|unitary]].
 
This statement has an intuitive geometric interpretation in the real case:
let the columns of <math>A</math> and <math>B</math> be the vectors <math>a_1,\dots,a_n</math> and <math>b_1,\dots,b_n</math> in <math>\mathbb{R}^k</math>.
A real unitary matrix is an [[orthogonal matrix]], which describes a [[rigid transformation]] (an isometry of Euclidean space <math>\mathbb{R}^k</math>) preserving the 0 point (i.e. [[Rotation matrix|rotations]] and [[Reflection matrix|reflections]], without translations).
Therefore, the dot products <math>a_i \cdot a_j</math> and <math>b_i \cdot b_j</math> are equal if and only if some rigid transformation of <math>\mathbb{R}^k</math> transforms the vectors <math>a_1,\dots,a_n</math> to <math>b_1,\dots,b_n</math> (and 0 to 0).
 
===Square root===
{{main|Square root of a matrix}}
A Hermitian matrix <math>M</math> is positive semidefinite if and only if there is a positive semidefinite matrix <math>B</math> (in particular <math>B</math> is Hermitian, so <math>B^* = B</math>) satisfying <math>M = B B</math>. This matrix <math>B</math> is unique,<ref>{{harvtxt|Horn|Johnson|2013}}, p. 439, Theorem 7.2.6 with <math>k = 2</math></ref> is called the ''non-negative [[square root of a matrix|square root]]'' of <math>M</math>, and is denoted with <math>B = M^\frac{1}{2}</math>.
When <math>M</math> is positive definite, so is <math>M^\frac{1}{2}</math>, hence it is also called the ''positive square root'' of <math>M</math>.
 
The non-negative square root should not be confused with other decompositions <math>M = B^*B</math>.
Some authors use the name ''square root'' and <math>M^\frac{1}{2}</math> for any such decomposition, or specifically for the [[Cholesky decomposition]],
or any decomposition of the form <math>M = B B</math>;
others only use it for the non-negative square root.
 
If <math> M > N > 0 </math> then <math>M^\frac{1}{2} > N^\frac{1}{2} > 0</math>.
 
===Cholesky decomposition===
A Hermitian positive semidefinite matrix <math>M</math> can be written as <math>M = LL^*</math>, where <math>L</math> is lower triangular with non-negative diagonal (equivalently <math>M = B^*B</math> where <math>B=L^*</math> is upper triangular); this is the [[Cholesky decomposition]].
If <math>M</math> is positive definite, then the diagonal of <math>L</math> is positive and the Cholesky decomposition is unique. Conversely if <math>L</math> is lower triangular with nonnegative diagonal then <math>LL^*</math> is positive semidefinite.
The Cholesky decomposition is especially useful for efficient numerical calculations.
A closely related decomposition is the [[Cholesky decomposition#LDL decomposition|LDL decomposition]], <math>M = L D L^*</math>, where <math>D</math> is diagonal and <math>L</math> is [[Triangular matrix#Unitriangular matrix|lower unitriangular]].
 
== Other characterizations ==