Definite matrix: Difference between revisions

Content deleted Content added
\top → \mathsf{T} (we should not use the top symbol to indicate transpose
 
Line 3:
{{use dmy dates|date=June 2024}}
 
In [[mathematics]], a symmetric matrix <math>M</math> with [[real number|real]] entries is '''positive-definite''' if the real number <math>\mathbf{x}^\topmathsf{T} M \mathbf{x}</math> is positive for every nonzero real [[column vector]] <math>\mathbf{x},</math> where <math>\mathbf{x}^\topmathsf{T}</math> is the [[row vector]] [[transpose]] of <math>\mathbf{x}.</math><ref>
{{cite book
|first = Adriaan |last = van den Bos
Line 18:
More generally, a [[Hermitian matrix]] (that is, a [[complex matrix]] equal to its [[conjugate transpose]]) is '''positive-definite''' if the real number <math>\mathbf{z}^* M \mathbf{z}</math> is positive for every nonzero complex column vector <math>\mathbf{z},</math> where <math>\mathbf{z}^*</math> denotes the conjugate transpose of <math>\mathbf{z}.</math>
 
'''Positive semi-definite''' matrices are defined similarly, except that the scalars <math>\mathbf{x}^\topmathsf{T} M \mathbf{x}</math> and <math>\mathbf{z}^* M \mathbf{z}</math> are required to be positive ''or zero'' (that is, nonnegative). '''Negative-definite''' and '''negative semi-definite''' matrices are defined analogously. A matrix that is not positive semi-definite and not negative semi-definite is sometimes called ''indefinite''.
 
Some authors use more general definitions of definiteness, permitting the matrices to be non-symmetric or non-Hermitian. The properties of these generalized definite matrices are explored in {{alink|Extension for non-Hermitian square matrices}}, below, but are not the main focus of this article.
Line 47:
 
== Definitions ==
In the following definitions, <math>\mathbf{x}^\topmathsf{T}</math> is the transpose of <math>\mathbf{x},</math> <math>\mathbf{z}^*</math> is the [[conjugate transpose]] of <math>\mathbf{z},</math> and <math>\mathbf{0}</math> denotes the {{nobr|{{mvar|n}} dimensional}} zero-vector.
 
=== Definitions for real matrices ===
An <math>n \times n</math> symmetric real matrix <math>M</math> is said to be '''positive-definite''' if <math>\mathbf{x}^\topmathsf{T} M\mathbf{x} > 0</math> for all non-zero <math>\mathbf{x}</math> in <math>\mathbb{R}^n.</math> Formally,
 
{{Equation box 1
|indent =
|title=
|equation = <math>M \text{ positive-definite} \quad \iff \quad \mathbf{x}^\topmathsf{T} M\mathbf{x} > 0 \text{ for all } \mathbf{x} \in \R^n \setminus \{\mathbf{0}\}</math>
|cellpadding= 6
|border
Line 61:
|background colour=var(--background-color-success-subtle,#d5fdf4)}}
 
An <math>n \times n</math> symmetric real matrix <math>M</math> is said to be '''positive-semidefinite''' or '''non-negative-definite''' if <math>\mathbf{x}^\topmathsf{T} M\mathbf{x} \geq 0</math> for all <math>\mathbf{x}</math> in <math>\mathbb{R}^n .</math> Formally,
 
{{Equation box 1
|indent =
|title=
|equation = <math>M \text{ positive semi-definite} \quad \iff \quad \mathbf{x}^\topmathsf{T} M\mathbf{x} \geq 0 \text{ for all } \mathbf{x} \in \mathbb{R}^n</math>
|cellpadding= 6
|border
Line 72:
|background colour=var(--background-color-success-subtle,#d5fdf4)}}
 
An <math>n \times n</math> symmetric real matrix <math>M</math> is said to be '''negative-definite''' if <math>\mathbf{x}^\topmathsf{T} M\mathbf{x} < 0</math> for all non-zero <math>\mathbf{x}</math> in <math>\R^n.</math> Formally,
 
{{Equation box 1
|indent =
|title=
|equation = <math>M \text{ negative-definite} \quad \iff \quad \mathbf{x}^\topmathsf{T} M\mathbf{x} < 0 \text{ for all } \mathbf{x} \in \mathbb{R}^n \setminus \{\mathbf{0}\}</math>
|cellpadding= 6
|border
Line 83:
|background colour=var(--background-color-success-subtle,#d5fdf4)}}
 
An <math>n \times n</math> symmetric real matrix <math>M</math> is said to be '''negative-semidefinite''' or '''non-positive-definite''' if <math>\mathbf{x}^\topmathsf{T} M\mathbf{x} \leq 0</math> for all <math>\mathbf{x}</math> in <math>\mathbb{R}^n .</math> Formally,
 
{{Equation box 1
|indent =
|title=
|equation = <math>M \text{ negative semi-definite} \quad \iff \quad \mathbf{x}^\topmathsf{T} M\mathbf{x} \leq 0 \text{ for all } \mathbf{x} \in \R^n</math>
|cellpadding= 6
|border
Line 150:
For complex matrices, the most common definition says that <math>M</math> is positive-definite if and only if <math>\mathbf{z}^* M\mathbf{z}</math> is real and positive for every non-zero complex column vectors <math>\mathbf{z} .</math> This condition implies that <math>M</math> is Hermitian (i.e. its transpose is equal to its conjugate), since <math>\mathbf{z}^* M\mathbf{z}</math> being real, it equals its conjugate transpose <math>\mathbf{z}^*M^*\mathbf{z}</math> for every <math>\mathbf{z},</math> which implies <math>M = M^* .</math>
 
By this definition, a positive-definite ''real'' matrix <math>M</math> is Hermitian, hence symmetric; and <math>\mathbf{z}^\topmathsf{T} M\mathbf{z}</math> is positive for all non-zero ''real'' column vectors <math>\mathbf{z} .</math> However the last condition alone is not sufficient for <math>M</math> to be positive-definite. For example, if
<math display="block">M = \begin{bmatrix} 1 & 1 \\-1 & 1 \end{bmatrix},</math>
 
then for any real vector <math>\mathbf{z}</math> with entries <math>a</math> and <math>b</math> we have <math>\mathbf{z}^\topmathsf{T} M\mathbf{z} = \left(a + b\right)a + \left(-a + b\right) b = a^2 + b^2,</math> which is always positive if <math>\mathbf{z}</math> is not zero. However, if <math>\mathbf{z}</math> is the complex vector with entries {{math|1}} and <math>{{tmath| i }},</math> one gets
 
<math display="block">\mathbf{z}^* M\mathbf{z} = \begin{bmatrix} 1 & -i \end{bmatrix}M\begin{bmatrix} 1 \\i \end{bmatrix} = \begin{bmatrix} 1 + i & 1 - i \end{bmatrix}\begin{bmatrix} 1 \\i \end{bmatrix} = 2 + 2i .</math>
Line 159:
which is not real. Therefore, <math>M</math> is not positive-definite.
 
On the other hand, for a ''symmetric'' real matrix <math>M,</math> the condition "<math>\mathbf{z}^\topmathsf{T} M\mathbf{z} > 0</math> for all nonzero real vectors <math>\mathbf{z}</math>" ''does'' imply that <math>M</math> is positive-definite in the complex sense.
 
=== Notation ===
If a Hermitian matrix <math>M</math> is positive semi-definite, one sometimes writes <math>M \succeq 0</math> and if <math>M</math> is positive-definite one writes <math>M \succ 0.</math> To denote that <math>M</math> is negative semi-definite one writes <math>M \preceq 0</math> and to denote that <math>M</math> is negative-definite one writes <math>M \prec 0.</math>
 
Line 171:
{{unordered list
| The [[identity matrix]] <math>I = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}</math> is positive-definite (and as such also positive semi-definite). It is a real symmetric matrix, and, for any non-zero column vector '''z''' with real entries ''a'' and ''b'', one has
<math display="block"> \mathbf{z}^\topmathsf{T} I\mathbf{z} = \begin{bmatrix} a & b \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} a \\ b \end{bmatrix} = a^2 + b^2.</math>
Seen as a complex matrix, for any non-zero column vector ''z'' with complex entries ''a'' and ''b'' one has
<math display="block">\mathbf{z}^*I\mathbf{z} = \begin{bmatrix} \overline{a} & \overline{b} \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} a \\ b\end{bmatrix} = \overline{a}a + \overline{b}b = |a|^2 + |b|^2.</math>
Line 179:
is positive-definite since for any non-zero column vector '''z''' with entries ''a'', ''b'' and ''c'', we have
<math display="block">\begin{align}
\mathbf{z}^\topmathsf{T} M \mathbf{z} = \left( \mathbf{z}^\topmathsf{T} M \right) \mathbf{z}
&= \begin{bmatrix} (2a - b) & (-a + 2b - c) & (-b + 2c) \end{bmatrix}
\begin{bmatrix} a \\ b \\ c \end{bmatrix} \\
Line 190:
 
This result is a sum of squares, and therefore non-negative; and is zero only if <math>a = b = c = 0,</math> that is, when <math>\mathbf{z}</math> is the zero vector.
| For any real [[invertible matrix]] <math>A,</math> the product <math>A^\topmathsf{T} A</math> is a positive definite matrix (if the means of the columns of A are 0, then this is also called the [[covariance matrix]]). A simple proof is that for any non-zero vector <math>\mathbf{z},</math> the condition <math>\mathbf{z}^\topmathsf{T} A^\topmathsf{T} A\mathbf{z} = (A\mathbf{z})^\topmathsf{T} (A\mathbf{z}) = \|A\mathbf{z}\|^2 > 0,</math> since the invertibility of matrix <math>A</math> means that <math>A\mathbf{z} \neq 0.</math>
| The example <math>M</math> above shows that a matrix in which some elements are negative may still be positive definite. Conversely, a matrix whose entries are all positive is not necessarily positive definite, as for example
<math display="block">N = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix},</math>
for which <math>\begin{bmatrix} -1 & 1 \end{bmatrix}N\begin{bmatrix} -1 & 1 \end{bmatrix}^\topmathsf{T} = -2 < 0.</math>
}}
 
Line 208:
With this in mind, the one-to-one change of variable <math>\mathbf{y} = P\mathbf{z}</math> shows that <math>\mathbf{z}^* M\mathbf{z}</math> is real and positive for any complex vector <math>\mathbf{z}</math> if and only if <math>\mathbf{y}^* D \mathbf{y}</math> is real and positive for any <math>y;</math> in other words, if <math>D</math> is positive definite. For a diagonal matrix, this is true only if each element of the main diagonal – that is, every eigenvalue of <math>M</math> – is positive. Since the [[spectral theorem]] guarantees all eigenvalues of a Hermitian matrix to be real, the positivity of eigenvalues can be checked using [[Descartes' rule of signs|Descartes' rule of alternating signs]] when the [[characteristic polynomial]] of a real, symmetric matrix <math>M</math> is available.
 
== Decomposition ==
{{See also|Gram matrix}}
Let <math>M</math> be an <math>n \times n</math> [[Hermitian matrix]].
Line 215:
of a matrix <math>B</math> with its [[conjugate transpose]].
 
When <math>M</math> is real, <math>B</math> can be real as well and the decomposition can be written as <math display="block">M = B^\topmathsf{T} B.</math>
 
<math>M</math> is positive definite if and only if such a decomposition exists with <math>B</math> [[Invertible matrix|invertible]].
Line 242:
In general, the rank of the Gram matrix of vectors <math>b_1, \dots, b_n</math> equals the dimension of the space [[Linear span|spanned]] by these vectors.<ref>{{harvtxt|Horn|Johnson|2013}}, p. 441, Theorem 7.2.10</ref>
 
=== Uniqueness up to unitary transformations ===
The decomposition is not unique:
if <math>M = B^* B</math> for some <math>k \times n</math> matrix <math>B</math> and if <math>Q</math> is any [[unitary matrix|unitary]] <math>k \times k</math> matrix (meaning <math>Q^* Q = Q Q^* = I</math>),
Line 257:
Therefore, the dot products <math>a_i \cdot a_j</math> and <math>b_i \cdot b_j</math> are equal if and only if some rigid transformation of <math>\mathbb{R}^k</math> transforms the vectors <math>a_1,\dots,a_n</math> to <math>b_1,\dots,b_n</math> (and 0 to 0).
 
=== Square root ===
{{main|Square root of a matrix}}
A Hermitian matrix <math>M</math> is positive semidefinite if and only if there is a positive semidefinite matrix <math>B</math> (in particular <math>B</math> is Hermitian, so <math>B^* = B</math>) satisfying <math>M = B B.</math> This matrix <math>B</math> is unique,<ref>{{harvtxt|Horn|Johnson|2013}}, p. 439, Theorem 7.2.6 with <math>k = 2</math></ref> is called the ''non-negative [[square root of a matrix|square root]]'' of <math>M,</math> and is denoted with <math>B = M^\frac{1}{2}.</math>
Line 269:
If <math>M \succ N \succ 0</math> then <math>M^\frac{1}{2} \succ N^\frac{1}{2} \succ 0.</math>
 
=== Cholesky decomposition ===
A Hermitian positive semidefinite matrix <math>M</math> can be written as <math>M = L L^*,</math> where <math>L</math> is lower triangular with non-negative diagonal (equivalently <math>M = B^*B</math> where <math>B = L^*</math> is upper triangular); this is the [[Cholesky decomposition]].
If <math>M</math> is positive definite, then the diagonal of <math>L</math> is positive and the Cholesky decomposition is unique. Conversely if <math>L</math> is lower triangular with nonnegative diagonal then <math>L L^*</math> is positive semidefinite.
Line 275:
A closely related decomposition is the [[Cholesky decomposition#LDL decomposition|LDL decomposition]], <math>M = L D L^*,</math> where <math>D</math> is diagonal and <math>L</math> is [[Triangular matrix#Unitriangular matrix|lower unitriangular]].
 
=== Williamson theorem ===
Any <math>2n\times 2n </math> positive definite Hermitian real matrix <math>M </math> can be diagonalized via symplectic (real) matrices. More precisely, [[Williamson theorem|Williamson's theorem]] ensures the existence of symplectic <math>S\in\mathbf{Sp}(2n,\mathbb{R}) </math> and diagonal real positive <math>D\in\mathbb{R}^{n\times n} </math> such that <math>SMS^T=D\oplus D </math>.
 
== Other characterizations ==
Let <math>M</math> be an <math>n \times n</math> [[Hermitian matrix|real symmetric matrix]], and let <math>B_1(M) \equiv \{ \mathbf{x} \in \mathbb{R}^n : \mathbf{x}^\topmathsf{T} M\mathbf{x} \leq 1\}</math> be the "unit ball" defined by <math>M.</math> Then we have the following
* <math>B_1( \mathbf{v}\mathbf{v}^\topmathsf{T} )</math> is a solid slab sandwiched between <math>\pm \{ \mathbf{w}: \langle \mathbf{w}, \mathbf{v}\rangle = 1 \}.</math>
 
* <math>B_1( \mathbf{v}\mathbf{v}^\top )</math> is a solid slab sandwiched between <math>\pm \{ \mathbf{w}: \langle \mathbf{w}, \mathbf{v}\rangle = 1 \}.</math>
* <math>M \succeq 0</math> if and only if <math>B_1(M)</math> is an ellipsoid, or an ellipsoidal cylinder.
* <math>M \succ 0</math> if and only if <math>B_1(M)</math> is bounded, that is, it is an ellipsoid.
* If <math>N \succ 0,</math> then <math>M \succeq N</math> if and only if <math>B_1(M) \subseteq B_1(N);</math> <math>M \succ N</math> if and only if <math>B_1(M) \subseteq \operatorname{int}\bigl(B_1(N)\bigr).</math>
* If <math>N \succ 0,</math> then <math>M \succeq \frac{ \mathbf{v}\mathbf{v}^\topmathsf{T} }{\mathbf{v}^\topmathsf{T} N\mathbf{v}}</math> for all <math>v \neq 0</math> if and only if <math display="inline">B_1(M) \subset \bigcap_{ \mathbf{v}^\topmathsf{T} N\mathbf{v} = 1 } B_1(\mathbf{v} \mathbf{v}^\topmathsf{T}).</math> So, since the polar dual of an ellipsoid is also an ellipsoid with the same principal axes, with inverse lengths, we have <math display="block">B_1(N^{-1}) = \bigcap_{\mathbf{v}^\topmathsf{T} N\mathbf{v} = 1} B_1(\mathbf{v}\mathbf{v}^\topmathsf{T}) = \bigcap_{ \mathbf{v}^\topmathsf{T} N\mathbf{v} = 1 } \{ \mathbf{w}: |\langle \mathbf{w}, \mathbf{v}\rangle| \leq 1 \}.</math> That is, if <math>N</math> is positive-definite, then <math>M \succeq \frac{ \mathbf{v} \mathbf{v}^\topmathsf{T} }{\mathbf{v}^\topmathsf{T} N\mathbf{v}}</math> for all <math>\mathbf{v} \neq \mathbf{0}</math> if and only if <math>M \succeq N^{-1} .</math>
 
Let <math>M</math> be an <math>n \times n</math> [[Hermitian matrix]]. The following properties are equivalent to <math>M</math> being positive definite:
; The associated sesquilinear form is an inner product : The [[sesquilinear form]] defined by <math>M</math> is the function <math>\langle \cdot, \cdot \rangle</math> from <math>\mathbb{C}^n \times \mathbb{C}^n</math> to <math>\mathbb{C}^n</math> such that <math>\langle \mathbf{x}, \mathbf{y} \rangle \equiv \mathbf{y}^* M\mathbf{x}</math> for all <math>\mathbf{x}</math> and <math>\mathbf{y}</math> in <math>\mathbb{C}^n,</math> where <math>\mathbf{y}^*</math> is the conjugate transpose of <math>\mathbf{y}.</math> For any complex matrix <math>M,</math> this form is linear in <math>x</math> and semilinear in <math>\mathbf{y}.</math> Therefore, the form is an [[inner product]] on <math>\mathbb{C}^n</math> if and only if <math>\langle \mathbf{z}, \mathbf{z} \rangle</math> is real and positive for all nonzero <math>\mathbf{z};</math> that is if and only if <math>M</math> is positive definite. (In fact, every inner product on <math>\mathbb{C}^n</math> arises in this fashion from a Hermitian positive definite matrix.)
; Its leading principal minors are all positive : The {{mvar|k}}th [[minor (linear algebra)|leading principal minor]] of a matrix <math>M</math> is the [[determinant]] of its upper-left <math>k \times k</math> sub-matrix. It turns out that a matrix is positive definite if and only if all these determinants are positive. This condition is known as [[Sylvester's criterion]], and provides an efficient test of positive definiteness of a symmetric real matrix. Namely, the matrix is reduced to an [[upper triangular matrix]] by using [[elementary row operations]], as in the first part of the [[Gaussian elimination]] method, taking care to preserve the sign of its determinant during [[pivot element|pivoting]] process. Since the {{mvar|k}}th leading principal minor of a triangular matrix is the product of its diagonal elements up to row <math>k,</math> Sylvester's criterion is equivalent to checking whether its diagonal elements are all positive. This condition can be checked each time a new row <math>k</math> of the triangular matrix is obtained.
 
A positive semidefinite matrix is positive definite if and only if it is [[invertible matrix|invertible]].<ref>{{harvtxt|Horn|Johnson|2013}}, p. 431, Corollary 7.1.7</ref>
Line 295:
== Quadratic forms ==
{{Main|Definite quadratic form}}
The (purely) [[quadratic form]] associated with a real <math>n \times n</math> matrix <math>M</math> is the function <math>Q : \mathbb{R}^n \to \mathbb{R}</math> such that <math>Q(\mathbf{x}) = \mathbf{x}^\topmathsf{T} M \mathbf{x}</math> for all <math>\mathbf{x}.</math> <math>M</math> can be assumed symmetric by replacing it with <math>\tfrac{1}{2} \left(M + M^\topmathsf{T} \right),</math> since any asymmetric part will be zeroed-out in the double-sided product.
 
A symmetric matrix <math>M</math> is positive definite if and only if its quadratic form is a [[strictly convex function]].
 
More generally, any [[quadratic function]] from <math>\mathbb{R}^n</math> to <math>\mathbb{R}</math> can be written as <math>\mathbf{x}^\topmathsf{T} M \mathbf{x} + \mathbf{b}^\topmathsf{T} \mathbf{x} + c</math> where <math>M</math> is a symmetric <math>n \times n</math> matrix, <math>\mathbf{b}</math> is a real {{nobr|{{mvar|n}} &nbsp;vector,}} and <math>c</math> a real constant. In the <math>n = 1</math> case, this is a parabola, and just like in the <math>n = 1</math> case, we have
 
'''Theorem:''' This quadratic function is strictly convex, and hence has a unique finite global minimum, if and only if <math>M</math> is positive definite.
 
'''Proof:''' If <math>M</math> is positive definite, then the function is strictly convex. Its gradient is zero at the unique point of <math>M^{-1} \mathbf{b},</math> which must be the global minimum since the function is strictly convex. If <math>M</math> is not positive definite, then there exists some vector <math>\mathbf{v}</math> such that <math>\mathbf{v}^\topmathsf{T} M \mathbf{v} \leq 0,</math> so the function <math>f(t) \equiv ( t \mathbf{v} )^\topmathsf{T} M ( t\mathbf{v} ) + b^\topmathsf{T} (t \mathbf{v}) + c</math> is a line or a downward parabola, thus not strictly convex and not having a global minimum.
 
For this reason, positive definite matrices play an important role in [[optimization (mathematics)|optimization]] problems.
Line 310:
One symmetric matrix and another matrix that is both symmetric and positive definite can be [[diagonalizable matrix#Simultaneous diagonalization|simultaneously diagonalized]]. This is so although simultaneous diagonalization is not necessarily performed with a [[Matrix similarity|similarity transformation]]. This result does not extend to the case of three or more matrices. In this section we write for the real case. Extension to the complex case is immediate.
 
Let <math>M</math> be a symmetric and <math>N</math> a symmetric and positive definite matrix. Write the generalized eigenvalue equation as <math>\left(M - \lambda N\right)\mathbf{x} = 0</math> where we impose that <math>\mathbf{x}</math> be normalized, i.e. <math>\mathbf{x}^\topmathsf{T} N \mathbf{x} = 1.</math> Now we use [[Cholesky decomposition]] to write the inverse of <math>N</math> as <math>Q^\topmathsf{T} Q.</math> Multiplying by <math>Q</math> and letting <math>\mathbf{x} = Q^\topmathsf{T} \mathbf{y},</math> we get <math>Q \left(M - \lambda N\right) Q^\topmathsf{T} \mathbf{y} = 0,</math> which can be rewritten as <math>\left(Q M Q^\topmathsf{T} \right)\mathbf{y} = \lambda \mathbf{y}</math> where <math>\mathbf{y}^\topmathsf{T} \mathbf{y} = 1.</math> Manipulation now yields <math>MX = NX\Lambda</math> where <math>X</math> is a matrix having as columns the generalized eigenvectors and <math>\Lambda</math> is a diagonal matrix of the generalized eigenvalues. Now premultiplication with <math>X^\topmathsf{T}</math> gives the final result: <math>X^\topmathsf{T} MX = \Lambda</math> and <math>X^\topmathsf{T} N X = I,</math> but note that this is no longer an orthogonal diagonalization with respect to the inner product where <math>\mathbf{y}^\topmathsf{T} \mathbf{y} = 1.</math> In fact, we diagonalized <math>M</math> with respect to the inner product induced by <math>N.</math><ref>{{harvtxt|Horn|Johnson|2013}}, p. 485, Theorem 7.6.1</ref>
 
Note that this result does not contradict what is said on simultaneous diagonalization in the article [[diagonalizable matrix#Simultaneous diagonalization|Diagonalizable matrix]], which refers to simultaneous diagonalization by a similarity transformation. Our result here is more akin to a simultaneous diagonalization of two quadratic forms, and is useful for optimization of one form under conditions on the other.
 
== Properties ==
=== Induced partial ordering ===
For arbitrary square matrices <math>M,</math> <math>N</math> we write <math>M \ge N</math> if <math>M - N \ge 0</math> i.e., <math>M - N</math> is positive semi-definite. This defines a [[partially ordered set|partial ordering]] on the set of all square matrices. One can similarly define a strict partial ordering <math>M > N.</math> The ordering is called the [[Loewner order]].
 
=== Inverse of positive definite matrix ===
Every positive definite matrix is [[invertible matrix|invertible]] and its inverse is also positive definite.<ref>{{harvtxt|Horn|Johnson|2013}}, p. 438, Theorem 7.2.1</ref> If <math>M \geq N > 0</math> then <math>N^{-1} \geq M^{-1} > 0.</math><ref>{{harvtxt|Horn|Johnson|2013}}, p. 495, Corollary 7.7.4(a)</ref> Moreover, by the [[min-max theorem]], the {{mvar|k}}th largest eigenvalue of <math>M</math> is greater than or equal to the {{mvar|k}}th largest eigenvalue of <math>N.</math>
 
Line 329:
* If <math>M</math> is positive-definite and <math>N</math> is positive-semidefinite, then the sum <math>M + N</math> is also positive-definite.
 
=== Multiplication ===
* If <math>M</math> and <math>N</math> are positive definite, then the products <math>M N M</math> and <math>NMN</math> are also positive definite. If <math>M N = N M,</math> then <math>M N</math> is also positive definite.
* If <math>M</math> is positive semidefinite, then <math>A^* M A</math> is positive semidefinite for any (possibly rectangular) matrix <math>A .</math> If <math>M</math> is positive definite and <math>A</math> has full column rank, then <math>A^* M A</math> is positive definite.<ref>{{harvtxt|Horn|Johnson|2013}}, p. 431, Observation 7.1.8</ref>
 
=== Trace ===
The diagonal entries <math>m_{ii}</math> of a positive-semidefinite matrix are real and non-negative. As a consequence the [[trace (linear algebra)|trace]], <math>\operatorname{tr}(M) \ge 0.</math> Furthermore,<ref>{{harvtxt|Horn|Johnson|2013}}, p. 430</ref> since every principal sub-matrix (in particular, 2-by-2) is positive semidefinite,
<math display="block">\left|m_{ij}\right| \leq \sqrt{m_{ii}m_{jj}} \quad \forall i, j</math>
 
and thus, when <math>n \ge 1,</math>
<math display="block"> \max_{i,j} \left|m_{ij}\right| \leq \max_i m_{ii}</math>
Line 345 ⟶ 344:
Another important result is that for any <math>M</math> and <math>N</math> positive-semidefinite matrices, <math>\operatorname{tr}(MN) \ge 0 .</math> This follows by writing <math>\operatorname{tr}(MN) = \operatorname{tr}(M^\frac{1}{2}N M^\frac{1}{2}).</math> The matrix <math>M^\frac{1}{2}N M^\frac{1}{2}</math> is positive-semidefinite and thus has non-negative eigenvalues, whose sum, the trace, is therefore also non-negative.
 
=== Hadamard product ===
If <math>M, N \geq 0,</math> although <math>M N</math> is not necessary positive semidefinite, the [[Hadamard product (matrices)|Hadamard product]] is, <math>M \circ N \geq 0</math> (this result is often called the [[Schur product theorem]]).<ref>{{harvtxt|Horn|Johnson|2013}}, p. 479, Theorem 7.5.3</ref>
 
Line 352 ⟶ 351:
* <math>\det(M \circ N) \geq \det(M) \det(N).</math><ref name=styan1973>{{cite journal |last=Styan |first=G.P. |year=1973 |title=Hadamard products and multivariate statistical analysis |journal=[[Linear Algebra and Its Applications]] |volume=6 |pages=217–240 |doi=10.1016/0024-3795(73)90023-2 }}, Corollary 3.6, p. 227</ref>
 
=== Kronecker product ===
If <math>M, N \geq 0,</math> although <math>M N</math> is not necessary positive semidefinite, the [[Kronecker product]] <math>M \otimes N \geq 0.</math>
 
=== Frobenius product ===
If <math>M, N \geq 0,</math> although <math>M N</math> is not necessary positive semidefinite, the [[Frobenius inner product]] <math>M : N \geq 0</math> (Lancaster–Tismenetsky, ''The Theory of Matrices'', p.&nbsp;218).
 
=== Convexity ===
The set of positive semidefinite symmetric matrices is [[convex set|convex]]. That is, if <math>M</math> and <math>N</math> are positive semidefinite, then for any <math>\alpha</math> between {{math|0}} and {{math|1}}, <math>\alpha M + \left(1 - \alpha\right) N</math> is also positive semidefinite. For any vector <math>\mathbf{x}</math>:
<math display="block">\mathbf{x}^\topmathsf{T} \left(\alpha M + \left(1 - \alpha\right)N\right)\mathbf{x} = \alpha \mathbf{x}^\topmathsf{T} M\mathbf{x} + (1 - \alpha) \mathbf{x}^\topmathsf{T} N\mathbf{x} \geq 0.</math>
 
This property guarantees that [[semidefinite programming]] problems converge to a globally optimal solution.
Line 367 ⟶ 366:
The positive-definiteness of a matrix <math>A</math> expresses that the angle <math>\theta</math> between any vector <math>\mathbf{x}</math> and its image <math>A \mathbf{x}</math> is always <math>-\pi / 2 < \theta < +\pi / 2:</math>
 
<math display="block">\cos\theta = \frac{ \mathbf{x}^\topmathsf{T} A\mathbf{x} }{\lVert \mathbf{x} \rVert \lVert A\mathbf{x} \rVert} = \frac{\langle \mathbf{x}, A\mathbf{x} \rangle}{\lVert \mathbf{x} \rVert \lVert A\mathbf{x} \rVert} , \theta = \theta(\mathbf{x}, A \mathbf{x}) \equiv \widehat{\left(\mathbf{x},A\mathbf{x}\right)} \equiv</math> the angle between <math>\mathbf{x}</math> and <math>A\mathbf{x}.</math>
 
=== Further properties ===
 
# If <math>M</math> is a symmetric [[Toeplitz matrix]], i.e. the entries <math>m_{ij}</math> are given as a function of their absolute index differences: <math>m_{ij} = h(|i-j|),</math> and the ''strict'' inequality <math display="inline">\sum_{j \neq 0} \left|h(j)\right| < h(0)</math> holds, then <math>M</math> is ''strictly'' positive definite.
Line 384 ⟶ 383:
# If <math>M_k</math> denotes the leading <math>k \times k</math> minor, <math>\det\left(M_k\right)/\det\left(M_{k-1}\right)</math> is the {{mvar|k}}th pivot during [[LU decomposition]].
# A matrix is negative definite if its {{mvar|k}}th order leading [[principal minor]] is negative when <math>k</math> is odd, and positive when <math>k</math> is even.
# If <math>M</math> is a real positive definite matrix, then there exists a positive real number <math>m</math> such that for every vector <math>\mathbf{v},</math> <math>\mathbf{v}^\topmathsf{T} M\mathbf{v} \geq m\|\mathbf{v}\|_2^{2}.</math>
# A Hermitian matrix is positive semidefinite if and only if all of its principal minors are nonnegative. It is however not enough to consider the leading principal minors only, as is checked on the diagonal matrix with entries {{math|0}} and {{math|−1&nbsp;.}}
 
Line 393 ⟶ 392:
where each block is <math>n \times n,</math> By applying the positivity condition, it immediately follows that <math>A</math> and <math>D</math> are hermitian, and <math>C = B^*.</math>
 
We have that <math>\mathbf{z}^* M\mathbf{z} \ge 0</math> for all complex <math>\mathbf{z},</math> and in particular for <math>\mathbf{z} = [\mathbf{v}, 0]^\topmathsf{T} .</math> Then
<math display="block">\begin{bmatrix} \mathbf{v}^* & 0 \end{bmatrix} \begin{bmatrix} A & B \\ B^* & D \end{bmatrix} \begin{bmatrix} \mathbf{v} \\ 0 \end{bmatrix} = \mathbf{v}^* A\mathbf{v} \ge 0.</math>
 
Line 401 ⟶ 400:
 
=== Local extrema ===
A general [[quadratic form]] <math>f(\mathbf{x})</math> on <math>n</math> real variables <math>x_1, \ldots, x_n</math> can always be written as <math>\mathbf{x}^\topmathsf{T} M \mathbf{x}</math> where <math>\mathbf{x}</math> is the column vector with those variables, and <math>M</math> is a symmetric real matrix. Therefore, the matrix being positive definite means that <math>f</math> has a unique minimum (zero) when <math>\mathbf{x}</math> is zero, and is strictly positive for any other <math>\mathbf{x}.</math>
 
More generally, a twice-differentiable real function <math>f</math> on <math>n</math> real variables has local minimum at arguments <math>x_1, \ldots, x_n</math> if its [[gradient]] is zero and its [[Hessian matrix|Hessian]] (the matrix of all second derivatives) is positive semi-definite at that point. Similar statements can be made for negative definite and semi-definite matrices.
 
=== Covariance ===
In [[statistics]], the [[covariance matrix]] of a [[multivariate probability distribution]] is always positive semi-definite; and it is positive definite unless one variable is an exact linear function of the others. Conversely, every positive semi-definite matrix is the covariance matrix of some multivariate distribution.
 
== Extension for non-Hermitian square matrices ==
The definition of positive definite can be generalized by designating any complex matrix <math>M</math> (e.g. real non-symmetric) as positive definite if <math>\mathcal{R_e} \left\{\mathbf{z}^* M \mathbf{z}\right\} > 0</math> for all non-zero complex vectors <math>\mathbf{z},</math> where <math>\mathcal{R_e}\{c\}</math> denotes the real part of a [[complex number]] <math>c.</math><ref name="mathw">{{cite web |last = Weisstein |first = Eric W. |url = http://mathworld.wolfram.com/PositiveDefiniteMatrix.html |title = Positive definite matrix |website = MathWorld |publisher = Wolfram Research |access-date= 2012-07-26 }}</ref> Only the Hermitian part <math display="inline">\frac{1}{2}\left(M + M^*\right)</math> determines whether the matrix is positive definite, and is assessed in the narrower sense above. Similarly, if <math>\mathbf{x}</math> and <math>M</math> are real, we have <math>\mathbf{x}^\topmathsf{T} M \mathbf{x} > 0</math> for all real nonzero vectors <math>\mathbf{x}</math> if and only if the symmetric part <math display="inline">\frac{1}{2}\left(M + M^\topmathsf{T} \right)</math> is positive definite in the narrower sense. It is immediately clear that <math display="inline">\mathbf{x}^\topmathsf{T} M \mathbf{x} = \sum_{ij} x_i M_{ij} x_j</math>is insensitive to transposition of <math>M.</math>
 
A non-symmetric real matrix with only positive eigenvalues may have a symmetric part with negative eigenvalues, in which case it will not be positive (semi)definite. For example, the matrix <math display=inline>M = \left[\begin{matrixsmallmatrix} 4 & 9 \\ 1 & 4 \end{matrixsmallmatrix}\right]</math> has positive eigenvalues 1 and 7, yet <math>\mathbf{x}^\topmathsf{T} M \mathbf{x} = -2 </math> with the choice <math>\mathbf{x} = \left[\begin{smallmatrix} -1 \\ 1 \end{smallmatrix}\right] </math>.
 
In summary, the distinguishing feature between the real and complex case is that, a [[Bounded operator|bounded]] positive operator on a complex Hilbert space is necessarily Hermitian, or self adjoint. The general claim can be argued using the [[polarization identity]]. That is no longer true in the real case.
Line 417 ⟶ 416:
== Applications ==
=== Heat conductivity matrix ===
Fourier's law of heat conduction, giving heat flux <math>\mathbf{q}</math> in terms of the temperature gradient <math>\mathbf{g} = \nabla T</math> is written for anisotropic media as <math>\mathbf{q} = -K \mathbf{g},</math> in which <math>K</math> is the [[thermal conductivity]] matrix. The negative is inserted in Fourier's law to reflect the expectation that heat will always flow from hot to cold. In other words, since the temperature gradient <math>\mathbf{g}</math> always points from cold to hot, the heat flux <math>\mathbf{q}</math> is expected to have a negative inner product with <math>\mathbf{g}</math> so that <math>\mathbf{q}^\topmathsf{T} \mathbf{g} < 0.</math> Substituting Fourier's law then gives this expectation as <math>\mathbf{g}^\topmathsf{T} K\mathbf{g} > 0,</math> implying that the conductivity matrix should be positive definite. Ordinarily <math>K</math> should be symmetric, however it becomes nonsymmetric in the presence of a magnetic field as in a [[thermal Hall effect]].
 
More generally in thermodynamics, the flow of heat and particles is a fully coupled system as described by the [[Onsager reciprocal relations]], and the coupling matrix is required to be positive semi-definite (possibly non-symmetric) in order that entropy production be nonnegative.
 
== See also ==
* [[Covariance matrix]]
* [[M-matrix]]
* [[Positive-definite function]]
* [[Positive-definite kernel]]
* [[Schur complement]]
* [[Sylvester's criterion]]
* [[Numerical range]]
* [[Williamson theorem]]
 
== References ==
Line 444 ⟶ 443:
|isbn=978-0-521-54823-6
}}
 
* {{cite book
|first=Rajendra |last=Bhatia |author-link=Rajendra Bhatia
Line 452 ⟶ 450:
|isbn=978-0-691-12918-1
}}
 
* {{cite journal
|last1=Bernstein |first1=B.