Content deleted Content added
deleted stray punctuation in text |
\top → \mathsf{T} (we should not use the top symbol to indicate transpose |
||
(25 intermediate revisions by 17 users not shown) | |||
Line 1:
{{Short description|Property of a mathematical matrix}}
{{use dmy dates|date=June 2024}}▼
{{hatnote|Not to be confused with [[Positive matrix]] and [[Totally positive matrix]].}}
▲{{use dmy dates|date=June 2024}}
In [[mathematics]], a symmetric matrix <math>
{{cite book
|first = Adriaan |last = van den Bos
Line 10:
|title = Parameter Estimation for Scientists and Engineers |edition=online
|publisher = John Wiley & Sons
|
|pages = 259–263
|doi = 10.1002/9780470173862 |doi-access=
Line 16:
}} Print ed. {{ISBN|9780470147818}}
</ref>
More generally, a [[Hermitian matrix]] (that is, a [[complex matrix]] equal to its [[conjugate transpose]]) is '''positive-definite''' if the real number <math>
'''Positive semi-definite''' matrices are defined similarly, except that the scalars <math>
Some authors use more general definitions of definiteness, permitting the matrices to be non-symmetric or non-Hermitian. The properties of these generalized definite matrices are explored in {{alink|Extension for non-Hermitian square matrices}}, below, but are not the main focus of this article.
▲'''Positive semi-definite''' matrices are defined similarly, except that the scalars <math>\ \mathbf{x}^\top M \mathbf{x}\ </math> and <math>\ \mathbf{z}^* M \mathbf{z}\ </math> are required to be positive ''or zero'' (that is, not negative). '''Negative-definite''' and '''negative semi-definite''' matrices are defined analogously. A matrix that is not positive semi-definite and not negative semi-definite is sometimes called ''indefinite''.
== Ramifications ==
Line 24 ⟶ 26:
Positive-definite and positive-semidefinite matrices can be characterized in many ways, which may explain the importance of the concept in various parts of mathematics. A matrix {{mvar|M}} is positive-definite if and only if it satisfies any of the following equivalent conditions.
* <math>
* <math>
* <math>
* There exists an [[invertible matrix]] <math>
A matrix is positive semi-definite if it satisfies similar equivalent conditions where "positive" is replaced by "nonnegative", "invertible matrix" is replaced by "matrix", and the word "leading" is removed.
Positive-definite and positive-semidefinite real matrices are at the basis of [[convex optimization]], since, given a [[function of several real variables]] that is twice [[differentiable function|differentiable]], then if its [[Hessian matrix]] (matrix of its second partial derivatives) is positive-definite at a point <math>
The set of positive definite matrices is an [[Open set|open]] [[convex cone]], while the set of positive semi-definite matrices is a [[closed set|closed]] convex cone.<ref>
Line 43 ⟶ 45:
}}
</ref>
== Definitions ==
In the following definitions, <math>
=== Definitions for real matrices ===
An <math>n \times n</math> symmetric real matrix <math>
{{Equation box 1
|indent =
|title=
|equation = <math>
|cellpadding= 6
|border
|border colour = #0073CF
|background colour=var(--background-color-success-subtle,#
An <math>
{{Equation box 1
|indent =
|title=
|equation = <math>
|cellpadding= 6
|border
|border colour = #0073CF
|background colour=var(--background-color-success-subtle,#
An <math>
{{Equation box 1
|indent =
|title=
|equation = <math>
|cellpadding= 6
|border
|border colour = #0073CF
|background colour=var(--background-color-success-subtle,#
An <math>
{{Equation box 1
|indent =
|title=
|equation = <math>M \text{ negative semi-definite} \quad \iff \quad \mathbf{x}^\
|cellpadding= 6
|border
|border colour = #0073CF
|background colour=var(--background-color-success-subtle,#
An <math>n \times n</math> symmetric real matrix which is neither positive semidefinite nor negative semidefinite is called '''indefinite'''.
=== Definitions for complex matrices ===
The following definitions all involve the term <math>
An <math>
{{Equation box 1
|indent =
|title=
|equation = <math>
|cellpadding= 6
|border
|border colour = #0073CF
|background colour=var(--background-color-success-subtle,#
An <math>
{{Equation box 1
|indent =
|title=
|equation = <math>
|cellpadding= 6
|border
|border colour = #0073CF
|background colour=var(--background-color-success-subtle,#
An <math>
{{Equation box 1
|indent =
|title=
|equation = <math>
|cellpadding= 6
|border
|border colour = #0073CF
|background colour=var(--background-color-success-subtle,#
An <math>
{{Equation box 1
|indent =
|title=
|equation = <math>
|cellpadding= 6
|border
|border colour = #0073CF
|background colour=var(--background-color-success-subtle,#
An <math>
=== Consistency between real and complex definitions ===
Since every real matrix is also a complex matrix, the definitions of "definiteness" for the two classes must agree.
For complex matrices, the most common definition says that <math>
By this definition, a positive-definite ''real'' matrix <math>
<math display="block">
then for any real vector <math>
<math display="block">\mathbf{z}^* M
which is not real. Therefore, <math>
On the other hand, for a ''symmetric'' real matrix <math>
=== Notation ===
If a Hermitian matrix <math>
The notion comes from [[functional analysis]] where positive semidefinite matrices define [[positive operator]]s. If two matrices <math>
A common alternative notation is <math>
== Examples ==
{{unordered list
| The [[identity matrix]] <math>I = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}</math> is positive-definite (and as such also positive semi-definite). It is a real symmetric matrix, and, for any non-zero column vector '''z''' with real entries ''a'' and ''b'', one has
<math display="block"> \mathbf{z}^\
Seen as a complex matrix, for any non-zero column vector ''z'' with complex entries ''a'' and ''b'' one has
<math display="block">\mathbf{z}^*I\mathbf{z} = \begin{bmatrix} \overline{a} & \overline{b} \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} a \\ b\end{bmatrix} = \overline{a}a + \overline{b}b = |a|^2 + |b|^2.</math>
Line 179:
is positive-definite since for any non-zero column vector '''z''' with entries ''a'', ''b'' and ''c'', we have
<math display="block">\begin{align}
\mathbf{z}^\
&= \begin{bmatrix} (2a - b) & (-a + 2b - c) & (-b + 2c) \end{bmatrix}
\begin{bmatrix} a \\ b \\ c \end{bmatrix} \\
&= (2a - b)a + (-a + 2b - c)b + (-b + 2c)c \\
&= 2a^2 - ba - ab + 2b^2 - cb - bc + 2c^2 \\
Line 189:
\end{align}</math>
This result is a sum of squares, and therefore non-negative; and is zero only if <math>
| For any real [[invertible matrix]] <math>
| The example <math>M</math> above shows that a matrix in which some elements are negative may still be positive definite. Conversely, a matrix whose entries are all positive is not necessarily positive definite, as for example
<math display="block">
for which <math>\begin{bmatrix} -1 & 1 \end{bmatrix}N\begin{bmatrix} -1 & 1 \end{bmatrix}^\
}}
Line 200:
* <math>M</math> is positive definite if and only if all of its eigenvalues are positive.
* <math>M</math> is positive semi-definite if and only if all of its eigenvalues are non-negative.
* <math>M</math> is negative definite if and only if all of its eigenvalues are negative.
* <math>M</math> is negative semi-definite if and only if all of its eigenvalues are non-positive.
* <math>M</math> is indefinite if and only if it has both positive and negative eigenvalues.
Let <math>
With this in mind, the one-to-one change of variable <math>
== Decomposition ==
{{See also|Gram matrix}}
Let <math>
<math>
<math display="block">
of a matrix <math>
When <math>
<math>M</math> is positive definite if and only if such a decomposition exists with <math>
More generally, <math>
Moreover, for any decomposition <math>
{{math proof | proof =
If <math>
If moreover <math>B</math> is invertible then the inequality is strict for <math>
If <math>B</math> is <math>k \times n</math> of rank <math>
In the other direction, suppose <math>
Since <math>
Since <math>
Then <math>
If moreover <math>M</math> is positive definite, then the eigenvalues are (strictly) positive, so <math>
If <math>
Cutting the zero rows gives a <math>
}}
The columns <math>
Then the entries of <math>M</math> are [[inner product]]s (that is [[dot product]]s, in the real case) of these vectors
<math display="block">
In other words, a Hermitian matrix <math>
It is positive definite if and only if it is the Gram matrix of some [[linearly independent]] vectors.
In general, the rank of the Gram matrix of vectors <math>
=== Uniqueness up to unitary transformations ===
The decomposition is not unique:
if <math>
then <math>
However, this is the only way in which two decompositions can differ: The decomposition is unique up to [[unitary transformation]]s.
More formally, if <math>
then there is a <math>
When <math>\ell = k</math> this means <math>Q</math> is [[unitary matrix|unitary]].
This statement has an intuitive geometric interpretation in the real case:
let the columns of <math>A</math> and <math>B</math> be the vectors <math>a_1,\dots,a_n</math> and <math>
A real unitary matrix is an [[orthogonal matrix]], which describes a [[rigid transformation]] (an isometry of Euclidean space <math>\mathbb{R}^k</math>) preserving the 0 point (i.e. [[Rotation matrix|rotations]] and [[Reflection matrix|reflections]], without translations).
Therefore, the dot products <math>a_i \cdot a_j</math> and <math>b_i \cdot b_j</math> are equal if and only if some rigid transformation of <math>\mathbb{R}^k</math> transforms the vectors <math>a_1,\dots,a_n</math> to <math>b_1,\dots,b_n</math> (and 0 to 0).
=== Square root ===
{{main|Square root of a matrix}}
A Hermitian matrix <math>
When <math>M</math> is positive definite, so is <math>
The non-negative square root should not be confused with other decompositions <math>
Some authors use the name ''square root'' and <math>M^\frac{1}{2}</math> for any such decomposition, or specifically for the [[Cholesky decomposition]],
or any decomposition of the form <math>
others only use it for the non-negative square root.
If <math>
=== Cholesky decomposition ===
A Hermitian positive semidefinite matrix <math>
If <math>
The Cholesky decomposition is especially useful for efficient numerical calculations.
A closely related decomposition is the [[Cholesky decomposition#LDL decomposition|LDL decomposition]], <math>
=== Williamson theorem ===
Any <math>2n\times 2n </math> positive definite Hermitian real matrix <math>M </math> can be diagonalized via symplectic (real) matrices. More precisely, [[Williamson theorem|Williamson's theorem]] ensures the existence of symplectic <math>S\in\mathbf{Sp}(2n,\mathbb{R}) </math> and diagonal real positive <math>D\in\mathbb{R}^{n\times n} </math> such that <math>SMS^T=D\oplus D </math>.
== Other characterizations ==
Let <math>
* <math>
* <math>
* <math>
* If <math>
* If <math>
Let <math>M</math> be an <math>
▲* <math>\ B_1( \mathbf{v}\ \mathbf{v}^\top )</math> is a solid slab sandwiched between <math>\ \pm \{ \mathbf{w}: \langle \mathbf{w}, \mathbf{v}\rangle = 1 \} ~.</math>
; The associated sesquilinear form is an inner product : The [[sesquilinear form]] defined by <math>M</math> is the function <math>
▲* <math>\ M \succeq 0\ </math> if and only if <math>\ B_1(M)\ </math> is an ellipsoid, or an ellipsoidal cylinder.
; Its leading principal minors are all positive : The {{mvar|k}}th [[minor (linear algebra)|leading principal minor]] of a matrix <math>
▲* <math>\ M \succ 0\ </math> if and only if <math>\ B_1(M)\ </math> is bounded, that is, it is an ellipsoid.
▲* If <math>\ N \succ 0\ ,</math> then <math>\ M \succeq N\ </math> if and only if <math>\ B_1(M) \subseteq B_1(N)\ ;</math> <math>\ M \succ N\ </math> if and only if <math>\ B_1(M) \subseteq \operatorname{int}\!\bigl(\ B_1(N)\ \bigr) ~.</math>
▲* If <math>\ N \succ 0\ ,</math> then <math>\ M \succeq \frac{ \mathbf{v}\ \mathbf{v}^\top }{\ \mathbf{v}^\top N\ \mathbf{v}\ }\ </math> for all <math>v \neq 0</math> if and only if <math display="inline">\ B_1(M) \subset \bigcap_{ \mathbf{v}^\top N\ \mathbf{v} = 1 } B_1(\mathbf{v} \mathbf{v}^\top) ~.</math> So, since the polar dual of an ellipsoid is also an ellipsoid with the same principal axes, with inverse lengths, we have <math display="block">\ B_1(N^{-1}) = \bigcap_{\mathbf{v}^\top N\ \mathbf{v} = 1} B_1(\mathbf{v}\ \mathbf{v}^\top) = \bigcap_{ \mathbf{v}^\top N\ \mathbf{v} = 1 } \{ \mathbf{w}: |\langle \mathbf{w}, \mathbf{v}\rangle| \leq 1 \} ~.</math> That is, if <math>\ N\ </math> is positive-definite, then <math>\ M \succeq \frac{ \mathbf{v} \mathbf{v}^\top }{\ \mathbf{v}^\top N\ \mathbf{v}\ }\ </math> for all <math>\ \mathbf{v} \neq \mathbf{0}\ </math> if and only if <math>\ M \succeq N^{-1} ~.</math>
▲Let <math>M</math> be an <math>\ n \times n\ </math> [[Hermitian matrix]]. The following properties are equivalent to <math>\ M\ </math> being positive definite:
▲; The associated sesquilinear form is an inner product: The [[sesquilinear form]] defined by <math>M</math> is the function <math>\ \langle \cdot, \cdot \rangle\ </math> from <math>\ \mathbb{C}^n \times \mathbb{C}^n\ </math> to <math>\ \mathbb{C}^n\ </math> such that <math>\ \langle \mathbf{x}, \mathbf{y} \rangle \equiv \mathbf{y}^* M\ \mathbf{x}\ </math> for all <math>\ \mathbf{x}\ </math> and <math>\ \mathbf{y}\ </math> in <math>\ \mathbb{C}^n\ ,</math> where <math>\ \mathbf{y}^*\ </math> is the conjugate transpose of <math>\ \mathbf{y} ~.</math> For any complex matrix <math>\ M\ ,</math> this form is linear in <math>x</math> and semilinear in <math>\ \mathbf{y} ~.</math> Therefore, the form is an [[inner product]] on <math>\ \mathbb{C}^n\ </math> if and only if <math>\ \langle \mathbf{z}, \mathbf{z} \rangle\ </math> is real and positive for all nonzero <math>\ \mathbf{z}\ ;</math> that is if and only if <math>\ M\ </math> is positive definite. (In fact, every inner product on <math>\ \mathbb{C}^n\ </math> arises in this fashion from a Hermitian positive definite matrix.)
▲; Its leading principal minors are all positive: The {{mvar|k}}th [[minor (linear algebra)|leading principal minor]] of a matrix <math>\ M\ </math> is the [[determinant]] of its upper-left <math>\ k \times k\ </math> sub-matrix. It turns out that a matrix is positive definite if and only if all these determinants are positive. This condition is known as [[Sylvester's criterion]], and provides an efficient test of positive definiteness of a symmetric real matrix. Namely, the matrix is reduced to an [[upper triangular matrix]] by using [[elementary row operations]], as in the first part of the [[Gaussian elimination]] method, taking care to preserve the sign of its determinant during [[pivot element|pivoting]] process. Since the {{mvar|k}}th leading principal minor of a triangular matrix is the product of its diagonal elements up to row <math>\ k\ ,</math> Sylvester's criterion is equivalent to checking whether its diagonal elements are all positive. This condition can be checked each time a new row <math>\ k\ </math> of the triangular matrix is obtained.
A positive semidefinite matrix is positive definite if and only if it is [[invertible matrix|invertible]].<ref>{{harvtxt|Horn|Johnson|2013}}, p. 431, Corollary 7.1.7</ref>
A matrix <math>
== Quadratic forms ==
{{Main|Definite quadratic form}}
The (purely) [[quadratic form]] associated with a real <math>
A symmetric matrix <math>
More generally, any [[quadratic function]] from <math>
'''Theorem:''' This quadratic function is strictly convex, and hence has a unique finite global minimum, if and only if <math>
'''Proof:''' If <math>
For this reason, positive definite matrices play an important role in [[optimization (mathematics)|optimization]] problems.
Line 307 ⟶ 310:
One symmetric matrix and another matrix that is both symmetric and positive definite can be [[diagonalizable matrix#Simultaneous diagonalization|simultaneously diagonalized]]. This is so although simultaneous diagonalization is not necessarily performed with a [[Matrix similarity|similarity transformation]]. This result does not extend to the case of three or more matrices. In this section we write for the real case. Extension to the complex case is immediate.
Let <math>
Note that this result does not contradict what is said on simultaneous diagonalization in the article [[diagonalizable matrix#Simultaneous diagonalization|Diagonalizable matrix]], which refers to simultaneous diagonalization by a similarity transformation. Our result here is more akin to a simultaneous diagonalization of two quadratic forms, and is useful for optimization of one form under conditions on the other.
== Properties ==
=== Induced partial ordering ===
For arbitrary square matrices <math>
=== Inverse of positive definite matrix ===
Every positive definite matrix is [[invertible matrix|invertible]] and its inverse is also positive definite.<ref>{{harvtxt|Horn|Johnson|2013}}, p. 438, Theorem 7.2.1</ref> If <math>
=== Scaling ===
If <math>
=== Addition ===
Line 326 ⟶ 329:
* If <math>M</math> is positive-definite and <math>N</math> is positive-semidefinite, then the sum <math>M + N</math> is also positive-definite.
=== Multiplication ===
* If <math>
* If <math>
=== Trace ===
The diagonal entries <math>
<math display="block">
▲and thus, when <math>\ n \ge 1\ ,</math>
<math display="block"> \max_{i,j} \left|m_{ij}\right| \leq \max_i m_{ii}</math>
An <math>n \times n</math> Hermitian matrix <math>M</math> is positive definite if it satisfies the following trace inequalities:<ref>{{cite journal | title=Bounds for Eigenvalues using Traces |
<math display="block">
Another important result is that for any <math>
=== Hadamard product ===
If <math>
Regarding the Hadamard product of two positive semidefinite matrices <math>
* Oppenheim's inequality: <math>
* <math>
=== Kronecker product ===
If <math>
=== Frobenius product ===
If <math>
=== Convexity ===
The set of positive semidefinite symmetric matrices is [[convex set|convex]]. That is, if <math>
<math display="block">
This property guarantees that [[semidefinite programming]] problems converge to a globally optimal solution.
=== Relation with cosine ===
The positive-definiteness of a matrix <math>A</math> expresses that the angle <math>
<math display="block">
=== Further properties ===
# If <math>M</math> is a symmetric [[Toeplitz matrix]], i.e. the entries <math>m_{ij}</math> are given as a function of their absolute index differences: <math>
# Let <math>M > 0</math> and <math>N</math> Hermitian. If <math>MN + NM \ge 0</math> (resp., <math>MN + NM > 0</math>) then <math>N \ge 0</math> (resp., <math>N > 0</math>).<ref> {{Cite book
| title=Positive Definite Matrices
Line 378 ⟶ 380:
| pages=8
}}</ref>
# If <math>
# If <math>M_k</math> denotes the leading <math>
# A matrix is negative definite if its {{mvar|k}}th order leading [[principal minor]] is negative when <math>
# If <math>
# A Hermitian matrix is positive semidefinite if and only if all of its principal minors are nonnegative. It is however not enough to consider the leading principal minors only, as is checked on the diagonal matrix with entries {{math|0}} and {{math|−1 .}}
=== Block matrices and submatrices ===
A positive <math>
<math display="block">
where each block is <math>
We have that <math>
<math display="block">
A similar argument can be applied to <math>
Converse results can be proved with stronger conditions on the blocks, for instance, using the [[Schur complement#Conditions for positive definiteness and semi-definiteness|Schur complement]].
=== Local extrema ===
A general [[quadratic form]] <math>f(\mathbf{x})</math> on <math>n</math> real variables <math>x_1, \ldots, x_n</math> can always be written as <math>\mathbf{x}^\
More generally, a twice-differentiable real function <math>f</math> on <math>n</math> real variables has local minimum at arguments <math>x_1, \ldots, x_n</math> if its [[gradient]] is zero and its [[Hessian matrix|Hessian]] (the matrix of all second derivatives) is positive semi-definite at that point. Similar statements can be made for negative definite and semi-definite matrices.
=== Covariance ===
In [[statistics]], the [[covariance matrix]] of a [[multivariate probability distribution]] is always positive semi-definite; and it is positive definite unless one variable is an exact linear function of the others. Conversely, every positive semi-definite matrix is the covariance matrix of some multivariate distribution.
== Extension for non-Hermitian square matrices ==
The definition of positive definite can be generalized by designating any complex matrix <math>
In summary, the distinguishing feature between the real and complex case is that, a [[Bounded operator|bounded]] positive operator on a complex Hilbert space is necessarily Hermitian, or self adjoint. The general claim can be argued using the [[polarization identity]]. That is no longer true in the real case.
Line 414 ⟶ 416:
== Applications ==
=== Heat conductivity matrix ===
Fourier's law of heat conduction, giving heat flux <math>
More generally in thermodynamics, the flow of heat and particles is a fully coupled system as described by the [[Onsager reciprocal relations]], and the coupling matrix is required to be positive semi-definite (possibly non-symmetric) in order that entropy production be nonnegative.
== See also ==
* [[Covariance matrix]]
* [[M-matrix]]
* [[Positive-definite function]]
* [[Positive-definite kernel]]
* [[Schur complement]]
* [[Sylvester's criterion]]
* [[Numerical range]]
* [[Williamson theorem]]
== References ==
Line 438 ⟶ 443:
|isbn=978-0-521-54823-6
}}
* {{cite book
|first=Rajendra |last=Bhatia |author-link=Rajendra Bhatia
Line 446 ⟶ 450:
|isbn=978-0-691-12918-1
}}
* {{cite journal
|
|last2=Toupin |first2=R.A.
|year=1962
Line 470 ⟶ 473:
{{DEFAULTSORT:Positive-Definite Matrix}}
[[Category:Matrices (mathematics)]]
[[de:Definitheit#Definitheit von Matrizen]]
|