Diagonal matrix: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 06:58, 26 February 2023 edit Rambor12 (talk \| contribs) 105 edits m →Matrix-to-vector diag operator Tag: 2017 wikitext editor ← Previous edit		Latest revision as of 23:43, 27 June 2025 edit undo AnomieBOT (talk \| contribs) Bots 6,862,192 edits m Dating maintenance tags: {{More footnotes needed}}
(25 intermediate revisions by 15 users not shown)
Line 1: {{Use American English\|date = March 2019}} {{Short description\|Matrix whose only nonzero elements are on its main diagonal}} {{More footnotes needed\|date=June 2025}} In [[linear algebra]], a '''diagonal matrix''' is a [[matrix (mathematics)\|matrix]] in which the entries outside the [[main diagonal]] are all zero; the term usually refers to [[square matrices]]. Elements of the main diagonal can either be zero or nonzero. An example of a 2×2 diagonal matrix is <math>\left[\begin{smallmatrix} 3 & 0 \\ Line 6 ⟶ 8: \left[\begin{smallmatrix} 6 & 0 & 0 \\ 0 & 05 & 0 \\ 0 & 0 & 04 \end{smallmatrix}\right]</math>. An [[identity matrix]] of any size, or any multiple of it (is a diagonal matrix called a [[#Scalar matrix\|''scalar matrix'']]), isfor aexample, ~~diagonal matrix.~~<math>\left[\begin{smallmatrix} 0.5 & 0 \\ 0 & 0.5 \end{smallmatrix}\right]</math>. AIn [[geometry]], a diagonal matrix ismay ~~sometimes~~be ~~called~~used as a ''[[scaling matrix]]'', since matrix multiplication with it results in changing scale (size). ~~Its~~and ~~determinant~~possibly isalso ~~the~~[[shape]]; ~~product~~only ofa ~~its~~scalar ~~diagonal~~matrix ~~values~~results in uniform change in scale. ==Definition== As stated above, a diagonal matrix is a matrix in which all off-diagonal entries are zero. That is, the matrix {{math\|1='''D''' = (''d''<sub>''i'',''j''</sub>)}} with ''{{mvar\|n''}} columns and ''{{mvar\|n''}} rows is diagonal if <math display="block">\forall i,j \in \{1, 2, \ldots, n\}, i \ne j \implies d_{i,j} = 0.</math> However, the main diagonal entries are unrestricted. The term ''diagonal matrix'' may sometimes refer to a '''{{visible anchor\|rectangular diagonal matrix}}''', which is an ''{{mvar\|m''}}-by-''{{mvar\|n''}} matrix with all the entries not of the form {{math\|''d''<sub>''i'',''i''</sub>}} being zero. For example: :<math display=block>\begin{bmatrix} 1 & 0 & 0\\ 0 & 4 & 0\\ 0 & 0 & -3\\ 0 & 0 & 0\\ \end{bmatrix}~~</math>~~ \quad \text{or} \quad ~~<math>~~\begin{bmatrix} 1 & 0 & 0 & 0 & 0\\ 0 & 4 & 0& 0 & 0\\ 0 & 0 & -3& 0 & 0 \end{bmatrix}</math> Line 46 ⟶ 49: ==Vector-to-matrix diag operator== A diagonal matrix <{{math~~>\mathbf{~~\|'''D'''}}~~</math>~~ can be constructed from a vector <math>\mathbf{a} = \begin{bmatrix}a_1 & \~~dotsm~~dots & a_n\end{bmatrix}^\textsf{T}</math> using the <math>\operatorname{diag}</math> operator: <math display="block"> \mathbf{D} = \operatorname{diag}(a_1, \dots, a_n). </math> This may be written more compactly as <math>\mathbf{D} = \operatorname{diag}(\mathbf{a})</math>. The same operator is also used to represent [[Block matrix#Block diagonal matrices\|block diagonal matrices]] as <math> \mathbf{A} = \operatorname{diag}(\mathbf A_1, \dots, \mathbf A_n)</math> where each argument <{{math~~>A_i</math>~~\|'''A'''{{sub\|''i''}}}} is a matrix. The ~~<math>\operatorname~~{{math\|diag}~~</math>~~} operator may be written as: <math display="block"> \operatorname{diag}(\mathbf{a}) = \left(\mathbf{a} \mathbf{1}^\textsf{T}\right) \circ \mathbf{I}, </math> where <math>\circ</math> represents the [[Hadamard product (matrices)\|Hadamard product]], and ~~<math>\mathbf~~{{math\|'''1'''}}~~</math>~~ is a constant vector with elements 1. ==Matrix-to-vector diag operator== The inverse matrix-to-vector <{{math~~>\operatorname{~~\|diag}~~</math>~~} operator is sometimes denoted by the identically named <math>\operatorname{diag}(\mathbf{D}) = \begin{bmatrix}a_1 & \~~dotsm~~dots & a_n\end{bmatrix}^\textsf{T},</math> where the argument is now a matrix, and the result is a vector of its diagonal entries. The following property holds: <math display="block"> \operatorname{diag}(\mathbf{A}\mathbf{B}) = \sum_j \left(\mathbf{A} \circ \mathbf{B}^\textsf{T}\right)_{ij} = \left( \mathbf{A} \circ \mathbf{B}^\textsf{T} \right) \mathbf{1}. </math> == Scalar matrix == ~~{{Confusing\|section\|reason=many sentences use incorrect, awkward grammar and should be reworded to make sense\|date=February 2021}}~~ <!-- Linked from [[Scalar matrix]] and [[Scalar transformation]] --> A diagonal matrix with equal diagonal entries is a '''scalar matrix'''; that is, a scalar multiple ''{{mvar\|λ''}} of the [[identity matrix]] {{~~mvar~~math\|'''I'''}}. Its effect on a [[vector (mathematics and physics)\|vector]] is [[scalar multiplication]] by ''{{mvar\|λ''}}. For example, a 3×3 scalar matrix has the form: <math display="block"> \begin{bmatrix} Line 76 ⟶ 84: </math> The scalar matrices are the [[center of an algebra\|center]] of the algebra of matrices: that is, they are precisely the matrices that [[commute (mathematics)\|commute]] with all other square matrices of the same size.{{efn\|Proof: given the [[elementary matrix]] <math>e_{ij}</math>, <math>Me_{ij}</math> is the matrix with only the ''i''-th row of ''M'' and <math>e_{ij}M</math> is the square matrix with only the ''M'' ''j''-th column, so the non-diagonal entries must be zero, and the ''i''th diagonal entry much equal the ''j''th diagonal entry.}} By contrast, over a [[field (mathematics)\|field]] (like the real numbers), a diagonal matrix with all diagonal elements distinct only commutes with diagonal matrices (its [[centralizer]] is the set of diagonal matrices). That is because if a diagonal matrix <math>\mathbf{D} = \operatorname{diag}(a_1, \dots, a_n)</math> has <math>a_i \neq a_j,</math> then given a matrix <{{math>\|'''M~~</math>~~'''}} with <math>m_{ij} \neq 0,</math> the <{{math>\|(''i'', ''j'')~~</math>~~}} term of the products are: <math>(\mathbf{DM})_{ij} = a_im_{ij}</math> and <math>(\mathbf{MD})_{ij} = m_{ij}a_j,</math> and <math>a_jm_{ij} \neq m_{ij}a_i</math> (since one can divide by ~~<math>m_~~{{mvar\|m{{sub\|ij}~~</math>~~}}}), so they do not commute unless the off-diagonal terms are zero.{{efn\|Over more general rings, this does not hold, because one cannot always divide.}} Diagonal matrices where the diagonal entries are not all equal or all distinct have centralizers intermediate between the whole space and only diagonal matrices.<ref>{{cite web \|url=https://math.stackexchange.com/q/1697991 \|title=Do Diagonal Matrices Always Commute? \|author=<!--Not stated--> \|date=March 15, 2016 \|publisher=Stack Exchange \|access-date=August 4, 2018 }}</ref> For an abstract vector space ''{{mvar\|V''}} (rather than the concrete vector space ~~<math>~~{{mvar\|K^{{sup\|n~~</math>~~}}}}), the analog of scalar matrices are '''scalar transformations'''. This is true more generally for a [[module (ring theory)\|module]] ''{{mvar\|M''}} over a [[ring (algebra)\|ring]] ''{{mvar\|R''}}, with the [[endomorphism algebra]] {{math\|End(''M'')}} (algebra of linear operators on ''{{mvar\|M''}}) replacing the algebra of matrices. Formally, scalar multiplication is a linear map, inducing a map <math>R \to \operatorname{End}(M),</math> (from a scalar ''{{mvar\|λ''}} to its corresponding scalar transformation, multiplication by ''{{mvar\|λ''}}) exhibiting {{math\|End(''M'')}} as a ''{{mvar\|R''}}-[[Algebra (ring theory)\|algebra]]. For vector spaces, the scalar transforms are exactly the [[center of a ring\|center]] of the endomorphism algebra, and, similarly, scalar invertible transforms are the center of the [[general linear group]] {{math\|GL(''V'')}}. The former is more generally true [[free module]]s <math>M \cong R^n,</math>, for which the endomorphism algebra is isomorphic to a matrix algebra. == Vector operations == Multiplying a vector by a diagonal matrix multiplies each of the terms by the corresponding diagonal entry. Given a diagonal matrix <math>\mathbf{D} = \operatorname{diag}(a_1, \dots, a_n)</math> and a vector <math>\mathbf{v} = \begin{bmatrix} x_1 & \dotsm & x_n \end{bmatrix}^\textsf{T}</math>, the product is: <math display="block">\mathbf{D}\mathbf{v} = \operatorname{diag}(a_1, \dots, a_n)\begin{bmatrix}x_1 \\ \vdots \\ x_n\end{bmatrix} = \begin{bmatrix} a_1 \\ Line 94 ⟶ 102: This can be expressed more compactly by using a vector instead of a diagonal matrix, <math>\mathbf{d} = \begin{bmatrix} a_1 & \dotsm & a_n \end{bmatrix}^\textsf{T}</math>, and taking the [[Hadamard product (matrices)\|Hadamard product]] of the vectors (entrywise product), denoted <math>\mathbf{d} \circ \mathbf{v}</math>: <math display="block">\mathbf{D}\mathbf{v} = \mathbf{d} \circ \mathbf{v} = \begin{bmatrix} a_1 \\ \vdots \\ a_n \end{bmatrix} \circ \begin{bmatrix} x_1 \\ \vdots \\ x_n \end{bmatrix} = \begin{bmatrix} a_1 x_1 \\ \vdots \\ a_n x_n \end{bmatrix}. </math> This is mathematically equivalent, but avoids storing all the zero terms of this [[sparse matrix]]. This product is thus used in [[machine learning]], such as computing products of derivatives in [[backpropagation]] or multiplying IDF weights in [[TF-IDF]],<ref>{{cite book \|last=Sahami \|first=Mehran \|date=2009-06-15 \|title=Text Mining: Classification, Clustering, and Applications \|url=https://~~www~~books.google.com/books~~/edition/Text_Mining/~~?id=BnvYaYhMl-MC~~?gbpv=1~~&pg=PA14 \|publisher=CRC Press \|page=14 \|isbn=9781420059458}}</ref> since some [[BLAS]] frameworks, which multiply matrices efficiently, do not include Hadamard product capability directly.<ref>{{cite web \|url=https://stackoverflow.com/questions/7621520/element-wise-vector-vector-multiplication-in-blas \|title=Element-wise vector-vector multiplication in BLAS? \|author=<!--Not stated--> \|date=2011-10-01 \|website=stackoverflow.com \|access-date=2020-08-30}}</ref> == Matrix operations == The operations of matrix addition and [[matrix multiplication]] are especially simple for diagonal matrices. Write {{math\|diag(''a''<sub>1</sub>, ..., ''a''<sub>''n''</sub>'')}} for a diagonal matrix whose diagonal entries starting in the upper left corner are {{math\|''a''<sub>1</sub>, ..., ''a''<sub>''n''</sub>''}}. Then, for [[matrix addition\|addition]], we have <math display=block> :{{math\|diag(''a''<sub>1</sub>, ..., ''a''<sub>''n''</sub>)}} + {{math\|diag(''b''<sub>1</sub>, ..., ''b''<sub>''n''</sub>)}} = {{math\|diag(''a''<sub>1</sub> + ''b''<sub>1</sub>, ..., ''a''<sub>''n''</sub> + ''b''<sub>''n''</sub>)}} \operatorname{diag}(a_1,\, \ldots,\, a_n) + \operatorname{diag}(b_1,\, \ldots,\, b_n) = \operatorname{diag}(a_1 + b_1,\, \ldots,\, a_n + b_n)</math> and for [[matrix multiplication]], <math display=block>\operatorname{diag}(a_1,\, \ldots,\, a_n) \operatorname{diag}(b_1,\, \ldots,\, b_n) = \operatorname{diag}(a_1 b_1,\, \ldots,\, a_n b_n).</math> :{{math\|diag(''a''<sub>1</sub>, ..., ''a''<sub>''n''</sub>)}} {{math\|diag(''b''<sub>1</sub>, ..., ''b''<sub>''n''</sub>)}} = {{math\|diag(''a''<sub>1</sub>''b''<sub>1</sub>, ..., ''a''<sub>''n''</sub>''b''<sub>''n''</sub>)}}. The diagonal matrix {{math\|diag(''a''<sub>1</sub>, ..., ''a''<sub>''n''</sub>'')}} is [[invertible matrix\|invertible]] [[if and only if]] the entries {{math\|''a''<sub>1</sub>, ..., ''a''<sub>''n''</sub>''}} are all nonzero. In this case, we have <math display=block>\operatorname{diag}(a_1,\, \ldots,\, a_n)^{-1} = \operatorname{diag}(a_1^{-1},\, \ldots,\, a_n^{-1}).</math> ~~:{{math\|diag(''a''<sub>1</sub>, ..., ''a''<sub>''n''</sub>)<sup>−1</sup>}} = {{math\|diag(''a''<sub>1</sub><sup>−1</sup>, ..., ''a''<sub>''n''</sub><sup>−1</sup>)}}.~~ In particular, the diagonal matrices form a [[subring]] of the ring of all ''{{mvar\|n''}}-by-''{{mvar\|n''}} matrices. Multiplying an ''{{mvar\|n''}}-by-''{{mvar\|n''}} matrix {{~~mvar~~math\|'''A'''}} from the ''left'' with {{math\|diag(''a''<sub>1</sub>, ..., ''a''<sub>''n''</sub>'')}} amounts to multiplying the {{mvar\|i}}-th ''row'' of {{~~mvar~~math\|'''A'''}} by {{~~math~~mvar\|''a''<sub>''i''</sub>}} for all {{mvar\|i}}; multiplying the matrix {{~~mvar~~math\|'''A'''}} from the ''right'' with {{math\|diag(''a''<sub>1</sub>, ..., ''a''<sub>''n''</sub>'')}} amounts to multiplying the {{mvar\|i}}-th ''column'' of {{~~mvar~~math\|'''A'''}} by {{~~math~~mvar\|''a''<sub>''i''</sub>}} for all {{mvar\|i}}. == Operator matrix in eigenbasis == {{Main\|Transformation matrix#Finding the matrix of a transformation\|Eigenvalues and eigenvectors}} As explained in [[transformation matrix#Finding the matrix of a transformation\|determining coefficients of operator matrix]], there is a special basis, {{math\|'''e'''<sub>1</sub>, ..., '''e'''<sub>''n''</sub>}}, for which the matrix <{{math>\|'''A~~</math>~~'''}} takes the diagonal form. Hence, in the defining equation <math display="inline">A \mathbf ~~e_j~~{Ae}_j = \sum_i a_{i,j} \mathbf e_i</math>, all coefficients ~~<math>a_~~{{mvar\|a{{sub\|i, j} ~~</math>~~}}} with {{math\|''i'' ≠ ''j''}} are zero, leaving only one term per sum. The surviving diagonal elements, ~~<math>a_~~{{mvar\|a{{sub\|i,i j}}}}~~</math>~~, are known as '''eigenvalues''' and designated with ~~<math>\lambda_i</math>~~{{mvar\|λ{{sub\|i}}}} in the equation, which reduces to <math>A \mathbf ~~e_i~~{Ae}_i = \lambda_i \mathbf e_i.</math>. The resulting equation is known as '''eigenvalue equation'''<ref>{{cite book \|last=Nearing \|first=James \|year=2010 \|title=Mathematical Tools for Physics \|url=http://www.physics.miami.edu/nearing/mathmethods \|chapter=Chapter 7.9: Eigenvalues and Eigenvectors \|publisher=Dover Publications \|chapter-url= http://www.physics.miami.edu/~nearing/mathmethods/operators.pdf \|access-date=January 1, 2012\|isbn=978-0486482125}}</ref> and used to derive the [[characteristic polynomial]] and, further, [[eigenvalues and eigenvectors]]. In other words, the [[eigenvalue]]s of {{math\|diag(''λ''<sub>1</sub>, ..., ''λ''<sub>''n''</sub>)}} are {{math\|''λ''<sub>1</sub>, ..., ''λ''<sub>''n''</sub>}} with associated [[eigenvectors]] of {{math\|'''e'''<sub>1</sub>, ..., '''e'''<sub>''n''</sub>}}. Line 133 ⟶ 142: A matrix is diagonal if and only if it is both [[triangular matrix\|upper-]] and [[triangular matrix\|lower-triangular]]. A diagonal matrix is [[symmetric matrix\|symmetric]]. * The [[identity matrix]] {{math\|'''I'''<sub>''n''</sub>}} and [[zero matrix]] are diagonal. * A 1×1 matrix is always diagonal. * The square of a 2×2 matrix with zero [[trace (linear algebra)\|trace]] is always diagonal. == Applications == Diagonal matrices occur in many areas of linear algebra. Because of the simple description of the matrix operation and eigenvalues/eigenvectors given above, it is typically desirable to represent a given matrix or [[linear operator\|linear map]] by a diagonal matrix. In fact, a given ''{{mvar\|n''}}-by-''{{mvar\|n''}} matrix {{~~mvar~~math\|'''A'''}} is [[similar matrix\|similar]] to a diagonal matrix (meaning that there is a matrix {{~~mvar~~math\|'''X'''}} such that {{math\|'''X'''<sup>−1</sup>'''AX'''}} is diagonal) if and only if it has {{mvar\|n}} [[linearly independent]] eigenvectors. Such matrices are said to be [[diagonalizable matrix\|diagonalizable]]. Over the [[field (mathematics)\|field]] of [[real number\|real]] or [[complex number\|complex]] numbers, more is true. The [[spectral theorem]] says that every [[normal matrix]] is [[matrix similarity\|unitarily similar]] to a diagonal matrix (if {{math\|1='''AA'''<sup>∗</sup> = '''A'''<sup>∗</sup>'''A'''}} then there exists a [[unitary matrix]] {{~~mvar~~math\|'''U'''}} such that {{math\|'''UAU'''<sup>∗</sup>}} is diagonal). Furthermore, the [[singular value decomposition]] implies that for any matrix {{~~mvar~~math\|'''A'''}}, there exist unitary matrices {{~~mvar~~math\|'''U'''}} and {{~~mvar~~math\|'''V'''}} such that {{math\|'''U'''<sup>∗</sup>'''AV'''}} is diagonal with positive entries. == Operator theory ==