Projection matrix: Difference between revisions

Content deleted Content added
m Intuition: punctuation
 
(6 intermediate revisions by 5 users not shown)
Line 7:
If the vector of [[Response variable|response values]] is denoted by <math>\mathbf{y}</math> and the vector of fitted values by <math>\mathbf{\hat{y}}</math>,
:<math>\mathbf{\hat{y}} = \mathbf{P} \mathbf{y}.</math>
As <math>\mathbf{\hat{y}}</math> is usually pronounced "y-hat", the projection matrix <math>\mathbf{P}</math> is also named ''hat matrix'' as it "puts a [[circumflex|hat]] on <math>\mathbf{y}</math>".
 
The element in the ''i''th row and ''j''th column of <math>\mathbf{P}</math> is equal to the [[covariance]] between the ''j''th response value and the ''i''th fitted value, divided by the [[variance]] of the former:<ref>Wood, Simon N. Generalized additive models: an introduction with R. chapman and hall/CRC, 2006.</ref>
:<math>p_{ij} = \frac{\operatorname{Cov}\left[ \hat{y}_i, y_j \right]}{\operatorname{Var}\left[y_j \right]}</math>
 
==Application for residuals==
Line 33 ⟶ 30:
\Rightarrow && \mathbf{A}^\textsf{T}\mathbf{b} &= \mathbf{A}^\textsf{T}\mathbf{Ax} \\
\Rightarrow && \mathbf{x} &= \left(\mathbf{A}^\textsf{T}\mathbf{A}\right)^{-1}\mathbf{A}^\textsf{T}\mathbf{b}
\end{align}</math>.
 
Therefore, since <math>\mathbf{xAx}</math> is on the column space of <math>\mathbf{A}</math>, the projection matrix, which maps <math>\mathbf{b}</math> onto <math>\mathbf{x}</math>, is just <math>\mathbf{A}</math>, or <math>\mathbf{A}\left(\mathbf{A}^\textsf{T}\mathbf{A}\right)^{-1}\mathbf{A}^\textsf{T}</math>.
 
== Linear model ==
Line 98 ⟶ 95:
where, e.g., <math>\mathbf{P}[\mathbf{A}] = \mathbf{A} \left(\mathbf{A}^\textsf{T} \mathbf{A} \right)^{-1} \mathbf{A}^\textsf{T}</math> and <math>\mathbf{M}[\mathbf{A}] = \mathbf{I} - \mathbf{P}[\mathbf{A}]</math>.
There are a number of applications of such a decomposition. In the classical application <math>\mathbf{A}</math> is a column of all ones, which allows one to analyze the effects of adding an intercept term to a regression. Another use is in the [[fixed effects model]], where <math>\mathbf{A}</math> is a large [[sparse matrix]] of the dummy variables for the fixed effect terms. One can use this partition to compute the hat matrix of <math>\mathbf{X}</math> without explicitly forming the matrix <math>\mathbf{X}</math>, which might be too large to fit into computer memory.
==History==
The hat matrix was introduced by John Wilder in 1972. An article by Hoaglin, D.C. and Welsch, R.E. (1978) gives the properties of the matrix and also many examples of its application.
 
== See also ==
Line 111 ⟶ 110:
 
[[Category:Regression analysis]]
[[Category:Matrices (mathematics)]]