Projection matrix: Difference between revisions

Content deleted Content added
No edit summary
Line 3:
| first1= David C. | last1= Hoaglin |first2= Roy E. | last2=Welsch |journal= [[The American Statistician]] | volume=32 |date=February 1978| pages=17–22 | doi = 10.2307/2683469 |issue=1| jstor = 2683469 |url=http://dspace.mit.edu/bitstream/1721.1/1920/1/SWP-0901-02752210.pdf | hdl= 1721.1/1920 | hdl-access= free }}</ref><ref name = "Freedman09">{{cite book |author=David A. Freedman |author-link=David A. Freedman |year=2009|title=Statistical Models: Theory and Practice |publisher=[[Cambridge University Press]]}}</ref> The diagonal elements of the projection matrix are the [[leverage (statistics)|leverage]]s, which describe the influence each response value has on the fitted value for that same observation.
 
==OverviewDefinition==
If the vector of [[Response variable|response values]] is denoted by <math>\mathbf{y}</math> and the vector of fitted values by <math>\mathbf{\hat{y}}</math>,
:<math>\mathbf{\hat{y}} = \mathbf{P} \mathbf{y}.</math>
As <math>\mathbf{\hat{y}}</math> is usually pronounced "y-hat", the projection matrix <math>\mathbf{P}</math> is also named ''hat matrix'' as it "puts a [[circumflex|hat]] on <math>\mathbf{y}</math>". The formula for the vector of [[errors and residuals in statistics|residual]]s <math>\mathbf{r}</math> can also be expressed compactly using the projection matrix:
 
:<math>\mathbf{r} = \mathbf{y} - \mathbf{\hat{y}} = \mathbf{y} - \mathbf{P} \mathbf{y} = \left( \mathbf{I} - \mathbf{P} \right) \mathbf{y}.</math>
where <math>\mathbf{I}</math> is the [[identity matrix]]. The matrix <math>\mathbf{M} \equiv \mathbf{I} - \mathbf{P}</math> is sometimes referred to as the '''residual maker matrix'''. Moreover, the element in the ''i''th row and ''j''th column of <math>\mathbf{P}</math> is equal to the [[covariance]] between the ''j''th response value and the ''i''th fitted value, divided by the [[variance]] of the former:{{cn}}
:<math>p_{ij} = \frac{\operatorname{Cov}\left[ \hat{y}_i, y_j \right]}{\operatorname{Var}\left[y_j \right]}</math>
 
==Application for residuals==
Therefore, the [[covariance matrix]] of the residuals <math>\mathbf{r}</math>, by [[error propagation]], equals
The formula for the vector of [[errors and residuals in statistics|residual]]s <math>\mathbf{r}</math> can also be expressed compactly using the projection matrix:
:<math>\mathbf{r} = \mathbf{y} - \mathbf{\hat{y}} = \mathbf{y} - \mathbf{P} \mathbf{y} = \left( \mathbf{I} - \mathbf{P} \right) \mathbf{y}.</math>
where <math>\mathbf{I}</math> is the [[identity matrix]]. The matrix <math>\mathbf{M} \equiv \mathbf{I} - \mathbf{P}</math> is sometimes referred to as the '''residual maker matrix'''.
 
Therefore, theThe [[covariance matrix]] of the residuals <math>\mathbf{r}</math>, by [[error propagation]], equals
:<math>\mathbf{\Sigma}_\mathbf{r} = \left( \mathbf{I} - \mathbf{P} \right)^\textsf{T} \mathbf{\Sigma} \left( \mathbf{I}-\mathbf{P} \right)</math>,
where <math>\mathbf{\Sigma}</matH> is the [[covariance matrix]] of the error vector (and by extension, the response vector as well). For the case of linear models with [[independent and identically distributed]] errors in which <math>\mathbf{\Sigma} = \sigma^{2} \mathbf{I}</math>, this reduces to:<ref name="Hoaglin1977"/>