Residual sum of squares: Difference between revisions

Content deleted Content added
Yoderj (talk | contribs)
No edit summary
No edit summary
 
(113 intermediate revisions by 72 users not shown)
Line 1:
{{Short description|Statistical measure of the discrepancy between data and an estimation model}}
{{Mergeto|Least squares|date=July 2006}}
{{more citations needed|date=April 2013}}
In [[statistics]], the '''residual sum of squares''' ('''RSS'''), also known as the '''sum of squared residuals''' ('''SSR''') or the '''sum of squared estimate of errors''' ('''SSE'''), is the [[summation|sum]] of the [[square (arithmetic)|squares]] of [[errors and residuals in statistics|residuals]] (deviations predicted from actual empirical values of data). It is a measure of the discrepancy between the data and an estimation model, such as a [[linear regression]]. A small RSS indicates a tight fit of the model to the data. It is used as an [[optimality criterion]] in parameter selection and [[model selection]].
 
In general, [[total sum of squares]] = [[explained sum of squares]] + residual sum of squares. For a proof of this in the multivariate [[ordinary least squares]] (OLS) case, see [[Explained sum of squares#Partitioning in the general ordinary least squares model|partitioning in the general OLS model]].
In [[statistics]], the '''residual sum of squares (RSS)''' is the [[sum]] of squares of [[errors and residuals in statistics|residuals]],
 
==One explanatory variable==
:<math>RSS = \sum_{i=1}^n (y_i - f(x_i))^2. </math>
 
In a model with a single explanatory variable, RSS is given by:<ref>{{Cite book|title=Correlation and regression analysis : a historian's guide|last=Archdeacon, Thomas J.|date=1994|publisher=University of Wisconsin Press|isbn=0-299-13650-7|pages=161–162|oclc=27266095}}</ref>
In a standard [[regression model]] <math>y_i = a+bx_i+\varepsilon_i\,</math>, where ''a'' and ''b'' are [[coefficient]]s, ''y'' and ''x'' are the [[regressand]] and the [[regressor]], respectively, and &epsilon; is the error term. The sum of squares of residuals is the sum of squares of [[estimator|estimates]] of &epsilon;<sub>''i''</sub>, that is
 
:<math display="block">\operatorname{RSS} = \sum_{i=1}^n \left(y_i - f(a+bx_ix_i)\right)^2. </math>
 
where ''y''<sub>''i''</sub> is the ''i''<sup>th</sup> value of the variable to be predicted, ''x''<sub>''i''</sub> is the ''i''<sup>th</sup> value of the explanatory variable, and <math>f(x_i)</math> is the predicted value of ''y''<sub>''i''</sub> (also termed <math>\hat{y_i}</math>).
In general: [[total sum of squares]] = [[explained sum of squares]] + '''residual sum of squares'''.
In a standard linear simple [[regression model]], <math>y_i = a\alpha +bx_i \beta x_i+\varepsilon_i\,</math>, where ''a''<math>\alpha</math> and ''b''<math>\beta</math> are [[coefficient]]s, ''y'' and ''x'' are the [[regressand]] and the [[regressor]], respectively, and &epsilon; is the [[errors and residuals in statistics|error term]]. The sum of squares of residuals is the sum of squares of [[estimator|estimates]] of &epsilon;<submath>''i''\widehat{\varepsilon\,}_i</submath>,; that is
 
<math display="block">\operatorname{RSS} = \sum_{i=1}^n \left(\widehat{\varepsilon}_i\right)^2 = \sum_{i=1}^n \left(y_i - (\widehat{\alpha\,} + \widehat{\beta}\, x_i)\right)^2 </math>
[[Category:Regression analysis]]
 
where <math>\widehat{\alpha\,}</math> is the estimated value of the constant term <math>\alpha</math> and <math>\widehat{\beta\,}</math> is the estimated value of the slope coefficient <math>\beta</math>.
{{statistics-stub}}
 
==Matrix expression for the OLS residual sum of squares==
 
The general regression model with {{mvar|n}} observations and {{mvar|k}} explanators, the first of which is a constant unit vector whose coefficient is the regression intercept, is
 
<math display="block"> y = X \beta + e</math>
 
where {{mvar|y}} is an ''n'' × 1 vector of dependent variable observations, each column of the ''n'' × ''k'' matrix {{mvar|X}} is a vector of observations on one of the ''k'' explanators, <math>\beta </math> is a ''k'' × 1 vector of true coefficients, and {{mvar|e}} is an ''n''× 1 vector of the true underlying errors. The [[ordinary least squares]] estimator for <math>\beta</math> is
 
<math display="block"> \begin{align}
&X \hat \beta = y \\[1ex]
\iff &
X^\operatorname{T} X \hat \beta = X^\operatorname{T} y \\[1ex]
\iff &
\hat \beta = \left(X^\operatorname{T} X\right)^{-1}X^\operatorname{T} y.
\end{align}</math>
 
The residual vector <math>\hat e = y - X \hat \beta = y - X (X^\operatorname{T} X)^{-1}X^\operatorname{T} y</math>; so the residual sum of squares is:
 
<math display="block">\operatorname{RSS} = \hat e ^\operatorname{T} \hat e = \left\| \hat e \right\|^2 ,</math>
 
{{anchor|Norm of residuals}}(equivalent to the square of the [[vector norm|norm]] of residuals). In full:
 
<math display="block">\begin{align}
\operatorname{RSS} &= y^\operatorname{T} y - y^\operatorname{T} X \left(X^\operatorname{T} X\right)^{-1} X^\operatorname{T} y \\[1ex]
&= y^\operatorname{T} \left[I - X \left(X^\operatorname{T} X\right)^{-1} X^\operatorname{T}\right] y \\[1ex]
&= y^\operatorname{T} \left[I - H\right] y,
\end{align}</math>
 
where {{mvar|H}} is the [[hat matrix]], or the projection matrix in linear regression.
 
== Relation with Pearson's product-moment correlation ==
The [[Least squares|least-squares regression line]] is given by
 
<math display="block">y = ax + b,</math>
 
where <math>b=\bar{y}-a\bar{x}</math> and <math>a=\frac{S_{xy}}{S_{xx}}</math>, where <math>S_{xy}=\sum_{i=1}^n(\bar{x}-x_i)(\bar{y}-y_i)</math> and <math>S_{xx}=\sum_{i=1}^n(\bar{x}-x_i)^2.</math>
 
Therefore,
 
<math display="block">
\begin{align}
\operatorname{RSS} & = \sum_{i=1}^n \left(y_i - f(x_i)\right)^2
= \sum_{i=1}^n \left(y_i - (ax_i+b)\right)^2 \\[1ex]
&= \sum_{i=1}^n \left(y_i - ax_i-\bar{y} + a\bar{x}\right)^2
= \sum_{i=1}^n \left[a\left(\bar{x} - x_i\right) - \left(\bar{y} - y_i\right)\right]^2 \\[1ex]
&= a^2 S_{xx} - 2aS_{xy} + S_{yy}
= S_{yy} - aS_{xy} \\[1ex]
&=S_{yy} \left(1 - \frac{S_{xy}^2}{S_{xx} S_{yy}} \right)
\end{align}
</math>
 
:where <math>RSS S_{yy}= \sum_{i=1}^n (y_i \bar{y}- f(x_i)y_i)^2. .</math>
 
The [[Pearson correlation coefficient|Pearson product-moment correlation]] is given by <math>r=\frac{S_{xy}}{\sqrt{S_{xx}S_{yy}}}; </math> therefore, <math>\operatorname{RSS}=S_{yy}(1-r^2). </math>
 
==See also==
{{div col}}
*{{slink|Akaike information criterion#Comparison with least squares}}
*{{slink|Chi-squared distribution#Applications}}
*{{slink|Degrees of freedom (statistics)#Sum of squares and degrees of freedom}}
*[[Errors and residuals in statistics]]
*[[Lack-of-fit sum of squares]]
*[[Mean squared error]]
*[[Reduced chi-squared statistic]], RSS per degree of freedom
*[[Squared deviations]]
*[[Sum of squares (statistics)]]
{{div col end}}
 
==References==
{{Reflist}}
 
* {{cite book
|title = Applied Regression Analysis
|edition = 3rd
|last1= Draper |first1=N.R. |last2=Smith |first2=H.
|publisher = John Wiley
|year = 1998
|isbn = 0-471-17082-8}}
 
[[Category:RegressionLeast analysissquares]]
[[Category:Errors and residuals]]