Residual sum of squares: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 19:18, 14 February 2007 edit Yoderj (talk \| contribs) 467 edits No edit summary ← Previous edit		Latest revision as of 18:02, 26 August 2025 edit undo Hellacioussatyr (talk \| contribs) Extended confirmed users 10,151 edits No edit summary Tag: Visual edit
(113 intermediate revisions by 72 users not shown)
Line 1: {{Short description\|Statistical measure of the discrepancy between data and an estimation model}} ~~{{Mergeto\|Least squares\|date=July 2006}}~~ {{more citations needed\|date=April 2013}} In [[statistics]], the '''residual sum of squares''' ('''RSS'''), also known as the '''sum of squared residuals''' ('''SSR''') or the '''sum of squared estimate of errors''' ('''SSE'''), is the [[summation\|sum]] of the [[square (arithmetic)\|squares]] of [[errors and residuals in statistics\|residuals]] (deviations predicted from actual empirical values of data). It is a measure of the discrepancy between the data and an estimation model, such as a [[linear regression]]. A small RSS indicates a tight fit of the model to the data. It is used as an [[optimality criterion]] in parameter selection and [[model selection]]. In general, [[total sum of squares]] = [[explained sum of squares]] + residual sum of squares. For a proof of this in the multivariate [[ordinary least squares]] (OLS) case, see [[Explained sum of squares#Partitioning in the general ordinary least squares model\|partitioning in the general OLS model]]. ~~In [[statistics]], the '''residual sum of squares (RSS)''' is the [[sum]] of squares of [[errors and residuals in statistics\|residuals]],~~ ==One explanatory variable== :<math>RSS = \sum_{i=1}^n (y_i - f(x_i))^2. </math>▼ In a model with a single explanatory variable, RSS is given by:<ref>{{Cite book\|title=Correlation and regression analysis : a historian's guide\|last=Archdeacon, Thomas J.\|date=1994\|publisher=University of Wisconsin Press\|isbn=0-299-13650-7\|pages=161–162\|oclc=27266095}}</ref> In a standard [[regression model]] <math>y_i = a+bx_i+\varepsilon_i\,</math>, where ''a'' and ''b'' are [[coefficient]]s, ''y'' and ''x'' are the [[regressand]] and the [[regressor]], respectively, and ε is the error term. The sum of squares of residuals is the sum of squares of [[estimator\|estimates]] of ε<sub>''i''</sub>, that is▼ :<math display="block">\operatorname{RSS} = \sum_{i=1}^n \left(y_i - f(~~a+bx_i~~x_i)\right)^2. </math> where ''y''<sub>''i''</sub> is the ''i''<sup>th</sup> value of the variable to be predicted, ''x''<sub>''i''</sub> is the ''i''<sup>th</sup> value of the explanatory variable, and <math>f(x_i)</math> is the predicted value of ''y''<sub>''i''</sub> (also termed <math>\hat{y_i}</math>). ~~In general: [[total sum of squares]] = [[explained sum of squares]] + '''residual sum of squares'''.~~ ▲In a standard linear simple [[regression model]], <math>y_i = a\alpha +~~bx_i~~ \beta x_i+\varepsilon_i\,</math>, where ~~''a''~~<math>\alpha</math> and ~~''b''~~<math>\beta</math> are [[coefficient]]s, ''y'' and ''x'' are the [[regressand]] and the [[regressor]], respectively, and ε is the [[errors and residuals in statistics\|error term]]. The sum of squares of residuals is the sum of squares of ~~[[estimator\|estimates]] of ε~~<~~sub~~math>~~''i''~~\widehat{\varepsilon\,}_i</~~sub~~math>,; that is <math display="block">\operatorname{RSS} = \sum_{i=1}^n \left(\widehat{\varepsilon}_i\right)^2 = \sum_{i=1}^n \left(y_i - (\widehat{\alpha\,} + \widehat{\beta}\, x_i)\right)^2 </math> [[Category:Regression analysis]]▼ where <math>\widehat{\alpha\,}</math> is the estimated value of the constant term <math>\alpha</math> and <math>\widehat{\beta\,}</math> is the estimated value of the slope coefficient <math>\beta</math>. ~~{{statistics-stub}}~~ ==Matrix expression for the OLS residual sum of squares== The general regression model with {{mvar\|n}} observations and {{mvar\|k}} explanators, the first of which is a constant unit vector whose coefficient is the regression intercept, is <math display="block"> y = X \beta + e</math> where {{mvar\|y}} is an ''n'' × 1 vector of dependent variable observations, each column of the ''n'' × ''k'' matrix {{mvar\|X}} is a vector of observations on one of the ''k'' explanators, <math>\beta </math> is a ''k'' × 1 vector of true coefficients, and {{mvar\|e}} is an ''n''× 1 vector of the true underlying errors. The [[ordinary least squares]] estimator for <math>\beta</math> is <math display="block"> \begin{align} &X \hat \beta = y \\[1ex] \iff & X^\operatorname{T} X \hat \beta = X^\operatorname{T} y \\[1ex] \iff & \hat \beta = \left(X^\operatorname{T} X\right)^{-1}X^\operatorname{T} y. \end{align}</math> The residual vector <math>\hat e = y - X \hat \beta = y - X (X^\operatorname{T} X)^{-1}X^\operatorname{T} y</math>; so the residual sum of squares is: <math display="block">\operatorname{RSS} = \hat e ^\operatorname{T} \hat e = \left\\| \hat e \right\\|^2 ,</math> {{anchor\|Norm of residuals}}(equivalent to the square of the [[vector norm\|norm]] of residuals). In full: <math display="block">\begin{align} \operatorname{RSS} &= y^\operatorname{T} y - y^\operatorname{T} X \left(X^\operatorname{T} X\right)^{-1} X^\operatorname{T} y \\[1ex] &= y^\operatorname{T} \left[I - X \left(X^\operatorname{T} X\right)^{-1} X^\operatorname{T}\right] y \\[1ex] &= y^\operatorname{T} \left[I - H\right] y, \end{align}</math> where {{mvar\|H}} is the [[hat matrix]], or the projection matrix in linear regression. == Relation with Pearson's product-moment correlation == The [[Least squares\|least-squares regression line]] is given by <math display="block">y = ax + b,</math> where <math>b=\bar{y}-a\bar{x}</math> and <math>a=\frac{S_{xy}}{S_{xx}}</math>, where <math>S_{xy}=\sum_{i=1}^n(\bar{x}-x_i)(\bar{y}-y_i)</math> and <math>S_{xx}=\sum_{i=1}^n(\bar{x}-x_i)^2.</math> Therefore, <math display="block"> \begin{align} \operatorname{RSS} & = \sum_{i=1}^n \left(y_i - f(x_i)\right)^2 = \sum_{i=1}^n \left(y_i - (ax_i+b)\right)^2 \\[1ex] &= \sum_{i=1}^n \left(y_i - ax_i-\bar{y} + a\bar{x}\right)^2 = \sum_{i=1}^n \left[a\left(\bar{x} - x_i\right) - \left(\bar{y} - y_i\right)\right]^2 \\[1ex] &= a^2 S_{xx} - 2aS_{xy} + S_{yy} = S_{yy} - aS_{xy} \\[1ex] &=S_{yy} \left(1 - \frac{S_{xy}^2}{S_{xx} S_{yy}} \right) \end{align} </math> ▲:where <math>~~RSS~~ S_{yy}= \sum_{i=1}^n (~~y_i~~ \bar{y}- ~~f(x_i)~~y_i)^2. .</math> The [[Pearson correlation coefficient\|Pearson product-moment correlation]] is given by <math>r=\frac{S_{xy}}{\sqrt{S_{xx}S_{yy}}}; </math> therefore, <math>\operatorname{RSS}=S_{yy}(1-r^2). </math> ==See also== {{div col}} {{slink\|Akaike information criterion#Comparison with least squares}} {{slink\|Chi-squared distribution#Applications}} {{slink\|Degrees of freedom (statistics)#Sum of squares and degrees of freedom}} [[Errors and residuals in statistics]] [[Lack-of-fit sum of squares]] [[Mean squared error]] [[Reduced chi-squared statistic]], RSS per degree of freedom [[Squared deviations]] [[Sum of squares (statistics)]] {{div col end}} ==References== {{Reflist}} {{cite book \|title = Applied Regression Analysis \|edition = 3rd \|last1= Draper \|first1=N.R. \|last2=Smith \|first2=H. \|publisher = John Wiley \|year = 1998 \|isbn = 0-471-17082-8}} ▲[[Category:~~Regression~~Least ~~analysis~~squares]] [[Category:Errors and residuals]]