Coefficient of multiple correlation: Difference between revisions

Content deleted Content added
Undid revision 1216542234 by 95.83.136.159 (talk)
Tags: Undo Mobile edit Mobile web edit Advanced mobile edit
 
(40 intermediate revisions by 25 users not shown)
Line 1:
{{Short description|Statistical concept}}
{{More footnotes|date=November 2010}}
 
In [[statistics]], the '''coefficient of '''multiple correlation''' is a measure of how well a given variable can be predicted using a [[linear function]] of a set of other variables. It is measured by the square root of the [[coefficientPearson of determinationcorrelation|correlation]], but underbetween the particularvariable's assumptions that an intercept is includedvalues and that the best possiblepredictions linearthat predictorscan arebe used, whereas the coefficient of determination is defined for more general cases, including those of nonlinear prediction and those in which the predicted values have not been derived from a model-fitting procedure. The coefficient of multiple correlation takes values between zero and one; a higher value indicates a better predictability of thecomputed [[dependentlinear and independent variablesequation|dependent variablelinearly]] from the [[dependent and independentpredictive variables|independent variables]], with a value of one indicating that the predictions are exactly correct and a value of zero indicating that no linear combination of the independent variables is a better predictor than is the fixed mean of the dependent variable.<ref>[http://mtwebonlinestatbook.mtsu.educom/stats2/regression/level3/multicorrel/multicorrcoefmultiple_regression.htmhtml Introduction to Multiple correlationRegression] coefficient]</ref>
 
The coefficient of multiple correlation takes values between 0 and 1. Higher values indicate higher predictability of the [[dependent and independent variables|dependent variable]] from the [[dependent and independent variables|independent variables]], with a value of 1 indicating that the predictions are exactly correct and a value of 0 indicating that no linear combination of the independent variables is a better predictor than is the fixed [[mean]] of the dependent variable.<ref>[http://mtweb.mtsu.edu/stats/regression/level3/multicorrel/multicorrcoef.htm Multiple correlation coefficient]</ref>
{| class="wikitable"
|Correlation Coefficient (r)
|Direction and Strength of Correlation
|-
|1
|Perfectly positive
|-
|0.8
|Strongly positive
|-
|0.5
|Moderately positive
|-
|0.2
|Weakly positive
|-
|0
|No association
|-
| -0.2
|Weakly negative
|-
| -0.5
|Moderately negative
|-
| -0.8
|Strongly negative
|-
| -1
|Perfectly negative
|}
The coefficient of multiple correlation is known as the square root of the [[coefficient of determination]], but under the particular assumptions that an intercept is included and that the best possible linear predictors are used, whereas the coefficient of determination is defined for more general cases, including those of nonlinear prediction and those in which the predicted values have not been derived from a model-fitting procedure.
 
==Definition==
Line 8 ⟶ 43:
==Computation==
 
The square of the coefficient of multiple correlation can be computed using the [[Euclidean space|vector]] ''<math>\mathbf{c''} = {(r_{x_1 y}, r_{x_2 y},\dots,r_{x_N y})}^\top</math> of cross-[[correlation]]s <math>r_{x_n y}</math> between the predictor variables <math>x_n</math> (independent variables) and the target variable <math>y</math> (dependent variable), and the [[correlation matrix]] ''R''<submath>''R_{xx''}</submath> of inter-correlations between predictor variables. It is given by
 
::<math>R^2 = \mathbf{c}^\top R_{xx}^{-1}\, \mathbf{c},</math>
::''R''<sup>2</sup> = ''c''' ''R''<sub>''xx''</sub><sup>&minus;1</sup> ''c'',
 
where ''<math>\mathbf{c'' '}^\top</math> is the [[transpose]] of ''<math>\mathbf{c''}</math>, and ''R''<submath>''R_{xx''</sub><sup>&minus;}^{-1}</supmath> is the [[Matrix inversion|inverse]] of the matrix ''R''<sub>''xx''</sub>.
 
::<math>R_{xx} = \left(\begin{array}{cccc}
If all the predictor variables are uncorrelated, the matrix ''R''<sub>''xx''</sub> is the identity matrix and ''R''<sup>2</sup> simply equals ''c''' ''c'', the sum of the squared cross-correlations with the dependent variable. If there is cross-correlation among the predictor variables, the inverse of the cross-correlation matrix accounts for this.
r_{x_1 x_1} & r_{x_1 x_2} & \dots & r_{x_1 x_N} \\
r_{x_2 x_1} & \ddots & & \vdots \\
\vdots & & \ddots & \\
r_{x_N x_1} & \dots & & r_{x_N x_N}
\end{array}\right).</math>
 
If all the predictor variables are uncorrelated, the matrix ''R''<submath>''R_{xx''}</submath> is the identity matrix and ''R''<supmath>R^2</supmath> simply equals ''<math>\mathbf{c'''}^\top\, ''\mathbf{c''}</math>, the sum of the squared cross-correlations with the dependent variable. If therethe ispredictor cross-correlationvariables amongare thecorrelated predictoramong variablesthemselves, the inverse of the cross-correlation matrix <math>R_{xx}</math> accounts for this.
The squared coefficient of multiple correlation can also be computed as the fraction of variance of the dependent variable that is explained by the independent variables, which in turn is 1 minus the unexplained fraction. The unexplained fraction can be computed as the [[sum of squared residuals]]&mdash;that is, the sum of the squares of the prediction errors&mdash;divided by the [[Total sum of squares|sum of the squared deviations of the values of the dependent variable]] from its [[expected value]].
 
The squared coefficient of multiple correlation can also be computed as the fraction of variance of the dependent variable that is explained by the independent variables, which in turn is 1 minus the unexplained fraction. The unexplained fraction can be computed as the [[sum of squaredsquares of residuals]]&mdash;that is, the sum of the squares of the prediction errors&mdash;divided by the [[Total sum of squares|sum of thesquares squaredof deviations of the values of the dependent variable]] from its [[expected value]].
==Properties==
 
==Properties==
With more than two variables being related to each other, the value of the coefficient of multiple correlation depends on the choice of dependent variable: a regression of ''y'' on ''x'' and ''z'' will in general have a different ''R'' than will a regression of ''z'' on ''x'' and ''y''. For example, suppose that in a particular sample the variable ''z'' is [[Correlation and dependence|uncorrelated]] with both ''x'' and ''y'', while ''x'' and ''y'' are linearly related to each other. Then a regression of ''z'' on ''y'' and ''x'' will yield an ''R'' of zero, while a regression of ''y'' on ''x'' and ''z'' will yield a strictly positive ''R''. This follows since the correlation of ''y'' with the best predictor based on ''x'' and ''z'' is in all cases at least as large as the correlation of ''y'' with the best predictor based on ''x'' alone, and in this case with ''z'' providing no explanatory power it will be exactly as large.
 
With more than two variables being related to each other, the value of the coefficient of multiple correlation depends on the choice of dependent variable: a regression of ''<math>y''</math> on ''<math>x''</math> and ''<math>z''</math> will in general have a different ''<math>R''</math> than will a regression of ''<math>z''</math> on ''<math>x''</math> and ''<math>y''</math>. For example, suppose that in a particular sample the variable ''<math>z''</math> is [[Correlation and dependence|uncorrelated]] with both ''<math>x''</math> and ''<math>y''</math>, while ''<math>x''</math> and ''<math>y''</math> are linearly related to each other. Then a regression of ''<math>z''</math> on ''<math>y''</math> and ''<math>x''</math> will yield an ''<math>R''</math> of zero, while a regression of ''<math>y''</math> on ''<math>x''</math> and ''<math>z''</math> will yield a strictly positive ''<math>R''</math>. This follows since the correlation of ''<math>y''</math> with theits best predictor based on ''<math>x''</math> and ''<math>z''</math> is in all cases at least as large as the correlation of ''<math>y''</math> with theits best predictor based on ''<math>x''</math> alone, and in this case with ''<math>z''</math> providing no explanatory power it will be exactly as large.
{{inline|date=April 2013}}
 
''''Italic text''''==References==
{{Reflist}}
 
==Further reading==
* Allison, Paul D. (1998) ''Multiple Regression: A Primer' '''London, U.K.: Sage Publications' 'ISBN-13: 9780761985334'
'''* CohenAllison, Jacob, etPaul alD. (20021998). ''Applied Multiple Regression: CorrelationA AnalysisPrimer''. forLondon: the Behavioral Sciences''Sage Publications. {{ISBN 0805822232|9780761985334}}
* CrownCohen, WilliamJacob, et Hal. (19982002). ''StatisticalApplied ModelsMultiple Regression: Correlation Analysis for the Social and Behavioral Sciences: Multiple Regression and Limited-Dependent Variable Models''. {{ISBN 0275953165|0805822232}}
* Crown, William H. (1998). ''Statistical Models for the Social and Behavioral Sciences: Multiple Regression and Limited-Dependent Variable Models''. {{ISBN|0275953165}}
* Edwards, Allen Louis (1985) ''Multiple Regression and the Analysis of Variance and Covariance'' ISBN 0716710811
* KeithEdwards, TimothyAllen Louis (20061985). ''Multiple Regression and Beyond'',the Boston,Analysis Mass:of PearsonVariance Educationand Covariance''. {{ISBN|0716710811}}
* Fred N. KerlingerKeith, Elazar J. PedhazurTimothy (19732006). ''Multiple Regression inand Behavioral Research.Beyond' '''New York, N. Y.Boston: HoltPearson Rinehart Winston' 'ISBN-13: 9780030862113'Education.
* Fred N. Kerlinger, Elazar J. Pedhazur (1973). ''Multiple Regression in Behavioral Research.'' New York: Holt Rinehart Winston. {{ISBN|9780030862113}}
* Stanton, Jeffrey M. (2001). [http://www.amstat.org/publications/jse/v9n3/stanton.html "Galton, Pearson, and the Peas: A Brief History of Linear Regression for Statistics Instructors"], ''Journal of Statistics Education'', 9 (3).
 
{{DEFAULTSORT:Multiple Correlation}}
[[Category:Correlation indicators]]
[[Category:Regression analysis]]
[[Category:Covariance and correlation]]