Coefficient of multiple correlation: Difference between revisions

Content deleted Content added
inline wikilink
slight further explanation
Line 1:
In [[statistics]], [[regression analysis]] is a method for the explanation of phenomena and the prediction of future events. In regression analysis, a [[correlation coefficient|coefficient of correlation]] ''r'' between variables ''X'' and ''Y'' is a quantitative index of co-movement between these two variables. Its squared form, the [[coefficient of determination]] ''r''<sup>&nbsp;2</sup>, indicates the fraction of [[variance]] in the criterion variable ''Y'' that is accounted for by the variation in the predictor variable ''X''. In multiple regression analysis, the set of predictor variables (also called independent variables or explanatory variables) ''X''<sub>1</sub>, ''X''<sub>2</sub>, ... is used to explain variability of the criterion variable (also called the dependent variable) ''Y''. A multivariate counterpart of the coefficient of determination ''r''<sup>&nbsp;2</sup> is the '''coefficient of multiple determination''', ''R''<sup>&nbsp;2</sup>, which is frequently called simply the coefficient of determination. The [[square root]] of the coefficient of multiple determination is the '''coefficient of multiple correlation''',&nbsp;'''''R'''''. Since the coefficient of multiple correlation is always between minus one and one, the coefficient of multiple determination is always between zero and one. For either coefficient, a larger absolute value indicates a stronger relationship between the predictor variable(s) and the criterion variable.
 
==Conceptualization of multiple correlation==
''R''<sup>2</sup> is simply the square of the sample correlation coefficient between the actual and predicted values of the criterion variable.
 
An intuitive approach to multiple regression analysis is to sum the squared correlations between the predictor variables and the criterion variable to obtain an index of the strength of the overall relationship between the predictor variables and the criterion variable. However, such a sum is often greater than one, suggesting that simple summation of the squared coefficients of correlations is not a correct procedure to employ. In fact, the simple summation of squared coefficients of correlations between the predictor variables and the criterion variable ''is'' the correct procedure if and only if one has the special case in which the predictor variables are not correlated. If the predictors are correlated, their inter-correlations must be removed so that only the unique contributions of each predictor toward explanation of the criterion remain.