Revision as of 01:55, 2 December 2023 edit Fgnievinski (talk \| contribs) Autopatrolled, Extended confirmed users 71,085 edits No edit summary ← Previous edit		Revision as of 01:56, 2 December 2023 edit undo Fgnievinski (talk \| contribs) Autopatrolled, Extended confirmed users 71,085 edits →Fitting the regression line Next edit →
Line 14: In this case, the slope of the fitted line is equal to the [[Pearson correlation coefficient\|correlation]] between {{mvar\|y}} and {{mvar\|x}} corrected by the ratio of standard deviations of these variables. The intercept of the fitted line is such that the line passes through the center of mass {{math\|({{overline\|''x''}}, {{overline\|''y''}})}} of the data points. ==Formulation and computation== ~~==Fitting the regression line==~~ Consider the [[mathematical model\|model]] function : <math> y = \alpha + \beta x,</math> Line 70: The [[coefficient of determination]] ("R squared") is equal to <math>r_{xy}^2</math> when the model is linear with a single independent variable. See [[Correlation#Pearson's product-moment coefficient\|sample correlation coefficient]] for additional details. == Interpretation == === Intuition about the slope ===▼ ▲=== ~~Intuition~~Interpretation about the slope === By multiplying all members of the summation in the numerator by : <math>\begin{align}\frac{(x_i - \bar{x})}{(x_i - \bar{x})} = 1\end{align}</math> (thereby not changing it): Line 81 ⟶ 82: We can see that the slope (tangent of angle) of the regression line is the weighted average of <math>\frac{(y_i - \bar{y})}{(x_i - \bar{x})}</math> that is the slope (tangent of angle) of the line that connects the i-th point to the average of all points, weighted by <math>(x_i - \bar{x})^2</math> because the further the point is the more "important" it is, since small errors in its position will affect the slope connecting it to the center point more. === ~~Intuition~~Interpretation about the intercept === : <math>\begin{align} Line 90 ⟶ 91: we have <math>y_{\rm intersection} = \bar{y} - dx\times\widehat\beta = \bar{y} - dy</math> === ~~Intuition~~Interpretation about the correlation=== In the above formulation, notice that each <math>x_i</math> is a constant ("known upfront") value, while the <math>y_i</math> are random variables that depend on the linear function of <math>x_i</math> and the random term <math>\varepsilon_i</math>. This assumption is used when deriving the standard error of the slope and showing that it is [[Proofs_involving_ordinary_least_squares\|unbiased]].

Simple linear regression: Difference between revisions