Simple linear regression: Difference between revisions

Content deleted Content added
m page contains factually incorrect information :(
Lead: small introduction + a picture
Line 1:
{{expert-subject|Statistics}}
 
[[Image:Okuns_law_quarterly_differences.svg|300px|thumb|[[Okun's_law|Okun’s law]] in [[macroeconomics]] is an example of the simple linear regression. Here the dependent variable (GDP growth) is presumed to be in a linear relationship with the changes in the unemployment rate.]]
A '''simple linear regression''' is a [[linear regression]] in which there is only one [[covariate]] (predictor variable).
In [[statistics]], '''simple linear regression''' is the [[ordinary least squares|least squares]] estimator of a [[linear regression model]] with a single [[covariate|predictor variable]]. In other words, simple linear regression fits a straight line through the set of ''n'' points in such a way that makes the sum of squared ''residuals'' of the model (that is, vertical distances between the points of the data set and the fitted line) as small as possible.
 
The adjective ''simple'' refers to the fact that this regression is one of the simplest in statistics. The fitted line has the slope equal to the [[Pearson product moment correlation coefficient|correlation]] between ''y'' and ''x'' corrected by the ratio of standard deviations of these variables. The intercept of the fitted line is such that it passes through the center of mass (<span style="text-decoration:overline">''x''</span>, <span style="text-decoration:overline">''y''</span>) of the data points.
Simple linear regression is used to evaluate the linear relationship between two variables. One example could be the relationship between muscle strength and lean body mass. Another way to put it is that simple linear regression is used to develop an equation by which we can predict or estimate a dependent variable given an independent variable.
Given a sample <math> (Y_i, X_i), \, i = 1, \ldots, n </math>, the regression model is given by
 
: <math>Y_i = a + bX_i + \varepsilon_i </math>
 
Where <math>Y_i</math> is the dependent variable, <math>a</math> is the ''y'' intercept, <math>b</math> is the gradient or slope of the line, <math>X_i</math> is independent variable and <math> \varepsilon_i </math> is a random term associated with each observation.
The linear relationship between the two variables (i.e. dependent and independent) can be measured using a correlation coefficient e.g. the [[Pearson product moment correlation coefficient]].
 
== Estimating the regression line ==