Content deleted Content added
→History: another reference. |
m →Model definition: fix typos, formatting. |
||
Line 32:
==Model definition==
Local regression uses a [[data set]] consisting of observations one or more
For ease of presentation, the development below assumes a single predictor variable; the extension to multiple predictors (when the <math>x_i</math> are vectors) is conceptually straightforward. A functional relationship between the predictor and response variables is assumed:
where <math>\mu(x)</math> is the unknown
Local regression then estimates the function <math>\mu(x)</math>, for one value of <math>x</math> at a time. Since the function is assumed to be smooth, the most informative data points are those whose <math>x_i</math> values are close to <math>x</math>. This is formalized with a bandwidth <math>h</math> and a [[kernel (statistics)|kernel]] or weight function <math>W(\cdot)</math>, with observations assigned weights
A typical choice of <math>W</math>, used by Cleveland in LOWESS, is <math>W(u) = (1-|u|^3)^3</math> for <math>|u|<1</math>, although any similar function (peaked at <math>u=0</math> and small or 0 for large values of <math>u</math>) can be used. Questions of bandwidth selection and specification (how large should <math>h</math> be, and should it vary depending upon the fitting point <math>x</math>?) are deferred for now.
A local model (usually a low-order polynomial with degree <math>p \le 3</math>), expressed as
is then fitted by [[weighted least squares]]: choose regression coefficients
<math>(\hat \beta_0,\ldots,\hat\beta_p)</math> to minimize
\sum_{i=1}^n w_i(x) \left ( Y_i - \beta_0 - \beta_1(x_i-x) - \ldots - \beta_p(x_i-x)^p \right )^2.
</math>
The local regresssion estimate of <math>\mu(x)</math> is then simply the intercept estimate:
while the remaining coefficients can be interpreted
(up to a factor of <math>p!</math>) as derivative estimates.
|