Content deleted Content added
→Degree of local polynomials: rewritten section. |
section on fitting criteria/robustness. needs more work. Tag: Disambiguation links added |
||
Line 20:
The [[Savitzky-Golay filter]], introduced by [[Abraham Savitzky]] and [[Marcel J. E. Golay]] (1964)<ref>{{citeQ|Q56769732}}</ref> significantly expanded the method. Like the earlier graduation work, the focus was on data with an equally-spaced predictor variable, where (excluding boundary effects) local regression can be represented as a [[convolution]]. Savitzky and Golay published extensive sets of convolution coefficients for different orders of polynomial and smoothing window widths.
Local regression methods started to appear extensively in statistics literature in the 1970's; for example, [[Charles Joel Stone|Charles J. Stone]] (1977)<ref>{{citeQ|Q56533608}}</ref>, [[Vladimir Katkovnik]] (1979)<ref>{{cite |first=Vladimir|last=Katkovnik|title=Linear and nonlinear methods of nonparametric regression analysis.|journal=Soviet Automatic Control|volume=5|pages=25-34}}</ref> and [[William S. Cleveland]] (1979)<ref>{{citeQ|Q30052922}}</ref>. Katkovnik (1985)<ref name="katbook">{{citeQ|Q132129931}}</ref> is the earliest book devoted primarily to local regression methods.
Extensive theoretical work continued to appear throughout the 1990's. Important contributions include [[Jianqing Fan]] and [[Irène Gijbels]] (1992)<ref>{{citeQ|Q132202273}}</ref> studying efficiency properties, and [[David Ruppert]] and [[Matthew P. Wand]] (1994)<ref>{{citeQ|Q132202598}}</ref> developing an asymptotic distribution theory for multivariate local regression.
Line 39:
Local regression then estimates the function <math>\mu(x)</math>, for one value of <math>x</math> at a time. Since the function is assumed to be smooth, the most informative data points are those whose <math>x_i</math> values are close to <math>x</math>. This is formalized with a bandwidth <math>h</math> and a [[kernel (statistics)|kernel]] or weight function <math>W(\cdot)</math>, with observations assigned weights
<math display="block">w_i(x) = W\left ( \frac{x_i-x}{h} \right ).</math>
A typical choice of <math>W</math>, used by Cleveland in LOWESS, is <math>W(u) = (1-|u|^3)^3</math> for <math>|u|<1</math>, although any similar function (peaked at <math>u=0</math> and small or 0 for large values of <math>u</math>) can be used. Questions of bandwidth selection and specification (how large should <math>h</math> be, and should it vary depending upon the fitting point <math>x</math>?) are deferred for now.
Line 127:
:<math>w(x,z) = \exp\left(-\frac{\| x-z \|^2}{2\alpha^2}\right)</math>.
===Choice of Fitting Criterion===
As described above, local regression uses a locally weighted least squares criterion to estimate the regression parameters. This inherits many of the advantages (ease of implementation and interpretation; good properties when errors are normally distributed) and disadvantages (sensitivity to extreme values and outliers; inefficiency when errors have unequal variance or are not normally distributed) usually associated with least squares regression.
To address the sensitivity to outliers, techniques from [[robust regression]] can be employed. In local [[M-estimator|M-estimation]], the local least-squares criterion is replaced by a criterion of the form
<math display="block">
\sum_{i=1}^n w_i(x) \rho \left (
\frac{Y_i-\beta_0 - \ldots - \beta_p(x_i-x)^p}{s}
\right )
</math>
where <math>\rho(\cdot)</math> is a robustness function and <math>s</math> is a scale parameter. Discussion of the merits of different choices of robustness function is best left to the [[robust regression]] literature. The scale paramter <math>s</math> must also be estimated. References for local M-estimation include Katkovnik (1985)<ref name="katbook">{{citeQ|Q132129931}}</ref> and [[Alexandre Tsybakov]] (1986).<ref>{{cite |first=Alexandre B.|last=Tsybakov|title=Robust reconstruction of functions by the local-approximation method.|journal=Problems of Information Transmission|volume=22|pages=133-146}}</ref>
The robustness iterations in LOWESS and LOESS correspond to the robustness function defined by
<math display="block">
\rho'(u) = u (1-u^2/6)^2; |u|<1
</math>
and a robust global estimate of the scale parameter.
If <math>\rho(u)=|u|</math>, the local <math>L_1</math>
criterion
<math display="block">
\sum_{i=1}^n w_i(x) \left | Y_i - \beta_0 - \ldots - \beta_p(x_i-x)^p \right |
</math>
results; this does not require a scale parameter. Local <math>L_1</math> regression has been studied by [[Keming Yu]] and [[M.C. Jones]] (1998),<ref>{{cite |first1=Keming|last1=Yu|first2=M.C.|last2=Jones|title=Local Linear Quantile Regression|journal=Journal of the American Statistical Association|volume=93|pages=228-237}}</ref>
==Advantages==
|