Local regression: Difference between revisions

Content deleted Content added
Zaqrfv (talk | contribs)
m Model definition: fix typos, formatting.
Zaqrfv (talk | contribs)
Degree of local polynomials: rewritten section.
Line 83:
 
===Degree of local polynomials===
Most sources, in both theoretical and computational work, use low-order polynomials as the local model, with polynomial degree ranging from 0 to 3.
The local polynomials fit to each subset of the data are almost always of first or second degree; that is, either locally linear (in the straight line sense) or locally quadratic. Using a zero degree polynomial turns LOESS into a weighted [[moving average]]. Higher-degree polynomials would work in theory, but yield models that are not really in the spirit of LOESS. LOESS is based on the ideas that any function can be well approximated in a small neighborhood by a low-order polynomial and that simple models can be fit to data easily. High-degree polynomials would tend to overfit the data in each subset and are numerically unstable, making accurate computations difficult.
 
The degree 0 (local constant) model is equivalent to a [[kernel smoother]]; usually credited to [[Èlizbar Nadaraya]] (1964)<ref>{{citeQ|Q29303512}}</ref> and [[G. S. Watson]] (1964).<ref>{{cite|last=Watson|first=G. S.|title=Smooth regression analysis|journal=Sankhya Series A|volume=26|pages=359-372}}</ref>. This is the simplest model to use, but can suffer from bias when fitting near boundaries of the dataset.
 
Local linear (degree 1) fitting can substantially reduce the boundary bias.
 
Local quadratic (degree 2) and local cubic (degree 3) can result in improved fits, particularly when the underlying mean function <math>\mu(x)</math> has substantial curvature, or equivalently a large second derivative.
 
In theory, higher orders of polynomial can lead to faster convergence of the estimate <math>\hat\mu(x)</math> to the true mean <math>\mu(x)</math>, ''provided that <math>\mu(x)</math> has a sufficient number of derivatives''. See C. J. Stone (1980).<ref>{{citeQ|Q132272803}}</ref> Generally, it takes a large sample size for this faster convergence to be realized. There are also computational and stability issues that arise, particularly for multivariate smoothing. It is generally not recommended to use local polynomials with degree greater than 3.
 
As with bandwidth selection, methods such as cross-validation can be used to compare the fits obtained with different degrees of polynomial.
 
===Weight function===