Content deleted Content added
→External links: link dead |
|||
Line 9:
==Localized subsets of data==
The '''subsets''' of data used for each weighted least squares fit in LOESS are determined by a nearest neighbors algorithm. A user-specified input to the procedure called the "bandwidth" or "smoothing parameter" determines how much of the data is used to fit each local polynomial. The smoothing parameter, <math>\alpha</math>, is a number between <math>\left(\lambda+1\right)/n</math> and 1, with <math>\lambda</math> denoting the degree of the local polynomial. The value of <math>\alpha</math> is the proportion of data used in each fit. The subset of data used in each weighted least squares fit is comprised of the <math>n\alpha</math> (rounded to the next largest integer) points whose explanatory variables values are closest to the point at which the response is being estimated.
<math>\alpha</math> is called the smoothing parameter because it controls the flexibility of the LOESS regression function. Large values of <math>\alpha</math> produce the smoothest functions that wiggle the least in response to fluctuations in the data. The smaller <math>\alpha</math> is, the closer the regression function will conform to the data. Using too small a value of the smoothing parameter is not desirable, however, since the regression function will eventually start to capture the random error in the data. Useful values of the smoothing parameter typically lie in the range 0.25 to 0.5 for most LOESS applications.
|