Local regression: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 13:22, 25 June 2025 edit Tdb36 (talk \| contribs) 26 edits m Fix spelling mistake Maculay -> Macaulay ← Previous edit		Latest revision as of 07:26, 12 July 2025 edit undo Citation bot (talk \| contribs) Bots 5,869,731 edits Altered doi-broken-date. \| Use this bot. Report bugs. \| #UCB_CommandLine
(One intermediate revision by one other user not shown)
Line 9: LOESS and LOWESS thus build on [[classical statistics\|"classical" methods]], such as linear and nonlinear [[least squares regression]]. They address situations in which the classical procedures do not perform well or cannot be effectively applied without undue labor. LOESS combines much of the simplicity of linear least squares regression with the flexibility of [[Non-linear regression\|nonlinear regression]]. It does this by fitting simple models to localized subsets of the data to build up a function that describes the deterministic part of the variation in the data, point by point. In fact, one of the chief attractions of this method is that the data analyst is not required to specify a global function of any form to fit a model to the data, only to fit segments of the data. The trade-off for these features is increased computation. Because it is so computationally intensive, LOESS would have been practically impossible to use in the era when least squares regression was being developed. Most other modern methods for process ~~modeling~~modelling are similar to LOESS in this respect. These methods have been consciously designed to use our current computational ability to the fullest possible advantage to achieve goals not easily achieved by traditional approaches. A smooth curve through a set of data points obtained with this statistical technique is called a ''loess curve'', particularly when each smoothed value is given by a weighted quadratic least squares regression over the span of values of the ''y''-axis [[scattergram]] criterion variable. When each smoothed value is given by a weighted linear least squares regression over the span, this is known as a ''lowess curve.''; ~~however~~However, some authorities treat ''lowess'' and loess as synonyms.<ref>Kristen Pavlik, US Environmental Protection Agency, ''[https://19january2021snapshot.epa.gov/sites/static/files/2016-07/documents/loess-lowess.pdf Loess (or Lowess)]'', ''Nutrient Steps'', July 2016.</ref><ref name="NIST"/> ==History== Line 115: One question not addressed above is, how should the bandwidth depend upon the fitting point <math>x</math>? Often a constant bandwidth is used, while LOWESS and LOESS prefer a nearest-neighbor bandwidth, meaning ''h'' is smaller in regions with many data points. Formally, the smoothing parameter, <math>\alpha</math>, is the fraction of the total number ''n'' of data points that are used in each local fit. The subset of data used in each weighted least squares fit thus comprises the <math>n\alpha</math> points (rounded to the next largest integer) whose explanatory variables' values are closest to the point at which the response is being estimated.<ref name="NIST">NIST, [http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd144.htm "LOESS (aka LOWESS)"], section 4.1.4.4, ''NIST/SEMATECH e-Handbook of Statistical Methods,'' (accessed 14 April 2017)</ref> More sophisticated methods attempt to choose the bandwidth ''adaptively''; that is, choose a bandwidth at each fitting point <math>x</math> by applying criteria such as cross-validation locally within the smoothing window. An early example of this is [[Jerome H. Friedman]]'s<ref>{{citation\|first=Jerome H.\|last=Friedman\|title=A Variable Span Smoother\|date=October 1984\|publisher=Technical report, Laboratory for Computational Statistics LCS 5; SLAC PUB-3466\|doi=10.2171/1447470\|doi-broken-date=201 ~~March~~July 2025 \|url=http://www.slac.stanford.edu/cgi-wrap/getdoc/slac-pub-3477.pdf}}</ref> "supersmoother", which uses cross-validation to choose among local linear fits at different bandwidths. ===Degree of local polynomials===