Content deleted Content added
Mr Butterbur (talk | contribs) m minor grammar edits |
→History: more details of Henderson's formulation |
||
Line 15:
==History==
Local regression and closely related procedures have a long and rich history, having been discovered and rediscovered in different fields on multiple occasions. An early work by [[Robert Henderson (mathematician)|Robert Henderson]]<ref>Henderson, R. Note on Graduation by Adjusted Average. Actuarial Society of America Transactions 17, 43--48. [https://archive.org/details/transactions17actuuoft archive.org]</ref> studying the problem of graduation (a term for smoothing used in Actuarial literature) introduced local regression using cubic polynomials
Specifically, let <math>Y_j</math> denote an ungraduated sequence of observations. Following Henderson, suppose that only the terms from <math>Y_{-h}</math> to <math>Y_h</math> are to be taken into account when computing the graduated value of <math>Y_0</math>, and <math>W_j</math> is the weight to be assigned to <math>Y_j</math>. Henderson then uses a local polynomial approximation <math>a + b j + c j^2 + d j^3</math>, and sets up the following four equations for the coefficients:
:<math>
\begin{align}
\sum_{j=-h}^h ( a + b j + c j^2 + d j^3) W_x &= \sum_{j=-h}^h W_j Y_j \\
\sum_{j=-h}^h ( aj + b j^2 + c j^3 + d j^4) W_x &= \sum_{j=-h}^h j W_j Y_j \\
\sum_{j=-h}^h ( aj^2 + b j^3 + c j^4 + d j^5) W_x &= \sum_{j=-h}^h j^2 W_j Y_j \\
\sum_{j=-h}^h ( aj^3 + b j^4 + c j^5 + d j^6) W_x &= \sum_{j=-h}^h j^3 W_j Y_j
\end{align}
</math>
Solving these equations for the polynomial coefficients yields the graduated value, <math>\hat Y_0 = a</math>.
Henderson went further. In preceding years, many 'summation formula' methods of graduation had been developed, which derived graduation rules based on summation formulae (convolution of the series of obeservations with a chosen set of weights). Two such rules are the 15-point and 21-point rules of [[John Spencer (Actuary)|Spencer]] (1904)<ref>{{citeQ|Q127775139}}</ref>. These graduation rules were carefully designed to have a quadratic-reproducing property: If the ungraduated values happen to be exactly follow a quadratic formula, then the graduated values equal the ungraduated values. This is an important property: a simple moving average, by contrast, cannot adequately model peaks and troughs in the data. Henderson's insight was to show that ''any'' such graduation rule can be represented as a local cubic (or quadratic) fit for an appropriate choice of weights.
Further discussions of the historical work on graduation and local polynomial fitting can be found in [[Frederick Macaulay|Maculay]] (1931)<ref>{{citeQ|Q134465853}}</ref>, [[William S. Cleveland|Cleveland]] and [[Catherine Loader|Loader]] (1995);<ref>{{cite Q|Q132138257}}</ref> and [[Lori Murray|Murray]] and [[David Bellhouse (statistician)|Bellhouse]] (2019)<ref>{{cite Q|Q127772934}}</ref> discuss more of the historical work on graduation.
The [[Savitzky-Golay filter]], introduced by [[Abraham Savitzky]] and [[Marcel J. E. Golay]] (1964)<ref>{{cite Q|Q56769732}}</ref> significantly expanded the method. Like the earlier graduation work, their focus was data with an equally-spaced predictor variable, where (excluding boundary effects) local regression can be represented as a [[convolution]]. Savitzky and Golay published extensive sets of convolution coefficients for different orders of polynomial and smoothing window widths.
|