Maximum spacing estimation: Difference between revisions

Content deleted Content added
No edit summary
Tag: references removed
Line 1:
[[ImageFile:Spacings.svg|thumb|right|260px|The maximum spacing method tries to find a distribution function such that the spacings, ''D''<sub>(''i'')</sub>, are all approximately of the same length. This is done by maximizing their [[geometric mean]].]]
 
In [[statistics]], '''maximum spacing estimation''' ('''MSE''' or '''MSP'''), or '''maximum product of spacing estimation (MPS)''', is a method for estimating the parameters of a univariate [[parametric model|statistical model]].<ref name="CA83">{{harvtxt|Cheng|Amin|1983}}</ref> The method requires maximization of the [[geometric mean]] of ''spacings'' in the data, which are the differences between the values of the [[cumulative distribution function]] at neighbouring data points.
Line 115:
since <math>x_{i} = x_{i-1}</math>.
 
When ties are due to rounding error, {{harvtxt|Cheng|Stephens|1989}} suggest another method to remove the effects.<ref group="note">{{NoteTag|There appear to be some minor typographical errors in the paper. For example, in section 4.2, equation (4.1), the rounding replacement for <math>D_j</math>, should not have the log term. In section 1, equation (1.2), <math>D_j</math> is defined to be the spacing itself, and <math>M(\theta)</math> is the negative sum of the logs of <math>D_j</math>. If <math>D_j</math> is logged at this step, the result is always&nbsp;≤&nbsp;0, as the difference between two adjacent points on a cumulative distribution is always &le; 1, and strictly&nbsp;<&nbsp;1 unless there are only two points at the bookends. Also, in section 4.3, on page 392, calculation shows that it is the variance <math>\textstyle\tilde{\sigma^2}</math> which has MPS estimate of 6.87, not the standard deviation <math>\textstyle\tilde{\sigma}</math>. – ''Editor''</ref>}}
Given ''r'' tied observations from ''x''<sub>''i''</sub> to ''x''<sub>''i''+''r''−1</sub>, let ''δ'' represent the [[round-off error]]. All of the true values should then fall in the range <math>x \pm \delta</math>. The corresponding points on the distribution should now fall between <math>y_L = F(x-\delta, \hat\theta)</math> and <math>y_U = F(x+\delta, \hat\theta)</math>. Cheng and Stephens suggest assuming that the rounded values are [[Uniform distribution (continuous)|uniformly spaced]] in this interval, by defining
: <math>
Line 124:
 
==Moran test==
The statistic ''S<sub>n</sub>''(''θ'') is also a form of [[Pat Moran (statistician)|Moran]] or Moran-Darling statistic, ''M''(''θ''), which can be used to test [[goodness of fit]].<ref group="note">{{NoteTag|The literature refers to related statistics as Moran or Moran-Darling statistics. For example, {{harvtxt|Cheng|Stephens|1989}} analyze the form <math>\scriptstyle M(\theta)= -\sum_{j=1}^{n+1}\log{D_i(\theta)}</math> where <math>\scriptstyle D_i(\theta)</math> is defined as above. {{harvtxt|Wong|Li|2006}} use the same form as well. However, {{harvtxt|Beirlant|al.|2001}} uses the form <math>\scriptstyle M_n= -\sum_{j=0}^{n}\ln{((n + 1)(X_{n,i+1} - X_{n,i}))}</math>, with the additional factor of <math>(n+1)</math> inside the logged summation. The extra factors will make a difference in terms of the expected mean and variance of the statistic. For consistency, this article will continue to use the Cheng & Amin/Wong & Li form. -- ''Editor''</ref>}}
It has been shown that the statistic, when defined as
: <math>
Line 134:
\sigma^2_M & \approx (n+1)\left ( \frac{\pi^2}{6} -1 \right ) -\frac{1}{2}-\frac{1}{6(n+1)},
\end{align}</math>
where ''γ'' is the [[Euler–Mascheroni constant]] which is approximately 0.57722.<ref group="note">{{NoteTag|{{harvtxt|Wong|Li|2006}} leave out the [[Euler–Mascheroni constant]] from their description. -- ''Editor''</ref>}}
 
The distribution can also be approximated by that of <math>A</math>, where
Line 161:
{{harvtxt|Ranneby|al.|2005}} discuss extended maximum spacing methods to the [[Joint probability distribution|multivariate]] case. As there is no natural order for <math>\mathbb{R}^k (k>1)</math>, they discuss two alternative approaches: a geometric approach based on [[Dirichlet cell]]s and a probabilistic approach based on a “nearest neighbor ball” metric.
 
== See also ==
* [[Kullback–Leibler divergence]]
* [[Maximum likelihood]]
Line 169:
{{NoteFoot}}
 
== References ==
=== Citations ===
{{Reflist|320em}}
 
=== Works cited ===
{{refbegin}}
* {{cite journal
Line 319 ⟶ 320:
| isbn = 978-0-940600-68-3
}}
 
{{refend}}
 
{{-}}
{{Statistics}}
 
{{good article}}
 
{{DEFAULTSORT:Maximum Spacing Estimation}}
[[Category:Estimation methods]]
[[Category:Probability distribution fitting]]
 
{{good article}}