Content deleted Content added
m Open access bot: doi added to citation with #oabot. |
|||
(10 intermediate revisions by 9 users not shown) | |||
Line 1:
{{Short description|Method of estimating a statistical model's parameters}}
[[File:Spacings.svg|thumb|260px|The maximum spacing method tries to find a distribution function such that the spacings, ''D''<sub>(''i'')</sub>, are all approximately of the same length. This is done by maximizing their [[geometric mean]].]]
In [[statistics]], '''maximum spacing estimation''' ('''MSE''' or '''MSP'''), or '''maximum product of spacing estimation (MPS)''', is a method for estimating the parameters of a univariate [[parametric model|statistical model]].<ref name="CA83">{{harvtxt|Cheng|Amin|1983}}</ref> The method requires maximization of the [[geometric mean]] of ''spacings'' in the data, which are the differences between the values of the [[cumulative distribution function]] at neighbouring data points.
Line 5 ⟶ 6:
The concept underlying the method is based on the [[probability integral transform]], in that a set of independent random samples derived from any random variable should on average be uniformly distributed with respect to the cumulative distribution function of the random variable. The MPS method chooses the parameter values that make the observed data as uniform as possible, according to a specific quantitative measure of uniformity.
One of the most common methods for estimating the parameters of a distribution from data, the method of [[maximum likelihood]] (MLE), can break down in various cases, such as involving certain mixtures of continuous distributions.<ref name
Apart from its use in pure mathematics and statistics, the trial applications of the method have been reported using data from fields such as [[hydrology]],<ref>{{harvtxt|Hall|al.|2004}}</ref> [[econometrics]],<ref>{{harvtxt|Anatolyev|Kosenok|2004}}</ref> [[magnetic resonance imaging]],<ref>{{harvtxt|Pieciak|2014}}</ref> and others.<ref>{{harvtxt|Wong|Li|2006}}</ref>
== History and usage ==
The MSE method was derived independently by Russel Cheng and Nik Amin at the [[Cardiff University|University of Wales Institute of Science and Technology]], and Bo Ranneby at the [[Swedish University of Agricultural Sciences]].<ref name
There are certain distributions, especially those with three or more parameters, whose [[Likelihood#
The distributions that tend to have likelihood issues are often those used to model physical phenomena. {{harvtxt|Hall|al.|2004}} seek to analyze flood alleviation methods, which requires accurate models of river flood effects. The distributions that better model these effects are all three-parameter models, which suffer from the infinite likelihood issue described above, leading to Hall's investigation of the maximum spacing procedure. {{harvtxt|Wong|Li|2006}}, when comparing the method to maximum likelihood, use various data sets ranging from a set on the oldest ages at death in Sweden between 1905 and 1958 to a set containing annual maximum wind speeds.
== Definition ==
Given an [[iid]] [[random sample]] {''x''<sub>1</sub>, ..., ''x''<sub>''n''</sub>} of size ''n'' from a [[univariate distribution]] with continuous cumulative distribution function ''F''(''x'';''θ''<sub>0</sub>), where ''θ''<sub>0</sub> ∈ Θ is an unknown parameter to be [[estimation|estimated]], let {''x''<sub>(1)</sub>, ..., ''x''<sub>(''n'')</sub>} be the corresponding [[order statistic|ordered]] sample, that is the result of sorting of all observations from smallest to largest. For convenience also denote ''x''<sub>(0)</sub> = −∞ and ''x''<sub>(''n''+1)</sub> = +∞.
Define the ''spacings'' as the “gaps” between the values of the distribution function at adjacent ordered points:<ref name
D_i(\theta) = F(x_{(i)};\,\theta) - F(x_{(i-1)};\,\theta), \quad i=1,\ldots,n+1.
</math>
Then the '''maximum spacing estimator''' of ''θ''<sub>0</sub> is defined as a value that maximizes the [[natural logarithm|logarithm]] of the [[geometric mean]] of sample spacings:
\hat{\theta} = \underset{\theta\in\Theta}{\operatorname{arg\,max}} \; S_n(\theta),
\quad\text{where }\
Line 36 ⟶ 37:
Note that some authors define the function ''S''<sub>''n''</sub>(''θ'') somewhat differently. In particular, {{harvtxt|Ranneby|1984}} multiplies each ''D''<sub>''i''</sub> by a factor of (''n''+1), whereas {{harvtxt|Cheng|Stephens|1989}} omit the {{frac|''n''+1}} factor in front of the sum and add the “−” sign in order to turn the maximization into minimization. As these are constants with respect to ''θ'', the modifications do not alter the ___location of the maximum of the function ''S''<sub>''n''</sub>.
== Examples ==
This section presents two examples of calculating the maximum spacing estimator.
=== Example 1 ===
[[
Suppose two values ''x''<sub>(1)</sub> = 2, ''x''<sub>(2)</sub> = 4 were sampled from the [[exponential distribution]] ''F''(''x'';''λ'') = 1 − e<sup>−''xλ''</sup>, ''x'' ≥ 0 with unknown parameter ''λ'' > 0. In order to construct the MSE we have to first find the spacings:
{| class="wikitable" style="margin:1em auto;"
! ''i'' !! ''F''(''x''<sub>(''i'')</sub>) !! ''F''(''x''<sub>(''i''−1)</sub>) !! ''D''<sub>''i''</sub> = ''F''(''x''<sub>(''i'')</sub>) − ''F''(''x''<sub>(''i''−1)</sub>)
|-
| 1 || 1 − e<sup>−2''λ''</sup> || 0 || 1 − e<sup>−2''λ''</sup>
|-
| 2 || 1 − e<sup>−4''λ''</sup> || 1 − e<sup>−2''λ''</sup> || e<sup>−2''λ''</sup> − e<sup>−4''λ''</sup>
|-
| 3 || 1 || 1 − e<sup>−4''λ''</sup> || e<sup>−4''λ''</sup>
|}
The process continues by finding the ''λ'' that maximizes the geometric mean of the “difference” column. Using the convention that ignores taking the (''n''+1)st root, this turns into the maximization of the following product: (1 − e<sup>−2''λ''</sup>) · (e<sup>−2''λ''</sup> − e<sup>−4''λ''</sup>) · (e<sup>−4''λ''</sup>). Letting ''μ'' = e<sup>−2''λ''</sup>, the problem becomes finding the maximum of ''μ''<sup>5</sup>−2''μ''<sup>4</sup>+''μ''<sup>3</sup>. Differentiating, the ''μ'' has to satisfy 5''μ''<sup>4</sup>−8''μ''<sup>3</sup>+3''μ''<sup>2</sup> = 0. This equation has roots 0, 0.6, and 1. As ''μ'' is actually e<sup>−2''λ''</sup>, it has to be greater than zero but less than one. Therefore, the only acceptable solution is
\mu=0.6 \quad \Rightarrow \quad \lambda_{\text{MSE}} = \frac{\ln 0.6}{-2} \approx 0.255,
</math>
which corresponds to an exponential distribution with a mean of {{frac|''λ''}} ≈ 3.915. For comparison, the maximum likelihood estimate of λ is the inverse of the sample mean, 3, so ''λ''<sub>MLE</sub> = ⅓ ≈ 0.333.
=== Example 2 ===
Suppose {''x''<sub>(1)</sub>, ..., ''x''<sub>(''n'')</sub>} is the ordered sample from a [[Uniform distribution (continuous)|uniform distribution]] ''U''(''a'',''b'') with unknown endpoints ''a'' and ''b''. The cumulative distribution function is ''F''(''x'';''a'',''b'') = (''x''−''a'')/(''b''−''a'') when ''x''∈[''a'',''b'']. Therefore, individual spacings are given by
D_1 = \frac{x_{(1)}-a}{b-a}, \ \
D_i = \frac{x_{(i)}-x_{(i-1)}}{b-a}\ \text{for } i = 2, \ldots, n, \ \
Line 71 ⟶ 69:
Calculating the geometric mean and then taking the logarithm, statistic ''S''<sub>''n''</sub> will be equal to
S_n(a,b) = \tfrac{
</math>
Here only three terms depend on the parameters ''a'' and ''b''. Differentiating with respect to those parameters and solving the resulting linear system, the maximum spacing estimates will be
Line 78 ⟶ 76:
\hat{a} = \frac{nx_{(1)} - x_{(n)}}{n-1},\ \ \hat{b} = \frac{nx_{(n)}-x_{(1)}}{n-1}.
</math>
These are known to be the [[uniformly minimum variance unbiased]] (UMVU) estimators for the continuous uniform distribution.<ref name="CA83" /> In comparison, the maximum likelihood estimates for this problem <math alt="ML estimate of a is the smallest of x’s">\scriptstyle\hat{a}=x_{(1)}</math> and <math alt="ML estimate of b is the largest of x’s">\scriptstyle\hat{b}=x_{(n)}</math> are biased and have higher [[mean-squared error]].
== Properties ==
=== Consistency and efficiency ===
{{multiple image
Line 95 ⟶ 93:
}}
The maximum spacing estimator is a [[consistent estimator]] in that it [[convergence in probability|converges in probability]] to the true value of the parameter, ''θ''<sub>0</sub>, as the sample size increases to infinity.<ref name
Maximum spacing estimators are also at least as [[Efficiency (statistics)#Asymptotic efficiency|asymptotically efficient]] as maximum likelihood estimators, where the latter exist. However, MSEs may exist in cases where MLEs do not.<ref name
=== Sensitivity ===
Maximum spacing estimators are sensitive to closely spaced observations, and especially ties.<ref name
X_{i+k} = X_{i+k-1}=\cdots=X_i, \,
</math>
we get
D_{i+k}(\theta) = D_{i+k-1}(\theta) = \cdots = D_{i+1}(\theta) = 0. \,
</math>
When the ties are due to multiple observations, the repeated spacings (those that would otherwise be zero) should be replaced by the corresponding likelihood.<ref name
\lim_{x_i \to x_{i-1}}\frac{\int_{x_{i-1}}^{x_i}f(t;\theta)\,dt}{x_i-x_{i-1}} = f(x_{i-1},\theta) = f(x_{i},\theta),
</math>
since <math>x_{i} = x_{i-1}</math>.
When ties are due to rounding error, {{harvtxt|Cheng|Stephens|1989}} suggest another method to remove the effects.{{NoteTag|There appear to be some minor typographical errors in the paper. For example, in section 4.2, equation (4.1), the rounding replacement for <math>D_j</math>, should not have the log term. In section 1, equation (1.2), <math>D_j</math> is defined to be the spacing itself, and <math>M(\theta)</math> is the negative sum of the logs of <math>D_j</math>. If <math>D_j</math> is logged at this step, the result is always ≤ 0, as the difference between two adjacent points on a cumulative distribution is always
Given ''r'' tied observations from ''x''<sub>''i''</sub> to ''x''<sub>''i''+''r''−1</sub>, let ''δ'' represent the [[round-off error]]. All of the true values should then fall in the range <math>x \pm \delta</math>. The corresponding points on the distribution should now fall between <math>y_L = F(x-\delta, \hat\theta)</math> and <math>y_U = F(x+\delta, \hat\theta)</math>. Cheng and Stephens suggest assuming that the rounded values are [[Uniform distribution (continuous)|uniformly spaced]] in this interval, by defining
D_j = \frac{y_U-y_L}{r-1} \quad (j=i+1,\ldots,i+r-1).
</math>
The MSE method is also sensitive to secondary clustering.<ref name
== Moran test ==
The statistic ''S<sub>n</sub>''(''θ'') is also a form of [[Pat Moran (statistician)|Moran]] or Moran-Darling statistic, ''M''(''θ''), which can be used to test [[goodness of fit]].{{NoteTag|The literature refers to related statistics as Moran or Moran-Darling statistics. For example, {{harvtxt|Cheng|Stephens|1989}} analyze the form <math>\scriptstyle M(\theta)= -\sum_{j=1}^{n+1}\log{D_i(\theta)}</math> where <math>\scriptstyle D_i(\theta)</math> is defined as above. {{harvtxt|Wong|Li|2006}} use the same form as well. However, {{harvtxt|Beirlant|al.|2001}} uses the form <math>\scriptstyle M_n= -\sum_{j=0}^{n}\ln{((n + 1)(X_{n,i+1} - X_{n,i}))}</math>, with the additional factor of <math>(n+1)</math> inside the logged summation. The extra factors will make a difference in terms of the expected mean and variance of the statistic. For consistency, this article will continue to use the Cheng & Amin/Wong & Li form. -- ''Editor''}}
It has been shown that the statistic, when defined as
S_n(\theta) = M_n(\theta)= -\sum_{j=1}^{n+1}\ln{D_j(\theta)},
</math>
is [[Estimator#Asymptotic normality|asymptotically normal]], and that a chi-squared approximation exists for small samples.<ref name
\mu_M & \approx (n+1)(\ln(n+1)+\gamma)-\frac{1}{2}-\frac{1}{12(n+1)},\\
\sigma^2_M & \approx (n+1)\left ( \frac{\pi^2}{6} -1 \right ) -\frac{1}{2}-\frac{1}{6(n+1)},
Line 137 ⟶ 135:
The distribution can also be approximated by that of <math>A</math>, where
<math display="block"> A = C_1 + C_2\chi^2_n \,, </math>
in which
C_1 &= \mu_M - \sqrt{\frac{\sigma^2_Mn}{2}},\\
C_2 &= {\sqrt\frac{\sigma^2_M}{2n}},\\
\end{align}</math>
and where <math>\chi^2_n</math> follows a [[chi-squared distribution]] with <math>n</math> [[Degrees of freedom (statistics)|degrees of freedom]]. Therefore, to test the hypothesis <math>H_0</math> that a random sample of <math>n</math> values comes from the distribution <math>F(x,\theta)</math>, the statistic <math>T(\theta)= \frac{M(\theta)-C_1}{C_2}</math> can be calculated. Then <math>H_0</math> should be rejected with [[Statistical significance|significance]] <math>\alpha</math> if the value is greater than the [[critical value (statistics)|critical value]] of the appropriate chi-squared distribution.<ref name
Where ''θ''<sub>0</sub> is being estimated by <math>\hat\theta</math>, {{harvtxt|Cheng|Stephens|1989}} showed that <math>S_n(\hat\theta) = M_n(\hat\theta)</math> has the same asymptotic mean and variance as in the known case. However, the test statistic to be used requires the addition of a bias correction term and is:
T(\hat\theta) = \frac{M(\hat\theta)+\frac{k}{2}-C_1}{C_2},
</math>
where <math>k</math> is the number of parameters in the estimate.
== Generalized maximum spacing ==
=== Alternate measures and spacings ===
{{harvtxt|Ranneby|Ekström|1997}} generalized the MSE method to approximate other [[F-divergence|measures]] besides the
=== Multivariate distributions ===
{{harvtxt|Ranneby|al.|2005}} discuss extended maximum spacing methods to the [[Joint probability distribution|multivariate]] case. As there is no natural order for <math>\mathbb{R}^k (k>1)</math>, they discuss two alternative approaches: a geometric approach based on [[Dirichlet cell]]s and a probabilistic approach based on a “nearest neighbor ball” metric.
Line 176 ⟶ 172:
{{refbegin}}
* {{cite journal
| last1 = Anatolyev
| first1 = Stanislav | last2 = Kosenok
| first2 = Grigory | year
| title = An alternative to maximum likelihood based on spacings
| journal = Econometric Theory
| volume = 21
| issue = 2 | pages = 472–476 | doi = 10.1017/S0266466605050255
| url = http://fir.nes.ru/~gkosenok/MPS.pdf
|
| ref = CITEREFAnatolyevKosenok2004
| citeseerx = 10.1.1.494.7340
| s2cid = 123004317
| archive-date = 2011-08-16
| archive-url = https://web.archive.org/web/20110816101736/http://fir.nes.ru/~gkosenok/MPS.pdf
| url-status = dead
}}
* {{cite journal
|
|
|
|first2
|last3
|first3 = L.
|last4 = van der Meulen
|first4
|year = 1997
|title = Nonparametric entropy estimation: an overview
|journal = International Journal of Mathematical and Statistical Sciences
|volume = 6
|issue = 1
|pages = 17–40
|issn = 1055-7490
|url = http://www.menem.com/ilya/digital_library/entropy/beirlant_etal_97.pdf
|access-date = 2008-12-31
|ref = CITEREFBeirlantal.2001
|archive-url = https://web.archive.org/web/20050505044534/http://www.menem.com/ilya/digital_library/entropy/beirlant_etal_97.pdf
|archive-date = May 5, 2005
}} <small>''Note: linked paper is an updated 2001 version.''</small>
* {{cite journal
| last1 = Cheng | first1 = R.C.H.
Line 211 ⟶ 223:
| issn = 0035-9246
| jstor = 2345411
| doi = 10.1111/j.2517-6161.1983.tb01268.x
}}
Line 222 ⟶ 233:
| volume = 76 | issue = 2 | pages = 386–392
| doi = 10.1093/biomet/76.2.385
}}
* {{cite journal
| last
| first = Magnus
| year
| title = Generalized maximum spacing estimates
| journal = University of Umeå, Department of Mathematics
Line 233 ⟶ 243:
| issn = 0345-3928
| url = http://www.matstat.umu.se/varia/reports/rep9706.ps.gz
|
| archive-url = https://web.archive.org/web/20070214143052/http://www.matstat.umu.se/varia/reports/rep9706.ps.gz
|
}}
* {{cite journal
|last1 = Hall
|first1 = M.J.
|last2 = van den Boogaard
|first2 = H.F.P.
|last3 = Fernando
|first3 = R.C.
|last4 = Mynett
|first4 = A.E.
|year = 2004
|title = The construction of confidence intervals for frequency analysis using resampling techniques
|journal = Hydrology and Earth System Sciences
|volume = 8
|issue = 2
|pages = 235–246
|issn = 1027-5606
|ref = CITEREFHallal.2004
|doi = 10.5194/hess-8-235-2004
|url = https://hal.archives-ouvertes.fr/hal-00304907/document
|doi-access = free
}}
* {{cite conference
| last1 = Pieciak
| first1 = Tomasz | year
| title = The maximum spacing noise estimation in single-coil background MRI data
| conference = IEEE International Conference on Image Processing
| pages = 1743–1747
| ___location = Paris
| doi = 10.1109/icip.2014.7025349
| url = https://scholar.archive.org/work/e2l3rb6s3va7pd3kf6oioymgza
}}
* {{cite journal
Line 269 ⟶ 287:
| issn = 0035-9246
| jstor = 2345793
| issue = 3
| doi = 10.1111/j.2517-6161.1965.tb00602.x
Line 281 ⟶ 298:
| issn = 0303-6898
| jstor = 4615946
}}
* {{cite journal
|
| |
| |
|
|
|
|
|
|access-date
|archive-url = https://web.archive.org/web/20070214143042/http://www.matstat.umu.se/varia/reports/rep9705.ps.gz
|archive-date = February 14, 2007
}}
* {{cite journal
|
|first1
|last2 = Jammalamadakab
|first2
|last3 = Teterukovskiy
|first3 = Alex
|year
|title = The maximum spacing estimation for multivariate observations
|journal = Journal of Statistical Planning and Inference
|volume = 129
|issue = 1–2
|pages = 427–446
|doi = 10.1016/j.jspi.2004.06.059
|url = http://www.pstat.ucsb.edu/faculty/jammalam/html/research%20publication_files/MSP2.pdf
|access-date = 2008-12-31
|ref = CITEREFRannebyal.2005
}}
* {{cite book
| last1 = Wong | first1 = T.S.T
Line 317 ⟶ 341:
| pages = 272–283
| doi = 10.1214/074921706000001102
| arxiv = math/0702830v1
| series = Institute of Mathematical Statistics Lecture Notes – Monograph Series
| isbn = 978-0-940600-68-3
| s2cid = 88516426
}}
{{refend}}
Line 327 ⟶ 351:
{{Statistics}}
[[Category:Estimation methods]]
[[Category:Probability distribution fitting]]
|