Content deleted Content added
Calliopejen1 (talk | contribs) m Calliopejen1 moved page Draft:Quantile-parameterized distribution to Quantile-parameterized distribution: Publishing accepted Articles for creation submission (AFCH 0.9.1) |
m Open access bot: url-access=subscription updated in citation with #oabot. |
||
(31 intermediate revisions by 12 users not shown) | |||
Line 1:
== History ==
The development of quantile-parameterized distributions was inspired by the practical need for flexible continuous probability distributions that are easy to fit to data. Historically, the [[Pearson distribution|Pearson]]<ref>Johnson NL, Kotz S, Balakrishnan N. Continuous univariate distributions, Vol 1, Second Edition, John Wiley & Sons, Ltd, 1994, pp.
For example, the [[beta distribution]] is a flexible Pearson distribution that is frequently used to model percentages of a population. However, if the characteristics of this population are such that the desired [[cumulative distribution function]] (CDF) should run through certain specific CDF points, there may be no beta distribution that meets this need. Because the beta distribution has only two shape parameters, it cannot, in general, match even three specified CDF points. Moreover, the beta parameters that best fit such data can be found only by nonlinear iterative methods.
Practitioners of [[decision analysis]], needing distributions easily parameterized by three or more CDF points (e.g., because such points were specified as the result of an [[Expert elicitation|expert-elicitation process]]), originally invented quantile-parameterized distributions for this purpose. Keelin and Powley (2011)<ref name="KeelinPowley">
== Definition ==
Line 14:
F^{-1} (y)= \left\{
\begin{array}{cl}
L_0 & \
\sum_{i=1}^n a_i g_i(y) & \
L_1 & \mbox{for } y=1
\end{array}\right.
Line 29:
</math>
and the functions <math>g_i(y)</math> are continuously differentiable and linearly independent basis functions. Here, essentially, <math>L_0</math> and <math>L_1</math> are the lower and upper bounds (if they exist) of a random variable with quantile function <math>F^{-1}(y)</math>. These distributions are called quantile-parameterized because for a given set of quantile pairs <math>\{(x_i, y_i)
: <math>\mu(y)=a_1+a_4 y</math>
Line 41:
QPD’s that meet Keelin and Powley’s definition have the following properties.
=== Probability
Differentiating <math>x=F^{-1} (y)=\sum_{i=1}^n a_i g_i (y)</math> with respect to <math>y</math> yields <math>dx/dy</math>. The reciprocal of this quantity, <math>dy/dx</math>, is the [[probability density function]] (PDF)
Line 47:
a_i {{d g_i(y)}\over{dy}} \right)^{-1}</math>
where <math>0<y<1</math>. Note that this PDF is expressed as a function of cumulative probability <math>y</math> rather than <math>x</math>. To plot it, as shown in the
=== Feasibility ===
A function of the form of <math>F^{-1} (y)</math> is a feasible probability distribution if and only if <math>f(y)>0</math> for all <math>y \in (0,1)</math>.<ref name="KeelinPowley" />
: <math>\sum_{i=1}^n a_i {{d g_i(y)}\over{dy}} >0</math> for all <math>y \in (0,1)</math>
Line 57:
=== Convexity ===
A QPD’s set of feasible coefficients <math>S_\boldsymbol a=\{\boldsymbol a\in\R^n
=== Fitting to
The coefficients <math>\boldsymbol a</math> can be determined from data by [[linear least squares]]. Given <math>m</math> data points <math>(x_i,y_i)</math> that are intended to characterize the CDF of a QPD, and <math>m \times n</math> matrix <math>\boldsymbol Y</math> whose elements consist of <math>g_j (y_i)</math>, then, so long as <math>\boldsymbol Y^T \boldsymbol Y</math> is invertible, coefficients' column vector <math>\boldsymbol a</math> can be determined as <math>\boldsymbol a=(\boldsymbol Y^T \boldsymbol Y)^{-1} \boldsymbol Y^T \boldsymbol x</math>, where <math>m\geq n</math> and column vector <math>\boldsymbol x=(x_1,
=== Shape
A QPD with <math>n</math> terms, where <math>n\ge 2</math>, has <math>n-2</math> shape parameters. Thus, QPDs can be far more flexible than the [[Pearson distribution
=== Transformations ===
QPD transformations are governed by a general property of quantile functions: for any [[quantile function]] <math>x=Q(y)</math> and increasing function <math>t(x), x=t^{-1} (Q(y))</math> is a [[quantile function]].<ref>Gilchrist, W., 2000. Statistical modelling with quantile functions. CRC Press.</ref>
=== Moments ===
The <math>k^{th}</math> moment of a QPD is:<ref name="KeelinPowley" />
: <math>E[x^k] = \int_0^1 \left( \sum_{i=1}^n a_i g_i(y) \right)^k dy</math>
Whether such moments exist in closed form depends on the choice of QPD basis functions <math>g_i (y)</math>. The unbounded [[metalog distribution
=== Simulation ===
Line 85 ⟶ 86:
* The quantile function of the [[Cauchy distribution]], <math>x=x_0+\gamma \tan[\pi(y-0.5)]</math>.
* The quantile function of the [[logistic distribution]], <math>x=\mu+s \ln(y/(1-y) )</math>.
* The unbounded [[metalog
* The [[Metalog distribution#Unbounded,_semibounded,_and_bounded_metalog_distributions|semi-bounded and bounded metalog distributions
* The [[Metalog distribution#SPT_metalog_distributions|SPT (symmetric-percentile triplet) unbounded, semi-bounded, and bounded metalog distributions
* The Simple Q-Normal distribution<ref>
* The metadistributions, including the meta-normal<ref>
* Quantile functions expressed as [[polynomial]] functions of cumulative probability <math>y</math>, including [[Chebyshev polynomial]] functions.
Like the SPT metalog distributions
== Applications ==
The original applications of QPDs were by decision analysts wishing to conveniently convert expert-assessed quantiles (e.g., 10th, 50th, and 90th quantiles) into smooth continuous probability distributions. QPDs have also been used to fit output data from simulations in order to represent those outputs (both CDFs and PDFs) as closed-form continuous distributions.<ref>[[doi:10.1287/deca.2016.0338|Keelin, T.W. (2016), Section 6.2.2, pp.
== External links ==
Line 105 ⟶ 107:
{{reflist}}
[[
[[Category:Systems of probability distributions]]
|