Quantile-parameterized distribution: Difference between revisions

Content deleted Content added
No edit summary
No edit summary
Line 29:
</math>
 
and the functions <math>g_i(y)</math> are continuously differentiable and linearly independent basis functions. Here, essentially, <math>L_0</math> and <math>L_1</math> are the lower and upper bounds (if they exist) of a random variable with quantile function <math>F^{-1}(y)</math>. These distributions are called quantile-parameterized because for a given set of quantile pairs <math>\{(x_i, y_i) \mid i=1,\ldots,n\}</math>, where <math>x_i=F^{-1}(y_i)</math>, and a set of <math>n</math> basis functions <math>g_i(y)</math>, the coefficients <math>a_i</math> can be determined by solving a set of linear equations<ref name="KeelinPowley" />. If one desires to use more quantile pairs than basis functions, then the coefficients <math>a_i</math> can be chosen to minimize the sum of squared errors between the stated quantiles <math>x_i</math> and <math>F^{-1}(y_i)</math>. Keelin and Powley<ref name="KeelinPowley" /> illustrate this concept for a specific choice of basis functions that is a generalization of quantile function of the [[normal distribution]], <math>x=\mu+\sigma \phivarphi^{-1} (y)</math>, for which the mean <math>\mu</math> and standard deviation <math>\sigma</math> are linear functions of cumulative probability <math>y</math>:
 
: <math>\mu(y)=a_1+a_4 y</math>
Line 57:
 
=== Convexity ===
A QPD’s set of feasible coefficients <math>S_\boldsymbol a=\{\boldsymbol a\in\R^n |\mid \sum_{i=1}^n a_i d g_i (y)/dy > 0</math> for all <math>y\in (0,1)\}</math> is [[Convex set|convex]]. Because [[convex optimization]] problems require convex feasible sets, this property simplifies optimization problems involving QPDs.
 
=== Fitting to data ===
Line 63:
 
=== Shape flexibility ===
A QPD with <math>n</math> terms, where <math>n\ge 2</math>, has <math>n-2</math> shape parameters. Thus, QPDs can be far more flexible than the [[Pearson distribution|Pearson distributions]], which have at most two shape parameters. For example, ten-term [http://www.metalogs.org metalog] distributions parameterized by 105 CDF points from 30 traditional source distributions (including normal, student-t, lognormal, gamma, beta, and extreme value) have been shown to approximate each such source distribution within a [[Kolmogorov–Smirnov test|K-SK–S]] distance of 0.001 or less<ref>[[doi:10.1287/deca.2016.0338|Keelin, T.W. (2016), Table 8]]</ref>.
 
=== Transformations ===
QPD transformations are governed by a general property of quantile functions: for any [[quantile function]] <math>x=Q(y)</math> and increasing function <math>t(x), x=t^{-1} (Q(y))</math> is a [[quantile function]]<ref>Gilchrist, W., 2000. Statistical modelling with quantile functions. CRC Press.</ref>. For example, the [[quantile function]] of the [[normal distribution]], <math>x=\mu+\sigma \phivarphi^{-1} (y)</math>, is a QPD by the Keelin and Powley definition. The natural logarithm, <math>t(x)=\ln(x-b_l)</math>, is an increasing function, so <math>x=b_l+e^{\mu+\sigma \varphi^{-1} (y)}</math> is the [[quantile function]] of the [[Log-normal distribution|lognormal distribution]] with lower bound <math>b_l</math>. Importantly, this transformation converts an unbounded QPD into a semi-bounded QPD. Similarly, applying this log transformation to the unbounded metalog distribution<ref name="UnboundedMetalog">[[doi:10.1287/deca.2016.0338|Keelin, T.W. (2016), Section 3, pp. 249–257.]]</ref> yields the semi-bounded (log) metalog distribution<ref name="KeelinSec4">[[doi:10.1287/deca.2016.0338|Keelin, T.W. (2016), Section 4.]]</ref>; likewise, applying the logit transformation, <math>t(x)=\ln((x-b_l)/(b_u-x))</math>, yields the bounded (logit) metalog distribution<ref name="KeelinSec4" /> with lower and upper bounds <math>b_l</math> and <math>b_u</math>, respectively. Moreover, by considering <math>t(x)</math> to be <math>F^{-1} (y)</math> distributed, where <math>F^{-1} (y)</math> is any QPD that meets Keelin and Powley’s definition, the transformed variable maintains the above properties of feasibility, convexity, and fitting to data. Such transformed QPDs have greater shape flexibility than the underlying <math>F^{-1} (y)</math>, which has <math>n-2</math> shape parameters; the log transformation has <math>n-1</math> shape parameters, and the logit transformation has <math>n</math> shape parameters.
 
=== Moments ===