Quantile-parameterized distribution: Difference between revisions

Content deleted Content added
m Calliopejen1 moved page Draft:Quantile-parameterized distribution to Quantile-parameterized distribution: Publishing accepted Articles for creation submission (AFCH 0.9.1)
Cleaning up accepted Articles for creation submission (AFCH 0.9.1)
Line 1:
{{AFC submission|t||ts=20201109224936|u=Riskanal|ns=118|demo=}}<!-- Important, do not remove this line before article has been created. -->Quantile-parameterized distributions (QPDs) are probability distributions that are directly parameterized by data. They were motivated by the need for easy-to-use continuous probability distributions flexible enough to represent a wide range of uncertainties, such as those commonly encountered in business, engineering, and science. Because QPDs are directly parameterized by data, they have the practical advantage of avoiding the intermediate step of [[Estimation theory|parameter estimation]], a time-consuming process that typically requires non-linear iterative methods to estimate probability-distribution parameters from data. Some QPDs have virtually unlimited shape flexibility and closed-form moments as well.
 
== History ==
Line 60:
 
=== Fitting to Data ===
The coefficients <math>\boldsymbol a</math> can be determined from data by [[linear least squares]]. Given <math>m</math> data points <math>(x_i,y_i)</math> that are intended to characterize the CDF of a QPD, and <math>m \times n</math> matrix <math>\boldsymbol Y</math> whose elements consist of <math>g_j (y_i)</math>, then, so long as <math>\boldsymbol Y^T \boldsymbol Y</math> is invertible, coefficients' column vector <math>\boldsymbol a</math> can be determined as <math>\boldsymbol a=(\boldsymbol Y^T \boldsymbol Y)^{-1} \boldsymbol Y^T \boldsymbol x</math>, where <math>m\geq n</math> and column vector <math>\boldsymbol x=(x_1,...,x_m)</math>. If <math>m=n</math>, this equation reduces to <math>\boldsymbol a=\boldsymbol Y^{-1} \boldsymbol x</math>, where the resulting CDF runs through all data points exactly. An alternate method, implemented as a linear program, determines the coefficients by minimizing the sum of absolute distances between the CDF and the data subject to feasibility constraints.<ref name="Faber">[https://searchworks.stanford.edu/view/13257318 Faber, I.J. (2019). Cyber Risk Management: AI-generated Warnings of Threats (Doctoral dissertation, Stanford University).]</ref>.
 
=== Shape Flexibility ===
Line 88:
* The semi-bounded and bounded metalog distributions<ref name="KeelinSec4" />, which are the log and logit transforms, respectively, of the unbounded metalog distribution.
* The SPT (symmetric-percentile triplet) unbounded, semi-bounded, and bounded metalog distributions<ref name="SPT">[[doi:10.1287/deca.2016.0338|Keelin, T.W. (2016), pp. 269-271.]]</ref>, which are parameterized by three CDF points and optional upper and lower bounds.
* The Simple Q-Normal distribution<ref>[[doi:10.1287/deca.1110.0213|Keelin, T.W., and Powley, B.W. (2011), pp. 208-210]]</ref>.
* The metadistributions, including the meta-normal<ref>[[doi:10.1287/deca.2016.0338|Keelin, T.W. (2016), p. 253.]]</ref>
* Quantile functions expressed as [[polynomial]] functions of cumulative probability <math>y</math>, including [[Chebyshev polynomial]] functions.
Line 105:
{{reflist}}
 
[[:Category:Continuous distributions]]
 
== Quantile-parameterized distribution ==
 
{{AFC submission|||ts=20201109225152|u=Riskanal|ns=118}}