Content deleted Content added
Link suggestions feature: 2 links added. |
m link to inverse function |
||
(3 intermediate revisions by 3 users not shown) | |||
Line 24:
We are randomly choosing a proportion of the area under the curve and returning the number in the ___domain such that exactly this proportion of the area occurs to the left of that number. Intuitively, we are unlikely to choose a number in the far end of tails because there is very little area in them which would require choosing a number very close to zero or one.
Computationally, this method involves computing the [[quantile function]] of the distribution — in other words, computing the [[cumulative distribution function]] (CDF) of the distribution (which maps a number in the ___domain to a probability between 0 and 1) and then [[inverse function|inverting]] that function. This is the source of the term "inverse" or "inversion" in most of the names for this method. Note that for a [[discrete distribution]], computing the CDF is not in general too difficult: we simply add up the individual probabilities for the various points of the distribution. For a [[continuous distribution]], however, we need to integrate the [[probability density function]] (PDF) of the distribution, which is impossible to do analytically for most distributions (including the [[normal distribution]]). As a result, this method may be computationally inefficient for many distributions and other methods are preferred; however, it is a useful method for building more generally applicable samplers such as those based on [[rejection sampling]].
For the [[normal distribution]], the lack of an [[Closed-form expression|analytical expression]] for the corresponding quantile function means that other methods (e.g. the [[Box–Muller transform]]) may be preferred computationally. It is often the case that, even for simple distributions, the inverse transform sampling method can be improved on:<ref>{{cite book |author=Luc Devroye |url=http://www.eirene.de/Devroye.pdf |title=Non-Uniform Random Variate Generation |publisher=Springer-Verlag |place=New York |year=1986 |access-date=2012-04-12 |archive-date=2014-08-18 |archive-url=https://web.archive.org/web/20140818200854/http://www.eirene.de/Devroye.pdf |url-status=dead }}</ref> see, for example, the [[ziggurat algorithm]] and [[rejection sampling]]. On the other hand, it is possible to approximate the quantile function of the normal distribution extremely accurately using moderate-degree polynomials, and in fact the method of doing this is fast enough that inversion sampling is now the default method for sampling from a normal distribution in the statistical package [[R (programming language)|R]].<ref>{{Cite web|url=https://stat.ethz.ch/R-manual/R-devel/library/base/html/Random.html|title = R: Random Number Generation}}</ref>
==Formal statement==
Line 37:
== Intuition ==
From <math>U \sim \mathrm{Unif}[0,1]</math>, we want to generate <math>X</math> with [[Cumulative distribution function|CDF]] <math>F_X(x).</math> We assume <math>F_X(x)</math> to be a continuous, strictly [[Monotonic function|increasing function]], which provides good intuition.
We want to see if we can find some strictly monotone transformation <math>T:[0,1]\mapsto \mathbb{R}</math>, such that <math>T(U)\overset{d}{=}X</math>. We will have
Line 64:
Expressed differently, given a cumulative distribution function <math>F_X</math> and a uniform variable <math>U\in[0,1]</math>, the random variable <math>X = F_X^{-1}(U)</math> has the distribution <math>F_X</math>.<ref name="mcneil2005" />
In the continuous case, a treatment of such inverse functions as objects satisfying differential equations can be given.<ref>{{cite journal | last1 = Steinbrecher | first1 = György | last2 = Shaw | first2 = William T. | title = Quantile mechanics | journal = European Journal of Applied Mathematics | date = 19 March 2008 | volume = 19 | issue = 2 | pages = 87–112 | doi = 10.1017/S0956792508007341| s2cid = 6899308 }}</ref> Some such differential equations admit explicit [[power series]] solutions, despite their non-linearity.<ref>{{Cite journal |last1=Arridge |first1=Simon |last2=Maass |first2=Peter |last3=Öktem |first3=Ozan |last4=Schönlieb |first4=Carola-Bibiane |title=Solving inverse problems using data-driven models |journal=Acta Numerica |year=2019 |language=en |volume=28 |pages=1–174 |doi=10.1017/S0962492919000059 |s2cid=197480023 |issn=0962-4929|doi-access=free }}</ref>
== Examples ==
Line 94:
==Proof of correctness==
Let <math>F</math> be a [[cumulative distribution function]], and let <math>F^{-1}</math> be its [[cumulative distribution function#Inverse_distribution_function_(quantile_function)|generalized inverse function]] (using the [[infimum]] because CDFs are weakly monotonic and [[Càdlàg|right-continuous]]):<ref>{{cite book |author=Luc Devroye |title=Non-Uniform Random Variate Generation |publisher=Springer-Verlag |place=New York |year=1986 |chapter=Section 2.2. Inversion by numerical solution of ''F''(''X'') = ''U'' |chapter-url=
:<math>F^{-1}(u) = \inf\;\{x \mid F(x)\geq u\} \qquad (0<u<1).</math>
|