Inverse transform sampling: Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Alter: title, template type. Add: pages, chapter, date, title, url. Changed bare reference to CS1/2. Removed parameters. | Use this bot. Report bugs. | Suggested by Abductive | Category:Monte Carlo methods | #UCB_Category 4/66
 
(6 intermediate revisions by 6 users not shown)
Line 1:
{{Short description|Basic method for pseudo-random number sampling}}
'''Inverse transform sampling''' (also known as '''inversion sampling''', the '''inverse probability integral transform''', the '''inverse transformation method''', or the '''[[Nikolai Smirnov (mathematician)|Smirnov]] transform''', or the '''golden rule'''<ref name=aalto>Aalto University, N. Hyvönen, Computational methods in inverse problems. Twelfth lecture https://noppa.tkk.fi/noppa/kurssi/mat-1.3626/luennot/Mat-1_3626_lecture12.pdf{{dead link|date=November 2017 |bot=InternetArchiveBot |fix-attempted=yes }}</ref>) is a basic method for [[pseudo-random number sampling]], i.e., for generating sample numbers at [[random]] from any [[probability distribution]] given its [[cumulative distribution function]].
 
Inverse transformation sampling takes [[Continuous uniform distribution|uniform samples]] of a number <math>u</math> between 0 and 1, interpreted as a probability, and then returns the smallest number <math>x\in\mathbb R</math> such that <math>F(x)\ge u</math> for the [[cumulative distribution function]] <math>F</math> of a random variable. For example, imagine that <math>F</math> is the standard [[normal distribution]] with mean zero and [[standard deviation]] one. The table below shows samples taken from the uniform distribution and their representation on the standard normal distribution.
 
{| class="wikitable floatright"
Line 24:
We are randomly choosing a proportion of the area under the curve and returning the number in the ___domain such that exactly this proportion of the area occurs to the left of that number. Intuitively, we are unlikely to choose a number in the far end of tails because there is very little area in them which would require choosing a number very close to zero or one.
 
Computationally, this method involves computing the [[quantile function]] of the distribution — in other words, computing the [[cumulative distribution function]] (CDF) of the distribution (which maps a number in the ___domain to a probability between 0 and 1) and then [[inverse function|inverting]] that function. This is the source of the term "inverse" or "inversion" in most of the names for this method. Note that for a [[discrete distribution]], computing the CDF is not in general too difficult: we simply add up the individual probabilities for the various points of the distribution. For a [[continuous distribution]], however, we need to integrate the [[probability density function]] (PDF) of the distribution, which is impossible to do analytically for most distributions (including the [[normal distribution]]). As a result, this method may be computationally inefficient for many distributions and other methods are preferred; however, it is a useful method for building more generally applicable samplers such as those based on [[rejection sampling]].
 
For the [[normal distribution]], the lack of an [[Closed-form expression|analytical expression]] for the corresponding quantile function means that other methods (e.g. the [[Box–Muller transform]]) may be preferred computationally. It is often the case that, even for simple distributions, the inverse transform sampling method can be improved on:<ref>{{cite book |author=Luc Devroye |url=http://www.eirene.de/Devroye.pdf |title=Non-Uniform Random Variate Generation |publisher=Springer-Verlag |place=New York |year=1986 |access-date=2012-04-12 |archive-date=2014-08-18 |archive-url=https://web.archive.org/web/20140818200854/http://www.eirene.de/Devroye.pdf |url-status=dead }}</ref> see, for example, the [[ziggurat algorithm]] and [[rejection sampling]]. On the other hand, it is possible to approximate the quantile function of the normal distribution extremely accurately using moderate-degree polynomials, and in fact the method of doing this is fast enough that inversion sampling is now the default method for sampling from a normal distribution in the statistical package [[R (programming language)|R]].<ref>{{Cite web|url=https://stat.ethz.ch/R-manual/R-devel/library/base/html/Random.html|title = R: Random Number Generation}}</ref>
 
==Formal statement==
Line 32:
For any [[random variable]] <math>X\in\mathbb R</math>, the random variable <math>F_X^{-1}(U)</math> has the same distribution as <math>X</math>, where <math>F_X^{-1}</math> is the [[Cumulative distribution function#Inverse_distribution_function_(quantile_function)|generalized inverse]] of the [[cumulative distribution function]] <math>F_X</math> of <math>X</math> and <math>U</math> is uniform on <math>[0,1]</math>.<ref name="mcneil2005">{{cite book | last1 = McNeil | first1 = Alexander J. | last2 = Frey | first2 = Rüdiger | last3 = Embrechts | first3 = Paul | title = Quantitative risk management | date=2005 | series=Princeton Series in Finance | publisher=Princeton University Press, Princeton, NJ | page=186 | isbn=0-691-12255-5}}</ref>
 
For [[Random_variable#Continuous_random_variable|continuous random variables]], the [[inverse probability]] integral transform is indeed the inverse of the [[probability integral transform]], which states that for a [[continuous random variable]] <math>X</math> with [[cumulative distribution function]] <math>F_X</math>, the random variable <math>U=F_X(X)</math> is [[uniform distribution (continuous)|uniform]] on <math>[0,1]</math>.
 
[[File:InverseFunc.png|thumb|360px|Graph of the inversion technique from <math>x</math> to <math>F(x)</math>. On the bottom right we see the regular function and in the top left its inversion.]]
 
== Intuition ==
From <math>U \sim \mathrm{Unif}[0,1]</math>, we want to generate <math>X</math> with [[Cumulative distribution function|CDF]] <math>F_X(x).</math> We assume <math>F_X(x)</math> to be a continuous, strictly [[Monotonic function|increasing function]], which provides good intuition.
 
We want to see if we can find some strictly monotone transformation <math>T:[0,1]\mapsto \mathbb{R}</math>, such that <math>T(U)\overset{d}{=}X</math>. We will have
Line 64:
Expressed differently, given a cumulative distribution function <math>F_X</math> and a uniform variable <math>U\in[0,1]</math>, the random variable <math>X = F_X^{-1}(U)</math> has the distribution <math>F_X</math>.<ref name="mcneil2005" />
 
In the continuous case, a treatment of such inverse functions as objects satisfying differential equations can be given.<ref>{{cite journal | last1 = Steinbrecher | first1 = György | last2 = Shaw | first2 = William T. | title = Quantile mechanics | journal = European Journal of Applied Mathematics | date = 19 March 2008 | volume = 19 | issue = 2 | pages = 87–112 | doi = 10.1017/S0956792508007341| s2cid = 6899308 }}</ref> Some such differential equations admit explicit [[power series]] solutions, despite their non-linearity.<ref>{{Cite journal |last1=Arridge |first1=Simon |last2=Maass |first2=Peter |last3=Öktem |first3=Ozan |last4=Schönlieb |first4=Carola-Bibiane |title=Solving inverse problems using data-driven models |journal=Acta Numerica |year=2019 |language=en |volume=28 |pages=1–174 |doi=10.1017/S0962492919000059 |s2cid=197480023 |issn=0962-4929|doi-access=free }}</ref>
 
== Examples ==
Line 94:
 
==Proof of correctness==
Let <math>F</math> be a [[cumulative distribution function]], and let <math>F^{-1}</math> be its [[cumulative distribution function#Inverse_distribution_function_(quantile_function)|generalized inverse function]] (using the [[infimum]] because CDFs are weakly monotonic and [[Càdlàg|right-continuous]]):<ref>{{cite book |author=Luc Devroye |title=Non-Uniform Random Variate Generation |publisher=Springer-Verlag |place=New York |year=1986 |chapter=Section 2.2. Inversion by numerical solution of ''F''(''X'')&nbsp;=&nbsp;''U'' |chapter-url=httphttps://luc.devroye.org/chapter_two.pdf}}</ref>
 
:<math>F^{-1}(u) = \inf\;\{x \mid F(x)\geq u\} \qquad (0<u<1).</math>
Line 131:
* [[Copula (statistics)|Copula]], defined by means of probability integral transform.
* [[Quantile function]], for the explicit construction of inverse CDFs.
* [[Cumulative distribution function#InverseInverse_distribution_function_(quantile_function)|Inverse distribution function]] for a precise mathematical definition for distributions with discrete components.
* [[Rejection sampling]] is another common technique to generate random variates that does not rely on inversion of the CDF.