Importance sampling: Difference between revisions

Content deleted Content added
Per WP:HOWTOSD
Tags: Mobile edit Mobile web edit Advanced mobile edit
Citation bot (talk | contribs)
Altered url. URLs might have been anonymized. Added bibcode. | Use this bot. Report bugs. | Suggested by Abductive | Category:Monte Carlo methods | #UCB_Category 9/65
 
(38 intermediate revisions by 23 users not shown)
Line 1:
{{short description|Distribution estimation technique}}
In [[statistics]], '''importanceImportance sampling''' is a general[[Monte techniqueCarlo method]] for estimatingevaluating properties of a particular [[probability distribution|distribution]], while only having samples generated from a different distribution than the distribution of interest. TheIts methodintroduction wasin firststatistics is generally attributed to a introducedpaper by [[Teun Kloek]] and [[Herman K. van Dijk]] in 1978,<ref>{{cite journal |firstfirst1=T. |lastlast1=Kloek |first2=H. K. |last2=van Dijk |title=Bayesian Estimates of Equation System Parameters: An Application of Integration by Monte Carlo |journal=[[Econometrica]] |volume=46 |issue=1 |year=1978 |pages=1–19 |doi=10.2307/1913641 |jstor=1913641 |url=https://ageconsearch.umn.edu/record/272139/files/erasmus076.pdf }}</ref> but its precursors can be found in [[Monte Carlo method in statistical physics|statistical physics]] as early as 1949.<ref>{{cite journal |first=G. |last=Goertzle |authorlink=Gerald Goertzel |title=Quota Sampling and Importance Functions in Stochastic Solution of Particle Problems |journal=Technical Report ORNL-434, Oak Ridge National Laboratory |series=Aecd; 2793 |year=1949 |hdl=2027/mdp.39015086443671 }}</ref><ref>{{cite journal |last1=Kahn |first1=H. |authorlink=Herman Kahn |last2=Harris |first2=T. E. |authorlink2=Theodore E. Harris |year=1949 |title=Estimation of Particle Transmission by Random Sampling |journal=Monte Carlo Method |volume=12 |series=Applied Mathematics Series |pages=27–30 |publisher=National Bureau of Standards. }}</ref> Importance sampling is also related to [[umbrella sampling]] in [[computational physics]]. Depending on the application, the term may refer to the process of sampling from this alternative distribution, the process of inference, or both.
 
== Basic theory ==
<!-- '''E'''[''X;P''] / \mathbf{E}[X;P] represents the expectation of ''X''. -->
 
Let <math>X:\colon \Omega\to \mathbb{R}</math> be a [[random variable]] in some [[probability space]] <math>(\Omega,\mathcal{F},\mathbb{P})</math>. We wish to estimate the [[expected value]] of ''<math>X''</math> under ''<math>\mathbb{P''}</math>, denoted '''<math>\mathbb{E'''}_\mathbb{P}[''X;P'']</math>. If we have statistically independent random samples <math>x_1X_1, \ldots, x_nX_n</math>, generated according to ''<math>\mathbb{P''}</math>, then an empirical estimate of '''<math>\mathbb{E'''}_{\mathbb{P}}[''X;P'']</math> is just
 
: <math>
\widehat{\mathbfmathbb{E}}_{n\mathbb{P}}[X;P] = \frac{1}{n} \sum_{i=1}^n x_iX_i \quad \mathrm{where}\; X_i \sim \mathbb{P}(X)
</math>
 
and the precision of this estimate depends on the variance of ''<math>X''</math>:
 
: <math>
\operatorname{var}_\mathbb{P}\big[\widehat{\mathbfmathbb{E}}_{n};\mathbb{P}}[X]\big] = \frac{\operatorname{var}_\mathbb{P}[X;P]} n.
</math>
 
The basic idea of importance sampling is to sample the states from a different distribution to lower the variance of the estimation of '''<math>\mathbb{E'''}_\mathbb{P}[''X;P'']</math>, or when sampling directly from <math>\mathbb{P}</math> is difficult.
 
This is accomplished by first choosing a random variable <math>LY\geq 0</math> such that '''<math>\mathbb{E'''[''L'';''}_\mathbb{P''}[Y]&nbsp; =&nbsp; 1</math> and that ''<math>\mathbb{P''}</math>-[[almost everywhere]] <math>LY(\omega)\neq 0</math>.
With the variatevariable ''L''<math>Y</math> we define a probability <math>P^\mathbb{(L)Q}</math> that satisfies
: <math>
\mathbfmathbb{E}_{\mathbb{P}}[X;P] = \mathbfmathbb{E}_{\mathbb{Q}}\left[\frac{X}{L};P^{(L)Y}\right].
</math>
 
The variable ''<math>X''/''L''Y</math> will thus be sampled under ''P''<supmath>(''L'')\mathbb{Q}</supmath> to estimate '''<math>\mathbb{E'''}_{\mathbb{P}}[''X;P'']</math> as above and this estimation is improved when
<math>\operatorname{var}\left[\frac{X}{L};P^{(L)}\right] < \operatorname{var}[X;P]</math>.
 
When ''X'' is of constant sign over Ω, the best variable ''L'' would clearly be <math>L^*=\frac{X}{\mathbf{E}[X;P]}\geq 0</math>, so that ''X''/''L''* is the searched constant '''E'''[''X;P''] and a single sample under ''P''<sup>(''L''*)</sup> suffices to give its value. Unfortunately we cannot take that choice, because '''E'''[''X;P''] is precisely the value we are looking for! However this theoretical best case ''L*'' gives us an insight into what importance sampling does:
 
: <math>
<math> \operatorname{var}_{\mathbb{Q}}\left[\frac{X}{L};P^{(L)Y}\right] < \operatorname{var}_{\mathbb{P}}[X;P]</math>.
\begin{align}\forall a\in\mathbb{R}, \; P^{(L^*)}(X\in[a;a+da]) &= \int_{\omega\in\{X\in[a;a+da]\}} \frac{X(\omega)}{E[X;P]} \, dP(\omega) \\[6pt] &= \frac{1}{E[X;P]}\; a\,P(X\in[a;a+da])
\end{align}</math>
 
to the right, <math>a\,P(X\in[a;a+da])</math> is one of the infinitesimal elements that sum up to '''E'''[''X'';''P'']:
When ''<math>X''</math> is of constant sign over Ω<math>\Omega</math>, the best variable ''L''<math>Y</math> would clearly be <math>LY^*=\frac{X}{\mathbfmathbb{E}_{\mathbb{P}}[X;P]}\geq 0</math>, so that ''<math>X''/''L''Y^*</math> is the searched constant '''<math>\mathbb{E'''}_{\mathbb{P}}[''X;P'']</math> and a single sample under ''P''<supmath>(''L''\mathbb{Q}^*)</supmath> suffices to give its value. Unfortunately we cannot take that choice, because '''<math>\mathbb{E'''}_{\mathbb{P}}[''X;P'']</math> is precisely the value we are looking for! However this theoretical best case ''L<math>Y^*''</math> gives us an insight into what importance sampling does: for all <math> x \in \mathbb{R}</math>, the density of <math>\mathbb{Q}^*</math> at <math>X=x</math> can be written as
: <math>\begin{align}
\begin{align}\forall a\in\mathbb{RQ}, \; P^{(L^*)}\big(X\in[ax;ax+dadx]\big) &= \int_{\omega\in\{X\in[ax;ax+dadx]\}} \frac{X(\omega)}{\mathbb{E}_{\mathbb{P}}[X;P]} \, dPd\mathbb{P}(\omega) \\[6pt] &= \frac{1}{\mathbb{E}_{\mathbb{P}}[X;P]}\; ax\,\mathbb{P}(X\in[ax;ax+dadx]) .
\end{align}
</math>
toTo the right, <math>ax\,\mathbb{P}(X\in[ax;ax+dadx])</math> is one of the infinitesimal elements that sum up to '''<math>\mathbb{E'''}_{\mathbb{P}}[''X'';''P'']</math>:
 
: <math>\mathbb{E}_{\mathbb{P}}[X;P] = \int_{a=-\infty}^{+\infty} ax\,\mathbb{P}(X\in[ax;ax+dadx]) </math>
therefore, a good probability change ''P''<supmath>(''L'')\mathbb{Q}</supmath> in importance sampling will redistribute the law of ''<math>X''</math> so that its samples' frequencies are sorted directly according to their weightscontributions in '''<math>\mathbb{E'''}_{\mathbb{P}}[''X'';'']</math> as opposed to <math>\mathbb{E}_{\mathbb{P''}}[1]</math>. Hence the name "importance sampling."
 
Importance sampling is often used as a [[Monte Carlo integration|Monte Carlo integrator]].
When <math>\mathbb{P}</math> is the uniform distribution andover <math>\Omega =\mathbb{R}</math>, '''the expectation <math>\mathbb{E'''}_{\mathbb{P}}[''X;P'']</math> corresponds to the integral of the real function <math>X:\colon \mathbb{R}\to\mathbb{R}</math>.
 
== Application to probabilistic inference ==
 
Such methods are frequently used to estimate posterior densities or expectations in state and/or parameter estimation problems in probabilistic models that are too hard to treat analytically,. forExamples example ininclude [[Bayesian network]]s and importance weighted [[variational autoencoder]]s.<ref>{{Cite journal |last1=Burda |first1=Yuri |last2=Grosse |first2=Roger |last3=Salakhutdinov |first3=Ruslan |title=Importance Weighted Autoencoders |journal=Proceedings of the 4th International Conference on Learning Representations (ICLR) |arxiv=1509.00519 |publication-date=2016}}</ref>
 
== Application to simulation ==
'''Importance sampling''' is a [[variance reduction]] technique that can be used in the [[Monte Carlo method]]. The idea behind importance sampling is that certain values of the input [[random variables]] in a [[simulation]] have more impact on the parameter being estimated than others. If these "[[important]]" values are emphasized by sampling more frequently, then the [[estimator]] variance can be reduced. Hence, the basic methodology in importance sampling is to choose a distribution which "encourages" the important values. This use of "biased" distributions will result in a biased estimator if it is applied directly in the simulation. However, the simulation outputs are weighted to correct for the use of the biased distribution, and this ensures that the new importance sampling estimator is unbiased. The weight is given by the [[Likelihood-ratio test|likelihood ratio]], that is, the [[Radon–Nikodym derivative]] of the true underlying distribution with respect to the biased simulation distribution.
 
The fundamental issue in implementing importance sampling simulation is the choice of the biased distribution which encourages the important regions of the input variables. Choosing or designing a good biased distribution is the "art" of importance sampling. The rewards for a good distribution can be huge run-time savings; the penalty for a bad distribution can be longer run times than for a general Monte Carlo simulation without importance sampling.
Line 63 ⟶ 66:
=== Mathematical approach ===
 
Consider estimating by simulation the probability <math>p_t\,</math> of an event <math>X \ge t</math>, where <math>X</math> is a random variable with [[probabilitycumulative distribution|distribution function]] <math>F(x)</math> and [[probability density function]] <math>f(x)= F'(x)\,</math>, where prime denotes [[derivative]]. A <math>K</math>-length [[independent and identically distributed]] (i.i.d.) sequence <math>X_i\,</math> is generated from the distribution <math>F</math>, and the number <math>k_t</math> of random variables that lie above the threshold <math>t</math> are counted. The random variable <math>k_t</math> is characterized by the [[Binomial distribution]]
 
:<math>P(k_t = k)={K\choose k}p_t^k(1-p_t)^{K-k},\,\quad \quad k=0,1,\dots,K.</math>
 
One can show that <math>\operatornamemathbb{E} [k_t/K] = p_t</math>, and <math>\operatorname{var} [k_t/K] = p_t(1-p_t)/K</math>, so in the limit <math>K \to \infty</math> we are able to obtain <math>p_t</math>. Note that the variance is low if <math>p_t \approx 1</math>. Importance sampling is concerned with the determination and use of an alternate density function <math>f_*\,</math>(for <math>X</math>), usually referred to as a biasing density, for the simulation experiment. This density allows the event <math>{ X \ge t\ }</math> to occur more frequently, so the sequence lengths <math>K</math> gets smaller for a given [[estimator]] variance. Alternatively, for a given <math>K</math>, use of the biasing density results in a variance smaller than that of the conventional Monte Carlo estimate. From the definition of <math>p_t\,</math>, we can introduce <math>f_*\,</math> as below.
 
:<math>
\begin{align}
p_t & = \mathbb{E} [1(1_{\{X \ge t)\}}] \\[6pt]
& = \int 1(1_{\{x \ge t)\}} \frac{f(x)}{f_*(x)} f_*(x) \,dx \\[6pt]
& = E_\mathbb{E}_* [1(1_{\{X \ge t)\}} W(X)]
\end{align}
</math>
Line 83 ⟶ 86:
is a likelihood ratio and is referred to as the weighting function. The last equality in the above equation motivates the estimator
 
:<math> \hat p_t = \frac{1}{K}\,\sum_{i=1}^K 1(1_{\{X_i \ge t)\}} W(X_i),\,\quad \quad X_i \sim f_*</math>
 
This is the importance sampling estimator of <math>p_t\,</math> and is unbiased. That is, the estimation procedure is to generate i.i.d. samples from <math>f_*\,</math> and for each sample which exceeds <math>t\,</math>, the estimate is incremented by the weight <math>W\,</math> evaluated at the sample value. The results are averaged over <math>K\,</math> trials. The variance of the importance sampling estimator is easily shown to be
Line 89 ⟶ 92:
:<math>
\begin{align}
\operatorname{var}_*\widehat p_t & = \frac{1}{K}\operatorname{var}_* [1(X1_{\{X_i \ge t)\}}W(X)] \\[5pt]
& = \frac{1}{K}\left\{\mathbb{E_*E}_*[1(X1_{\{X_i \ge t)\}}^2 W^2(X)] - p_t^2\right\} \\[5pt]
& = \frac{1}{K}\left\{\mathbb{E}[1(X1_{\{X_i \ge t) \}}W(X)] - p_t^2\right\}
\end{align}
</math>
Line 136 ⟶ 139:
=== Evaluation of importance sampling ===
 
In order to identify successful importance sampling techniques, it is useful to be able to quantify the run-time savings due to the use of the importance sampling approach. The performance measure commonly used is <math>\sigma^2_{MC} / \sigma^2_{IS} \,</math>, and this can be interpreted as the speed-up factor by which the importance sampling estimator achieves the same precision as the MC estimator. This has to be computed empirically since the estimator variances are not likely to be analytically possible when their mean is intractable. Other useful concepts in quantifying an importance sampling estimator are the variance bounds and the notion of asymptotic efficiency. One related measure is the so-called '''Effective Sample Size''' '''(ESS)'''.<ref>{{Cite journal|lastlast1=Martino|firstfirst1=Luca|last2=Elvira|first2=Víctor|last3=Louzada|first3=Francisco|title=Effective sample size for importance sampling based on discrepancy measures|journal=Signal Processing|volume=131|pages=386–401|doi=10.1016/j.sigpro.2016.08.025|arxiv=1602.03572|year=2017|bibcode=2017SigPr.131..386M |s2cid=26317735 }}</ref>
 
=== Variance cost function ===
Line 145 ⟶ 148:
 
===Multiple and adaptive importance sampling ===
When different proposal distributions, <math>g_ng_i(x)</math> , <math>ni=1,\ldots,Nn,</math> are jointly used for drawing the samples <math>x_1, \ldots, x_Nx_n, </math> different proper weighting functions can be employed (e.g., see <ref>{{Cite book|title = Optimally Combining Sampling Techniques for Monte Carlo Rendering|publisher = ACM|journal = Proceedings of the 22Nd Annual Conference on Computer Graphics and Interactive Techniques|date = 1995-01-01|___location = New York, NY, USA|isbn = 978-0-89791-701-8|pages = [https://archive.org/details/computergraphics00sigg/page/419 419–428]|series = SIGGRAPH '95|doi = 10.1145/218380.218498|firstfirst1 = Eric|lastlast1 = Veach|first2 = Leonidas J.|last2 = Guibas| title=Proceedings of the 22nd annual conference on Computer graphics and interactive techniques - SIGGRAPH '95 | chapter=Optimally combining sampling techniques for Monte Carlo rendering |citeseerx = 10.1.1.127.8105| s2cid=207194026 |chapter-url = https://archive.org/details/computergraphics00sigg/page/419}}</ref><ref>{{Cite journal|title = Safe and Effective Importance Sampling|journal = Journal of the American Statistical Association|date = 2000-03-01|issn = 0162-1459|pages = 135–143|volume = 95|issue = 449|doi = 10.1080/01621459.2000.10473909|firstfirst1 = Art|lastlast1 = Owen|first2 = Yi Zhou|last2 = Associate|citeseerx = 10.1.1.36.4536| s2cid=119761472 }}</ref><ref>{{Cite journal|title = Efficient Multiple Importance Sampling Estimators|journal = IEEE Signal Processing Letters|date = 2015-10-01|issn = 1070-9908|pages = 1757–1761|volume = 22|issue = 10|doi = 10.1109/LSP.2015.2432078|firstfirst1 = V.|lastlast1 = Elvira|first2 = L.|last2 = Martino|first3 = D.|last3 = Luengo|first4 = M.F.|last4 = Bugallo|arxiv = 1505.05391|bibcode = 2015ISPL...22.1757E| s2cid=14504598 }}</ref><ref>{{Cite journal|lastlast1=Elvira|firstfirst1=Víctor|last2=Martino|first2=Luca|last3=Luengo|first3=David|last4=Bugallo|first4=Mónica F.|title=Improving population Monte Carlo: Alternative weighting and resampling schemes|journal=Signal Processing|volume=131|pages=77–91|doi=10.1016/j.sigpro.2016.07.012|arxiv=1607.02758|year=2017|bibcode=2017SigPr.131...77E |s2cid=205171823 }}</ref>). In an adaptive setting, the proposal distributions, <math>g_{ni,t}(x)</math> , <math>ni=1,\ldots,Nn,</math> and <math>t=1,\ldots,T,</math> are updated each iteration <math>t</math> of the adaptive importance sampling algorithm. Hence, since a population of proposal densities is used, several suitable combinations of sampling and weighting schemes can be employed.<ref>{{Cite journal|title = Population Monte Carlo|journal = Journal of Computational and Graphical Statistics|date = 2004-12-01|issn = 1061-8600|pages = 907–929|volume = 13|issue = 4|doi = 10.1198/106186004X12803|firstfirst1 = O.|lastlast1 = Cappé|first2 = A.|last2 = Guillin|first3 = J. M.|last3 = Marin|first4 = C. P.|last4 = Robert| s2cid=119690181 }}</ref><ref>{{Cite journal|lastlast1=Martino|firstfirst1=L.|last2=Elvira|first2=V.|last3=Luengo|first3=D.|last4=Corander|first4=J.|date=2017-05-01|title=Layered adaptive importance sampling|journal=Statistics and Computing|language=en|volume=27|issue=3|pages=599–623|doi=10.1007/s11222-016-9642-5|issn=0960-3174|arxiv=1505.04732|s2cid=2508031 }}</ref><ref>{{Cite journal|title = Adaptive importance sampling in general mixture classes|journal = Statistics and Computing|date = 2008-04-25|issn = 0960-3174|pages = 447–459|volume = 18|issue = 4|doi = 10.1007/s11222-008-9059-x|firstfirst1 = Olivier|lastlast1 = Cappé|first2 = Randal|last2 = Douc|first3 = Arnaud|last3 = Guillin|first4 = Jean-Michel|last4 = Marin|first5 = Christian P.|last5 = Robert|arxiv = 0710.4242| s2cid=483916 }}</ref><ref>{{Cite journal|title = Adaptive Multiple Importance Sampling|journal = Scandinavian Journal of Statistics|date = 2012-12-01|issn = 1467-9469|pages = 798–812|volume = 39|issue = 4|doi = 10.1111/j.1467-9469.2011.00756.x|firstfirst1 = Jean-Marie|lastlast1 = Cornuet|first2 = Jean-Michel|last2 = Marin|first3 = Antonietta|last3 = Mira|author3-link=Antonietta Mira|first4 = Christian P.|last4 = Robert|arxiv = 0907.1254| s2cid=17191248 }}</ref><ref>{{Cite journal|title = An Adaptive Population Importance Sampler: Learning From Uncertainty|journal = IEEE Transactions on Signal Processing|date = 2015-08-01|issn = 1053-587X|pages = 4422–4437|volume = 63|issue = 16|doi = 10.1109/TSP.2015.2440215|firstfirst1 = L.|lastlast1 = Martino|first2 = V.|last2 = Elvira|first3 = D.|last3 = Luengo|first4 = J.|last4 = Corander|bibcode = 2015ITSP...63.4422M|citeseerx = 10.1.1.464.9395| s2cid=17017431 }}</ref><ref>{{Cite journal|title = Adaptive importance sampling in signal processing|journal = Digital Signal Processing|date = 2015-12-01|pages = 36–49|volume = 47|series = Special Issue in Honour of William J. (Bill) Fitzgerald|doi = 10.1016/j.dsp.2015.05.014|firstfirst1 = Mónica F.|lastlast1 = Bugallo|first2 = Luca|last2 = Martino|first3 = Jukka|last3 = Corander| bibcode=2015DSP....47...36B |doi-access = free}}</ref><ref>{{Cite journal|lastlast1=Bugallo|firstfirst1=M. F.|last2=Elvira|first2=V.|last3=Martino|first3=L.|last4=Luengo|first4=D.|last5=Miguez|first5=J.|last6=Djuric|first6=P. M.|date=July 2017|title=Adaptive Importance Sampling: The past, the present, and the future|journal=IEEE Signal Processing Magazine|volume=34|issue=4|pages=60–79|doi=10.1109/msp.2017.2699226|issn=1053-5888|bibcode=2017ISPM...34...60B|s2cid=5619054 }}</ref>
 
== See also ==
Line 162 ⟶ 165:
 
== References ==
*{{cite journal |last=Arouna |first=Bouhari |title=Adaptative Monte Carlo Method, A Variance Reduction Technique |journal=Monte Carlo Methods and Their Applications |year=2004 |volume=10 |issue=1 |pages=1–24 |doi=10.1515/156939604323091180 |s2cid=21949573 }}
*{{cite book |title=Introduction to Rare Event Simulation |first=James Antonio |last=Bucklew |publisher=Springer-Verlag |___location=New York |year=2004 }}
*{{cite book |title=Sequential Monte Carlo Methods in Practice |firstfirst1=A. |lastlast1=Doucet |first2=N. |last2=de Freitas |first3=N. |last3=Gordon |publisher=Springer |year=2001 |isbn=978-0-387-95146-1 }}
*{{cite book |firstfirst1=M. |lastlast1=Ferrari |first2=S. |last2=Bellini |title=ImportanceICC Sampling2001. simulationIEEE ofInternational turboConference producton codesCommunications. Conference Record (Cat. No.01CH37240) |journalchapter=TheImportance IEEEsampling Internationalsimulation Conferenceof onturbo Communicationsproduct codes |volume=9 |pages=2773–2777 |year=2001 |doi=10.1109/ICC.2001.936655 |isbn=978-0-7803-7097-5 |s2cid=5158473 }}
*{{cite journal |first=Oleg |last=Mazonka |title=Easy as Pi: The Importance Sampling Method |journal=Journal of Reference |volume=16 |year=2016 |url=httphttps://jrxvwww.researchgate.net/xpublication/16/ism.pdf303859967}}
*{{cite book |first=Tommy |last=Oberg |title=Modulation, Detection, and Coding |publisher=John Wiley & Sons |___location=New York |year=2001 }}
*{{Cite book | last1=Press | first1=WH | last2=Teukolsky | first2=SA | last3=Vetterling | first3=WT | last4=Flannery | first4=BP | year=2007 | title=Numerical Recipes: The Art of Scientific Computing | edition=3rd | publisher=Cambridge University Press | ___location=New York | isbn=978-0-521-88068-8 | chapter=Section 7.9.1 Importance Sampling | chapter-url=http://apps.nrbook.com/empanel/index.html#pg=411 | access-date=2011-08-12 | archive-date=2011-08-11 | archive-url=https://web.archive.org/web/20110811154417/http://apps.nrbook.com/empanel/index.html#pg=411 | url-status=dead }}
*{{cite book |first=B. D. |last=Ripley |title=Stochastic Simulation |year=1987 |publisher=Wiley & Sons }}
*{{cite journal |firstfirst1=P. J. |lastlast1=Smith |first2=M. |last2=Shafi |first3=H. |last3=Gao |title=Quick simulation: A review of importance sampling techniques in communication systems |journal= IEEE Journal on Selected Areas in Communications |volume=15 |issue=4 |pages=597–613 |year=1997 |doi=10.1109/49.585771 }}
*{{cite book |first=R. |last=Srinivasan |title=Importance sampling – Applications in communications and detection |publisher=Springer-Verlag |___location=Berlin |year=2002 }}
 
Line 176 ⟶ 179:
* [http://www-sigproc.eng.cam.ac.uk/smc/ Sequential Monte Carlo Methods (Particle Filtering)] homepage on University of Cambridge
* [http://www.iop.org/EJ/abstract/0143-0807/22/4/315 Introduction to importance sampling in rare-event simulations] European journal of Physics. PDF document.
* [http://portal.acm.org/citation.cfm?id=1030470 Adaptive monteMonte carlo methods for rare event simulation: adaptive monte carloCarlo methods for rare event simulations] Winter Simulation Conference
 
[[Category:Monte Carlo methods]]