Metropolis–Hastings algorithm: Difference between revisions

Content deleted Content added
Tag: Reverted
m Reverted edits by 45.64.227.250 (talk) (HG) (3.4.10)
Line 4:
In [[statistics]] and [[statistical physics]], the '''Metropolis–Hastings algorithm''' is a [[Markov chain Monte Carlo]] (MCMC) method for obtaining a sequence of [[pseudo-random number sampling|random samples]] from a [[probability distribution]] from which direct sampling is difficult. This sequence can be used to approximate the distribution (e.g. to generate a [[histogram]]) or to [[Monte Carlo integration|compute an integral]] (e.g. an [[expected value]]). Metropolis–Hastings and other MCMC algorithms are generally used for sampling from multi-dimensional distributions, especially when the number of dimensions is high. For single-dimensional distributions, there are usually other methods (e.g. [[adaptive rejection sampling]]) that can directly return independent samples from the distribution, and these are free from the problem of [[autocorrelation|autocorrelated]] samples that is inherent in MCMC methods.
 
==HistryHistory==
The algorithm was named after [[Nicholas Metropolis]], who authored the 1953 article ''[[Equation of State Calculations by Fast Computing Machines]]'' together with [[Arianna W. Rosenbluth]], [[Marshall Rosenbluth]], [[Augusta H. Teller]] and [[Edward Teller]]. This article proposed the algorithm for the case of symmetrical proposal distributions, and [[W. K. Hastings]] extended it to the more general case in 1970.<ref name=Hastings/>
 
Line 140:
The desired acceptance rate depends on the target distribution, however it has been shown theoretically that the ideal acceptance rate for a one-dimensional Gaussian distribution is about 50%, decreasing to about 23% for an <math>N</math>-dimensional Gaussian target distribution.<ref name=Roberts/> These guidelines can work well when sampling from sufficiently regular Bayesian posteriors as they often follow a multivariate normal distribution as can be established using the [[Bernstein-von Mises theorem]].<ref>{{Cite journal |last=Schmon |first=Sebastian M. |last2=Gagnon |first2=Philippe |date=2022-04-15 |title=Optimal scaling of random walk Metropolis algorithms using Bayesian large-sample asymptotics |journal=Statistics and Computing |language=en |volume=32 |issue=2 |pages=28 |doi=10.1007/s11222-022-10080-8 |issn=0960-3174 |pmc=8924149 |pmid=35310543}}</ref>
 
If <math>\sigma^2</math> iis too small, the chain will ''mix slowly'' (i.e., the acceptance rate will be high, but successive samples will move around the space slowly, and the chain will converge only slowly to <math>P(x)</math>). On the other hand,
if <math>\sigma^2</math> is too large, the acceptance rate will be very low because the proposals are likely to land in regions of much lower probability density, so <math>a_1</math> will be very small, and again the chain will converge very slowly. One typically tunes the proposal distribution so that the algorithms accepts on the order of 30% of all samples – in line with the theoretical estimates mentioned in the previous paragraph.