Content deleted Content added
P(x) is the density function. |
|||
Line 25:
'''Metropolis algorithm (symmetric proposal distribution)'''
Let <math>f(x)</math> be a function that is proportional to the desired probability
# Initialization: Choose an arbitrary point <math>x_t</math> to be the first sample and choose an arbitrary probability density <math>g(x\mid y)</math> (sometimes written <math>Q(x\mid y)</math>) that suggests a candidate for the next sample value <math>x</math>, given the previous sample value <math>y</math>. In this section, <math>g</math> is assumed to be symmetric; in other words, it must satisfy <math>g(x\mid y) = g(y\mid x)</math>. A usual choice is to let <math>g(x\mid y)</math> be a [[Gaussian distribution]] centered at <math>y</math>, so that points closer to <math>y</math> are more likely to be visited next, making the sequence of samples into a [[random walk]] {{efn|In the original paper however, <math>g(x\mid y)</math> was actually the [[Boltzmann distribution]], as it was applied to physical systems in the context of [[statistical mechanics]]}}. The function <math>g</math> is referred to as the ''proposal density'' or ''jumping distribution''.
Line 36:
#** If <math>u > \alpha</math>, then ''reject'' the candidate and set <math>x_{t+1} = x_t</math> instead.
This algorithm proceeds by randomly attempting to move about the sample space, sometimes accepting the moves and sometimes remaining in place. Note that the acceptance ratio <math>\alpha</math> indicates how probable the new proposed sample is with respect to the current sample, according to the distribution whose density is <math>P(x)</math>. If we attempt to move to a point that is more probable than the existing point (i.e. a point in a higher-density region of <math>P(x)</math> corresponding to an <math>\alpha > 1 \ge u</math>), we will always accept the move. However, if we attempt to move to a less probable point, we will sometimes reject the move, and the larger the relative drop in probability, the more likely we are to reject the new point. Thus, we will tend to stay in (and return large numbers of samples from) high-density regions of <math>P(x)</math>, while only occasionally visiting low-density regions. Intuitively, this is why this algorithm works and returns samples that follow the desired distribution with density <math>P(x)</math>.
Compared with an algorithm like [[adaptive rejection sampling]]<ref name=":0">{{Cite journal |title = Adaptive Rejection Sampling for Gibbs Sampling |jstor = 2347565 |journal = Journal of the Royal Statistical Society. Series C (Applied Statistics) |date = 1992-01-01 |pages = 337–348 |volume = 41 |issue = 2 |doi = 10.2307/2347565 |first1 = W. R. |last1 = Gilks |first2 = P. |last2 = Wild}}</ref> that directly generates independent samples from a distribution, Metropolis–Hastings and other MCMC algorithms have a number of disadvantages:
|