Metropolis–Hastings algorithm

This is an old revision of this page, as edited by Cburnett (talk | contribs) at 18:18, 25 March 2005 (How was this not in Category:Algorithms; copyeditting; changed pdf to lowercase; etc.). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In mathematics and physics, the Metropolis-Hastings algorithm is an algorithm to generate a sequence of samples from the joint distribution of two or more variables. The purpose of such a sequence is to approximate the joint distribution (as with a histogram), or to compute an integral (such as an expected value). This algorithm is an example of a Markov chain Monte Carlo algorithm. It is a generalization of the Metropolis algorithm suggested by Hastings (citation below). The Gibbs sampling algorithm is a special case of the Metropolis-Hastings algorithm.

The Sampling distribution determines the next point to move to in the random walk.

The Metropolis-Hastings algorithm can draw samples from any probability distribution , requiring only that the density can be calculated at . The algorithm generates a set of states which is a Markov chain because each state depends only on the previous state . The algorithm depends on the creation of a proposal density , which depends on the current state and which can generate a new proposed sample . For example, the proposal density could be a Gaussian function centred on the current state

reading as the probability of generating given the previous value .

This proposal density would generate samples centred around the current state with variance . So we draw a new proposal state with probability and then calculate a value

where

is the likelihood ratio between the proposed sample and the previous sample , and

is the ratio of the proposal density in two directions (from to and vice versa). This is equal to 1 if the proposal density is symmetric. Then the new state is chosen with the rule

The Markov chain is started from a random initial value and the algorithm is run for a few thousand iterations so that this initial state is "forgotten". These samples, which are discarded, are known as burn-in. The algorithm works best if the proposal density matches the shape of the target distribution , that is , but in most cases this is unknown. If a Gaussian proposal is used the variance parameter has to be tuned during the burn-in period. This is usually done by calculating the acceptance rate, which is the fraction of proposed samples that is accepted in a window of the last samples. This is usually set to be around 60%. If the proposal steps are too small the chain will mix slowly (i.e., it will move around the space slowly and converge slowly to ). If the proposal steps are too large the acceptance rate will be very low because the proposals are likely to land in regions of much lower probability density so will be very small.

References

  • Chib, Siddhartha and Edward Greenberg: "Understanding the Metropolis–Hastings Algorithm". American Statistician, 49, 327–335, 1995
  • W.K. Hastings. "Monte Carlo Sampling Methods Using Markov Chains and Their Applications", Biometrika, 57:97-109, 1970.
  • N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller. "Equations of State Calculations by Fast Computing Machines". Journal of Chemical Physics, 21:1087-1091, 1953.