Revision as of 04:21, 15 November 2021 edit Citation bot (talk \| contribs) Bots 5,862,490 edits Add: bibcode, s2cid, pages. Formatted dashes. \| Use this bot. Report bugs. \| Suggested by Abductive \| #UCB_toolbar ← Previous edit		Revision as of 03:08, 22 November 2021 edit undo Closed Limelike Curves (talk \| contribs) Extended confirmed users, Pending changes reviewers 8,350 edits →Intuition Next edit →
Line 39: Compared with an algorithm like [[adaptive rejection sampling]]<ref name=":0">{{Cite journal \|title = Adaptive Rejection Sampling for Gibbs Sampling \|jstor = 2347565 \|journal = Journal of the Royal Statistical Society. Series C (Applied Statistics) \|date = 1992-01-01 \|pages = 337–348 \|volume = 41 \|issue = 2 \|doi = 10.2307/2347565 \|first1 = W. R. \|last1 = Gilks \|first2 = P. \|last2 = Wild}}</ref> that directly generates independent samples from a distribution, Metropolis–Hastings and other MCMC algorithms have a number of disadvantages: * The samples are correlated. Even though over the long term they do correctly follow <math>P(x)</math>, a set of nearby samples will be correlated with each other and not correctly reflect the distribution. This means that ~~if we want a set of independent samples, we have to throw away the majority of samples and only take every ''n''th~~effective sample~~, for some value of ''n'' (typically determined by examining the [[autocorrelation]] between adjacent samples).~~ ~~Autocorrelation~~sizes can be ~~reduced~~significantly bylower ~~increasing~~than the ~~''jumping width'' (the average size~~number of asamples ~~jump, which is related to the variance of the jumping~~actually ~~distribution)~~taken, ~~but this will also increase the likelihood of rejection of the proposed jump. Too large or too small a jumping size will lead~~leading to ~~a ''slow-mixing'' Markov chain, i.e. a highly correlated set of samples, so that a very~~ large ~~number of samples will be needed to get a reasonable estimate of any desired property of the distribution~~errors. * Although the Markov chain eventually converges to the desired distribution, the initial samples may follow a very different distribution, especially if the starting point is in a region of low density. As a result, a ''burn-in'' period is typically necessary,<ref>{{Cite book \|title=Bayesian data analysis \|date=2004 \|publisher=Chapman & Hall / CRC \|others=Gelman, Andrew \|isbn=978-1584883883 \|edition= 2nd \|___location=Boca Raton, Fla. \|oclc=51991499}}</ref> where an initial number of samples ~~(e.g. the first 1000 or so)~~ are thrown away. On the other hand, most simple [[rejection sampling]] methods suffer from the "[[curse of dimensionality]]", where the probability of rejection increases exponentially as a function of the number of dimensions. Metropolis–Hastings, along with other MCMC methods, do not have this problem to such a degree, and thus are often the only solutions available when the number of dimensions of the distribution to be sampled is high. As a result, MCMC methods are often the methods of choice for producing samples from [[hierarchical Bayesian model]]s and other high-dimensional statistical models used nowadays in many disciplines.

Metropolis–Hastings algorithm: Difference between revisions