Revision as of 10:21, 26 June 2009 edit Jheald (talk \| contribs) Autopatrolled, Extended confirmed users, Pending changes reviewers 53,072 edits →See also: + Empirical Bayes ← Previous edit		Revision as of 10:25, 26 June 2009 edit undo Jheald (talk \| contribs) Autopatrolled, Extended confirmed users, Pending changes reviewers 53,072 edits c/e Next edit →
Line 2: ABC methods originated in population and evolutionary genetics <ref name=Pritchard1999>{{cite journal\|last = Pritchard\|first = J. K.\|coauthors = Seielstad, M. T., Perez-Lezaun, A., and Feldman, M. T.\|title = Population Growth of Human Y Chromosomes: A Study of Y Chromosome Microsatellites\|journal = Mol. Biol. Evol.\|volume = 16\|date = 1999\|pages = 1791–1798}}</ref><ref name=Beaumont>{{cite journal\|last = Beaumont\|first = M. A.\|coauthors = Zhang, W. and [[David Balding\|Balding, D. J.]]\|title = Approximate Bayesian Computation in Population Genetics\|journal = Genetics\|volume = 162\|date = Dec 2002\|pages = 2025–2035\|url = http://www.genetics.org/cgi/content/abstract/162/4/2025\|pmid = 12524368\|issue = 4\|month = Dec\|day = 01}}</ref> but have recently also been introduced to the analysis of complex and stochastic [[dynamical systems]] <ref name=Toni2009>{{cite journal \|author = Toni, T.; Welch, D.; Strelkowa, N.; Ipsen, A.; Stumpf, M.P.H. \|year = 2009 \|title = Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems \| journal = Journal of the Royal Society Interface \|volume = 6 \|issue = 31 \|pages = 187–202 \|doi = 10.1098/rsif.2008.0172 \|url=http://rsif.royalsocietypublishing.org/content/6/31/187.abstract}}</ref>. ==Overview== In standard Bayesian inference the [[posterior distribution]] is given by :<math>P(\theta\|D)\propto P(D\|\theta) \pi(\theta)</math> where <math>\theta</math> are the parameters of a probability model, <math>D</math> are the observed data, and <math>\pi(\theta)</math> is the [[prior distribution]] of the parameters <math>\theta</math>. <math>P(D\|\theta)</math> is the [[likelihood]] of <math>\theta</math>, that is the probability of observing the data <math>D</math> given the model with parameter <math>\theta</math>. The explicit evaluation of the likelihood is avoided in ABC approaches by considering distances between observed and data simulated from a model with parameter <math>\theta</math>. For sufficiently complex models and large data sets the probability of happening upon a simulation run that yields precisely the same dataset as the one observed will be very small, often unacceptably so. So rather than considering the data we consider a summary statistic of the data, <math>S(D)</math>, and use a distance <math>\Delta(S(D),S(X))</math> between the summary statistics of real and simulated data, <math>D</math> and <math>X</math>, respectively. The explicit evaluation of the likelihood <math>P(D\|\theta)</math> is avoided in ABC approaches by considering distances between observed and data simulated from a model with parameter <math>\theta</math>. For sufficiently complex models and large data sets the probability of happening upon a simulation run that yields precisely the same dataset as the one observed will be very small, often unacceptably so. So rather than considering the data we consider a summary statistic of the data, <math>S(D)</math>, and use a distance <math>\Delta(S(D),S(X))</math> between the summary statistics of real and simulated data, <math>D</math> and <math>X</math>, respectively. The generic ABC approach to infer the posterior probability of a parameter <math>\theta</math> is as follows: ~~'''1.'''~~:# Sample a candidate parameter vector <math>\theta^\ast</math> from some proposal distribution <math>\pi(\theta)</math>. :# Simulate a dataset <math>X</math> from the model with parameter <math>\theta^\ast</math>. ~~'''2.'''~~:# ~~Simulate a dataset~~If <math>\Delta(S(D),S(X))<\epsilon</math> ~~from~~then ~~the model with parameter~~accept <math>\theta^\ast</math> as a sample from the posterior. ~~'''3.''' If <math>\Delta(S(D),S(X))<\epsilon</math> then accept <math>\theta^\ast</math> as a sample from the posterior.~~ For <math>\epsilon</math> sufficiently small the ABC procedure should deliver a good approximation to the true posterior, in particular if the summary statistic <math>S</math> is a [[sufficient statistic]] of the probability model. If sufficient statistics do not exist or are hard to come by, setting up a satisfying and efficient ABC approach can be challenging.

Approximate Bayesian computation: Difference between revisions