Approximate Bayesian computation: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 19:21, 16 October 2024 edit JCW-CleanerBot (talk \| contribs) Bots 136,899 edits m task Tag: AWB ← Previous edit		Latest revision as of 17:20, 9 August 2025 edit undo Bender the Bot (talk \| contribs) Bots 1,064,377 edits m →Software: HTTP to HTTPS for SourceForge Tag: AWB
(4 intermediate revisions by 4 users not shown)
Line 19: Approximate Bayesian computation can be understood as a kind of Bayesian version of [[indirect inference]].<ref>{{cite arXiv \| eprint=1803.01999 \| author1=Christopher C Drovandi \| title=ABC and Indirect Inference \| date=2018 \| class=stat.CO }}</ref><ref name="Peters 2009">{{Cite journal \|last=Peters \|first=Gareth \|date=2009 \|title=Advances in Approximate Bayesian Computation and Trans-Dimensional Sampling Methodology \|url=https://www.ssrn.com/abstract=3785580 \|journal=SSRN Electronic Journal \|language=en \|doi=10.2139/ssrn.3785580 \|issn=1556-5068\|hdl=1959.4/50086 \|hdl-access=free }}</ref> Several efficient Monte Carlo based approaches have been developed to perform sampling from the ABC posterior distribution for purposes of estimation and prediction problems. A popular choice is the SMC Samplers algorithm <ref>{{Cite journal \|last1=Del Moral \|first1=Pierre \|last2=Doucet \|first2=Arnaud \|last3=Jasra \|first3=Ajay \|date=2006 \|title=Sequential Monte Carlo Samplers \|url=https://www.jstor.org/stable/3879283 \|journal=Journal of the Royal Statistical Society. Series B (Statistical Methodology) \|volume=68 \|issue=3 \|pages=411–436 \|doi=10.1111/j.1467-9868.2006.00553.x \|jstor=3879283 \|issn=1369-7412\|arxiv=cond-mat/0212648 }}</ref><ref>{{Cite journal \|last1=Del Moral \|first1=Pierre \|last2=Doucet \|first2=Arnaud \|last3=Peters \|first3=Gareth \|date=2004 \|title=Sequential Monte Carlo Samplers CUED Technical Report \|url=https://www.ssrn.com/abstract=3841065 \|journal=SSRN Electronic Journal \|language=en \|doi=10.2139/ssrn.3841065 \|issn=1556-5068\|url-access=subscription }}</ref><ref>{{Cite journal \|last=Peters \|first=Gareth \|date=2005 \|title=Topics in Sequential Monte Carlo Samplers \|url=https://www.ssrn.com/abstract=3785582 \|journal=SSRN Electronic Journal \|language=en \|doi=10.2139/ssrn.3785582 \|issn=1556-5068\|url-access=subscription }}</ref> adapted to the ABC context in the method (SMC-ABC).<ref>{{Cite journal \|last1=Sisson \|first1=S. A. \|last2=Fan \|first2=Y. \|last3=Tanaka \|first3=Mark M. \|date=2007-02-06 \|title=Sequential Monte Carlo without likelihoods \|journal=Proceedings of the National Academy of Sciences \|language=en \|volume=104 \|issue=6 \|pages=1760–1765 \|doi=10.1073/pnas.0607208104 \|doi-access=free \|issn=0027-8424 \|pmc=1794282 \|pmid=17264216\|bibcode=2007PNAS..104.1760S }}</ref><ref name="Peters 2009"/><ref>{{Cite journal \|last1=Peters \|first1=G. W. \|last2=Sisson \|first2=S. A. \|last3=Fan \|first3=Y. \|date=2012-11-01 \|title=Likelihood-free Bayesian inference for α-stable models \|url=https://www.sciencedirect.com/science/article/pii/S0167947310003786 \|journal=Computational Statistics & Data Analysis \|series=1st issue of the Annals of Computational and Financial Econometrics \|volume=56 \|issue=11 \|pages=3743–3756 \|doi=10.1016/j.csda.2010.10.004 \|issn=0167-9473\|url-access=subscription }}</ref><ref>{{Cite journal \|last1=Peters \|first1=Gareth W. \|last2=Wüthrich \|first2=Mario V. \|last3=Shevchenko \|first3=Pavel V. \|date=2010-08-01 \|title=Chain ladder method: Bayesian bootstrap versus classical bootstrap \|url=https://www.sciencedirect.com/science/article/pii/S0167668710000351 \|journal=Insurance: Mathematics and Economics \|volume=47 \|issue=1 \|pages=36–51 \|doi=10.1016/j.insmatheco.2010.03.007 \|arxiv=1004.2548 \|issn=0167-6687}}</ref> ==Method== Line 30: where <math>p(\theta\|D)</math> denotes the posterior, <math>p(D\|\theta)</math> the likelihood, <math>p(\theta)</math> the prior, and <math>p(D)</math> the evidence (also referred to as the [[marginal likelihood]] or the prior predictive probability of the data). Note that the denominator <math>p(D)</math> is normalizing the total probability of the posterior density <math>p(\theta\|D)</math> to one and can be calculated that way. The prior represents beliefs or knowledge (such as f.e.g. physical constraints) about <math>\theta</math> before <math>D</math> is available. Since the prior narrows down uncertainty, the posterior estimates have less variance, but might be biased. For convenience the prior is often specified by choosing a particular distribution among a set of well-known and tractable families of distributions, such that both the evaluation of prior probabilities and random generation of values of <math>\theta</math> are relatively straightforward. For certain kinds of models, it is more pragmatic to specify the prior <math>p(\theta)</math> using a factorization of the joint distribution of all the elements of <math>\theta</math> in terms of a sequence of their conditional distributions. If one is only interested in the relative posterior plausibilities of different values of <math>\theta</math>, the evidence <math>p(D)</math> can be ignored, as it constitutes a [[Normalizing constant\|normalising constant]], which cancels for any ratio of posterior probabilities. It remains, however, necessary to evaluate the likelihood <math>p(D\|\theta)</math> and the prior <math>p(\theta)</math>. For numerous applications, it is [[computationally expensive]], or even completely infeasible, to evaluate the likelihood,<ref name="Busetto2009a" /> which motivates the use of ABC to circumvent this issue. ===The ABC rejection algorithm=== All ABC-based methods approximate the likelihood function by simulations, the outcomes of which are compared with the observed data.<ref>{{Cite journal \|last=Hunter \|first=Dawn \|date=2006-12-08 \|title=Bayesian inference, Monte Carlo sampling and operational risk \|url=https://www.risk.net/journal-of-operational-risk/2160915/bayesian-inference-monte-carlo-sampling-and-operational-risk \|journal=Journal of Operational Risk \|volume=1 \|issue=3 \|pages=27–50 \|language=en \|doi=10.21314/jop.2006.014\|url-access=subscription }}</ref><ref name="Peters 2009"/><ref name="Beaumont2010" /><ref name="Bertorelle" /><ref name="Csillery" /> More specifically, with the ABC rejection algorithm — the most basic form of ABC — a set of parameter points is first sampled from the prior distribution. Given a sampled parameter point <math>\hat{\theta}</math>, a data set <math>\hat{D}</math> is then simulated under the statistical model <math>M</math> specified by <math>\hat{\theta}</math>. If the generated <math>\hat{D}</math> is too different from the observed data <math>D</math>, the sampled parameter value is discarded. In precise terms, <math>\hat{D}</math> is accepted with tolerance <math>\epsilon \ge 0</math> if: :<math>\rho (\hat{D},D)\le\epsilon</math>, Line 296: \| <ref name="Wegmann2010" /> \|- \| [~~http~~https://msbayes.sourceforge.net/ msBayes] \| Open source software package consisting of several C and R programs that are run with a Perl "front-end". Hierarchical coalescent models. Population genetic data from multiple co-distributed species. \| <ref name="Hickerson07" /> Line 338: <ref name="Bharti">{{cite journal \| last1 = Bharti \| first1 = A \| last2 = Briol \| first2 = F.-X. \| last3 = Pedersen \| first3 = T \| year = 2021 \| title = A General Method for Calibrating Stochastic Radio Channel Models with Kernels \| journal = IEEE Transactions on Antennas and Propagation \| volume = 70 \| issue = 6 \| pages = 3986–4001 \| doi=10.1109/TAP.2021.3083761\| arxiv = 2012.09612 \| s2cid = 233880538 }}</ref> <ref name="Bertorelle">{{cite journal \| last1 = Bertorelle \| first1 = G \| last2 = Benazzo \| first2 = A \| last3 = Mona \| first3 = S \| year = 2010 \| title = ABC as a flexible framework to estimate demography over space and time: some cons, many pros \| journal = Molecular Ecology \| volume = 19 \| issue = 13\| pages = 2609–2625 \| doi=10.1111/j.1365-294x.2010.04690.x\| pmid = 20561199 \| bibcode = 2010MolEc..19.2609B \| s2cid = 12129604 \| doi-access = free }}</ref> <ref name="Csillery">{{cite journal \| last1 = Csilléry \| first1 = K \| last2 = Blum \| first2 = MGB \| last3 = Gaggiotti \| first3 = OE \| last4 = François \| first4 = O \| year = 2010 \| title = Approximate Bayesian Computation (ABC) in practice \| journal = Trends in Ecology & Evolution \| volume = 25 \| issue = 7\| pages = 410–418 \| doi=10.1016/j.tree.2010.04.001\| pmid = 20488578 \| bibcode = 2010TEcoE..25..410C \| s2cid = 13957079 }}</ref> <ref name="Rubin">{{cite journal \| last1 = Rubin \| first1 = DB \| year = 1984 \| title = Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician \| journal = The Annals of Statistics \| volume = 12 \| issue = 4\| pages = 1151–1172 \| doi=10.1214/aos/1176346785\| doi-access = free }}</ref> <ref name="Marjoram">{{cite journal \| last1 = Marjoram \| first1 = P \| last2 = Molitor \| first2 = J \| last3 = Plagnol \| first3 = V \| last4 = Tavare \| first4 = S \| year = 2003 \| title = Markov chain Monte Carlo without likelihoods \| journal = Proc Natl Acad Sci U S A \| volume = 100 \| issue = 26\| pages = 15324–15328 \| doi=10.1073/pnas.0306899100\| pmid = 14663152 \| pmc = 307566 \| bibcode = 2003PNAS..10015324M \| doi-access = free }}</ref> Line 368: <ref name="Templeton2010">{{cite journal \| last1 = Templeton \| first1 = AR \| year = 2010 \| title = Coherent and incoherent inference in phylogeography and human evolution \| journal = Proceedings of the National Academy of Sciences of the United States of America \| volume = 107 \| issue = 14\| pages = 6376–6381 \| doi=10.1073/pnas.0910647107\| pmid = 20308555 \| pmc = 2851988 \| bibcode = 2010PNAS..107.6376T\| doi-access = free }}</ref> <!--<ref name="Fagundes">{{cite journal \| last1 = Fagundes \| first1 = NJR \| last2 = Ray \| first2 = N \| last3 = Beaumont \| first3 = M \| last4 = Neuenschwander \| first4 = S \| last5 = Salzano \| first5 = FM \|display-authors=et al \| year = 2007 \| title = Statistical evaluation of alternative models of human evolution \| journal = Proceedings of the National Academy of Sciences of the United States of America \| volume = 104 \| pages = 17614–17619 \| doi=10.1073/pnas.0708280104 \| pmid=17978179 \| pmc=2077041}}</ref>--> <!-- <ref name="Gelfand">{{cite journal \| last1 = Gelfand \| first1 = AE \| last2 = Dey \| first2 = DK \| year = 1994 \| title = Bayesian model choice: Asymptotics and exact calculations \| journal = J R ~~Statist~~Stat Soc Ser B \| volume = 56 \| pages = 501–514 }}</ref> --> <!-- <ref name="Bernardo">Bernardo JM, Smith AFM (1994) Bayesian Theory: John Wiley.</ref> --> <!-- <ref name="Box">Box G, Draper NR (1987) Empirical Model-Building and Response Surfaces: John Wiley and Sons, Oxford.</ref> --> Line 382: <ref name="Dean">{{cite arXiv \| eprint=1103.5399 \| last1=Dean \| first1=Thomas A. \| last2=Singh \| first2=Sumeetpal S. \| last3=Jasra \| first3=Ajay \| last4=Peters \| first4=Gareth W. \| title=Parameter Estimation for Hidden Markov Models with Intractable Likelihoods \| date=2011 \| class=math.ST }}</ref> <ref name="Fearnhead">{{cite arXiv \| eprint=1004.1112 \| last1=Fearnhead \| first1=Paul \| last2=Prangle \| first2=Dennis \| title=Constructing Summary Statistics for Approximate Bayesian Computation: Semi-automatic ABC \| date=2010 \| class=stat.ME }}</ref> <ref name="Wilkinson">{{cite journal \| arxiv=0811.3355 \| doi=10.1515/sagmb-2013-0010 \| title=Approximate Bayesian computation (ABC) gives exact results under the assumption of model error \| date=2013 \| last1=Wilkinson \| first1=Richard David \| journal=Statistical Applications in Genetics and Molecular Biology \| volume=12 \| issue=2 \| pmid=23652634 }}</ref> <ref name="Nunes">{{cite journal \| last1 = Nunes \| first1 = MA \| last2 = Balding \| first2 = DJ \| year = 2010 \| title = On optimal selection of summary statistics for approximate Bayesian computation \| journal = Stat Appl Genet Mol Biol \| volume = 9 \| page = Article 34 \| doi=10.2202/1544-6115.1576\| pmid = 20887273 \| s2cid = 207319754 }}</ref> <ref name="Joyce">{{cite journal \| last1 = Joyce \| first1 = P \| last2 = Marjoram \| first2 = P \| year = 2008 \| title = Approximately sufficient statistics and bayesian computation \| journal = Stat Appl Genet Mol Biol \| volume = 7 \| issue = 1\| page = Article 26 \| doi=10.2202/1544-6115.1389\| pmid = 18764775 \| s2cid = 38232110 }}</ref> Line 389: <ref name="Toni">{{cite journal \| last1 = Toni \| first1 = T \| last2 = Welch \| first2 = D \| last3 = Strelkowa \| first3 = N \| last4 = Ipsen \| first4 = A \| last5 = Stumpf \| first5 = M \| year = 2007 \| title = Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems \| journal = J R Soc Interface \| volume = 6 \| issue = 31\| pages = 187–202 \| pmid = 19205079 \| pmc = 2658655 \| doi = 10.1098/rsif.2008.0172 }}</ref> <ref name="Tavare">{{cite journal \| last1 = Tavaré \| first1 = S \| last2 = Balding \| first2 = DJ \| last3 = Griffiths \| first3 = RC \| last4 = Donnelly \| first4 = P \| year = 1997 \| title = Inferring Coalescence Times from DNA Sequence Data \| journal = Genetics \| volume = 145 \| issue = 2 \| pages = 505–518 \| doi = 10.1093/genetics/145.2.505 \| pmc = 1207814 \| pmid=9071603}}</ref> <ref name="Toni2010">{{cite journal \| doi=10.1093/bioinformatics/btp619 \| title=Simulation-based model selection for dynamical systems in systems and population biology \| date=2010 \| last1=Toni \| first1=Tina \| last2=Stumpf \| first2=Michael P. H. \| journal=Bioinformatics \| volume=26 \| issue=1 \| pages=104–110 \| pmid=19880371 \| pmc=2796821 \| arxiv=0911.1705 }}</ref> .<ref name="Pritchard1999">{{cite journal \| last1 = Pritchard \| first1 = JK \| last2 = Seielstad \| first2 = MT \| last3 = Perez-Lezaun \| first3 = A \|display-authors=et al \| year = 1999 \| title = Population Growth of Human Y Chromosomes: A Study of Y Chromosome Microsatellites \| journal = Molecular Biology and Evolution \| volume = 16 \| issue = 12\| pages = 1791–1798 \| doi=10.1093/oxfordjournals.molbev.a026091\| pmid = 10605120 \| doi-access = free }}</ref> <ref name="Diggle">{{cite journal \| last1 = Diggle \| first1 = PJ \| year = 1984 \| title = Monte Carlo Methods of Inference for Implicit Statistical Models \| journal = Journal of the Royal Statistical Society, Series B \| volume = 46 \| issue = 2 \| pages = 193–227 \| doi = 10.1111/j.2517-6161.1984.tb01290.x }}</ref> Line 413: <ref name="Klinger2017">Klinger, E.; Rickert, D.; Hasenauer, J. (2017). pyABC: distributed, likelihood-free inference.</ref> <ref name="Salvatier2016">{{cite journal \| doi=10.7717/peerj-cs.55 \| doi-access=free \| title=Probabilistic programming in Python using PyMC3 \| date=2016 \| last1=Salvatier \| first1=John \| last2=Wiecki \| first2=Thomas V. \| last3=Fonnesbeck \| first3=Christopher \| journal=PeerJ Computer Science \| volume=2 \| pages=e55 \| arxiv=1507.08050 }}</ref> <ref name="Prangle">{{cite journal \| doi=10.1515/sagmb-2013-0012 \| title=Semi-automatic selection of summary statistics for ABC model choice\| date=2014 \| last1=Prangle \| first1=Dennis \| last2=Fearnhead \| first2=Paul \| last3=Cox \| first3=Murray P. \| last4=Biggs \| first4=Patrick J. \| last5=French \| first5=Nigel P. \| journal=Stat Appl Genet Mol Biol \| volume=13\| issue=1\| pages=67–82 \| pmid=24323893\| arxiv=1302.5624 }}</ref> }}