Bayesian model comparison: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 18:41, 14 February 2007 edit Yoderj (talk \| contribs) 467 edits (Introduction) Edit of equations and discussion of meaning of second equation ← Previous edit		Latest revision as of 04:58, 26 September 2011 edit undo Gnathan87 (talk \| contribs) Extended confirmed users 1,133 edits Bayes factor was reverted, temporarily redirecting to Bayes Factor.
(20 intermediate revisions by 15 users not shown)
Line 1: #REDIRECT [[Bayes factor]] A common problem in [[statistical inference]] is to use data to determine which of two competing models is the truth. Frequentist statistics uses [[hypothesis test]]s for this purpose. There are several [[Bayesian]] approaches. One approach is through [[Bayes factor]]s. ~~The posterior probability of a model given data, Pr(''H''\|''D''), is given by [[Bayes' theorem]]:~~ ~~:<math>Pr(H\|D) = \frac{Pr(D\|H)Pr(H)}{Pr(D)}</math>~~ The key data-dependent term Pr(''D''\|''H'') is a [[likelihood function\|likelihood]], and is sometimes called the evidence for model ''H''; evaluating it correctly is the key to Bayesian model comparison. ~~The evidence is usually the [[normalizing constant]] or [[partition function]] of another inference, namely the inference of the parameters of model ''H'' given the data ''D''.~~ The plausibility of two different models ''H''<sub>1</sub> and ''H''<sub>2</sub>, parametrised by model parameter vectors <math> \theta_1 </math> and <math> \theta_2 </math> is assessed by the [[Bayes factor]] given by ~~:<math> \frac{\Pr(D\|H_2)}{\Pr(D\|H_1)}~~ ~~= \frac{\int \Pr(\theta_2\|H_2)\Pr(D\|\theta_2,H_2)\,d\theta_2}~~ ~~{\int \Pr(\theta_1\|H_1)\Pr(D\|\theta_1,H_1)\,d\theta_1~~ }. ~~</math>~~ Thus the Bayesian model comparison does not depend on the parameters used by each model. Instead, it considers the probability of the model considering all possible parameter values. Alternatively, the [[Maximum likelihood estimate]] could be used for each of the parameters. ~~An advantage of the use of [[Bayes factors]] is that it automatically, and quite naturally, includes a penalty for including too much model structure. It thus guards against [[overfitting]].~~ ~~Another approach is to treat model comparison as a [[Decision theory#Choice under uncertainty\|decision problem]], computing the expected value or cost of each model choice.~~ ~~Another approach is to use [[Minimum Message Length]] ([[Minimum_Message_Length\|MML]]).~~ ~~== See also ==~~ [[Akaike information criterion]] Schwarz's [[Bayesian information criterion]] [[Conditional predictive ordinate]] [[Chris_Wallace_(computer_scientist)\|Wallace]]'s [[Minimum Message Length]] ([[Minimum Message Length\|MML]]) [[Model selection]] ~~== References ==~~ Gelman, A., Carlin, J.,Stern, H. and Rubin, D. Bayesian Data Analysis. Chapman and Hall/CRC.(1995) * Bernardo, J., and Smith, A.F.M., Bayesian Theory. John Wiley. (1994) * Lee, P.M. Bayesian Statistics. Arnold.(1989). * Denison, D.G.T., Holmes, C.C., Mallick, B.K., Smith, A.F.M., Bayesian Methods for Nonlinear Classification and Regression. John Wiley. (2002). * Richard O. Duda, Peter E. Hart, David G. Stork (2000) ''Pattern classification'' (2nd edition), Section 9.6.5, p. 487-489, Wiley, ISBN 0-471-05669-3 * Chapter 24 in [http://omega.math.albany.edu:8008/JaynesBook.html Probability Theory - The logic of science] by [[Edwin Thompson Jaynes\|E. T. Jaynes]], 1994. * [[David J.C. MacKay]] (2003) Information theory, inference and learning algorithms, CUP, ISBN 0-521-64298-1, (also [http://www.inference.phy.cam.ac.uk/mackay/itila/book.html available online]) ~~== External links ==~~ * [http://www.inference.phy.cam.ac.uk/mackay/itila/ The on-line textbook: Information Theory, Inference, and Learning Algorithms], by [[David J.C. MacKay]], discusses Bayesian model comparison in Chapter 28, p343. ~~[[Category:Bayesian statistics]]~~ ~~[[Category:Probability theory]]~~