Graphical model: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 02:41, 20 January 2020 edit Michael Hardy (talk \| contribs) Administrators 210,577 edits No edit summary ← Previous edit		Latest revision as of 00:23, 21 August 2025 edit undo Fgnievinski (talk \| contribs) Autopatrolled, Extended confirmed users 71,081 edits →Types of graphical models
(35 intermediate revisions by 19 users not shown)
Line 1: {{short description\|Probabilistic model}} ~~{{machine learning bar}}~~ {{about\|the representation of probability distributions using graphs\|the computer graphics journal\|Graphical Models{{!}}''Graphical Models''}} {{Machine learning\|Structured prediction}} {{More footnotes\|date=May 2017}} A '''graphical model''' or '''probabilistic graphical model''' ('''PGM''') or '''structured probabilistic model''' is a [[probabilistic model]] for which a [[Graph (discrete mathematics)\|graph]] expresses the [[conditional dependence]] structure between [[random variable]]s. ~~They~~Graphical models are commonly used in [[probability theory]], [[statistics]]—particularly [[Bayesian statistics]]—and [[machine learning]]. ==Types== [[File:Graph model.svg\|thumb\|right\|alt=An example of a graphical model.\| An example of a graphical model. Each arrow indicates a dependency. In this example: D depends on A, B, and C; and C depends on B and D; whereas A and B are each independent.]]▼ ~~==Types of graphical models==~~ Generally, probabilistic graphical models use a graph-based representation as the foundation for encoding a distribution over a multi-dimensional space and a graph that is a compact or [[Factor graph\|factorized]] representation of a set of independences that hold in the specific distribution. Two branches of graphical representations of distributions are commonly used, namely, [[Bayesian network]]s and [[Markov random field]]s. Both families encompass the properties of factorization and independences, but they differ in the set of independences they can encode and the factorization of the distribution that they induce.<ref name=koller09>{{cite book \|author=Koller, D. Line 16: \|pages=1208 \|isbn=978-0-262-01319-2 ~~\|doi=~~ ~~\|accessdate=~~ \|author2-link=Nir Friedman \|author-link=Daphne Koller Line 24 ⟶ 22: \|url-status=dead }}</ref> ===Undirected Graphical Model=== [[File:Examples of an Undirected Graph.svg\|thumb\|alt=An undirected graph with four vertices.\|An undirected graph with four vertices]] The undirected graph shown may have one of several interpretations; the common feature is that the presence of an edge implies some sort of dependence between the corresponding random variables. From this graph, we might deduce that B, C, and D are all [[Conditional independence\|conditionally independent]] given A. This means that if the value of A is known, then the values of B, C, and D provide no further information about each other. Equivalently (in this case), the joint probability distribution can be factorized as: :<math>P[A,B,C,D] = Pf_{AB}[A,B] \cdot Pf_{AC}[BA,C] \cdot Pf_{AD}[~~C,D\|~~A,BD]</math>▼ for some non-negative functions <math>f_{AB}, f_{AC}, f_{AD}</math>. ===Bayesian network=== {{main\|Bayesian network}} [[File:Example of a Directed Graph.svg\|thumb\|alt=Example of a directed acyclic graph on four vertices.\|Example of a directed acyclic graph on four vertices]] If the network structure of the model is a [[directed acyclic graph]], the model represents a factorization of the joint [[probability]] of all random variables. More precisely, if the events are <math>X_1,\ldots,X_n</math> then the joint probability satisfies Line 32 ⟶ 43: :<math>P[X_1,\ldots,X_n]=\prod_{i=1}^nP[X_i\|\text{pa}(X_i)]</math> where <math>\text{pa}(X_i)</math> is the set of parents of node <math>X_i</math> (nodes with edges directed towards <math>X_i</math>). In other words, the [[joint distribution]] factors into a product of conditional distributions. For example, ~~the graphical model~~ in the ~~Figure shown above (which is actually not a~~ directed acyclic graph, ~~but~~shown ~~an [[ancestral graph]]) consists of~~in the ~~random variables~~Figure ~~<math>A,~~this B,factorization C,would ~~D</math>~~be :<math>P[A,B,C,D] = P[A]\cdot P[B \| A]\cdot P[C \| A] \cdot P[D\|A,C]</math>. with a joint probability density that factors as▼ ▲:<math>P[A,B,C,D] = P[A]\cdot P[B]\cdot P[C,D\|A,B]</math> Any two nodes are [[Conditional independence\|conditionally independent]] given the values of their parents. In general, any two sets of nodes are conditionally independent given a third set if a criterion called [[d-separation\|''d''-separation]] holds in the graph. Local independences and global independences are equivalent in Bayesian networks. This type of graphical model is known as a directed graphical model, [[Bayesian network]], or belief network. Classic machine learning models like [[hidden Markov models]], [[Artificial neural network\|neural networks]] and newer models such as [[variable-order Markov model]]s can be considered special cases of Bayesian networks. One of the simplest Bayesian Networks is the [[Naive Bayes classifier]]. ===Cyclic Directed Graphical Models=== ▲[[File:Graph model.svg\|thumb\|right\|alt=An example of a directed graphical model.\| An example of a directed, cyclic graphical model. Each arrow indicates a dependency. In this example: D depends on A, B, and C; and C depends on B and D; whereas A and B are each independent.]] The next figure depicts a graphical model with a cycle. This may be interpreted in terms of each variable 'depending' on the values of its parents in some manner. ▲~~with~~The particular graph shown suggests a joint probability density that factors as :<math>P[A,B,C,D] = P[A]\cdot P[B]\cdot P[C,D\|A,B]</math>, but other interpretations are possible. <ref name=richardson95causal>{{cite book \|first=Thomas \|last=Richardson \|chapter=A discovery algorithm for directed cyclic graphs \|title=Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence \|year=1996 \|publisher=Morgan Kaufmann Pub. \|isbn=978-1-55860-412-4 }}▼ </ref> ===Other types=== [[Naive Bayes classifier]] where we use a tree with a single root [[Dependency network (graphical model)\|Dependency network]] where cycles are allowed Tree-augmented classifier or '''TAN model''' [[File:Tan corral.png\|thumb\| TAN model for "corral dataset"]] Targeted Bayesian network learning (TBNL) [[File:Tbnl corral.jpg\|thumb\|TBNL model for "corral dataset"]] A [[factor graph]] is an undirected [[bipartite graph]] connecting variables and factors. Each factor represents a function over the variables it is connected to. This is a helpful representation for understanding and implementing [[belief propagation]]. A [[clique tree]] or junction tree is a [[tree (graph theory)\|tree]] of [[clique (graph theory)\|cliques]], used in the [[junction tree algorithm]]. Line 86 ⟶ 116: \| last = Bishop \| first = Christopher M. \| ~~authorlink~~author-link = Christopher Bishop \| title = Pattern Recognition and Machine Learning \| publisher = Springer Line 93 ⟶ 123: \| isbn=978-0-387-31073-2 \| chapter= Chapter 8. Graphical Models \| ~~chapterurl~~chapter-url=https://www.microsoft.com/en-us/research/wp-content/uploads/2016/05/Bishop-PRML-sample.pdf \| pages=359–422 \| mr=2247587 Line 100 ⟶ 130: \|author=Cowell, Robert G. \|author2=Dawid, A. Philip \|author3=Lauritzen, Steffen L. \|author4-link=David Spiegelhalter \|author4=Spiegelhalter, David J. \|title=Probabilistic networks and expert systems \|publisher=Springer \|___location=Berlin \|year=1999 ~~\|pages=~~ \|isbn=978-0-387-98767-5 ~~\|doi= \|accessdate=~~ \|mr=1697175 \|ref=cowell \|author2-link=Philip Dawid }} A more advanced and statistically oriented book * {{cite book \|author=Jensen, Finn \|title=An introduction to Bayesian networks \|publisher=Springer \|___location=Berlin \|year=1996 ~~\|pages=~~ \|isbn=978-0-387-91502-9 ~~\|doi= \|accessdate=~~}} * {{Cite book \|first=Judea \|last=Pearl \|~~authorlink~~author-link = Judea Pearl \| year = 1988 \| title = Probabilistic Reasoning in Intelligent Systems \|url=https://archive.org/details/probabilisticrea00pear \|url-access=registration \| edition = 2nd revised \| ___location = San Mateo, CA \| publisher = [[Morgan Kaufmann]] Line 115 ⟶ 145: ===Journal articles=== * {{Cite journal \| author = Edoardo M. Airoldi \|~~authorlink1~~author-link1=Edoardo Airoldi \| title = Getting Started in Probabilistic Graphical Models \| journal = [[~~PLoS~~PLOS Computational Biology]] \| volume = 3 \| issue = 12 Line 125 ⟶ 155: \| pmid = 18069887 \| pmc = 2134967 \|arxiv=0706.2040 ▲}} \|bibcode=2007PLSCB...3..252A {{Cite journal \| last1 = Jordan \| first1 = M. I. \| authorlink=Michael I. Jordan\| doi = 10.1214/088342304000000026 \| title = Graphical Models \| journal = Statistical Science \| volume = 19 \| pages = 140–155\| year = 2004 \| pmid = \| pmc = }}▼ \|doi-access=free {{Cite journal\|last=Ghahramani\|first=Zoubin\|date=May 2015\|title=Probabilistic machine learning and artificial intelligence\|journal=Nature\|language=English\|volume=521\|issue=7553\|pages= 452–459\|doi=10.1038/nature14541\|pmid=26017444}}▼ }} ▲{{Cite journal \| last1 = Jordan \| first1 = M. I. \| ~~authorlink~~author-link=Michael I. Jordan\| doi = 10.1214/088342304000000026 \| title = Graphical Models \| journal = Statistical Science \| volume = 19 \| pages = 140–155\| year = 2004 \| ~~pmid~~doi-access = ~~\| pmc =~~free }} ▲{{Cite journal\|last=Ghahramani\|first=Zoubin\|date=May 2015\|title=Probabilistic machine learning and artificial intelligence\|journal=Nature\|language=~~English~~en\|volume=521\|issue=7553\|pages= 452–459\|doi=10.1038/nature14541\|pmid=26017444\|bibcode=2015Natur.521..452G\|s2cid=216356\|url=https://www.repository.cam.ac.uk/handle/1810/248538\|doi-access=free}} ===Other===