Graphical model: Difference between revisions

Content deleted Content added
OAbot (talk | contribs)
m Open access bot: doi added to citation with #oabot.
 
(30 intermediate revisions by 16 users not shown)
Line 1:
{{short description|Probabilistic model}}
{{machine learning bar}}
{{about|the representation of probability distributions using graphs|the computer graphics journal|Graphical Models{{!}}''Graphical Models''}}
{{Machine learning|Structured prediction}}
{{More footnotes|date=May 2017}}
A '''graphical model''' or '''probabilistic graphical model''' ('''PGM''') or '''structured probabilistic model''' is a [[probabilistic model]] for which a [[Graph (discrete mathematics)|graph]] expresses the [[conditional dependence]] structure between [[random variable]]s. TheyGraphical models are commonly used in [[probability theory]], [[statistics]]—particularly [[Bayesian statistics]]—and [[machine learning]].
 
==Types==
[[File:Graph model.svg|thumb|right|alt=An example of a graphical model.| An example of a graphical model. Each arrow indicates a dependency. In this example: D depends on A, B, and C; and C depends on B and D; whereas A and B are each independent.]]
 
==Types of graphical models==
Generally, probabilistic graphical models use a graph-based representation as the foundation for encoding a distribution over a multi-dimensional space and a graph that is a compact or [[Factor graph|factorized]] representation of a set of independences that hold in the specific distribution. Two branches of graphical representations of distributions are commonly used, namely, [[Bayesian network]]s and [[Markov random field]]s. Both families encompass the properties of factorization and independences, but they differ in the set of independences they can encode and the factorization of the distribution that they induce.<ref name=koller09>{{cite book
|author=Koller, D.
Line 22:
|url-status=dead
}}</ref>
 
===Undirected Graphical Model===
 
[[File:Examples of an Undirected Graph.svg|thumb|alt=An undirected graph with four vertices.|An undirected graph with four vertices]]
 
The undirected graph shown may have one of several interpretations; the common feature is that the presence of an edge implies some sort of dependence between the corresponding random variables. From this graph, we might deduce that B, C, and D are all [[Conditional independence|conditionally independent]] given A. This means that if the value of A is known, then the values of B, C, and D provide no further information about each other. Equivalently (in this case), the joint probability distribution can be factorized as:
 
:<math>P[A,B,C,D] = Pf_{AB}[A,B] \cdot Pf_{AC}[BA,C] \cdot Pf_{AD}[C,D|A,BD]</math>
 
for some non-negative functions <math>f_{AB}, f_{AC}, f_{AD}</math>.
 
===Bayesian network===
{{main|Bayesian network}}
 
[[File:Example of a Directed Graph.svg|thumb|alt=Example of a directed acyclic graph on four vertices.|Example of a directed acyclic graph on four vertices]]
 
 
If the network structure of the model is a [[directed acyclic graph]], the model represents a factorization of the joint [[probability]] of all random variables. More precisely, if the events are <math>X_1,\ldots,X_n</math> then the joint probability satisfies
Line 30 ⟶ 43:
:<math>P[X_1,\ldots,X_n]=\prod_{i=1}^nP[X_i|\text{pa}(X_i)]</math>
 
where <math>\text{pa}(X_i)</math> is the set of parents of node <math>X_i</math> (nodes with edges directed towards <math>X_i</math>). In other words, the [[joint distribution]] factors into a product of conditional distributions. For example, the graphical model in the Figure shown above (which is actually not a directed acyclic graph, butshown an [[ancestral graph]]) consists ofin the random variablesFigure <math>A,this B,factorization C,would D</math>be
:<math>P[A,B,C,D] = P[A]\cdot P[B | A]\cdot P[C | A] \cdot P[D|A,C]</math>.
with a joint probability density that factors as
 
:<math>P[A,B,C,D] = P[A]\cdot P[B]\cdot P[C,D|A,B]</math>
 
Any two nodes are [[Conditional independence|conditionally independent]] given the values of their parents. In general, any two sets of nodes are conditionally independent given a third set if a criterion called [[d-separation|''d''-separation]] holds in the graph. Local independences and global independences are equivalent in Bayesian networks.
 
This type of graphical model is known as a directed graphical model, [[Bayesian network]], or belief network. Classic machine learning models like [[hidden Markov models]], [[Artificial neural network|neural networks]] and newer models such as [[variable-order Markov model]]s can be considered special cases of Bayesian networks.
 
One of the simplest Bayesian Networks is the [[Naive Bayes classifier]].
 
===Cyclic Directed Graphical Models===
 
[[File:Graph model.svg|thumb|right|alt=An example of a directed graphical model.| An example of a directed, cyclic graphical model. Each arrow indicates a dependency. In this example: D depends on A, B, and C; and C depends on B and D; whereas A and B are each independent.]]
 
The next figure depicts a graphical model with a cycle. This may be interpreted in terms of each variable 'depending' on the values of its parents in some manner.
withThe particular graph shown suggests a joint probability density that factors as
:<math>P[A,B,C,D] = P[A]\cdot P[B]\cdot P[C,D|A,B]</math>,
but other interpretations are possible.
<ref name=richardson95causal>{{cite book
|first=Thomas |last=Richardson
|chapter=A discovery algorithm for directed cyclic graphs
|title=Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence
|year=1996
|publisher=Morgan Kaufmann Pub.
|isbn=978-1-55860-412-4
}}
</ref>
 
===Other types===
*[[Naive Bayes classifier]] where we use a tree with a single root
*[[Dependency network (graphical model)|Dependency network]] where cycles are allowed
*Tree-augmented classifier or '''TAN model'''
[[File:Tan corral.png|thumb| TAN model for "corral dataset"]]
*Targeted Bayesian network learning (TBNL) [[File:Tbnl corral.jpg|thumb|TBNL model for "corral dataset"]]
*A [[factor graph]] is an undirected [[bipartite graph]] connecting variables and factors. Each factor represents a function over the variables it is connected to. This is a helpful representation for understanding and implementing [[belief propagation]].
* A [[clique tree]] or junction tree is a [[tree (graph theory)|tree]] of [[clique (graph theory)|cliques]], used in the [[junction tree algorithm]].
Line 123 ⟶ 155:
| pmid = 18069887
| pmc = 2134967
|arxiv=0706.2040
}}
|bibcode=2007PLSCB...3..252A
|doi-access=free
}}
*{{Cite journal | last1 = Jordan | first1 = M. I. | author-link=Michael I. Jordan| doi = 10.1214/088342304000000026 | title = Graphical Models | journal = Statistical Science | volume = 19 | pages = 140–155| year = 2004 | doi-access = free }}
*{{Cite journal|last=Ghahramani|first=Zoubin|date=May 2015|title=Probabilistic machine learning and artificial intelligence|journal=Nature|language=en|volume=521|issue=7553|pages= 452–459|doi=10.1038/nature14541|pmid=26017444|bibcode=2015Natur.521..452G|s2cid=216356|url=https://www.repository.cam.ac.uk/handle/1810/248538|doi-access=free}}
 
===Other===