Content deleted Content added
Citation bot (talk | contribs) m Alter: author, author2, author4. Add: author-link, author pars. 2-4. Removed URL that duplicated unique identifier. Removed parameters. | You can use this bot yourself. Report bugs here.| Activated by User:Nemo bis | via #UCB_webform |
Fgnievinski (talk | contribs) |
||
(42 intermediate revisions by 21 users not shown) | |||
Line 1:
{{short description|Probabilistic model}}
{{about|the representation of probability distributions using graphs|the computer graphics journal|Graphical Models{{!}}''Graphical Models''}}
{{Machine learning|Structured prediction}}
{{More footnotes|date=May 2017}}
A '''graphical model''' or '''probabilistic graphical model''' ('''PGM''') or '''structured probabilistic model''' is a [[probabilistic model]] for which a [[Graph (discrete mathematics)|graph]] expresses the [[conditional dependence]] structure between [[random variable]]s.
==Types==
[[File:Graph model.svg|thumb|right|alt=An example of a graphical model.| An example of a graphical model. Each arrow indicates a dependency. In this example: D depends on A, B, and C; and C depends on B and D; whereas A and B are each independent.]]▼
Generally, probabilistic graphical models use a graph-based representation as the foundation for encoding a distribution over a multi-dimensional space and a graph that is a compact or [[Factor graph|factorized]] representation of a set of independences that hold in the specific distribution. Two branches of graphical representations of distributions are commonly used, namely, [[Bayesian network]]s and [[Markov random field]]s. Both families encompass the properties of factorization and independences, but they differ in the set of independences they can encode and the factorization of the distribution that they induce.<ref name=koller09>{{cite book
|author=Koller, D.
Line 14:
|___location=Massachusetts
|year=2009
|pages=
|isbn=978-0-262-01319-2
|author2-link=Nir Friedman▼
▲|author2-link=Nir Friedman
|author-link=Daphne Koller
|archive-url=https://web.archive.org/web/20140427083249/http://pgm.stanford.edu/
|archive-date=2014-04-27
|url-status=dead
}}</ref>
===Undirected Graphical Model===
[[File:Examples of an Undirected Graph.svg|thumb|alt=An undirected graph with four vertices.|An undirected graph with four vertices]]
The undirected graph shown may have one of several interpretations; the common feature is that the presence of an edge implies some sort of dependence between the corresponding random variables. From this graph, we might deduce that B, C, and D are all [[Conditional independence|conditionally independent]] given A. This means that if the value of A is known, then the values of B, C, and D provide no further information about each other. Equivalently (in this case), the joint probability distribution can be factorized as:
for some non-negative functions <math>f_{AB}, f_{AC}, f_{AD}</math>.
===Bayesian network===
{{main|Bayesian network}}
[[File:Example of a Directed Graph.svg|thumb|alt=Example of a directed acyclic graph on four vertices.|Example of a directed acyclic graph on four vertices]]
If the network structure of the model is a [[directed acyclic graph]], the model represents a factorization of the joint [[probability]] of all random variables. More precisely, if the events are <math>X_1,\ldots,X_n</math> then the joint probability satisfies
Line 29 ⟶ 43:
:<math>P[X_1,\ldots,X_n]=\prod_{i=1}^nP[X_i|\text{pa}(X_i)]</math>
where <math>\text{pa}(X_i)</math> is the set of parents of node <math>X_i</math> (nodes with edges directed towards <math>X_i</math>). In other words, the [[joint distribution]] factors into a product of conditional distributions. For example,
:<math>P[A,B,C,D] = P[A]\cdot P[B | A]\cdot P[C | A] \cdot P[D|A,C]</math>.
with a joint probability density that factors as▼
▲:<math>P[A,B,C,D] = P[A]\cdot P[B]\cdot P[C,D|A,B]</math>
Any two nodes are [[Conditional independence|conditionally independent]] given the values of their parents. In general, any two sets of nodes are conditionally independent given a third set if a criterion called [[d-separation|''d''-separation]] holds in the graph. Local independences and global independences are equivalent in Bayesian networks.
This type of graphical model is known as a directed graphical model, [[Bayesian network]], or belief network. Classic machine learning models like [[hidden Markov models]], [[Artificial neural network|neural networks]] and newer models such as [[variable-order Markov model]]s can be considered special cases of Bayesian networks.
One of the simplest Bayesian Networks is the [[Naive Bayes classifier]].
===Cyclic Directed Graphical Models===
▲[[File:Graph model.svg|thumb|right|alt=An example of a directed graphical model.| An example of a directed, cyclic graphical model. Each arrow indicates a dependency. In this example: D depends on A, B, and C; and C depends on B and D; whereas A and B are each independent.]]
The next figure depicts a graphical model with a cycle. This may be interpreted in terms of each variable 'depending' on the values of its parents in some manner.
:<math>P[A,B,C,D] = P[A]\cdot P[B]\cdot P[C,D|A,B]</math>,
but other interpretations are possible.
<ref name=richardson95causal>{{cite book
|first=Thomas |last=Richardson
|chapter=A discovery algorithm for directed cyclic graphs
|title=Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence
|year=1996
|publisher=Morgan Kaufmann Pub.
|isbn=978-1-55860-412-4
}}▼
</ref>
===Other types===
*[[Dependency network (graphical model)|Dependency network]] where cycles are allowed
*Tree-augmented classifier or '''TAN model'''
[[File:Tan corral.png|thumb| TAN model for "corral dataset"]]
*Targeted Bayesian network learning (TBNL) [[File:Tbnl corral.jpg|thumb|TBNL model for "corral dataset"]]
*A [[factor graph]] is an undirected [[bipartite graph]] connecting variables and factors. Each factor represents a function over the variables it is connected to. This is a helpful representation for understanding and implementing [[belief propagation]].
* A [[clique tree]] or junction tree is a [[tree (graph theory)|tree]] of [[clique (graph theory)|cliques]], used in the [[junction tree algorithm]].
Line 82 ⟶ 116:
| last = Bishop
| first = Christopher M.
|
| title = Pattern Recognition and Machine Learning
| publisher = Springer
Line 89 ⟶ 123:
| isbn=978-0-387-31073-2
| chapter= Chapter 8. Graphical Models
|
| pages=359–422
| mr=2247587
Line 96 ⟶ 130:
|author=Cowell, Robert G.
|author2=Dawid, A. Philip |author3=Lauritzen, Steffen L. |author4-link=David Spiegelhalter |author4=Spiegelhalter, David J.
|title=Probabilistic networks and expert systems |publisher=Springer |___location=Berlin |year=1999
* {{cite book |author=Jensen, Finn |title=An introduction to Bayesian networks |publisher=Springer |___location=Berlin |year=1996
* {{Cite book
|first=Judea |last=Pearl |
| year = 1988
| title = Probabilistic Reasoning in Intelligent Systems
|url=https://archive.org/details/probabilisticrea00pear |url-access=registration | edition = 2nd revised
| ___location = San Mateo, CA
| publisher = [[Morgan Kaufmann]]
Line 111 ⟶ 145:
===Journal articles===
* {{Cite journal
| author = Edoardo M. Airoldi |
| title = Getting Started in Probabilistic Graphical Models
| journal = [[
| volume = 3
| issue = 12
Line 121 ⟶ 155:
| pmid = 18069887
| pmc = 2134967
|arxiv=0706.2040
▲}}
|bibcode=2007PLSCB...3..252A
*{{Cite journal | last1 = Jordan | first1 = M. I. | authorlink=Michael I. Jordan| doi = 10.1214/088342304000000026 | title = Graphical Models | journal = Statistical Science | volume = 19 | pages = 140–155| year = 2004 | pmid = | pmc = }}▼
|doi-access=free
*{{Cite journal|last=Ghahramani|first=Zoubin|date=May 2015|title=Probabilistic machine learning and artificial intelligence|journal=Nature|language=English|volume=521|issue=7553|pages= 452–459|doi=10.1038/nature14541|pmid=26017444}}▼
}}
▲*{{Cite journal | last1 = Jordan | first1 = M. I. |
▲*{{Cite journal|last=Ghahramani|first=Zoubin|date=May 2015|title=Probabilistic machine learning and artificial intelligence|journal=Nature|language=
===Other===
|