Markov kernel: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 18:16, 23 March 2020 edit Cosmia Nebula (talk \| contribs) Extended confirmed users 11,336 edits →Measurable functions. Tag: 2017 wikitext editor ← Previous edit		Latest revision as of 14:25, 11 September 2024 edit undo RMGunton (talk \| contribs) 189 edits m →Formal definition: Add hyperlink to sigma-algebra page
(30 intermediate revisions by 21 users not shown)
Line 1: {{Short description\|Concept in probability theory}} In [[probability theory]], a '''Markov kernel''' (also known as a '''stochastic kernel''' or '''probability kernel''') is a map that in the general theory of [[Markov process]]es, plays the role that the [[Stochastic matrix\|transition matrix]] does in the theory of Markov processes with a [[finite set\|finite]] [[state space]].<ref>{{Cite ~~journal~~book \| last1 = Reiss \| first1 = R. D. \| title = A Course on Point Processes \| doi = 10.1007/978-1-4613-9308-5 \| series = Springer Series in Statistics \| year = 1993 \| isbn = 978-1-4613-9310-8 ~~\| pmid = \| pmc =~~ }}</ref> == Formal definition == Let <math>(X,\mathcal A)</math> and <math>(Y,\mathcal B)</math> be [[measurable space]]s. A ''Markov kernel'' with source <math>~~\kappa:~~ (X ,\tomathcal YA)</math> ~~with~~and ~~source~~target <math>(XY,\mathcal AB)</math>, ~~and~~sometimes ~~target~~written as <math>\kappa:(X,\mathcal{A})\to(Y,\mathcal {B})</math>, is a ~~map~~function <math>\kappa : \mathcal B \times X \to [0,1]</math> with the following properties: # For every (fixed) <math>BB_0 \in \mathcal B</math>, the map <math> x \mapsto \kappa(BB_0, x)</math> is <math>\mathcal A</math>-[[measurable function\|measurable]] # For every (fixed) <math> xx_0 \in X</math>, the map <math> B \mapsto \kappa(B, xx_0)</math> is a [[probability measure]] on <math>(Y, \mathcal B)</math> In other words it associates to each point <math>x \in X</math> a [[probability measure]] <math>\kappa(dy\|x): B \mapsto \kappa(B, x)</math> on <math>(Y,\mathcal B)</math> such that, for every measurable set <math>B\in\mathcal B</math>, the map <math>x\mapsto \kappa(B, x)</math> is measurable with respect to the [[Σ-algebra\|<math>\sigma</math>-algebra <math>\mathcal A.</math>]].<ref>{{cite book \|last1=Klenke \|first1=Achim \|title=Probability Theory: A Comprehensive Course\|series=Universitext \|year=2014 \|publisher=Springer\|page=180\|edition=2\|doi=10.1007/978-1-4471-5361-0\|isbn=978-1-4471-5360-3 }}</ref>. == Examples == ===[[Simple random walk]] on the integers === Take <math>X=Y=\Z</math>, and <math> \mathcal A = \mathcal B = \mathcal P(\Z)</math> (the [[power set]] of <math>\Z</math>). Then a Markov kernel is fully determined by the probability it assigns to ~~a singleton set~~singletons <math>\{m\}~~</math>~~,\, ~~with <math>~~m \in Y = \Z</math> for each <math>n \in X = \Z</math>: :<math>\kappa(B\|n )=\sum_{m \in B}\kappa(\{m\}\|n), \qquad \forall n \in \mathbb{Z}, \, \forall B \in \mathcal B</math>. Now the random walk <math>\kappa</math> that goes to the right with probability <math>p</math> and to the left with probability <math>1 - p</math> is defined by Line 35 ⟶ 36: then <math> \kappa(dy \|x) = k(y, x)\nu(dy) </math> i.e. the mapping :<math>\begin{cases} \kappa:\mathcal B \times X \to [0,1] \\ \kappa(B\|x)=\int_{B}k(y, x)\nu(\mathrm{d} y) \end{cases}</math> defines a Markov kernel.<ref>{{cite book\|last1=Erhan\|first1=Cinlar\|title=Probability and Stochastics\|date=2011\|publisher=Springer\|___location=New York\|isbn=978-0-387-87858-4\|pages=37–38}}</ref>. This example generalises the countable Markov process example where <math>\nu</math> was the [[counting measure]]. Moreover it encompasses other important examples such as the convolution kernels, in particular the Markov kernels defined by the heat equation. The latter example includes the [[Gaussian kernel]] on <math>X = Y = \mathbb R</math> with <math>\nu(dx) = dx </math> standard Lebesgue measure and :<math>k_t(y, x) = \frac{1}{\sqrt{2\pi}t}e^{-(y - x)^2/(2t^2)}</math>. === Measurable functions. === Take <math>(X, \mathcal{A})</math> and <math>(Y, \mathcal{B})</math> arbitrary measurable spaces, and let <math>f:X \to Y</math> be a measurable function. Now define <math>\kappa(dy\|x) = \delta_{f(x)}(dy)</math> i.e. :<math> \kappa(B\|x) = \mathbf{1}_B(f(x)) = \mathbf{1}_{f^{-1}(B)}(x) = \begin{cases}1 & \text{if } f(x) \in B\\ 0 & \text{otherwise}\end{cases}</math> for all <math>B \in \mathcal{B}</math>. Note that the indicator function <math>\mathbf{1}_{f^{-1}(B)}</math> is <math>\mathcal{A}</math>-measurable for all <math>B \in \mathcal{B}</math> iff <math>f</math> is measurable. This example allows us to think of a Markov kernel as a generalised function with a (in general) random rather than certain value. That is, it is a [[multivalued function]] where the values are not equally weighted. ===[[Galton–Watson process]]=== As a less obvious example, take <math>X = \N, \mathcal A = \mathcal P(\N)</math>, and <math>(Y, \mathcal B)</math> the real numbers <math>\R</math> with the standard sigma algebra of [[Borel set]]s. Then :<math>\kappa(B\|n)=\begin{cases} \mathbf{1}_B(0) & n=0\\ \Pr(\xi_1 + \cdots + \xi_x \in B) & n \neq 0 \\ \end{cases} </math> ~~with~~where <math> x </math> is the number of element at the state <math> n </math>, <math>\xi_i</math> are [[Independent and identically distributed random variables\|i.i.d.]] [[random variable]]s ~~<math>\xi_i</math>~~ (usually with mean 0) and where <math>\mathbf{1}_B</math> is the indicator function. For the simple case of [[Bernoulli distribution\|coin flips]] this models the different levels of a [[Galton board]]. == Composition of Markov Kernels ~~and the Markov Category~~== Given measurable spaces <math>(X, \mathcal A)</math>, <math>(Y, \mathcal B) </math> ~~and~~we consider a Markov kernel <math>\kappa: \mathcal B \times X \to [0,1]</math> as a morphism <math>\kappa: X \to Y</math>. Intuitively, rather than assigning to each <math>x \in X</math> a sharply defined point <math> y \in Y</math> the kernel assigns a "fuzzy" point in <math>Y</math> which is only known with some level of uncertainty, much like actual physical measurements. If we have a third measurable space <math>(Z, \mathcal C)</math>, and probability kernels <math>\kappa: X \to Y</math> and <math>\lambda: Y \to Z</math>, we can define a composition <math>\lambda \circ \kappa : X \to Z</math> by the [[Chapman-Kolmogorov equation]] :<math>(\lambda \circ \kappa) (dz\|x) = \int_Y \lambda(dz \| y)\kappa(dy\|x)</math>. The composition is associative by ~~[[Fubini's~~the ~~theorem#Tonelli's_theorem_for_non-negative_measurable_function\|Tonelli's~~Monotone ~~theorem]]~~Convergence Theorem and the identity function considered as a Markov kernel (i.e. the delta measure <math> \kappa_{1}(dx'\|x) = \delta_x(dx')</math>) is the unit for this composition. This composition defines the structure of a [[category (mathematics)\|category]] on the measurable spaces with Markov kernels as morphisms, first defined by Lawvere,<ref>{{cite web\|author = F. W. Lawvere\|title = The Category of Probabilistic Mappings\|date = 1962\|url = https://ncatlab.org/nlab/files/lawvereprobability1962.pdf}}</ref>. ~~The~~the [[category ~~has the empty set as initial object and the one point set <math></math> as the~~of ~~terminal~~Markov ~~object~~kernels]]. == Probability Space defined by Probability Distribution and a Markov Kernel== A ~~probability measure~~composition onof a ~~measurable~~probability space <math>(X, \mathcal A, P_X)</math> isand ~~the~~a ~~same~~probability ~~thing~~kernel as<math>\kappa: (X, \mathcal A) \to (Y, \mathcal B) </math> defines a ~~morphism~~probability space <math>(Y, \tomathcal XB, P_Y = \kappa \circ P_X)</math>, where the probability measure is given by :<math> P_Y(B) = \~~int_B~~int_X \~~int_X~~int_B \kappa(dy\|x) P_X(dx) = \int_X \kappa(B\|x)P_X(dx) = \mathbb{E}_{P_X}\kappa(B\|\cdot) .</math>▼ in the Markov category also denoted by <math>P</math>. By composition, a probability space <math>(X, \mathcal A, P_X)</math> and a probability kernel <math>\kappa: (X, \mathcal A) \to (Y, \mathcal B) </math> defines a probability space <math>(Y, \mathcal B, P_Y = \kappa \circ P_X)</math>. It is concretely defined by ▲:<math> P_Y(B) = \int_B \int_X \kappa(dy\|x) P_X(dx) = \int_X \kappa(B\|x)P_X(dx) = \mathbb{E}_{P_X}\kappa(B\|\cdot) </math> == Properties == Line 75: Let <math>(S,Y)</math> be a [[Borel set#Standard Borel spaces and Kuratowski theorems\|Borel space]], <math>X</math> a <math>(S,Y)</math>-valued random variable on the measure space <math>(\Omega, \mathcal{F}, P)</math> and <math>\mathcal G \subseteq \mathcal F</math> a sub-<math>\sigma</math>-algebra. Then there exists a Markov kernel <math>\kappa</math> from <math>(\Omega, \mathcal G)</math> to <math>(S,Y)</math>, such that <math>\kappa(\cdot,B)</math> is a version of the [[conditional expectation]] <math>\mathbb{E}[\mathbf 1_{\{X \in B\}} \mid \mathcal G]</math> for every <math>B \in Y</math>, i.e. :<math>P(X \in B\mid\mathcal G)=\mathbb{E} \left [\mathbf 1_{\{X \in B\}}\mid\mathcal G \right ] = \kappa(\~~omega~~cdot,B), \qquad P\text{-a.s.}\,\, \forall B \in \mathcal G.</math> It is called regular conditional distribution of <math>X</math> given <math>\mathcal G</math> and is not uniquely defined. Line 84: can be any type of (non negative) measure, not necessarily a probability measure. == External links == * [https://ncatlab.org/nlab/show/Markov+kernel Markov kernel] in [https://ncatlab.org/ nLab]. == References == Line 89 ⟶ 93: * {{citation\|first1=Heinz\|last1=Bauer\|title=Probability Theory\|publisher=de Gruyter\|year=1996\|isbn=3-11-013935-9}} :: §36. Kernels and semigroups of kernels == See also == * [[Category of Markov kernels]] [[Category:Markov processes]]