Markov kernel

This is an old revision of this page, as edited by RogierBrussee (talk | contribs) at 09:40, 19 April 2019 (Composition of Markov Kernels and the Markov Categorie: Wording). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In probability theory, a Markov kernel (also known as a stochastic kernel or probability kernel) is a map that plays the role, in the general theory of Markov processes, that the transition matrix does in the theory of Markov processes with a finite state space.[1]

Formal definition

Let   be measurable spaces. A Markov kernel   with source   and target   is a map   with the following properties:

  1. For every (fixed)  , the map   is  -measurable
  2. For every (fixed)  , the map   is a probability measure on  

In other words it associates to each point   a probability measure   on   such that, for every measurable set  , the map   is measurable with respect to the  -algebra  [2].

Examples

Simple random walk on the integers

Take   (the power set of  ). Then a Markov kernel is fully determined by the probability it assigns to singleton sets for each  .

 .

Now the random walk   that goes to the right with probability   and to the left with probability   is defined by

 

where   is the Kronecker delta. The transition probabilities   for the random walk are equivalent to the Markov kernel.

General Markov processes with countable state space

More generally take   and   both countable and  . Again a Markov kernel is defined by the probability it assigns to singleton sets for each  

 ,

We define a Markov process by defining a transition probability   where the numbers   define a (countable) stochastic matrix   i.e.

 

We then define

 .

Again the transition probability, the stochastic matrix and the Markov kernel are equivalent reformulations.a

Markov kernel defined by a kernel function and a measure

If   is a measure on   and   is a measurable function with respect to the product  -algebra   such that

 

then   i.e. the mapping

 

defines a Markov kernel.[3]. This example generalises the countable Markov process example where   was the counting measure, but other important examples are convolution kernels like the Markov kernel defined by the heat equation e.g. the Gaussian on   with   standard Lebesgue measure and

 .

Measurable functions.

Take   and   arbitrary measurable spaces, and let   be a measurable function. Now define   i.e.

  for all  .

Note that the indicator function   is  -measurable for all   iff   is measurable.

This example allows us to think of a Markov kernel as a generalised function with a (in general) random rather than certain value.

As a less obvious example, take  , and   the real numbers   with the standard sigma algebra of Borel sets. Then

 

with i.i.d. random variables   (usually with mean 0) and where   is the indicator function. For the simple case of coin flips this models the different levels of a Galton board.

Composition of Markov Kernels and the Markov Categorie

Given measurable spaces  ,   and  , and probability kernels   and  , we can define a composition   by

 

The composition is associative by Tonelli's theorem and the identity function considered as Markov kernel (i.e. the delta measure  ) is the unit for this composition.

This composition defines the structure of a category on the measurable spaces with Markov kernels as morphisms first defined by Lawvere [4]. The category has the empty set as initial object and the one point set   as the terminal object. A probability measure on a measurable space   is the same thing as a morphism   in this category also denoted by  . By composition, a probability space   and a probability kernel   defines a probability space  .

Properties

Semidirect product

Let   be a probability space and   a Markov kernel from   to some  . Then there exists a unique measure   on  , such that:

 

Regular conditional distribution

Let   be a Borel space,   a  -valued random variable on the measure space   and   a sub- -algebra. Then there exists a Markov kernel   from   to  , such that   is a version of the conditional expectation   for every  , i.e.

 

It is called regular conditional distribution of   given   and is not uniquely defined.

Generalizations

Transition kernels generalize Markov kernels in the sense that the map

 

is not necessarily a probability measure but can be any type of measure.

References

  1. ^ Reiss, R. D. (1993). "A Course on Point Processes". Springer Series in Statistics. doi:10.1007/978-1-4613-9308-5. ISBN 978-1-4613-9310-8. {{cite journal}}: Cite journal requires |journal= (help)
  2. ^ Klenke, Achim. Probability Theory: A Comprehensive Course (2 ed.). Springer. p. 180. doi:10.1007/978-1-4471-5361-0.
  3. ^ Erhan, Cinlar (2011). Probability and Stochastics. New York: Springer. pp. 37–38. ISBN 978-0-387-87858-4.
  4. ^ F. W. Lawvere (1962). "The Category of Probabilistic Mappings" (PDF).
§36. Kernels and semigroups of kernels