Discrete-time Markov chain: Difference between revisions

Content deleted Content added
splitting per Talk:Markov chain - written lead and rest is copied from Markov chain (see its talk page for attribution). Improvement in progress...
Tag: Removed redirect
m Maximal set is a (now redirect) page about an unrelated concept in computability theory
 
(30 intermediate revisions by 20 users not shown)
Line 1:
{{Short description|Probability concept}}
{{About|the mathematical properties of discrete-time Markov chains|the general field of study|Markov chain}}
[[File:Markovkate 01.svg|thumb|A Markov chain with two states, ''A'' and ''E''.]]
In [[probability]], a ('''discrete-time''') '''Markov chain''' ('''DTMC''') is a sequence of random variables, known as a [[stochastic process]], in which the value of the next variable depends only on the value of the current variable, and not any variables in the past. For instance, a machine may have two states, ''A'' and ''E''. When it is in state ''A'', there is a 40% chance of it moving to state ''E'' and a 60% chance of it remaining in state ''A''. When it is in state ''E'', there is a 70% chance of it moving to ''EA'' and a 30% chance of it staying in ''AE''. The sequence of states of the machine is a Markov chain. If we denote the chain by <math>X_0, X_1, X_2, ...</math> then <math>X_0</math> is the state which the machine starts in and <math>X_{10}</math> is the [[random variable]] describing its state after 10 transitions. The process continues forever, indexed by the [[natural numbers]].
 
An example of a stochastic process which is not a Markov chain is the model of a machine which has states ''A'' and ''E'' and moves to ''A'' from either state with 50% chance if it has ever visited ''A'' before, and 20% chance if it has never visited ''A'' before (leaving a 50% or 80% chance that the machine moves to ''E''). This is because the behavior of the machine depends on the whole history—if the machine is in ''E'', it may have a 50% or 20% chance of moving to ''A'', depending on its past values. Hence, it does not have the [[Markov property]].
Line 12 ⟶ 13:
:<math>\Pr(X_{n+1}=x\mid X_1=x_1, X_2=x_2, \ldots, X_n=x_n) = \Pr(X_{n+1}=x\mid X_n=x_n),</math> if both [[conditional probability|conditional probabilities]] are well defined, that is, if <math>\Pr(X_1=x_1,\ldots,X_n=x_n)>0.</math>
 
The possible values of ''X''<sub>''i''</sub> form a [[countable set]] ''S'' called the state space of the chain.<ref name="PRS">{{cite book | title=Probability and Random Processes | first2=D. R. | last2=Stirzaker | first1=G. R. | last1=Grimmett | author-link=Geoffrey Grimmett | year=1992 |edition=second | publisher=Oxford University Press|isbn=0198572220 |chapter=6}}</ref>
 
Markov chains are often described by a sequence of [[directed graph]]s, where the edges of graph ''n'' are labeled by the probabilities of going from one state at time ''n'' to the other states at time ''n''&nbsp;+&nbsp;1, <math>\Pr(X_{n+1}=x\mid X_n=x_n).</math> The same information is represented by the transition matrix from time ''n'' to time ''n''&nbsp;+&nbsp;1. However, Markov chains are frequently assumed to be time-homogeneous (see variations below), in which case the graph and matrix are independent of ''n'' and are thus not presented as sequences.
 
These descriptions highlight the structure of the Markov chain that is independent of the initial distribution <math>\Pr(X_1=x_1).</math> When time-homogeneous, the chain can be interpreted as a [[Finite-state machine|state machine]] assigning a probability of hopping from each vertex or state to an adjacent one. The probability <math>\Pr(X_n=x\mid X_1=x_1)</math> of the machine's state can be analyzed as the statistical behavior of the machine with an element <math>x_1</math> of the state space as input, or as the behavior of the machine with the initial distribution <math>\Pr(X_1=y)=[x_1=y]</math> of states as input, where <math>[P]</math> is the [[Iverson bracket]].{{cn|date=August 2020}}
 
=== Variations ===
The fact that some sequences of states might have zero probability of occurring corresponds to a graph with multiple [[connected component (graph theory)|connected components]], where we omit edges that would carry a zero transition probability. For example, if ''a'' has a nonzero probability of going to ''b'', but ''a'' and ''x'' lie in different connected components of the graph, then <math>\Pr(X_{n+1}=b\mid X_n=a)</math> is defined, while <math>\Pr(X_{n+1}=b\mid X_1=x, \ldots, X_n=a)</math> is not.
 
====Variations====
*{{Anchor|homogeneous}}Time-homogeneous Markov chains (or stationary Markov chains) are processes where
::<math>\Pr(X_{n+1}=x\mid X_n=y) = \Pr(X_n=x\mid X_{n-1}=y)</math>
:for all ''n''. The probability of the transition is independent of ''n''.<ref name="PRS"/>
*A Markov chain with memory (or a Markov chain of order ''m'')
:where ''m'' is finite, is a process satisfying
::<math>
\begin{align}
{} &\Pr(X_n=x_n\mid X_{n-1}=x_{n-1}, X_{n-2}=x_{n-2}, \dots , X_1=x_1) \\
= {} &\Pr(X_n=x_n\mid X_{n-1}=x_{n-1}, X_{n-2}=x_{n-2}, \dots, X_{n-m}=x_{n-m})
\text{ for }n > m
\end{align}
</math>
 
:In other words, the future state depends on the past ''m'' states. It is possible to construct a chain <math>(Y_n)</math> from <math>(X_n)</math> which has the 'classical' Markov property by taking as state space the ordered ''m''-tuples of ''X'' values, ie. <math>Y_n= \left( X_n,X_{n-1},\ldots,X_{n-m+1} \right)</math>.{{cn|date=August 2020}}
 
==''n''-step transitions==
==Properties==
The probability of going from state ''i'' to state ''j'' in ''n'' time steps is
===Reducibility===
 
A Markov chain is said to be irreducible if it is possible to get to any state from any state.<ref name=":0">{{cite book|title=Markov Chains: From Theory to Implementation and Experimentation|last=Gagniuc|first=Paul A.|publisher=John Wiley & Sons|year=2017|isbn=978-1-119-38755-8|___location=USA, NJ|pages=1–235}}</ref> The following explains this definition more formally.
:<math>p_{ij}^{(n)} = \Pr(X_n=j\mid X_0=i) </math>
 
and the single-step transition is
 
:<math>p_{ij} = \Pr(X_1=j\mid X_0=i). </math>
 
For a time-homogeneous Markov chain:
 
:<math>p_{ij}^{(n)} = \Pr(X_{k+n}=j \mid X_{k}=i) </math>
 
and
 
:<math>p_{ij} = \Pr(X_{k+1}=j \mid X_k=i). </math>
 
The ''n''-step transition probabilities satisfy the [[Chapman–Kolmogorov equation]], that for any ''k'' such that 0&nbsp;<&nbsp;''k''&nbsp;<&nbsp;''n'',
 
:<math>p_{ij}^{(n)} = \sum_{r \in S} p_{ir}^{(k)} p_{rj}^{(n-k)}</math>
 
where ''S'' is the state space of the Markov chain.<ref name="PRS"/>
 
The [[marginal distribution]] Pr(''X''<sub>''n''</sub>&nbsp;=&nbsp;''x'') is the distribution over states at time ''n''. The initial distribution is Pr(''X''<sub>0</sub>&nbsp;=&nbsp;''x''). The evolution of the process through one time step is described by
 
:<math> \Pr(X_n=j) = \sum_{r \in S} p_{rj} \Pr(X_{n-1}=r) = \sum_{r \in S} p_{rj}^{(n)} \Pr(X_0=r).</math>
 
(The superscript (''n'') is an [[indexed family|index]], and not an [[exponent]]).
 
==Communicating classes and properties==
A state ''j'' is said to be accessible from a state ''i'' (written ''i''&nbsp;→&nbsp;''j'') if a system started in state ''i'' has a non-zero probability of transitioning into state ''j'' at some point. Formally, state ''j'' is accessible from state ''i'' if there exists an integer ''n<sub>ij</sub>''&nbsp;≥&nbsp;0 such that
 
:<math> \Pr(X_{n_{ij}}=j \mid X_0=i) = p_{ij}^{(n_{ij})} > 0.</math>
 
A state ''i'' is said to communicate with state ''j'' (written ''i''&nbsp;↔&nbsp;''j'') if both ''i''&nbsp;→&nbsp;''j'' and ''j''&nbsp;→&nbsp;''i''. A communicating class is a maximal set of states ''C'' such that every pair of states in ''C'' communicates with each other. Communication is an [[equivalence relation]], and communicating classes are the [[equivalence class]]es of this relation.<ref name="PRS"/>
This integer is allowed to be different for each pair of states, hence the subscripts in n<sub>ij</sub>.
Allowing ''n'' to be zero means that every state is accessible from itself by definition. The accessibility relation is reflexive and transitive, but not necessarily symmetric.
 
A communicating class is closed if the probability of leaving the class is zero, namely if ''i'' is in ''C'' but ''j'' is not, then ''j'' is not accessible from&nbsp;''i''.<ref name="PRS"/> The set of communicating classes forms a [[directed acyclic graph|directed, acyclic graph]] by inheriting the arrows from the original state space. A communicating class is closed if and only if it has no outgoing arrows in this graph.
A state ''i'' is said to communicate with state ''j'' (written ''i''&nbsp;↔&nbsp;''j'') if both ''i''&nbsp;→&nbsp;''j'' and ''j''&nbsp;→&nbsp;''i''. A communicating class is a maximal set of states ''C'' such that every pair of states in ''C'' communicates with each other. Communication is an [[equivalence relation]], and communicating classes are the [[equivalence class]]es of this relation.
 
A communicating class is closed if the probability of leaving the class is zero, namely if ''i'' is in ''C'' but ''j'' is not, then ''j'' is not accessible from&nbsp;''i''. The set of communicating classes forms a [[directed acyclic graph|directed, acyclic graph]] by inheriting the arrows from the original state space. A communicating class is closed if and only if it has no outgoing arrows in this graph.
 
A state ''i'' is said to be essential or final if for all ''j'' such that ''i''&nbsp;→&nbsp;''j'' it is also true that ''j''&nbsp;→&nbsp;''i''. A state ''i'' is inessential if it is not essential.<ref>{{cite book|last=Asher Levin|first=David|title=Markov chains and mixing times|page=[https://archive.org/details/markovchainsmixi00levi_364/page/n31 16]|title-link= Markov Chains and Mixing Times |isbn=978-0-8218-4739-8|year=2009}}</ref> A state is final if and only if its communicating class is closed.
 
A Markov chain is said to be irreducible if its state space is a single communicating class; in other words, if it is possible to get to any state from any state.<ref name="PRS"/><ref name="Lawler">{{Cite book |last=Lawler |first=Gregory F. |author-link=Greg Lawler |title=Introduction to Stochastic Processes |publisher=CRC Press |year=2006 |isbn=1-58488-651-X |edition=2nd |language=en}}</ref>{{rp|20}}
 
===Periodicity===
Line 60 ⟶ 82:
:<math> k = \gcd\{ n > 0: \Pr(X_n = i \mid X_0 = i) > 0\}</math>
 
(where <math>\gcd</math> is the [[greatest common divisor]]) provided that this set is not empty. Otherwise the period is not defined.<ref name="PRS"/> Note that even though a state has period <math>k</math>, it may not be possible to reach the state in <math>k</math> steps. For example, suppose it is possible to return to the state in <math>\{6,~8,~10,~12,\dots\}</math> time steps; <math>k</math> would be <math>2</math>, even though <math>2</math> does not appear in this list.
 
If <math>k = 1</math>, then the state is said to be aperiodic. Otherwise (<math> k > 1</math>), the state is said to be periodic with period&nbsp;<math>k</math>. Periodicity is a class property—that is, if a state has period <math>k</math> then every state in its communicating class has period <math>k</math>.<ref name="PRS"/>
If <math>k = 1</math>, then the state is said to be aperiodic.
Otherwise (<math> k > 1</math>), the state is said to be periodic with period&nbsp;<math>k</math>. A Markov chain is aperiodic if every state is aperiodic. An irreducible Markov chain only needs one aperiodic state to imply all states are aperiodic.
 
Every state of a [[bipartite graph]] has an even period.
 
==={{anchor|Transience}}{{anchor|Recurrence}}Transience and recurrence===
A state ''i'' is said to be transient if, given that we start in state ''i'', there is a non-zero probability that we will never return to ''i''. Formally, let the [[random variable]] ''T<sub>i</sub>'' be the first return time to state ''i'' (the "[[hitting time]]"):
 
:<math> T_i = \inf \{ n\ge1: X_n = i\}.</math>
Line 81 ⟶ 100:
:<math> \Pr(T_i < \infty \mid X_0 = i) = \sum_{n=1}^\infty f_{ii}^{(n)} < 1. </math>
 
State ''i'' is recurrent (or persistent) if it is not transient. Recurrence and transience are class properties, that is, they either hold or do not hold equally for all members of a communicating class.<ref name="PRS"/>
Recurrent states are guaranteed (with probability 1) to have a finite hitting time.
Recurrence and transience are class properties, that is, they either hold or do not hold equally for all members of a communicating class.
 
A state ''i'' is recurrent [[if and only if]] the expected number of visits to ''i'' is infinite:<ref name="PRS"/>
====Mean recurrence time====
Even if the hitting time is finite with probability ''1'', it need not have a finite [[expected value|expectation]]<!-- Should provide example later for this and reference it here -->.
The mean recurrence time at state ''i'' is the expected return time ''M<sub>i</sub>'':
 
:<math> M_i = E[T_i]=\sum_{n=10}^\infty n\cdot f_p_{ii}^{(n)} = \infty.</math>
 
====Positive recurrence====
State ''i'' is positive recurrent (or non-null persistent) if ''M<sub>i</sub>'' is finite; otherwise, state ''i'' is null recurrent (or null persistent).
Even if the hitting time is finite with probability 1, it need not have a finite [[expected value|expectation]]<!-- Should provide example later for this and reference it here -->. The mean recurrence time at state ''i'' is the expected return time ''M<sub>i</sub>'':
 
:<math> M_i = E[T_i]=\sum_{n=1}^\infty n\cdot f_{ii}^{(n)}.</math>
====Expected number of visits====
It can be shown that a state ''i'' is recurrent [[if and only if]] the expected number of visits to this state is infinite:
 
State ''i'' is positive recurrent (or non-null persistent) if ''M<sub>i</sub>'' is finite; otherwise, state ''i'' is null recurrent (or null persistent). Positive and null recurrence are classes properties.<ref name="PRS"/>
:<math>\sum_{n=0}^\infty p_{ii}^{(n)} = \infty.</math>
 
====Absorbing states====
Line 103 ⟶ 118:
:<math> p_{ii} = 1\text{ and }p_{ij} = 0\text{ for }i \not= j.</math>
 
If every state can reach an absorbing state, then the Markov chain is an [[absorbing Markov chain]].<ref name=":0" /Grin>{{cite book
| first = Charles M.
| last = Grinstead
| first2 = J. Laurie
| last2 = Snell
| author-link2 = J. Laurie Snell
| title = Introduction to Probability
|date=July 1997
| publisher = American Mathematical Society
| isbn = 978-0-8218-0749-1
| chapter = Ch. 11: Markov Chains
| chapter-url = https://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/Chapter11.pdf}}</ref><ref name=Kem>
{{cite book
| first = John G.
| last = Kemeny
| author-link = John G. Kemeny
| first2 = J. Laurie
| last2 = Snell
| author-link2 = J. Laurie Snell
| editor-first = F. W.
| editor-last = Gehring
| editor2-first = P. R.
| editor2-last = Halmos
| title = Finite Markov Chains
| url = https://archive.org/details/finitemarkovchai00keme_792
| url-access = limited
| edition = Second
| orig-year = 1960
|date=July 1976
| publisher = Springer-Verlag
| ___location = New York Berlin Heidelberg Tokyo
| isbn = 978-0-387-90192-3
| pages = [https://archive.org/details/finitemarkovchai00keme_792/page/n235 224]
| chapter = Ch. 3: Absorbing Markov Chains
}}</ref>
 
===Reversible Markov chain{{Anchor|detailed balance}}===
Line 110 ⟶ 159:
:<math>\pi_i \Pr(X_{n+1} = j \mid X_{n} = i) = \pi_j \Pr(X_{n+1} = i \mid X_{n} = j)</math>
 
for all times ''n'' and all states ''i'' and ''j''. This condition is known as the '''[[detailed balance]]''' condition (some books call it theor local [[balance equation]]).
 
Considering a fixed arbitrary time ''n'' and using the shorthand
Line 118 ⟶ 167:
the detailed balance equation can be written more compactly as
 
:<math>\pi_i p_{ij} = \pi_j p_{ji}\,.</math><ref name="PRS"/>
 
The single time-step from ''n'' to ''n''&nbsp;+&nbsp;1 can be thought of as each person ''i'' having {{pi}}<sub>''i''</sub> dollars initially and paying each person ''j'' a fraction ''p<sub>ij</sub>'' of it. The detailed balance condition states that upon each payment, the other person pays exactly the same amount of money back.<ref name="Durrett2012">{{cite book|url=https://books.google.com/books?id=i_Ovy6OvI54C&pg=PA37|title=Essentials of Stochastic Processes|author=Richard Durrett|date=19 May 2012|publisher=Springer Science & Business Media|isbn=978-1-4614-3615-7|page=37|archiveurlarchive-url=https://web.archive.org/web/20170206031627/https://books.google.com/books?id=i_Ovy6OvI54C&pg=PA37|archivedatearchive-date=6 February 2017|url-status=live}}</ref> Clearly the total amount of money '''{{pi}}''' each person has remains the same after the time-step, since every dollar spent is balanced by a corresponding dollar received. This can be shown more formally by the equality
 
:<math>\sum_i \pi_i p_{ij} = \sum_i \pi_j p_{ji} = \pi_j \sum_i p_{ji} = \pi_j\,,</math>
Line 128 ⟶ 177:
As ''n'' was arbitrary, this reasoning holds for any ''n'', and therefore for reversible Markov chains '''{{pi}}''' is always a steady-state distribution of Pr(''X''<sub>''n''+1</sub>&nbsp;=&nbsp;''j''&nbsp;|&nbsp;''X''<sub>''n''</sub>&nbsp;=&nbsp;''i'') for every&nbsp;''n''.
 
If the Markov chain begins in the steady-state distribution, that is, if <math>\Pr(X_0=i)=\pi_i</math>, then <math>\Pr(X_n=i)=\pi_i</math> for all <math>n</math> and the detailed balance equation can be written as
 
:<math>\Pr(X_{n} = i, X_{n+1} = j) = \Pr(X_{n+1} = i, X_{n} = j)\,.</math>
Line 136 ⟶ 185:
[[Kolmogorov's criterion]] gives a necessary and sufficient condition for a Markov chain to be reversible directly from the transition matrix probabilities. The criterion requires that the products of probabilities around every closed loop are the same in both directions around the loop.
 
Reversible Markov chains are common in [[Markov chain Monte Carlo]] (MCMC) approaches because the detailed balance equation for a desired distribution '''{{pi}}''' necessarily implies that the Markov chain has been constructed so that '''{{pi}}''' is a steady-state distribution. Even with time-inhomogeneous Markov chains, where multiple transition matrices are used, if each such transition matrix exhibits detailed balance with the desired '''{{pi}}''' distribution, this necessarily implies that '''{{pi}}''' is a steady-state distribution of the Markov chain.
 
==== Closest reversible Markov chain ====
===Transient evolution===
For any time-homogeneous Markov chain given by a transition matrix <math>P \in \mathbb{R}^{n \times n}</math>, any norm <math>||\cdot ||</math> on <math> \mathbb{R}^{n \times n}</math> which is induced by a [[Inner product space |scalar product]], and any probability vector <math>\pi</math>, there exists a unique transition matrix <math>P^*</math> which is reversible according to <math>\pi</math>
The probability of going from state ''i'' to state ''j'' in ''n'' time steps is
and which is closest to <math>P</math> according to the norm <math>||\cdot ||.</math> The matrix <math>P^*</math> can be computed by solving a quadratic-convex [[optimization problem]].<ref>A. Nielsen, M. Weber. Online publication in Numerical Linear Algebra with Applications, DOI:10.1002/nla.1967, 2015.</ref>
 
For example, consider the following Markov chain:
:<math>p_{ij}^{(n)} = \Pr(X_n=j\mid X_0=i) </math>
[[File:Markov chain extremly simple1.png|frameless|center|Simple Markov chain.]]
This Markov chain is not reversible. According to the [[Matrix norm#Frobenius norm | Frobenius Norm ]] the closest reversible Markov chain according to <math>\pi = \left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)</math> can be computed as
[[File:Mchain simple corrected C1.png|frameless|center]]
If we choose the [[probability vector]] randomly as <math>\pi=\left( \frac{1}{4}, \frac{1}{4}, \frac{1}{2} \right)</math>, then the closest reversible Markov chain according to the Frobenius norm is approximately given by
[[File:Mvchain approx C2.png|400px|frameless|center]]
 
==Stationary distributions==
and the single-step transition is
A distribution <math>\pi</math> is a stationary distribution of the Markov chain with stochastic matrix <math>P</math> if and only if <math>\pi P = \pi</math>. This can be written as:<ref name="PRS"/>
:<math>\forall j\in \mathbb{S}: \sum_{i\in \mathbb{S}} \pi_i p_{ij} = \pi_j</math>.
 
This condition implies that <math>\pi P^n=\pi</math> and hence if the Markov chain <math>(X_n, n\in \mathbb{N})</math> has initial distribution <math>X_0 = \pi</math> then <math>X_n = \pi</math> (in distribution) for any <math>n\in\mathbb{N}</math>.<ref name="PRS"/>
:<math>p_{ij} = \Pr(X_1=j\mid X_0=i). </math>
 
If a Markov chain is irreducible then it has a stationary distribution if and only if it is positive recurrent,<ref>{{citation
For a time-homogeneous Markov chain:
|last= Serfozo
|first= Richard
|doi= 10.1007/978-3-540-89332-5
|isbn= 978-3-540-89331-8
|mr= 2484222
|page= 35
|title= Basics of Applied Stochastic Processes
|url= https://books.google.com/books?id=JBBRiuxTN0QC&pg=PA35
|year= 2009
|journal= Probability and Its Applications
|url-status= live
|archive-url= https://web.archive.org/web/20150319050311/https://books.google.com/books?id=JBBRiuxTN0QC&pg=PA35
|archive-date= 2015-03-19
|url-access= subscription
}}</ref> in which case the unique such distribution is given by <math>\pi_i=\frac{1}{M_i}</math> where <math>M_i=\mathbb{E}(T_i)</math> is the mean recurrence time of ''i''.<ref name="PRS"/>
 
If a chain has more than one closed communicating class, its stationary distributions will not be unique (consider any [[#Communicating classes and properties|closed communicating class]] <math>C_i</math> in the chain; each one will have its own unique stationary distribution <math>\pi_i</math>. Extending these distributions to the overall chain, setting all values to zero outside the communication class, yields that the set of invariant measures of the original chain is the set of all convex combinations of the <math>\pi_i</math>'s). However, if a state ''j'' is aperiodic, then
:<math>p_{ij}^{(n)} = \Pr(X_{k+n}=j \mid X_{k}=i) </math>
<math>\lim\nolimits_{n \rightarrow \infty} p_{jj}^{(n)} = \tfrac{C}{M_j}</math>
and for any other state ''i'', let ''f<sub>ij</sub>'' be the probability that the chain ever visits state ''j'' if it starts at&nbsp;''i'',
<math>\lim\nolimits_{n \rightarrow\infty} p_{ij}^{(n)} = C \tfrac{f_{ij}}{M_j}.</math>
 
If a state ''i'' is periodic with period ''k''&nbsp;>&nbsp;1 then the limit
and
<math>\lim\nolimits_{n \rightarrow \infty} p_{ii}^{(n)}</math>
does not exist, although the limit
<math>\lim\nolimits_{n \rightarrow \infty} p_{ii}^{(kn+r)}</math>
does exist for every integer&nbsp;''r''.
 
===Steady-state analysis and the time-inhomogeneous Markov chain===
:<math>p_{ij} = \Pr(X_{k+1}=j \mid X_k=i). </math>
A Markov chain need not necessarily be time-homogeneous to have an equilibrium distribution. If there is a probability distribution over states <math>\boldsymbol{\pi}</math> such that
 
:<math>\pi_j = \sum_{i \in S} \pi_i \, \Pr(X_{n+1} = j \mid X_n = i)</math>
The ''n''-step transition probabilities satisfy the [[Chapman–Kolmogorov equation]], that for any ''k'' such that 0&nbsp;<&nbsp;''k''&nbsp;<&nbsp;''n'',
 
for every state ''j'' and every time ''n'' then <math>\boldsymbol{\pi}</math> is an equilibrium distribution of the Markov chain. Such can occur in Markov chain Monte Carlo (MCMC) methods in situations where a number of different transition matrices are used, because each is efficient for a particular kind of mixing, but each matrix respects a shared equilibrium distribution.
:<math>p_{ij}^{(n)} = \sum_{r \in S} p_{ir}^{(k)} p_{rj}^{(n-k)}</math>
 
==Hitting times==
where ''S'' is the state space of the Markov chain.
{{Main|Phase-type distribution}}The hitting time is the time, starting in a given set of states until the chain arrives in a given state or set of states. The distribution of such a time period has a phase type distribution. The simplest such distribution is that of a single exponentially distributed transition.{{cn|date=August 2020}}
 
===Expected hitting times===
The [[marginal distribution]] Pr(''X''<sub>''n''</sub>&nbsp;=&nbsp;''x'') is the distribution over states at time ''n''. The initial distribution is Pr(''X''<sub>0</sub>&nbsp;=&nbsp;''x''). The evolution of the process through one time step is described by
For a subset of states ''A''&nbsp;⊆&nbsp;''S'', the vector ''k''<sup>''A''</sup> of hitting times (where element <math> k_i^A </math> represents the [[expected value]], starting in state ''i'' that the chain enters one of the states in the set ''A'') is the minimal non-negative solution to<ref name="norris2">{{cite book|title=Markov Chains|year=1997|isbn=9780511810633|pages=108–127|chapter=Continuous-time Markov chains II|doi=10.1017/CBO9780511810633.005|last1=Norris|first1=J. R.|author-link1=James R. Norris}}</ref>
 
:<math> \Pr(X_n=j) = \sum_{r \in S} p_{rj} \Pr(X_{n-1}=r) = \sum_{r \in S} p_{rj}^{(n)} \Pr(X_0=r).</math>
 
(The superscript (''n'') is an [[indexed family|index]], and not an [[exponent]]).
 
===Hitting times===
{{Main|Phase-type distribution}}The hitting time is the time, starting in a given set of states until the chain arrives in a given state or set of states. The distribution of such a time period has a phase type distribution. The simplest such distribution is that of a single exponentially distributed transition.
 
====Expected hitting times====
For a subset of states ''A''&nbsp;⊆&nbsp;''S'', the vector ''k''<sup>''A''</sup> of hitting times (where element <math> k_i^A </math> represents the [[expected value]], starting in state ''i'' that the chain enters one of the states in the set ''A'') is the minimal non-negative solution to<ref name="norris2">{{cite book|title=Markov Chains|year=1997|isbn=9780511810633|pages=108–127|chapter=Continuous-time Markov chains II|doi=10.1017/CBO9780511810633.005|pmc=|pmid=|last1=Norris|first1=J. R.|authorlink1=James R. Norris}}</ref>
 
:<math>\begin{align}
Line 177 ⟶ 249:
-\sum_{j \in S} q_{ij} k_j^A = 1&\text{ for } i \notin A.
\end{align}</math>
 
==Ergodic theorem==
An instance of [[ergodic theory]], the ergodic theorem for states that for an irreducible aperiodic Markov chain, for any two states ''i'' and ''j'',<ref name="PRS"/>
:<math>p_{i,j}^{(n)}\rightarrow \frac{1}{M_j}</math> as <math>n\rightarrow \infty</math>.
 
==Notes==
Line 184 ⟶ 260:
{{refbegin}}
* A. A. Markov (1971). "Extension of the limit theorems of probability theory to a sum of variables connected in a chain". reprinted in Appendix B of: R. Howard. ''Dynamic Probabilistic Systems, volume 1: Markov Chains''. John Wiley and Sons.
* {{cite journal |last = Markov |first = A. A. |year = 2006 |title = An Example of Statistical Investigation of the Text Eugene Onegin Concerning the Connection of Samples in Chains |translator-first = David |translator-last = Link |journal = Science in Context |volume = 19 |issue = 4 |pages = 591–600 |doi = 10.1017/s0269889706001074 |urls2cid = https://semanticscholar.org/paper/343829c84e1f9b3fcf6f0ff2d2a1c680426faaaf144854176 }}
* Leo Breiman (1992) [1968] ''Probability''. Original edition published by Addison-Wesley; reprinted by [[Society for Industrial and Applied Mathematics]] {{ISBN|0-89871-296-3}}. (See Chapter 7)
* [[J. L. Doob]] (1953) ''Stochastic Processes''. New York: John Wiley and Sons {{ISBN|0-471-52369-0}}.
* S. P. Meyn and R. L. Tweedie (1993) ''Markov Chains and Stochastic Stability''. London: Springer-Verlag {{ISBN|0-387-19832-6}}. online: [https://web.archive.org/web/20100619010320/https://netfiles.uiuc.edu/meyn/www/spm_files/book.html MCSS] . Second edition to appear, Cambridge University Press, 2009.
* {{cite book |title=Finite Mathematical Structures |url=https://archive.org/details/finitemathematic0000keme_h5g0 |url-access=registration |last=Kemeny |first=John G. |publisher=Prentice-Hall, Inc. |year=1959 |isbn= |edition=1st |___location=Englewood Cliffs, NJ |pages = |id = Library of Congress Card Catalog Number 59-12841 |author2=Hazleton Mirkil |author3=J. Laurie Snell |author4=Gerald L. Thompson }} Classical text. cf Chapter 6 ''Finite Markov Chains'' pp.&nbsp;384ff.
* [[John G. Kemeny]] & [[J. Laurie Snell]] (1960) ''Finite Markov Chains'', D. van Nostrand Company {{ISBN|0-442-04328-7}}
* E. Nummelin. "General irreducible Markov chains and non-negative operators". Cambridge University Press, 1984, 2004. {{ISBN|0-521-60494-X}}
* [[Eugene Seneta|Seneta, E.]] ''Non-negative matrices and Markov chains''. 2nd rev. ed., 1981, XVI, 288 p., Softcover Springer Series in Statistics. (Originally published by Allen & Unwin Ltd., London, 1973) {{ISBN|978-0-387-29765-1}}
{{refend}}
[[zh-yue:離散時間馬可夫鏈]]
[[Category:Markov processes]]