Content deleted Content added
No edit summary |
No edit summary |
||
Line 3:
'''Variable order Markov (VOM)''' Models are an important class of models that extend the well known [[Markov Chain]] Models. In contrast to the [[Markov Chain]] Models, where each random variable in a sequence with a [[Markov property]] depends on a fixed number of [[random variable]]s, in VOM models this number of conditioning [[random variable]]s may vary based on the specific observed realization.
This realization sequence is often called the ''context'' and thus the VOM models are also called ''Context Trees'' [1]. The flexibility in the number of conditioning [[random variable]]s turns out to be of real advantage for many applications, such as statistical analysis, classification and prediction [2,3,4].
==Definition==▼
Let <math>A</math> be a state space (finite [[alphabet]]) of size <nowiki>|A|</nowiki>. ▼
Consider a sequence with the [[Markov chain|Markov property]] <math>x_1^{n}=x_1x_2...x_n</math> of <math>n</math> realizations of [[random variable]]s, where <math> x_i\in A</math> is the state (symbol) at position <math>i</math> 1≤<math>i</math>≤<math>n</math>, and the concatenation of states <math>x_i</math> and <math>x_{i+1}</math> is denoted by <math>x_ix_{i+1}</math>.▼
Given a training set of observed states, <math>x_1^{n}</math>, the construction algorithm of the VOM models [2,3,4] learns a model <math>P</math> that provides a [[probability]] assignment for each state in the sequence given its past (previously observed symbols) or future states.▼
Specifically, the learner generates a [[conditional distribution|conditional probability distribution]] <math>P(x_i|s)</math> for a symbol <math>x_i \in A</math> given a context <math>s\in A^*</math>, where the * sign represents a sequence of states of any length, including the empty context. ▼
VOM models attempt to estimate [[conditional distribution]]s of the form <math>P(x_i|s)</math> where the context length |<math>s</math>|≤<math>D</math> varies depending on the available statistics. ▼
In contrast, conventional [[Markov chain|Markov models]] attempt to estimate these [[conditional distribution]]s by assuming a fixed contexts' length |<math>s</math>|=<math>D</math> and, hence, can be considered as special cases of the VOM models. ▼
Effectively, for a given training sequence, the VOM models are found to obtain better model parameterization than the fixed-order [[Markov chain|Markov Models]] that leads to a better [[variance]]-bias tradeoff of the learned models [2,3,4].▼
==Example==
Line 39 ⟶ 25:
The Variable Order Markov model assumes that in realistic settings, there are certain realizations of states (represented by contexts) in which some past states are independent from the future states, accordingly, ''a great reduction in the number of model parameters can be achieved''.
▲==Definition==
▲Let <math>A</math> be a state space (finite [[alphabet]]) of size <nowiki>|A|</nowiki>.
▲Consider a sequence with the [[Markov chain|Markov property]] <math>x_1^{n}=x_1x_2...x_n</math> of <math>n</math> realizations of [[random variable]]s, where <math> x_i\in A</math> is the state (symbol) at position <math>i</math> 1≤<math>i</math>≤<math>n</math>, and the concatenation of states <math>x_i</math> and <math>x_{i+1}</math> is denoted by <math>x_ix_{i+1}</math>.
▲Given a training set of observed states, <math>x_1^{n}</math>, the construction algorithm of the VOM models [2,3,4] learns a model <math>P</math> that provides a [[probability]] assignment for each state in the sequence given its past (previously observed symbols) or future states.
▲Specifically, the learner generates a [[conditional distribution|conditional probability distribution]] <math>P(x_i|s)</math> for a symbol <math>x_i \in A</math> given a context <math>s\in A^*</math>, where the * sign represents a sequence of states of any length, including the empty context.
▲VOM models attempt to estimate [[conditional distribution]]s of the form <math>P(x_i|s)</math> where the context length |<math>s</math>|≤<math>D</math> varies depending on the available statistics.
▲In contrast, conventional [[Markov chain|Markov models]] attempt to estimate these [[conditional distribution]]s by assuming a fixed contexts' length |<math>s</math>|=<math>D</math> and, hence, can be considered as special cases of the VOM models.
▲Effectively, for a given training sequence, the VOM models are found to obtain better model parameterization than the fixed-order [[Markov chain|Markov Models]] that leads to a better [[variance]]-bias tradeoff of the learned models [2,3,4].
==Application Areas==
|