Revision as of 20:23, 6 June 2016 edit Dcirovic (talk \| contribs) Autopatrolled, Extended confirmed users, Pending changes reviewers, Rollbackers 253,275 edits m →Application areas: refs using AWB ← Previous edit		Revision as of 05:39, 11 September 2017 edit undo Cedar101 (talk \| contribs) Extended confirmed users 18,555 edits m →Definition: merge <math>s Next edit →
Line 18: ==Definition== Let ~~<math>~~{{mvar\|A~~</math>~~}} be a state space (finite [[alphabet]]) of size <math>\|A\|</math>. Consider a sequence with the [[Markov property]] <math>x_1^{n}=x_1x_2\dots x_n</math> of ~~<math>~~{{mvar\|n~~</math>~~}} realizations of [[random variable]]s, where <math> x_i\in A</math> is the state (symbol) at position ~~<math>~~{{mvar\|i}} </math>\scriptstyle (1 \le ~~1≤<math>~~i~~</math>≤<math>~~ \le n)</math>, and the concatenation of states <math>x_i</math> and <math>x_{i+1}</math> is denoted by <math>x_ix_{i+1}</math>. Given a training set of observed states, <math>x_1^{n}</math>, the construction algorithm of the VOM models<ref name="Shmilovici"/><ref name="Begleiter"/><ref name="Ben-Gal"/> learns a model ~~<math>~~{{mvar\|P~~</math>~~}} that provides a [[probability]] assignment for each state in the sequence given its past (previously observed symbols) or future states. Specifically, the learner generates a [[conditional distribution\|conditional probability distribution]] <math>P(x_i\|s)</math> for a symbol <math>x_i \in A</math> given a context <math>s\in A^</math>, where the sign represents a sequence of states of any length, including the empty context. VOM models attempt to estimate [[conditional distribution]]s of the form <math>P(x_i\|s)</math> where the context length \|<math>\|s~~</math>~~\|~~≤<math>~~ \le D</math> varies depending on the available statistics. In contrast, conventional [[Markov chain\|Markov models]] attempt to estimate these [[conditional distribution]]s by assuming a fixed contexts' length \|<math>\|s~~</math>~~\| =~~<math>~~ D</math> and, hence, can be considered as special cases of the VOM models. Effectively, for a given training sequence, the VOM models are found to obtain better model parameterization than the fixed-order [[Markov chain\|Markov models]] that leads to a better [[variance]]-bias tradeoff of the learned models.<ref name="Shmilovici"/><ref name="Begleiter"/><ref name="Ben-Gal"/>

Variable-order Markov model: Difference between revisions