Revision as of 22:40, 20 April 2007 edit 85.64.213.129 (talk) →Definition ← Previous edit		Revision as of 22:43, 20 April 2007 edit undo 85.64.213.129 (talk) →Definition Next edit →
Line 11: Given a training set of observed states, <math>x_1^{n}</math>, the construction algorithm of the VOM models learns a model <math>P</math> that provides a [[probability]] assignment for each state in the sequence given its past (previously observed symbols) or future states. Specifically, the learner generates a [[conditional distribution\|conditional probability distribution]] <math>P(x\|s)</math> for a symbol <math>x_i \in A</math> given a context <math>s\in A^</math>, where the sign represents a sequence of states of any length, including the empty context. VOM models attempt to estimate [[conditional ~~distributions~~distribution]]s of the form <math>P(x\|s)</math> where the context length \|<math>s</math>\|≤<math>D</math> varies depending on the available statistics. In contrast, conventional [[Markov chain\|Markov models]] attempt to estimate these [[conditional ~~distributions~~distribution]]s by assuming a fixed contexts' length \|<math>s</math>\|=<math>D</math> and, hence, can be considered as special cases of the VOM models. Effectively, for a given training sequence, the VOM models are found to obtain better model parameterization than the fixed-order [[Markov chain\|Markov Models]] that leads to a better [[variance]]-bias tradeoff of the learned models [2,3,4]. ==Example==

Variable-order Markov model: Difference between revisions