Content deleted Content added
Line 11:
Given a training set of observed states, <math>x_1^{n}</math>, the construction algorithm of the VOM models learns a model <math>P</math> that provides a [[probability]] assignment for each state in the sequence given its past (previously observed symbols) or future states.
Specifically, the learner generates a [[conditional distribution|conditional probability distribution]] <math>P(x|s)</math> for a symbol <math>x_i \in A</math> given a context <math>s\in A^*</math>, where the * sign represents a sequence of states of any length, including the empty context.
VOM models attempt to estimate [[conditional
In contrast, conventional [[Markov chain|Markov models]] attempt to estimate these [[conditional
Effectively, for a given training sequence, the VOM models are found to obtain better model parameterization than the fixed-order [[Markov chain|Markov Models]] that leads to a better [[variance]]-bias tradeoff of the learned models [2,3,4].
==Example==
|