Revision as of 01:42, 23 January 2018 edit Michael Hardy (talk \| contribs) Administrators 210,577 edits No edit summary ← Previous edit		Revision as of 01:44, 23 January 2018 edit undo Michael Hardy (talk \| contribs) Administrators 210,577 edits No edit summary Next edit →
Line 7: Consider for example a sequence of [[random variable]]s, each of which takes a value from the ternary [[alphabet]] {''a'', ''b'', ''c''}. Specifically, consider the string ''aaabcaaabcaaabcaaabc...aaabc'' constructed from infinite concatenations of the sub-string ''aaabc''. The VOM model of maximal order 2 can approximate the above string using ''only'' the following five [[conditional probability]] components: {Pr(''a'' \| ''aa'') = 0.5, Pr(''b'' \| ''aa'') = 0.5, Pr(''c'' \| ''b'') = 1.0, Pr(''a'' \| ''c'')= 1.0, Pr(''a'' \| ''ca'') = 1.0}. In this example, Pr(''c''\|''ab'') = Pr(''c''\|''b'') = 1.0; therefore, the shorter context ''b'' is sufficient to determine the next character. Similarly, the VOM model of maximal order 3 can generate the string exactly using only five conditional probability components, which are all equal to 1.0. To construct the [[Markov chain]] of order 1 for the next character in that string, one must estimate the following 9 conditional probability components: {Pr(''a'' \| ''a''), Pr(''a'' \| ''b''), Pr(''a'' \| ''c''), Pr(''b'' \| ''a''), Pr(''b'' \| ''b''), Pr(''b'' \| ''c''), Pr(''c'' \| ''a''), Pr(''c'' \| ''b''), Pr(''c'' \| ''c'')}. To construct the Markov chain of order 2 for the next character, one must estimate 27 conditional probability components: {Pr(''a'' \| ''aa''), Pr(''a'' \| ''ab''), ..., Pr(''c'' \| ''cc'')}. And to construct the Markov chain of order three for the next character one must estimate the following 81 conditional probability components: {Pr(''a'' \| ''aaa''), Pr(''a'' \| ''aab''), ..., Pr(''c'' \| ''ccc'')}. In practical settings there is seldom sufficient data to accurately estimate the [[exponential growth\|exponentially increasing]] number of conditional probability components as the order of the Markov chain increases. Line 24: Given a training set of observed states, <math>x_1^{n}</math>, the construction algorithm of the VOM models<ref name="Shmilovici"/><ref name="Begleiter"/><ref name="Ben-Gal"/> learns a model {{mvar\|P}} that provides a [[probability]] assignment for each state in the sequence given its past (previously observed symbols) or future states. Specifically, the learner generates a [[conditional distribution\|conditional probability distribution]] <math>P(x_i\|\mid s)</math> for a symbol <math>x_i \in A</math> given a context <math>s\in A^</math>, where the sign represents a sequence of states of any length, including the empty context. VOM models attempt to estimate [[conditional distribution]]s of the form <math>P(x_i\|s\mids)</math> where the context length <math>\|s\| \le D</math> varies depending on the available statistics. In contrast, conventional [[Markov chain\|Markov models]] attempt to estimate these [[conditional distribution]]s by assuming a fixed contexts' length <math>\|s\| = D</math> and, hence, can be considered as special cases of the VOM models.

Variable-order Markov model: Difference between revisions