Revision as of 05:20, 25 July 2014 edit Monkbot (talk \| contribs) Bots 3,695,952 edits m Task 5: Fix CS1 deprecated coauthor parameter errors ← Previous edit		Revision as of 08:11, 27 July 2014 edit undo Yobot (talk \| contribs) Bots 4,733,870 edits m WP:CHECKWIKI error fixes using AWB (10331) Next edit →
Line 1: In [[stochastic processes]], a topic in [[mathematics]], '''Variable-order Markov (VOM) models''' are an important class of models that extend the well known [[Markov chain]] models. In contrast to the Markov chain models, where each [[random variable]] in a sequence with a [[Markov property]] depends on a fixed number of random variables, in VOM models this number of conditioning random variables may vary based on the specific observed realization. This realization sequence is often called the ''context''; therefore the VOM models are also called ''context trees''.<ref name="Rissanen">{{cite journal\|last = Rissanen\|first = J.\|title = A Universal Data Compression System\|journal = IEEE Transactions on Information Theory\|volume = 29\|issue = 5\|date = Sep 1983\|pages = 656–664\|url = http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?isnumber=22734&arnumber=1056741\|doi = 10.1109/TIT.1983.1056741}}</ref> The flexibility in the number of conditioning random variables turns out to be of real advantage for many applications, such as [[statistical analysis]], [[Statistical classification\|classification]] and [[prediction]].<ref name="Shmilovici">{{cite journal\|last = Shmilovici\|first = A.\|author2=Ben-Gal, I. \|title = Using a VOM Model for Reconstructing Potential Coding Regions in EST Sequences\|journal = Computational Statistics\|volume = 22\|issue = 1\|year = 2007\|pages = 49–69\|url=http://www.springerlink.com/content/a447865604519210/\|doi = 10.1007/s00180-007-0021-8}}</ref><ref name="Begleiter">{{cite journal\|last = Begleiter\|first = R.\|coauthors = El-Yaniv, R. and Yona, G.\|title = On Prediction Using Variable Order Markov models\|journal = Journal of Artificial Intelligence Research\|volume = 22\|year = 2004\|pages = 385–421\|url = http://www.jair.org/media/1491/live-1491-2335-jair.pdf}}</ref><ref name="Ben-Gal">{{cite journal\|last = Ben-Gal\|first = I.\|coauthors = Morag, G. and Shmilovici, A.\|title = CSPC: A Monitoring Procedure for State Dependent Processes\|journal = Technometrics\|volume = 45\|issue = 4\|year = 2003\|pages = 293–311\|url = http://www.eng.tau.ac.il/~bengal/Technometrics_final.pdf\|doi = 10.1198/004017003000000122}}</ref> Line 7: Consider for example a sequence of [[random variable]]s, each of which takes a value from the ternary [[alphabet]] {''a'', ''b'', ''c''}. Specifically, consider the string ''aaabcaaabcaaabcaaabc...aaabc'' constructed from infinite concatenations of the sub-string ''aaabc''. The VOM model of maximal order 2 can approximate the above string using ''only'' the following five [[conditional probability]] components: {Pr(''a''\|''aa'') = 0.5, Pr(''b''\|''aa'') = 0.5, Pr(''c''\|''b'') = 1.0, Pr(''a''\|''c'')= 1.0, Pr(''a''\|''ca'')= 1.0}. In this example, Pr(''c''\|''ab'') = Pr(''c''\|''b'') = 1.0; therefore, the shorter context ''b'' is sufficient to determine the next character. Similarly, the VOM model of maximal order 3 can generate the string exactly using only five conditional probability components, which are all equal to 1.0. Line 18: ==Definition== Let <math>A</math> be a state space (finite [[alphabet]]) of size <nowiki>\|A\|</nowiki>. Consider a sequence with the [[Markov property]] <math>x_1^{n}=x_1x_2\dots x_n</math> of <math>n</math> realizations of [[random variable]]s, where <math> x_i\in A</math> is the state (symbol) at position <math>i</math> 1≤<math>i</math>≤<math>n</math>, and the concatenation of states <math>x_i</math> and <math>x_{i+1}</math> is denoted by <math>x_ix_{i+1}</math>. Line 24: Given a training set of observed states, <math>x_1^{n}</math>, the construction algorithm of the VOM models<ref name="Shmilovici"/><ref name="Begleiter"/><ref name="Ben-Gal"/> learns a model <math>P</math> that provides a [[probability]] assignment for each state in the sequence given its past (previously observed symbols) or future states. Specifically, the learner generates a [[conditional distribution\|conditional probability distribution]] <math>P(x_i\|s)</math> for a symbol <math>x_i \in A</math> given a context <math>s\in A^</math>, where the sign represents a sequence of states of any length, including the empty context. VOM models attempt to estimate [[conditional distribution]]s of the form <math>P(x_i\|s)</math> where the context length \|<math>s</math>\|≤<math>D</math> varies depending on the available statistics. In contrast, conventional [[Markov chain\|Markov models]] attempt to estimate these [[conditional distribution]]s by assuming a fixed contexts' length \|<math>s</math>\|=<math>D</math> and, hence, can be considered as special cases of the VOM models. Effectively, for a given training sequence, the VOM models are found to obtain better model parameterization than the fixed-order [[Markov chain\|Markov models]] that leads to a better [[variance]]-bias tradeoff of the learned models.<ref name="Shmilovici"/><ref name="Begleiter"/><ref name="Ben-Gal"/> Line 34: Various efficient algorithms have been devised for estimating the parameters of the VOM model.<ref name="Begleiter"/> VOM models have been successfully applied to areas such as [[machine learning]], [[information theory]] and [[bioinformatics]], including specific applications such as [[code\|coding]] and [[data compression]],<ref name="Rissanen"/> document compression,<ref name="Begleiter"/> classification and identification of [[DNA]] and [[protein\|protein sequences]],<ref>{{cite paper \|url= http://www.eng.tau.ac.il/~bengal/VOMBAT.pdf\|title= VOMBAT: Prediction of Transcription Factor Binding Sites using Variable Order Bayesian Trees, \|author= Grau J., Ben-Gal I., Posch S., Grosse I.\|publisher= Nucleic Acids Research, vol. 34, issue W529–W533.\|year=2006}}</ref> [http://www.eng.tau.ac.il/~bengal/VOMBAT.pdf]<ref name="Shmilovici"/> [[statistical process control]],<ref name="Ben-Gal"/> [[spam filtering]],<ref name="Bratko">{{cite journal\|last = Bratko\|first = A.\|coauthors = Cormack, G. V., Filipic, B., Lynam, T. and Zupan, B.\|title = Spam Filtering Using Statistical Data Compression Models\|journal = Journal of Machine Learning Research\|volume = 7\|year = 2006\|pages = 2673–2698\|url = http://www.jmlr.org/papers/volume7/bratko06a/bratko06a.pdf}}</ref>, [[haplotyping]]<ref>Browning, Sharon R. "Multilocus association mapping using variable-length Markov chains." The American Journal of Human Genetics 78.6 (2006): 903-913.</ref> and others. ==See also==

Variable-order Markov model: Difference between revisions