Variable-order Markov model: Difference between revisions

Browse history interactively

← Previous edit Next edit →

Content deleted Content added

VisualWikitext

Revision as of 14:46, 25 November 2021 edit Yensaa (talk \| contribs) 185 edits m completed a ref Tag: Visual edit ← Previous edit		Revision as of 09:34, 25 July 2025 edit undo InternetArchiveBot (talk \| contribs) Bots, Pending changes reviewers 5,674,743 edits Rescuing 1 sources and tagging 0 as dead.) #IABot (v2.0.9.5 Next edit →
(8 intermediate revisions by 4 users not shown)
Line 2: In the mathematical theory of [[stochastic processes]], '''variable-order Markov (VOM) models''' are an important class of models that extend the well known [[Markov chain]] models. In contrast to the Markov chain models, where each [[random variable]] in a sequence with a [[Markov property]] depends on a fixed number of random variables, in VOM models this number of conditioning random variables may vary based on the specific observed realization. This realization sequence is often called the ''context''; therefore the VOM models are also called ''context trees''.<ref name="Rissanen">{{cite journal\|last = Rissanen\|first = J.\|title = A Universal Data Compression System\|journal = IEEE Transactions on Information Theory\|volume = 29\|issue = 5\|date = Sep 1983\|pages = 656–664\|doi = 10.1109/TIT.1983.1056741}}</ref> VOM models are nicely rendered by colorized probabilistic suffix trees (PST).<ref name=":0">{{Cite journal\|~~last~~last1=Gabadinho\|~~first~~first1=Alexis\|last2=Ritschard\|first2=Gilbert\|date=2016\|title=Analyzing State Sequences with Probabilistic Suffix Trees: The PST R Package\|url=http://www.jstatsoft.org/v72/i03/\|journal=Journal of Statistical Software\|language=en\|volume=72\|issue=3\|doi=10.18637/jss.v072.i03\|s2cid=63681202 \|issn=1548-7660\|doi-access=free}}</ref> The flexibility in the number of conditioning random variables turns out to be of real advantage for many applications, such as [[statistical analysis]], [[Statistical classification\|classification]] and [[prediction]].<ref name="Shmilovici">{{cite journal\|last = Shmilovici\|first = A.\|author2=Ben-Gal, I. \|title = Using a VOM Model for Reconstructing Potential Coding Regions in EST Sequences\|journal = Computational Statistics\|volume = 22\|issue = 1\|year = 2007\|pages = 49–69\|doi = 10.1007/s00180-007-0021-8\| s2cid=2737235 }}</ref><ref name="Begleiter">{{cite journal\|last = Begleiter\|first = R.\|author2 = El-Yaniv, R.\|author3 = Yona, G.\|title = On Prediction Using Variable Order Markov models\|journal = Journal of Artificial Intelligence Research\|volume = 22\|year = 2004\|pages = 385–421~~\|url = http://www.jair.org/media/1491/live-1491-2335-jair.pdf~~\|doi = 10.1613/jair.1491\|doi-access~~-date~~ = ~~2007-04-22~~free\|~~archive-url~~arxiv = ~~https://web~~1107.~~archive.org/web/20070928175244/http://www.jair.org/media/1491/live-1491-2335-jair.pdf\|archive-date = 2007-09-28\|url-status = dead\|doi-access = free~~0051}}</ref><ref name="Ben-Gal">{{cite journal\|last=Ben-Gal\|first=I.\|author2=Morag, G.\|author3=Shmilovici, A.\|year=2003\|title=Context-Based Statistical Process Control: A Monitoring Procedure for State-Dependent Processes\|url=http://www.eng.tau.ac.il/~bengal/Technometrics_final.pdf\|journal=Technometrics\|volume=45\|issue=4\|pages=293–311\|doi=10.1198/004017003000000122\|s2cid=5227793 \|issn=0040-1706}}</ref> ==Example== Consider for example a sequence of [[random variable]]s, each of which takes a value from the ternary [[alphabet]] {{math\|{{mset\|''a'',~~ ~~ ''b'',~~ ~~ ''c''}}}}. Specifically, consider the string ~~''{{not a typo\|aaabcaaabcaaabcaaabc...aaabc}}''~~ constructed from infinite concatenations of the sub-string ''{{~~not a typo~~math\|''aaabc''}}: {{math\|''aaabcaaabcaaabcaaabc…aaabc''}}. The VOM model of maximal order 2 can approximate the above string using ''only'' the following five [[conditional probability]] components: {{math\|Pr(''a'' \|{{!}} ''aa'') {{=}} 0.5}}, {{math\|Pr(''b'' \|{{!}} ''aa'') {{=}} 0.5}}, {{math\|Pr(''c'' \|{{!}} ''b'') {{=}} 1.0}}, {{math\|Pr(''a'' \|{{!}} ''c''){{=}} 1.0}}, {{math\|Pr(''a'' \|{{!}} ''ca'') {{=}} 1.0}}. In this example, {{math\|Pr(''c''\| {{!}} ''ab'') {{=}} Pr(''c''\| {{!}} ''b'') {{=}} 1.0}}; therefore, the shorter context {{math\|''b''}} is sufficient to determine the next character. Similarly, the VOM model of maximal order 3 can generate the string exactly using only five conditional probability components, which are all equal to 1.0. To construct the [[Markov chain]] of order 1 for the next character in that string, one must estimate the following 9 conditional probability components: {{math\|Pr(''a'' \|{{!}} ''a'')}}, {{math\|Pr(''a'' \|{{!}} ''b'')}}, {{math\|Pr(''a'' \|{{!}} ''c'')}}, {{math\|Pr(''b'' \|{{!}} ''a'')}}, {{math\|Pr(''b'' \|{{!}} ''b'')}}, {{math\|Pr(''b'' \|{{!}} ''c'')}}, {{math\|Pr(''c'' \|{{!}} ''a'')}}, {{math\|Pr(''c'' \|{{!}} ''b'')}}, {{math\|Pr(''c'' \|{{!}} ''c'')}}. To construct the Markov chain of order 2 for the next character, one must estimate 27 conditional probability components: {{math\|Pr(''a'' \|{{!}} ''aa'')}}, {{math\|Pr(''a'' \|{{!}} ''ab'')}}, ~~...~~{{math\|…}}, {{math\|Pr(''c'' \|{{!}} ''cc'')}}. And to construct the Markov chain of order three for the next character one must estimate the following 81 conditional probability components: {{math\|Pr(''a'' \|{{!}} ''aaa'')}}, {{math\|Pr(''a'' \|{{!}} ''aab'')}}, ~~...~~{{math\|…}}, {{math\|Pr(''c'' \|{{!}} ''ccc'')}}. In practical settings there is seldom sufficient data to accurately estimate the [[exponential growth\|exponentially increasing]] number of conditional probability components as the order of the Markov chain increases. Line 19: ==Definition== Let {{mvar\|A}} be a state space (finite [[Alphabet (formal languages)\|alphabet]]) of size <math>\|A\|</math>. Consider a sequence with the [[Markov property]] <math>x_1^{n}=x_1x_2\dots x_n</math> of {{mvar\|n}} realizations of [[random variable]]s, where <math> x_i\in A</math> is the state (symbol) at position {{mvar\|i}} <math>\scriptstyle (1 \le i \le n)</math>, and the concatenation of states <math>x_i</math> and <math>x_{i+1}</math> is denoted by <math>x_ix_{i+1}</math>. Line 35: Various efficient algorithms have been devised for estimating the parameters of the VOM model.<ref name="Begleiter"/> VOM models have been successfully applied to areas such as [[machine learning]], [[information theory]] and [[bioinformatics]], including specific applications such as [[code\|coding]] and [[data compression]],<ref name="Rissanen"/> document compression,<ref name="Begleiter"/> classification and identification of [[DNA]] and [[protein\|protein sequences]],<ref>{{cite journal \|url= http://www.eng.tau.ac.il/~bengal/VOMBAT.pdf \|title= VOMBAT: Prediction of Transcription Factor Binding Sites using Variable Order Bayesian Trees \|author1= Grau J. \|author2= Ben-Gal I. \|author3= Posch S. \|author4= Grosse I. \|journal= Nucleic Acids Research \|publisher= Nucleic Acids Research, vol. 34, issue W529–W533. \|year= 2006 \|volume= 34 \|issue= Web Server issue \|pages= W529-33 \|doi= 10.1093/nar/gkl212 \|pmid= 16845064 \|pmc= 1538886 \|archive-date= 2018-09-30 \|access-date= 2014-01-10 \|archive-url= https://web.archive.org/web/20180930084306/http://www.eng.tau.ac.il/~bengal/VOMBAT.pdf \|url-status= dead }}</ref> [http://www.eng.tau.ac.il/~bengal/VOMBAT.pdf]<ref name="Shmilovici"/> [[statistical process control]],<ref name="Ben-Gal"/> [[spam filtering]],<ref name="Bratko">{{cite journal\|last = Bratko\|first = A. \|author2=Cormack, G. V. \|author3=Filipic, B. \|author4=Lynam, T. \|author5=Zupan, B.\|title = Spam Filtering Using Statistical Data Compression Models\|journal = Journal of Machine Learning Research\|volume = 7\|year = 2006\|pages = 2673–2698\|url = http://www.jmlr.org/papers/volume7/bratko06a/bratko06a.pdf}}</ref> [[haplotyping]],<ref>[[Sharon R. Browning\|Browning, Sharon R.]] "Multilocus association mapping using variable-length Markov chains." The American Journal of Human Genetics 78.6 (2006): 903–913.</ref> speech recognition,<ref>{{Cite ~~journal~~book\|~~last~~last1=Smith\|~~first~~first1=A.\|last2=Denenberg\|first2=J.\|last3=Slack\|first3=T.\|last4=Tan\|first4=C.\|last5=Wohlford\|first5=R.\|~~date~~title=~~1985~~ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing \|~~title~~chapter=Application of a sequential pattern learning system to connected speech recognition \|date=1985\|chapter-url=~~http~~https://ieeexplore.ieee.org/document/1168282~~/\|journal=ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing~~\|___location=Tampa, FL, USA\|publisher=Institute of Electrical and Electronics Engineers\|volume=10\|pages=1201–1204\|doi=10.1109/ICASSP.1985.1168282\|s2cid=60991068 }}</ref> [[sequence analysis in social sciences]],<ref name=":0" /> and others. ==See also==