Variable-order Markov model: Difference between revisions

Content deleted Content added
Rescuing 1 sources and tagging 0 as dead.) #IABot (v2.0.9.5
(14 intermediate revisions by 6 users not shown)
Line 2:
In the mathematical theory of [[stochastic processes]], '''variable-order Markov (VOM) models''' are an important class of models that extend the well known [[Markov chain]] models. In contrast to the Markov chain models, where each [[random variable]] in a sequence with a [[Markov property]] depends on a fixed number of random variables, in VOM models this number of conditioning random variables may vary based on the specific observed realization.
 
This realization sequence is often called the ''context''; therefore the VOM models are also called ''context trees''.<ref name="Rissanen">{{cite journal|last = Rissanen|first = J.|title = A Universal Data Compression System|journal = IEEE Transactions on Information Theory|volume = 29|issue = 5|date = Sep 1983|pages = 656–664|doi = 10.1109/TIT.1983.1056741}}</ref> VOM models are nicely rendered by colorized probabilistic suffix trees (PST).<ref name=":0">{{Cite journal|last1=Gabadinho|first1=Alexis|last2=Ritschard|first2=Gilbert|date=2016|title=Analyzing State Sequences with Probabilistic Suffix Trees: The PST R Package|url=http://www.jstatsoft.org/v72/i03/|journal=Journal of Statistical Software|language=en|volume=72|issue=3|doi=10.18637/jss.v072.i03|s2cid=63681202 |issn=1548-7660|doi-access=free}}</ref> The flexibility in the number of conditioning random variables turns out to be of real advantage for many applications, such as [[statistical analysis]], [[Statistical classification|classification]] and [[prediction]].<ref name="Shmilovici">{{cite journal|last = Shmilovici|first = A.|author2=Ben-Gal, I. |title = Using a VOM Model for Reconstructing Potential Coding Regions in EST Sequences|journal = Computational Statistics|volume = 22|issue = 1|year = 2007|pages = 49–69|doi = 10.1007/s00180-007-0021-8| s2cid=2737235 }}</ref><ref name="Begleiter">{{cite journal|last = Begleiter|first = R.|author2 = El-Yaniv, R.|author3 = Yona, G.|title = On Prediction Using Variable Order Markov models|journal = Journal of Artificial Intelligence Research|volume = 22|year = 2004|pages = 385–421|url = http://www.jair.org/media/1491/live-1491-2335-jair.pdf|doi = 10.1613/jair.1491|accessdoi-dateaccess = 2007-04-22free|archive-urlarxiv = https://web1107.archive.org/web/20070928175244/http://www.jair.org/media/1491/live-1491-2335-jair.pdf|archive-date = 2007-09-28|url-status = dead0051}}</ref><ref name="Ben-Gal">{{cite journal|last = Ben-Gal|first = I. |author2=Morag, G. |author3=Shmilovici, A.|year=2003|title =Context-Based CSPCStatistical Process Control: A Monitoring Procedure for State -Dependent Processes|journal = Technometrics|volume = 45|issue = 4|year = 2003|pages = 293–311|url = http://www.eng.tau.ac.il/~bengal/Technometrics_final.pdf|journal=Technometrics|volume=45|issue=4|pages=293–311|doi = 10.1198/004017003000000122|s2cid=5227793 |issn=0040-1706}}</ref>
 
==Example==
 
Consider for example a sequence of [[random variable]]s, each of which takes a value from the ternary [[alphabet]] {{math|{{mset|''a'',&nbsp; ''b'',&nbsp; ''c''}}}}. Specifically, consider the string ''{{not a typo|aaabcaaabcaaabcaaabc...aaabc}}'' constructed from infinite concatenations of the sub-string ''{{not a typomath|''aaabc''}}: {{math|''aaabcaaabcaaabcaaabc…aaabc''}}.
 
The VOM model of maximal order 2 can approximate the above string using ''only'' the following five [[conditional probability]] components: {{math|Pr(''a'' |{{!}} ''aa'') {{=}} 0.5}}, {{math|Pr(''b'' |{{!}} ''aa'') {{=}} 0.5}}, {{math|Pr(''c'' |{{!}} ''b'') {{=}} 1.0}}, {{math|Pr(''a'' |{{!}} ''c''){{=}} 1.0}}, {{math|Pr(''a'' |{{!}} ''ca'') {{=}} 1.0}}.
 
In this example, {{math|Pr(''c''| {{!}} ''ab'') {{=}} Pr(''c''| {{!}} ''b'') {{=}} 1.0}}; therefore, the shorter context {{math|''b''}} is sufficient to determine the next character. Similarly, the VOM model of maximal order 3 can generate the string exactly using only five conditional probability components, which are all equal to 1.0.
To construct the [[Markov chain]] of order 1 for the next character in that string, one must estimate the following 9 conditional probability components: {{math|Pr(''a'' |{{!}} ''a'')}}, {{math|Pr(''a'' |{{!}} ''b'')}}, {{math|Pr(''a'' |{{!}} ''c'')}}, {{math|Pr(''b'' |{{!}} ''a'')}}, {{math|Pr(''b'' |{{!}} ''b'')}}, {{math|Pr(''b'' |{{!}} ''c'')}}, {{math|Pr(''c'' |{{!}} ''a'')}}, {{math|Pr(''c'' |{{!}} ''b'')}}, {{math|Pr(''c'' |{{!}} ''c'')}}. To construct the Markov chain of order 2 for the next character, one must estimate 27 conditional probability components: {{math|Pr(''a'' |{{!}} ''aa'')}}, {{math|Pr(''a'' |{{!}} ''ab'')}}, ...{{math|…}}, {{math|Pr(''c'' |{{!}} ''cc'')}}. And to construct the Markov chain of order three for the next character one must estimate the following 81 conditional probability components: {{math|Pr(''a'' |{{!}} ''aaa'')}}, {{math|Pr(''a'' |{{!}} ''aab'')}}, ...{{math|…}}, {{math|Pr(''c'' |{{!}} ''ccc'')}}.
 
In practical settings there is seldom sufficient data to accurately estimate the [[exponential growth|exponentially increasing]] number of conditional probability components as the order of the Markov chain increases.
Line 19:
 
==Definition==
Let {{mvar|A}} be a state space (finite [[Alphabet (formal languages)|alphabet]]) of size <math>|A|</math>.
 
Consider a sequence with the [[Markov property]] <math>x_1^{n}=x_1x_2\dots x_n</math> of {{mvar|n}} realizations of [[random variable]]s, where <math> x_i\in A</math> is the state (symbol) at position {{mvar|i}} <math>\scriptstyle (1 \le i \le n)</math>, and the concatenation of states <math>x_i</math> and <math>x_{i+1}</math> is denoted by <math>x_ix_{i+1}</math>.
Line 35:
Various efficient algorithms have been devised for estimating the parameters of the VOM model.<ref name="Begleiter"/>
 
VOM models have been successfully applied to areas such as [[machine learning]], [[information theory]] and [[bioinformatics]], including specific applications such as [[code|coding]] and [[data compression]],<ref name="Rissanen"/> document compression,<ref name="Begleiter"/> classification and identification of [[DNA]] and [[protein|protein sequences]],<ref>{{cite journal |url= http://www.eng.tau.ac.il/~bengal/VOMBAT.pdf |title= VOMBAT: Prediction of Transcription Factor Binding Sites using Variable Order Bayesian Trees |author1= Grau J. |author2= Ben-Gal I. |author3= Posch S. |author4= Grosse I. |journal= Nucleic Acids Research |publisher= Nucleic Acids Research, vol. 34, issue W529–W533. |year= 2006 |volume= 34 |issue= Web Server issue |pages= W529-33 |doi= 10.1093/nar/gkl212 |pmid= 16845064 |pmc= 1538886 |archive-date= 2018-09-30 |access-date= 2014-01-10 |archive-url= https://web.archive.org/web/20180930084306/http://www.eng.tau.ac.il/~bengal/VOMBAT.pdf |url-status= dead }}</ref> [http://www.eng.tau.ac.il/~bengal/VOMBAT.pdf]<ref name="Shmilovici"/> [[statistical process control]],<ref name="Ben-Gal"/> [[spam filtering]],<ref name="Bratko">{{cite journal|last = Bratko|first = A. |author2=Cormack, G. V. |author3=Filipic, B. |author4=Lynam, T. |author5=Zupan, B.|title = Spam Filtering Using Statistical Data Compression Models|journal = Journal of Machine Learning Research|volume = 7|year = 2006|pages = 2673–2698|url = http://www.jmlr.org/papers/volume7/bratko06a/bratko06a.pdf}}</ref> [[haplotyping]],<ref>[[Sharon R. Browning|Browning, Sharon R.]] "Multilocus association mapping using variable-length Markov chains." The American Journal of Human Genetics 78.6 (2006): 903–913.</ref> speech recognition,<ref>{{Cite book|last1=Smith|first1=A.|last2=Denenberg|first2=J.|last3=Slack|first3=T.|last4=Tan|first4=C.|last5=Wohlford|first5=R.|title=ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing |chapter=Application of a sequential pattern learning system to connected speech recognition |date=1985|chapter-url=https://ieeexplore.ieee.org/document/1168282|___location=Tampa, FL, USA|publisher=Institute of Electrical and Electronics Engineers|volume=10|pages=1201–1204|doi=10.1109/ICASSP.1985.1168282|s2cid=60991068 }}</ref> [[sequence analysis in social sciences]],<ref name=":0" /> and others.
 
==See also==
Line 49:
==References==
{{reflist}}
 
<ref>{{cite journal |last1=Smith |first1=A. R. |last2=Denenberg |first2=J. N. |last3=Slack |first3=T. B. |last4=Tan |first4=C. C. |last5=Wohlford |first5=R. E |title=Application of a Sequential Pattern Learning System to Connected Speech Recognition |journal=Proceedings of the IEEE 1985 International Conference on Acoustics, Speech, and Signal Processing |date=August 1985 |pages=1201–1204 |url=http://doctord.dyndns.org/Pubs/ApplicatioOfaSequentialPatternLearningSystem.pdf}}</ref>
[[Category:Markov models]]