Revision as of 09:58, 26 August 2023 edit Arjayay (talk \| contribs) Autopatrolled, Extended confirmed users, Page movers, Pending changes reviewers, Rollbackers 678,343 edits m refered > referred ← Previous edit		Revision as of 00:18, 18 September 2023 edit undo InternetArchiveBot (talk \| contribs) Bots, Pending changes reviewers 5,694,552 edits Rescuing 2 sources and tagging 0 as dead.) #IABot (v2.0.9.5) (Whoop whoop pull up - 15035 Next edit →
Line 124: The parameter learning task in HMMs is to find, given an output sequence or a set of such sequences, the best set of state transition and emission probabilities. The task is usually to derive the [[maximum likelihood]] estimate of the parameters of the HMM given the set of output sequences. No tractable algorithm is known for solving this problem exactly, but a local maximum likelihood can be derived efficiently using the [[Baum–Welch algorithm]] or the Baldi–Chauvin algorithm. The [[Baum–Welch algorithm]] is a special case of the [[expectation-maximization algorithm]]. If the HMMs are used for time series prediction, more sophisticated Bayesian inference methods, like [[Markov chain Monte Carlo]] (MCMC) sampling are proven to be favorable over finding a single maximum likelihood model both in terms of accuracy and stability.<ref>Sipos, I. Róbert. ''Parallel stratified MCMC sampling of AR-HMMs for stochastic time series prediction''. In: Proceedings, 4th Stochastic Modeling Techniques and Data Analysis International Conference with Demographics Workshop (SMTDA2016), pp. 295-306. Valletta, 2016. [http://1drv.ms/b/s!ApL_0Av0YGDLglwEOv1aYAGbmQeL PDF]</ref> Since MCMC imposes significant computational burden, in cases where computational scalability is also of interest, one may alternatively resort to variational approximations to Bayesian inference, e.g.<ref>{{cite journal \|url=http://users.iit.demokritos.gr/~dkosmo/downloads/patrec10/vbb10.pdf \|doi=10.1016/j.patcog.2010.09.001 \|volume=44 \|issue=2 \|title=A variational Bayesian methodology for hidden Markov models utilizing Student's-t mixtures \|year=2011 \|journal=Pattern Recognition \|pages=295–306 \|last1=Chatzis \|first1=Sotirios P. \|last2=Kosmopoulos \|first2=Dimitrios I. \|bibcode=2011PatRe..44..295C \|citeseerx=10.1.1.629.6275 \|access-date=2018-03-11 \|archive-date=2011-04-01 \|archive-url=https://web.archive.org/web/20110401184517/http://users.iit.demokritos.gr/~dkosmo/downloads/patrec10/vbb10.pdf \|url-status=dead }}</ref> Indeed, approximate variational inference offers computational efficiency comparable to expectation-maximization, while yielding an accuracy profile only slightly inferior to exact MCMC-type Bayesian inference. == Applications == Line 155: == History == Hidden Markov models were described in a series of statistical papers by [[Leonard E. Baum]] and other authors in the second half of the 1960s.<ref>{{cite journal \|last=Baum \|first=L. E. \|author2=Petrie, T. \|title=Statistical Inference for Probabilistic Functions of Finite State Markov Chains \|journal=The Annals of Mathematical Statistics \|year=1966 \|volume=37 \|issue=6 \|pages=1554–1563 \|doi=10.1214/aoms/1177699147\|doi-access=free }}</ref><ref>{{Cite journal \|last1=Baum \|first1=L. E. \|last2=Eagon \|first2=J. A. \|doi=10.1090/S0002-9904-1967-11751-8 \|title=An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology \|journal=[[Bulletin of the American Mathematical Society]] \|volume=73 \|issue=3 \|pages=360 \|year=1967 \|zbl=0157.11101 \|url=http://projecteuclid.org/euclid.bams/1183528841 \|doi-access=free }}</ref><ref>{{cite journal \|last=Baum \|first=L. E. \|author2=Sell, G. R. \|title=Growth transformations for functions on manifolds \|journal=Pacific Journal of Mathematics \|year=1968 \|volume=27 \|issue=2 \|pages=211–227 \|url=https://www.scribd.com/doc/6369908/Growth-Functions-for-Transformations-on-Manifolds \|access-date=28 November 2011 \|doi=10.2140/pjm.1968.27.211 \|doi-access=free \|archive-date=18 March 2014 \|archive-url=https://web.archive.org/web/20140318223209/http://www.scribd.com/doc/6369908/Growth-Functions-for-Transformations-on-Manifolds \|url-status=dead }}</ref><ref>{{Cite journal \|last1=Baum \|first1=L. E. \|author-link1=Leonard E. Baum \|last2=Petrie \|first2=T. \|last3=Soules \|first3=G. \|last4=Weiss \|first4=N. \|title=A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains \|doi=10.1214/aoms/1177697196 \|journal=[[The Annals of Mathematical Statistics]] \|volume=41 \|issue=1 \|pages=164–171 \|year=1970 \|jstor=2239727 \|zbl=0188.49603 \|mr=287613 \|doi-access=free }}</ref><ref>{{cite journal \|last=Baum \|first=L.E. \|title=An Inequality and Associated Maximization Technique in Statistical Estimation of Probabilistic Functions of a Markov Process \|journal=Inequalities \|year=1972 \|volume=3 \|pages=1–8}}</ref> One of the first applications of HMMs was [[speech recognition]], starting in the mid-1970s.<ref>{{Cite journal \|last1=Baker \|first1=J. \|author-link1=James K. Baker \|doi=10.1109/TASSP.1975.1162650 \|title=The DRAGON system—An overview \|journal=IEEE Transactions on Acoustics, Speech, and Signal Processing \|volume=23 \|pages=24–29 \|year=1975 }}</ref><ref>{{Cite journal \|last1=Jelinek \|first1=F. \|last2=Bahl \|first2=L. \|last3=Mercer \|first3=R. \|doi=10.1109/TIT.1975.1055384 \|title=Design of a linguistic statistical decoder for the recognition of continuous speech \|journal=[[IEEE Transactions on Information Theory]] \|volume=21 \|issue=3 \|pages=250 \|year=1975 }}</ref><ref>{{cite book \|title=Hidden Markov Models for Speech Recognition \|publisher=Edinburgh University Press \|year=1990 \|isbn=978-0-7486-0162-2 \|author1=Xuedong Huang \|author2=M. Jack \|author3=Y. Ariki \|author-link1=Xuedong Huang}}</ref><ref>{{cite book \|title=Spoken Language Processing \|publisher=Prentice Hall \|year=2001 \|isbn=978-0-13-022616-7 \|author1=Xuedong Huang \|author2=Alex Acero \|author3=Hsiao-Wuen Hon \|author-link1=Xuedong Huang}}</ref> In the second half of the 1980s, HMMs began to be applied to the analysis of biological sequences,<ref>{{cite journal \|doi=10.1016/0022-2836(86)90289-5 \|author=M. Bishop and E. Thompson \|title=Maximum Likelihood Alignment of DNA Sequences \|journal=[[Journal of Molecular Biology]] \|volume=190 \|issue=2 \|pages=159–165 \|year=1986 \|pmid=3641921}} {{subscription required}} {{closed access}}</ref> in particular [[DNA]]. Since then, they have become ubiquitous in the field of [[bioinformatics]].<ref name=durbin>{{Durbin 1998\|mode=cs1}}</ref>

Hidden Markov model: Difference between revisions