Content deleted Content added
m refered > referred |
Rescuing 2 sources and tagging 0 as dead.) #IABot (v2.0.9.5) (Whoop whoop pull up - 15035 |
||
Line 124:
The parameter learning task in HMMs is to find, given an output sequence or a set of such sequences, the best set of state transition and emission probabilities. The task is usually to derive the [[maximum likelihood]] estimate of the parameters of the HMM given the set of output sequences. No tractable algorithm is known for solving this problem exactly, but a local maximum likelihood can be derived efficiently using the [[Baum–Welch algorithm]] or the Baldi–Chauvin algorithm. The [[Baum–Welch algorithm]] is a special case of the [[expectation-maximization algorithm]].
If the HMMs are used for time series prediction, more sophisticated Bayesian inference methods, like [[Markov chain Monte Carlo]] (MCMC) sampling are proven to be favorable over finding a single maximum likelihood model both in terms of accuracy and stability.<ref>Sipos, I. Róbert. ''Parallel stratified MCMC sampling of AR-HMMs for stochastic time series prediction''. In: Proceedings, 4th Stochastic Modeling Techniques and Data Analysis International Conference with Demographics Workshop (SMTDA2016), pp. 295-306. Valletta, 2016. [http://1drv.ms/b/s!ApL_0Av0YGDLglwEOv1aYAGbmQeL PDF]</ref> Since MCMC imposes significant computational burden, in cases where computational scalability is also of interest, one may alternatively resort to variational approximations to Bayesian inference, e.g.<ref>{{cite journal |url=http://users.iit.demokritos.gr/~dkosmo/downloads/patrec10/vbb10.pdf |doi=10.1016/j.patcog.2010.09.001 |volume=44 |issue=2 |title=A variational Bayesian methodology for hidden Markov models utilizing Student's-t mixtures |year=2011 |journal=Pattern Recognition |pages=295–306 |last1=Chatzis |first1=Sotirios P. |last2=Kosmopoulos |first2=Dimitrios I. |bibcode=2011PatRe..44..295C |citeseerx=10.1.1.629.6275 |access-date=2018-03-11 |archive-date=2011-04-01 |archive-url=https://web.archive.org/web/20110401184517/http://users.iit.demokritos.gr/~dkosmo/downloads/patrec10/vbb10.pdf |url-status=dead }}</ref> Indeed, approximate variational inference offers computational efficiency comparable to expectation-maximization, while yielding an accuracy profile only slightly inferior to exact MCMC-type Bayesian inference.
== Applications ==
Line 155:
== History ==
Hidden Markov models were described in a series of statistical papers by [[Leonard E. Baum]] and other authors in the second half of the 1960s.<ref>{{cite journal |last=Baum |first=L. E. |author2=Petrie, T. |title=Statistical Inference for Probabilistic Functions of Finite State Markov Chains |journal=The Annals of Mathematical Statistics |year=1966 |volume=37 |issue=6 |pages=1554–1563 |doi=10.1214/aoms/1177699147|doi-access=free }}</ref><ref>{{Cite journal |last1=Baum |first1=L. E. |last2=Eagon |first2=J. A. |doi=10.1090/S0002-9904-1967-11751-8 |title=An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology |journal=[[Bulletin of the American Mathematical Society]] |volume=73 |issue=3 |pages=360 |year=1967 |zbl=0157.11101 |url=http://projecteuclid.org/euclid.bams/1183528841 |doi-access=free }}</ref><ref>{{cite journal |last=Baum |first=L. E. |author2=Sell, G. R. |title=Growth transformations for functions on manifolds |journal=Pacific Journal of Mathematics |year=1968 |volume=27 |issue=2 |pages=211–227 |url=https://www.scribd.com/doc/6369908/Growth-Functions-for-Transformations-on-Manifolds |access-date=28 November 2011 |doi=10.2140/pjm.1968.27.211 |doi-access=free |archive-date=18 March 2014 |archive-url=https://web.archive.org/web/20140318223209/http://www.scribd.com/doc/6369908/Growth-Functions-for-Transformations-on-Manifolds |url-status=dead }}</ref><ref>{{Cite journal |last1=Baum |first1=L. E. |author-link1=Leonard E. Baum |last2=Petrie |first2=T. |last3=Soules |first3=G. |last4=Weiss |first4=N. |title=A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains |doi=10.1214/aoms/1177697196 |journal=[[The Annals of Mathematical Statistics]] |volume=41 |issue=1 |pages=164–171 |year=1970 |jstor=2239727 |zbl=0188.49603 |mr=287613 |doi-access=free }}</ref><ref>{{cite journal |last=Baum |first=L.E. |title=An Inequality and Associated Maximization Technique in Statistical Estimation of Probabilistic Functions of a Markov Process |journal=Inequalities |year=1972 |volume=3 |pages=1–8}}</ref> One of the first applications of HMMs was [[speech recognition]], starting in the mid-1970s.<ref>{{Cite journal |last1=Baker |first1=J. |author-link1=James K. Baker |doi=10.1109/TASSP.1975.1162650 |title=The DRAGON system—An overview |journal=IEEE Transactions on Acoustics, Speech, and Signal Processing |volume=23 |pages=24–29 |year=1975 }}</ref><ref>{{Cite journal |last1=Jelinek |first1=F. |last2=Bahl |first2=L. |last3=Mercer |first3=R. |doi=10.1109/TIT.1975.1055384 |title=Design of a linguistic statistical decoder for the recognition of continuous speech |journal=[[IEEE Transactions on Information Theory]] |volume=21 |issue=3 |pages=250 |year=1975 }}</ref><ref>{{cite book |title=Hidden Markov Models for Speech Recognition |publisher=Edinburgh University Press |year=1990 |isbn=978-0-7486-0162-2 |author1=Xuedong Huang |author2=M. Jack |author3=Y. Ariki |author-link1=Xuedong Huang}}</ref><ref>{{cite book |title=Spoken Language Processing |publisher=Prentice Hall |year=2001 |isbn=978-0-13-022616-7 |author1=Xuedong Huang |author2=Alex Acero |author3=Hsiao-Wuen Hon |author-link1=Xuedong Huang}}</ref>
In the second half of the 1980s, HMMs began to be applied to the analysis of biological sequences,<ref>{{cite journal |doi=10.1016/0022-2836(86)90289-5 |author=M. Bishop and E. Thompson |title=Maximum Likelihood Alignment of DNA Sequences |journal=[[Journal of Molecular Biology]] |volume=190 |issue=2 |pages=159–165 |year=1986 |pmid=3641921}} {{subscription required}} {{closed access}}</ref> in particular [[DNA]]. Since then, they have become ubiquitous in the field of [[bioinformatics]].<ref name=durbin>{{Durbin 1998|mode=cs1}}</ref>
|