Markov model: Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
m Add: doi, pages. Removed URL that duplicated unique identifier. Removed parameters. Formatted dashes. | You can use this bot yourself. Report bugs here.| Activated by User:Nemo bis | via #UCB_webform
OAbot (talk | contribs)
m Open access bot: arxiv, url-access=subscription updated in citation with #oabot.
 
(30 intermediate revisions by 23 users not shown)
Line 1:
{{Short description|Statistical tool to model changing systems}}
{{more citations needed|date=July 2017}}
In [[probability theory]], a '''Markov model''' is a [[stochastic model]] used to [[Mathematical model|model]] pseudo-randomly changing systems.<ref name=":0">{{Cite book|title=Markov Chains: From Theory to Implementation and Experimentation|last=Gagniuc|first=Paul A.|publisher=John Wiley & Sons|year=2017|isbn=978-1-119-38755-8|___location=USA, NJ|pages=1–256}}</ref> It is assumed that future states depend only on the current state, not on the events that occurred before it (that is, it assumes the [[Markov property]]). Generally, this assumption enables reasoning and computation with the model that would otherwise be [[Intractability (complexity)|intractable]]. For this reason, in the fields of [[predictive modelling]] and [[probabilistic forecasting]], it is desirable for a given model to exhibit the Markov property.
 
==Introduction==
Andrey Andreyevich Markov (14 June 1856 – 20 July 1922) was a Russian mathematician best known for his work on stochastic processes. A primary subject of his research later became known as the Markov chain.
There are four common Markov models used in different situations, depending on whether every sequential state is observable or not, and whether the system is to be adjusted on the basis of observations made:
 
Line 19 ⟶ 21:
{{main|Markov chain}}
 
The simplest Markov model is the [[Markov chain]]. It models the state of a system with a [[random variable]] that changes through time.<ref name=":0" /> In this context, the Markov property suggestsindicates that the distribution for this variable depends only on the distribution of a previous state. An example use of a Markov chain is [[Markov chain Monte Carlo]], which uses the Markov property to prove that a particular method for performing a [[random walk]] will sample from the [[joint distribution]].
 
==Hidden Markov model==
{{main|Hidden Markov model}}
 
A [[hidden Markov model]] is a Markov chain for which the state is only partially observable or noisily observable. In other words, observations are related to the state of the system, but they are typically insufficient to precisely determine the state. Several well-known algorithms for hidden Markov models exist. For example, given a sequence of observations, the [[Viterbi algorithm]] will compute the most-likely corresponding sequence of states, the [[forward algorithm]] will compute the probability of the sequence of observations, and the [[Baum–Welch algorithm]] will estimate the starting probabilities, the transition function, and the observation function of a hidden Markov model.
 
One common use is for [[speech recognition]], where the observed data is the [[Speech coding|speech audio]] [[waveform]] and the hidden state is the spoken text. In this example, the Viterbi algorithm finds the most likely sequence of spoken words given the speech audio.
Line 31 ⟶ 33:
{{main|Markov decision process}}
 
A [[Markov decision process]] is a Markov chain in which state transitions depend on the current state and an action vector that is applied to the system. Typically, a Markov decision process is used to compute a policy of actions that will maximize some utility with respect to expected rewards. It is closely related to [[reinforcement learning]], and can be solved with [[value iteration]] and related methods.
 
==Partially observable Markov decision process==
Line 46 ⟶ 48:
| issn = 0004-3702
| doi = 10.1016/S0004-3702(98)00023-X
| doi-access= free
| citeseerx= 10.1.1.390.8474
}}</ref>
 
Line 53 ⟶ 57:
 
==Hierarchical Markov models==
Hierarchical Markov models can be applied to categorize human behavior at various levels of abstraction. For example, a series of simple observations, such as a person's ___location in a room, can be interpreted to determine more complex information, such as in what task or activity the person is performing. Two kinds of Hierarchical Markov Models are the [[Hierarchical hidden Markov model]]<ref name="HHMM">{{cite journal |first1=S. |last1=Fine |first2=Y. |last2=Singer |title=The hierarchical hidden markov model: Analysis and applications |journal=Machine Learning |volume=32 |issue=1 |pages=41–62 |year=1998 |doi=10.1023/A:1007469218079|doi-access=free }}</ref> and the Abstract Hidden Markov Model.<ref name="AHMM">{{cite journal |first1=H. H. |last1=Bui |first2=S. |last2=Venkatesh |first3=G. |last3=West |url=https://www.jair.org/index.php/jair/article/view/10316 |title=Policy recognition in the abstract hidden markov model |journal=Journal of Artificial Intelligence Research |volume=17 |pages=451–499 |year=2002 |doi=10.1613/jair.839|doi-access=free |hdl=10536/DRO/DU:30044252 |hdl-access=free |arxiv=1106.0672 }}</ref> Both have been used for behavior recognition.<ref name="HierarchicalLearningAndPlanningInPOMDPs">{{cite thesis |first=G. |last=Theocharous |url=http://dl.acm.org/citation.cfm?id=936140 |title=Hierarchical Learning and Planning in Partially Observable Markov Decision Processes |type=PhD |publisher=Michigan State University |year=2002}}</ref> and certain conditional independence properties between different levels of abstraction in the model allow for faster learning and inference.<ref name="AHMM" /><ref name="RecognitionOfHumanActivityThroughHierarchicalStochasticLearning">{{cite book |first1=S. |last1=Luhr |first2=H. H. |last2=Bui |first3=S. |last3=Venkatesh |first4=G. A. W. |last4=West |chapter-url=http://dl.acm.org/citation.cfm?id=826390 |chapter=Recognition of Human Activity through Hierarchical Stochastic Learning |title=PERCOM '03 Proceedings of the First IEEE International Conference on Pervasive Computing and Communications |pages=416–422 |year=2003 |doi=10.1109/PERCOM.2003.1192766|isbn=978-0-7695-1893-0 |citeseerx=10.1.1.323.928 |s2cid=13938580 }}</ref>
 
==Tolerant Markov model==
A Tolerant Markov model (TMM) is a probabilistic-algorithmic Markov chain model.<ref name="TMMs">{{cite book |first1=D. |last1=Pratas |first2=M. |last2=Hosseini |first3=A. J. |last3=Pinho |chapter=Substitutional tolerant Markov models for relative compression of DNA sequences |title=PACBB 2017 – 11th International Conference on Practical Applications of Computational Biology & Bioinformatics, Porto, Portugal |pages=265–272 |year=2017 |doi=10.1007/978-3-319-60816-7_32 |isbn=978-3-319-60815-0}}</ref> It assigns the probabilities according to a conditioning context that considers the last symbol, from the sequence to occur, as the most probable instead of the true occurring symbol. A TMM can model three different natures: substitutions, additions or deletions. Successful applications have been efficiently implemented in DNA sequences compression.<ref name="TMMs" /><ref name="GECO">{{cite book |first1=D. |last1=Pratas |first2=A. J. |last2=Pinho |first3=P. J. S. G. |last3=Ferreira |chapter=Efficient compression of genomic sequences |title=Data Compression Conference (DCC), 2016 |pages=231–240 |publisher=IEEE |year=2016 |doi=10.1109/DCC.2016.60|isbn=978-1-5090-1853-6 |s2cid=14230416 }}</ref>
 
==Markov-chain forecasting models==
Markov-chains have been used as a forecasting methods for several topics, for example price trends,<ref name="SLS">{{cite journal |first1=E.G. |last1=de Souza e Silva |first2=L.F.L. |last2=Legey |first3=E.A. |last3=de Souza e Silva |url=https://www.sciencedirect.com/science/article/pii/S0140988310001271 |title=Forecasting oil price trends using wavelets and hidden Markov models |journal=Energy Economics |volume=32 |year=2010|issue=6 |page=1507 |doi=10.1016/j.eneco.2010.08.006 |bibcode=2010EneEc..32.1507D |url-access=subscription }}</ref>, wind power<ref name="CGLT">{{cite journal |first1=A |last1=Carpinone |first2=M |last2=Giorgio |first3=R. |last3=Langella |first4=A. |last4=Testa |title=Markov chain modeling for very-short-term wind power forecasting |journal=Electric Power Systems Research |volume=122 |pages=152–158 |year=2015|doi=10.1016/j.epsr.2014.12.025 |doi-access=free |bibcode=2015EPSR..122..152C }}</ref> and [[solar irradiance]].<ref name="MMW">{{cite journal |first1=J. |last1=Munkhammar |first2=D.W. |last2=van der Meer |first3=J. |last3=Widén |title=Probabilistic forecasting of high-resolution clear-sky index time-series using a Markov-chain mixture distribution model |journal= Solar Energy |volume=184 |pages=688–695 |year=2019|doi=10.1016/j.solener.2019.04.014 |bibcode=2019SoEn..184..688M |s2cid=146076100 }}</ref>. The Markov-chain forecasting models utilize a variety of different settings, from discretizing the time-series<ref name="CGLT" /> to hidden Markov-models combined with wavelets<ref name="SLS" /> and the Markov-chain mixture distribution model (MCM).<ref name="MMW" />.
 
== See also ==
Line 71 ⟶ 75:
 
{{Authority control}}
[[zh-yue:馬可夫鏈]]
 
{{DEFAULTSORT:Markov Model}}
[[Category:Markov models| ]]