Hidden Markov model

A hidden Markov model (HMM) is a statistical model where the system being modelled is assumed to be a Markov process with unknown parameters, and the challenge is to determine the hidden parameters, from the observable parameters, based on this assumption. The extracted model parameters can then be used to perform further analysis, for example for pattern recognition applications.

The notions of observable and hidden are similar to Plato's notions of shadows and forms in the allegory of the cave. The allegory claims that perceived reality is but the shadow thrown into the world of experience of a true reality which is inaccessible to direct sensory experience. `Forms' in the true reality contain the essence of a class of object which can be experienced only incompletely in perceived reality. This analogy is particularly strong when modelling parts of speech and sentences, and other entities which have a strongly defined semantic meaning independent of the myriad of possible representations in the observable sequence.

In a regular Markov model, the state is directly visible to the observer, and therefore the state transition probabilities are the only parameters. A hidden Markov model adds outputs: each state has a probability distribution over the possible output tokens. Therefore, looking at a sequence of tokens generated by an HMM does not directly indicate the sequence of states.

State transitions in a hidden Markov model

File:MarkovModel.png

Markov Model Example.
- x — States of the Markov model
- a — Transition probabilities
- b — Output probabilities
- y — Observable outputs

Evolution of a Markov model

The preceding diagram emphasizes the state transitions of a hidden Markov model. It is also useful to explicitly represent the evolution of the model over time, with the states at different times t₁ and t₂ represented by different variables, x(t₁) and x(t₂).

Temporal evolution of a hidden Markov model

In this diagram, it is understood that the time slices (x(t), y(t)) extend to previous and following times as needed. Typically the earliest slice is at time t=0 or time t=1.

Using Markov Models

There are 3 canonical problems to solve with HMMs:

Given the model parameters, compute the probability of a particular output sequence. Solved by the forward algorithm.
Given the model parameters, find the most likely sequence of (hidden) states which could have generated a given output sequence. Solved by the Viterbi algorithm.
Given an output sequence, find the most likely set of state transition and output probabilities. Solved by the Baum-Welch algorithm.

Applications of hidden Markov models

speech recognition or optical character recognition
natural language processing
bioinformatics and genomics
- prediction of protein-coding regions in genome sequences
- modelling families of related DNA or protein sequences
- prediction of secondary structure elements from protein primary sequences
and many more...

References

Rabiner, Lawrence, 1989. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. http://www.caip.rutgers.edu/~lrr/Reprints/tutorial%20on%20hmm%20and%20applications.pdf
Kristie Seymore, Andrew McCallum, and Roni Rosenfeld. Learning Hidden Markov Model Structure for Information Extraction. AAAI 99 Workshop on Machine Learning for Information Extraction, 1999. (also at CiteSeer: [1])

External links

Hidden Markov Model (HMM) Toolbox for Matlab (by Kevin Murphy)

Hidden Markov Models (an exposition using basic mathematics)

GHMM Library (home page of the GHMM Library project)