Content deleted Content added
m Task 70: Update syntaxhighlight tags - remove use of deprecated <source> tags |
m →External links: HTTP to HTTPS for Brown University |
||
(20 intermediate revisions by 13 users not shown) | |||
Line 1:
{{Short description|Inference algorithm for hidden Markov models}}
{{Inline|date=April 2018}}
The '''forward–backward algorithm''' is an [[Statistical_inference | inference]] [[algorithm]] for [[hidden Markov model]]s which computes the [[posterior probability|posterior]] [[marginal probability|marginals]] of all hidden state variables given a sequence of observations/emissions <math>o_{1:T}:= o_1,\dots,o_T</math>, i.e. it computes, for all hidden state variables <math>X_t \in \{X_1, \dots, X_T\}</math>, the distribution <math>P(X_t\ |\ o_{1:T})</math>. This inference task is usually called '''smoothing'''. The algorithm makes use of the principle of [[dynamic programming]] to efficiently compute the values that are required to obtain the posterior marginal distributions in two passes. The first pass goes forward in time while the second goes backward in time; hence the name ''forward–backward algorithm''.
The term ''forward–backward algorithm'' is also used to refer to any algorithm belonging to the general class of algorithms that operate on sequence models in a forward–backward manner. In this sense, the descriptions in the remainder of this article refer
==Overview ==
Line 23 ⟶ 24:
==Forward probabilities==
The following description will use matrices of probability values
We transform the probability distributions related to a given [[hidden Markov model]] into matrix notation as follows.
Line 34 ⟶ 35:
</math>
In a typical Markov model, we would multiply a state vector by this matrix to obtain the probabilities for the subsequent state. In a hidden Markov model the state is unknown, and we instead observe events associated with the possible states. An event matrix of the form:
:<math>\mathbf{B} = \begin{pmatrix}
Line 46 ⟶ 47:
:<math>\mathbf{P}(O = j)=\sum_{i} \pi_i B_{i,j}</math>
:<math>\mathbf{O_1} = \begin{pmatrix}
Line 60 ⟶ 61:
</math>
We can now make this general procedure specific to our series of observations. Assuming an initial state vector <math>\mathbf{\pi}_0</math>, (which can be optimized as a parameter through repetitions of the forward-
:<math>
\mathbf{f_{0:1}} = \mathbf{\pi}_0 \mathbf{T} \mathbf{O_{
</math>
Line 69 ⟶ 70:
:<math>
\mathbf{f_{0:t}} = \mathbf{f_{0:t-1}} \mathbf{T} \mathbf{O_{
</math>
This value is the forward unnormalized [[probability vector]]. The i'th entry of this vector provides:
:<math>
Line 81 ⟶ 82:
:<math>
\mathbf{\hat{f}_{0:t}} = c_t^{-1}\ \mathbf{\hat{f}_{0:t-1}} \mathbf{T} \mathbf{O_{
</math>
Line 114 ⟶ 115:
</math>
Notice that we are now using a [[Row and column vectors|column vector]] while the forward probabilities used row vectors. We can then work backwards using:
:<math>
Line 190 ⟶ 191:
</math>
Notice that the [[transformation matrix]] is also transposed, but in our example the transpose is equal to the original matrix. Performing these calculations and normalizing the results provides:
:<math>
Line 227 ⟶ 228:
</math>
For the backward probabilities, we start with:
:<math>
Line 255 ⟶ 256:
</math>
Finally, we will compute the smoothed probability values. These
:<math>
Line 286 ⟶ 287:
==Performance ==
The
An enhancement to the general forward-backward algorithm, called the [[Island algorithm]], trades smaller memory usage for longer running time, taking <math> O(
In addition, algorithms have been developed to compute <math>\mathbf{f_{0:t+1}}</math> efficiently through online smoothing such as the fixed-lag smoothing (FLS) algorithm
==Pseudocode==
Line 318 ⟶ 319:
Given HMM (just like in [[Viterbi algorithm]]) represented in the [[Python programming language]]:
<syntaxhighlight lang="python">
states = (
end_state =
observations = (
start_probability = {
transition_probability = {
emission_probability = {
</syntaxhighlight>
Line 342 ⟶ 343:
# Forward part of the algorithm
fwd = []
for i, observation_i in enumerate(observations):
f_curr = {}
Line 350:
prev_f_sum = start_prob[st]
else:
prev_f_sum = sum(f_prev[k] * trans_prob[k][st] for k in states)
f_curr[st] = emm_prob[st][observation_i] * prev_f_sum
Line 361:
# Backward part of the algorithm
bkw = []
for i, observation_i_plus in enumerate(reversed(observations[1:] + (None,))):▼
▲ for i, observation_i_plus in enumerate(reversed(observations[1:]+(None,))):
b_curr = {}
for st in states:
Line 383 ⟶ 382:
assert p_fwd == p_bkw
return fwd, bkw, posterior
</syntaxhighlight>
Line 398 ⟶ 396:
<syntaxhighlight lang="python">
def example():
return fwd_bkw(
observations, )
</syntaxhighlight>
<syntaxhighlight lang="pycon">
Line 421:
== References==
{{reflist}}
* [[Lawrence Rabiner|Lawrence R. Rabiner]], A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. ''Proceedings of the [[IEEE]]'', 77 (2), p. 257–286, February 1989. [https://dx.doi.org/10.1109/5.18626 10.1109/5.18626]
* {{cite journal |author=Lawrence R. Rabiner, B. H. Juang|title=An introduction to hidden Markov models|journal=IEEE ASSP Magazine |date=January 1986 |pages=4–15}}
* {{cite book | author = Eugene Charniak|title = Statistical Language Learning|publisher = MIT Press| ___location=Cambridge, Massachusetts|year = 1993|isbn=978-0-262-53141-2}}
* <cite id = RussellNorvig10>{{cite book | author = Stuart Russell and Peter Norvig|title = Artificial Intelligence A Modern Approach 3rd Edition|publisher = Pearson Education/Prentice-Hall|___location = Upper Saddle River, New Jersey|year = 2010|isbn=978-0-13-604259-4}}</cite>
==External links ==
* [http://www.cs.jhu.edu/~jason/papers/#eisner-2002-tnlp An interactive spreadsheet for teaching the forward–backward algorithm] (spreadsheet and article with step-by-step walk-through)
* [
* [http://code.google.com/p/aima-java/ Collection of AI algorithms implemented in Java] (including HMM and the forward–backward algorithm)
{{DEFAULTSORT:Forward-backward algorithm}}
[[Category:Articles with example Python (programming language) code]]
[[Category:Dynamic programming]]
[[Category:Error detection and correction]]
|