Forward–backward algorithm: Difference between revisions

Content deleted Content added
m fixed dashes using a script
Line 1:
{{Mergefrom|BCJR algorithm|discuss=Talk:Forward-backwardForward–backward algorithm#Merger proposal|date=June 2009}}
 
The '''forward-backwardforward–backward algorithm''' is an [[inference]] [[algorithm]] for [[hidden Markov models]] which computes the [[posterior probability|posterior]] [[marginal probability|marginals]] of all hidden state variables given a sequence of observations/emissions <math>o_{1:t}:= o_1,\dots,o_t</math>, i.e. it computes, for all hidden state variables <math>X_k \in \{X_1, \dots, X_t\}</math>, the distribution <math>P(X_k\ |\ o_{1:t})</math>. This inference task is usually called ''smoothing''. The algorithm makes use of the principle of [[dynamic programming]] to efficiently compute the values that are required to obtain the posterior marginal distributions in two passes. The first pass goes forward in time while the second goes backward in time; hence the name ''forward-backwardforward–backward algorithm''.
 
The term ''forward-backwardforward–backward algorithm'' is also used to refer to any algorithm belonging to the general class of algorithms that operate on sequence models in a forward-backwardforward–backward manner. In this sense, the descriptions in the remainder of this article refer but to one specific instance of this class.
 
==Overview ==
In the first pass, the forward-backwardforward–backward algorithm computes a set of forward probabilities which provide, for all <math>k \in \{1, \dots, t\}</math>, the probability of ending up in any particular state given the first <math>k</math> observations in the sequence, i.e. <math>P(X_k\ |\ o_{1:k})</math>. In the second pass, the algorithm computes a set of backward probabilities which provide the probability of observing the remaining observations given any starting point <math>k</math>, i.e. <math>P(o_{k+1:t}\ |\ X_k)</math>. These two sets of probability distributions can then be combined to obtain the distribution over states at any specific point in time given the entire observation sequence:
 
:<math>P(X_k\ |\ o_{1:t}) = P(X_k\ |\ o_{1:k}, o_{k+1:t}) \propto P(o_{k+1:t}\ |\ X_k) P(X_k\ |\ o_{1:k})</math>
Line 20:
The forward and backward steps are often called "forward message pass" and "backward message pass". The wording originates from the way the algorithm processes the given observation sequence. First the algorithm moves forward starting with the first observation in the sequence and going to the last, and then returning back to the first. At each single observation in the sequence, probabilities to be used for calculations at the next observation are computed. During the backward pass the algorithm simultaneously performs the smoothing step. This step allows the algorithm to take into account any past observations of output for computing more accurate results.
 
The forward-backwardforward–backward algorithm can be used to find the most likely state for any point in time. It cannot, however, be used to find the most likely sequence of states (see [[Viterbi algorithm]]).
 
==Forward probabilities==
Line 278:
 
==Performance ==
The brute-force procedure for the solution of this problem is the generation of all possible <math>N^T</math> state sequences and calculating the joint probability of each state sequence with the observed series of events. This approach has [[time complexity]] <math> O(T \cdot N^T) </math>, where <math>T</math> is the length of sequences and <math>N</math> is the number of symbols in the state alphabet. This is intractable for realistic problems, as the number of possible hidden node sequences typically is extremely high. However, the forward-backwardforward–backward algorithm has time complexity <math> O(N^2 T)\, </math>.
 
Several enhancements are known to the general forward-backwardforward–backward algorithm which allow for computations to take place in constant space. In addition, as it grows, algorithms have been developed to compute <math>\mathbf{f_{0:t+1}}</math> efficiently through online smoothing such as the fixed-lag smoothing (FLS) algorithm [[#RusselNorvig03|Russel & Norvig 2003 pp. 552]].
 
==Pseudocode==
Line 301:
 
== References==
* [[Lawrence Rabiner|Lawrence R. Rabiner]], A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. ''Proceedings of the [[IEEE]]'', 77 (2), p.&nbsp;257&ndash;286257–286, February 1989.
*{{cite journal |author=Lawrence R. Rabiner, B. H. Juang|title=An introduction to hidden Markov models|journal=IEEE ASSP Magazine |month=January |pages=4–15 |year=1986}}
*{{cite book | author = Eugene Charniak|title = Statistical Language Learning|publisher = MIT Press|address = Cambridge, Massachusetts|year = 1993|isbn=978-0-262-53141-2}}
<cite id=RusselNorvig03>
*{{cite book | author = Stuart Russell and Peter Norvig|title = Artificial Intelligence A Modern Approach 2nd Edition|publisher = Pearson Education|address = Upper Saddle River, New Jersey|year = 2003|isbn=0-13-080302-2 | unused_data = ISBN status = May be invalid - please double check}}
 
==External links ==
* [http://www.cs.jhu.edu/~jason/papers/#tnlp02 An Interactiveinteractive Spreadsheetspreadsheet for Teachingteaching the Forward-Backwardforward–backward Algorithmalgorithm] (spreadsheet and article with step-by-step walk-through)
* [http://www.cs.brown.edu/research/ai/dynamics/tutorial/Documents/HiddenMarkovModels.html Tutorial of Hiddenhidden Markov Modelsmodels including the forward-backwardforward–backward algorithm]
* [http://code.google.com/p/aima-java/ Collection of AI algorithms implemented in Java] (including HMM and the forward-backwardforward–backward algorithm)
 
[[Category:Dynamic programming]]