Revision as of 23:22, 28 April 2018 edit 76.171.13.161 (talk) Corrected page reference. "Director" in the textbook does not have a gender specified. Tag: Visual edit ← Previous edit		Revision as of 15:42, 2 May 2018 edit undo 163.1.211.90 (talk) Update notation to be consistent - always t=1, ..., T and never k=1, ..., t Next edit →
Line 1: {{Inline\|date=April 2018}} The '''forward–backward algorithm''' is an [[Statistical_inference \| inference]] [[algorithm]] for [[hidden Markov model]]s which computes the [[posterior probability\|posterior]] [[marginal probability\|marginals]] of all hidden state variables given a sequence of observations/emissions <math>o_{1:tT}:= o_1,\dots,~~o_t~~o_T</math>, i.e. it computes, for all hidden state variables <math>~~X_k~~X_t \in \{X_1, \dots, ~~X_t~~X_T\}</math>, the distribution <math>P(~~X_k~~X_t\ \|\ o_{1:tT})</math>. This inference task is usually called ''smoothing''. The algorithm makes use of the principle of [[dynamic programming]] to efficiently compute the values that are required to obtain the posterior marginal distributions in two passes. The first pass goes forward in time while the second goes backward in time; hence the name ''forward–backward algorithm''. The term ''forward–backward algorithm'' is also used to refer to any algorithm belonging to the general class of algorithms that operate on sequence models in a forward–backward manner. In this sense, the descriptions in the remainder of this article refer but to one specific instance of this class. ==Overview == In the first pass, the forward–backward algorithm computes a set of forward probabilities which provide, for all <math>kt \in \{1, \dots, tT\}</math>, the probability of ending up in any particular state given the first <math>kt</math> observations in the sequence, i.e. <math>P(~~X_k~~X_t\ \|\ o_{1:kt})</math>. In the second pass, the algorithm computes a set of backward probabilities which provide the probability of observing the remaining observations given any starting point <math>kt</math>, i.e. <math>P(o_{kt+1:tT}\ \|\ ~~X_k~~X_t)</math>. These two sets of probability distributions can then be combined to obtain the distribution over states at any specific point in time given the entire observation sequence: :<math>P(~~X_k~~X_t\ \|\ o_{1:tT}) = P(~~X_k~~X_t\ \|\ o_{1:kt}, o_{kt+1:tT}) \propto P(o_{kt+1:tT}\ \|\ ~~X_k~~X_t) P( ~~X_k~~X_t \| o_{1:kt})</math> The last step follows from an application of the [[Bayes' rule]] and the [[conditional independence]] of <math>o_{kt+1:tT}</math> and <math>o_{1:kt}</math> given <math>~~X_k~~X_t</math>. As outlined above, the algorithm involves three steps:

Forward–backward algorithm: Difference between revisions