Viterbi algorithm: Difference between revisions

Content deleted Content added
Example: Function usage is obvious; terminal call not needed
Pseudocode: We don't need two versions of pseudocode
Line 45:
;Output
* The most likely hidden state sequence <math> X=(x_1,x_2,\ldots,x_T) </math>
'''function''' ''VITERBIviterbi''<math>(O,S,\Pi,Y,A,B):X</math>
'''for''' each state <math>i=1,2,\ldots,K</math> '''do'''
<math>T_1[i,1]\leftarrow\pi_i\cdot B_{iy_1}</math>
Line 64:
'''return''' <math>X</math>
'''end function'''
 
Restated in a succinct near-[[Python (programming language)|Python]]:
'''function''' ''viterbi''<math>(O, S, \Pi, Tm, Em): best\_path</math> Tm: transition matrix Em: emission matrix
<math>trellis \leftarrow matrix(length(S), length(O))</math> To hold probability of each state given each observation
<math>pointers \leftarrow matrix(length(S), length(O))</math> To hold backpointer to best prior state
'''for''' s '''in''' <math>range(length(S))</math>: Determine each hidden state's probability at time 0…
<math>trellis[s, 0] \leftarrow \Pi[s] \cdot Em[s, O[0]]</math>
'''for''' o '''in''' <math>range(1, length(O))</math>: …and after, tracking each state's most likely prior state, k
'''for''' s '''in''' <math>range(length(S))</math>:
<math>k \leftarrow \arg\max(trellis[k, o-1] \cdot Tm[k, s] \cdot Em[s, o]\ \mathsf{for}\ k\ \mathsf{in}\ range(length(S)))</math>
<math>trellis[s, o] \leftarrow trellis[k, o-1] \cdot Tm[k, s] \cdot Em[s, o]</math>
<math>pointers[s, o] \leftarrow k</math>
<math>best\_path \leftarrow list()</math>
<math>k \leftarrow \arg\max(trellis[k, length(O)-1]\ \mathsf{for}\ k\ \mathsf{in}\ range(length(S)))</math> Find k of best final state
'''for''' o '''in''' <math>range(length(O)-1, -1, -1)</math>: Backtrack from last observation
<math>best\_path.insert(0, S[k])</math> Insert previous state on most likely path
<math>k \leftarrow pointers[k, o]</math> Use backpointer to find best previous state
'''return''' <math>best\_path</math>
 
;Explanation:
Line 98 ⟶ 80:
\begin{align}
x_T &= \arg\max_{x \in S} (V_{T,x}), \\
x_{t-1} &= \mathrm{Ptr}(x_t,t).
\end{align}
</math>
 
where [[arg max]] has its usual definition.
Here we're using the standard definition of [[arg max]].
 
The complexity of this implementation is <math>O(T\times\left|{S}\right|^2)</math>. A better estimationestimate exists if the maximum in the internal loop is instead found by iterating only over states that directly link to the current state (i.e. there is an edge from <math>k</math> to <math>j</math>). Then using [[amortized analysis]] one can show that the complexity is <math>O(T\times(\left|{S}\right| + \left|{E}\right|))</math>, where <math>E</math> is the number of edges in the graph.
 
== Example ==