;Output
* The most likely hidden state sequence <math> X=(x_1,x_2,\ldots,x_T) </math>
'''function''' ''VITERBIviterbi''<math>(O,S,\Pi,Y,A,B):X</math>
'''for''' each state <math>i=1,2,\ldots,K</math> '''do'''
<math>T_1[i,1]\leftarrow\pi_i\cdot B_{iy_1}</math>
'''return''' <math>X</math>
'''end function'''
Restated in a succinct near-[[Python (programming language)|Python]]:
'''function''' ''viterbi''<math>(O, S, \Pi, Tm, Em): best\_path</math> Tm: transition matrix Em: emission matrix
<math>trellis \leftarrow matrix(length(S), length(O))</math> To hold probability of each state given each observation
<math>pointers \leftarrow matrix(length(S), length(O))</math> To hold backpointer to best prior state
'''for''' s '''in''' <math>range(length(S))</math>: Determine each hidden state's probability at time 0…
<math>trellis[s, 0] \leftarrow \Pi[s] \cdot Em[s, O[0]]</math>
'''for''' o '''in''' <math>range(1, length(O))</math>: …and after, tracking each state's most likely prior state, k
'''for''' s '''in''' <math>range(length(S))</math>:
<math>k \leftarrow \arg\max(trellis[k, o-1] \cdot Tm[k, s] \cdot Em[s, o]\ \mathsf{for}\ k\ \mathsf{in}\ range(length(S)))</math>
<math>trellis[s, o] \leftarrow trellis[k, o-1] \cdot Tm[k, s] \cdot Em[s, o]</math>
<math>pointers[s, o] \leftarrow k</math>
<math>best\_path \leftarrow list()</math>
<math>k \leftarrow \arg\max(trellis[k, length(O)-1]\ \mathsf{for}\ k\ \mathsf{in}\ range(length(S)))</math> Find k of best final state
'''for''' o '''in''' <math>range(length(O)-1, -1, -1)</math>: Backtrack from last observation
<math>best\_path.insert(0, S[k])</math> Insert previous state on most likely path
<math>k \leftarrow pointers[k, o]</math> Use backpointer to find best previous state
'''return''' <math>best\_path</math>
;Explanation:
\begin{align}
x_T &= \arg\max_{x \in S} (V_{T,x}), \\
x_{t-1} &= \mathrm{Ptr}(x_t,t).
\end{align}
</math>
where [[arg max]] has its usual definition.
Here we're using the standard definition of [[arg max]].
The complexity of this implementation is <math>O(T\times\left|{S}\right|^2)</math>. A better estimationestimate exists if the maximum in the internal loop is instead found by iterating only over states that directly link to the current state (i.e. there is an edge from <math>k</math> to <math>j</math>). Then using [[amortized analysis]] one can show that the complexity is <math>O(T\times(\left|{S}\right| + \left|{E}\right|))</math>, where <math>E</math> is the number of edges in the graph.
== Example ==
|