This article discusses common methods in [[communication theory]] for decoding [[code|codewords]] sent over a [[noisy channel]] (such as a [[binary symmetric channel]]).
==Notation==
Henceforth <math>C \subset \mathbb{F}_2^n</math> shall be a (not necessarily [[linear code|linear]]) [[code]] of length <math>n</math>; <math>x,y</math> shall be elements of <math>\mathbb{F}_2^n</math>; and <math>d(x,y)</math> shall represent the [[Hamming distance]] between <math>x,y</math>.
==Ideal observer decoding==
Given a received [[codeword]] <math>x \in \mathbb{F}_2^n</math>, '''ideal observer decoding''' picks a codeword <math>y \in C</math> to maximise:
:<math>\mathbb{P}(y \mbox{ sent} \mid x \mbox{ received})</math>
-the codeword (or a codeword) <math>y \in C</math> that is most likely to be received as <math>x</math>.
Where this decoding result is non-unique a convention must be agreed. Popular such conventions are:
:# Request that the codeword be resent;
:# Choose any one of the possible decodings at random.
==Maximum likelihood decoding==
Given a received codeword <math>x \in \mathbb{F}_2^n</math> '''[[maximum likelihood]] decoding''' picks a codeword <math>y \in C</math> to [[maximization|maximise]]:
:<math>\mathbb{P}(x \mbox{ received} \mid y \mbox{ sent})</math>
-the codeword that was most likely to have been sent [[conditional probability|given that]] <math>x</math> was received. Note that if all codewords are equally likely to be sent during ordinary use, then this scheme is equivalent to ''ideal observer decoding'':
& {} = \mathbb{P}(y \mbox{ sent} \mid x \mbox{ received}).
\end{align}
</math>
As for ''ideal observer decoding'', a convention must be agreed for non-unique decoding. Again, popular such conventions are:
:# Request that the codeword be resent;
:# Choose any one of the possible decodings at random.
==Minimum distance decoding==
Given a received codeword <math>x \in \mathbb{F}_2^n</math>, '''minimum distance decoding''' picks a codeword <math>y \in C</math> to minimise the [[Hamming distance]] :
:<math>d(x,y) = \# \{i : x_i \not = y_i \}</math>
-the codeword (or a codeword) <math>y \in C</math> that is as close as possible to <math> x \in \mathbb{F}_2^n</math>.
Notice that if the probability of error on a [[discrete memoryless channel]] <math>p</math> is strictly less than one half, then ''minimum distance decoding'' is equivalent to ''maximum likelihood decoding'' since if
which (since ''p'' is less than one half) is maximised by minimising ''d''.
As for other decoding methods, a convention is agreed for non-unique decoding. Popular such conventions are:
:# Request that the codeword be resent;
:# Choose any one of the possible decodings at random.
==Syndrome decoding==
'''Syndrome decoding''' is a highly efficient method of decoding a [[linear code]] over a ''noisy channel'' - ie one on which errors are made. In essence, syndrome decoding is ''minimum distance decoding'' using a reduced lookup table. It is the linearity of the code which allows for the lookup table to be reduced in size.
Suppose that <math>C\subset \mathbb{F}_2^n</math> is a linear code of length <math>n</math> and minimum distance <math>d</math> with [[parity-check matrix]] <math>H</math>. Then clearly <math>C</math> is capable of correcting up to
errors made by the channel (since if no more than <math>t</math> errors are made then minimum distance decoding will still correctly decode the incorrectly transmitted codeword).
Now suppose that a codeword <math>x \in \mathbb{F}_2^n</math> is sent over the channel and the error pattern <math>e \in \mathbb{F}_2^n</math> occurs. Then <math>z=x+e</math> is received. Ordinary minimum distance decoding would lookup the vector <math>z</math> in a table of size <math>|C|</math> for the nearest match - ie an element (not necessarily unique) <math>c \in C</math> with
:<math>d(c,z) \leq d(y,z)</math>
for all <math>y \in C</math>. Syndrome decoding takes advantage of the property of the parity matrix that:
:<math>Hx = 0</math>
for all <math>x \in C</math>. The ''syndrome'' of the received <math>z=x+e</math> is defined to be:
:<math>Hz = H(x+e) =Hx + He = 0 + He = He</math>
Under the assumption that no more than <math>t</math> errors were made during transmission the receiver looks up the value <math>He</math> in a table of size
:<math>
\begin{matrix}
\sum_{i=0}^t \binom{n}{i} < |C| \\
\end{matrix}
</math>
(for a binary code) against pre-computed values of <math>He</math> for all possible error patterns <math>e \in \mathbb{F}_2^n</math>. Knowing what <math>e</math> is, it is then trivial to decode <math>x</math> as:
:<math>x = z - e </math>
Notice that this will always give a unique (but not necessarily accurate) decoding result since
: <math>Hx = Hy</math>
if and only if <math>x=y</math>. This is because the parity check matrix <math>H</math> is a generator matrix for the dual code <math>C^\perp</math> and hence is of full [[rank (linear algebra)|rank]].