Generalized minimum-distance decoding: Difference between revisions

Content deleted Content added
Marked as dead end (Deadendmarker is an automated bot.)
m Reflist
 
(43 intermediate revisions by 19 users not shown)
Line 1:
In [[coding theory]], Generalized'''generalized Minimumminimum-distance Distance(GMD) Decodingdecoding''' provides an efficient [[algorithm]] for decoding [http://en.wikipedia.org/wiki/Concatenated_code [concatenated codescode]]s, which is based on using an errors[[error]]s-and-[http://en.wikipedia.org/wiki/Erasure_code[Erasure code|erasures]] decoder for the [http://en.wikipedia.org/wiki/Concatenated_code [outer code]].
{{deadend|date=May 2011}}
 
A [http://en.wikipedia.org/wiki/Decoding_methods[Concatenated error correction code#Decoding concatenated codes|naive decoding algorithm]] for concatenated codes can not be an optimal way of decoding because it does not take into account the information that [http://en.wikipedia.org/wiki/Maximum_likelihood_decoding[maximum Maximumlikelihood Likelihooddecoding]] Decoding(MLD)] gives. In other words, in the naive algorithm, inner received codewords[[Code word (communication)|codeword]]s are treated the same regardless of the difference between their [http://en.wikipedia.org/wiki/Hamming_distance [hamming distancesdistance]]s. Intuitively, the outer decoder should place higher confidence in symbols whose inner [[code|encodings]] are close to the received word. [http://en.wikipedia.org/wiki/David_Forney [David Forney]] in 1966 devised a better algorithm called Generalizedgeneralized Minimumminimum distance Distance(GMD) Decodingdecoding which makes use of those information better. This method is achieved by measuring confidence of each received codeword, and erasing symbols whose confidence is below a desired value. And GMD decoding algorithm was one of the first examples of [http://en.wikipedia.org/wiki/Soft[soft-decision_decoder soft decision decodersdecoder]]s. We will present three versions of the GMD decoding algorithm. The first two will be [http://en.wikipedia.org/wiki/Randomized_algorithm [randomized algorithmsalgorithm]]s while the last one will be a [http://en.wikipedia.org/wiki/Deterministic_algorithm [deterministic algorithm]].
==Introduction==
In coding theory, Generalized Minimum Distance(GMD) Decoding provides an efficient algorithm for decoding [http://en.wikipedia.org/wiki/Concatenated_code concatenated codes], which is based on using an errors-and-[http://en.wikipedia.org/wiki/Erasure_code erasures] decoder for the [http://en.wikipedia.org/wiki/Concatenated_code outer code].
A [http://en.wikipedia.org/wiki/Decoding_methods naive decoding algorithm] for concatenated codes can not be an optimal way of decoding because it does not take into account the information that [http://en.wikipedia.org/wiki/Maximum_likelihood_decoding Maximum Likelihood Decoding(MLD)] gives. In other words, in the naive algorithm, inner received codewords are treated the same regardless of the difference between their [http://en.wikipedia.org/wiki/Hamming_distance hamming distances]. Intuitively, the outer decoder should place higher confidence in symbols whose inner encodings are close to the received word. [http://en.wikipedia.org/wiki/David_Forney David Forney] in 1966 devised a better algorithm called Generalized Minimum Distance(GMD) Decoding which makes use of those information better. This method is achieved by measuring confidence of each received codeword, and erasing symbols whose confidence is below a desired value. And GMD decoding algorithm was one of the first examples of [http://en.wikipedia.org/wiki/Soft-decision_decoder soft decision decoders]. We will present three versions of the GMD decoding algorithm. The first two will be [http://en.wikipedia.org/wiki/Randomized_algorithm randomized algorithms] while the last one will be a [http://en.wikipedia.org/wiki/Deterministic_algorithm deterministic algorithm].
 
==Setup==
#* [[Hamming distance]] : Given two vectors[[Euclidean vector|vector]]s <math>u, v\in\sumSigma^n</math> the Hamming distance between <math>u</math> and <math>v</math>, denoted by <math>\Delta(u, v)</math>, is defined to be the number of positions in which <math>u</math> and <math>v</math> differ.
#* Minimum distance : Let <math>C\subseteq\sumSigma^n</math> be a [[code]]. The minimum distance of code <math>C</math> is defined to be <math>d = \min{\Delta(c_1, c_2)}</math> where <math>c_1 \ne c_2 \in C </math>
#* Code concatenation : Given <math>m = (m_{1}m_1, ...\cdots, m_{K}m_K) \in [Q]^K</math>, consider two codes which we call outer code and inner code
::<math>C_\text{out} = [Q]^K \rightarrowto [Q]^N, \qquad C_\text{in} : [q]^k \rightarrowto [q]^n,</math>,
:and their distances are <math>D</math> and <math>d</math>. A concatenated code can be achieved by <math>C_\text{out} \circ C_\text{in} (m) = (C_\text{in} (C_\text{out} (m)_1), \ldots, C_\text{in} (C_\text{out} (m)_N ))</math> where <math>C_\text{out}(m) = ((C_\text{out} (m)_1, \ldots, (m)_N )).</math>. Finally we will take <math>C_\text{out}</math> to be [[Reed Solomon|RS code]], which has an errors and erasure decoder, and <math>K = O(\log{ N})</math>, which in turn implies that MLD on the inner code will be poly(polynomial in <math>N</math>) time.
#* Maximum likelihood decoding (MLD) : MLD is a decoding method for error correcting codes, which outputs the codeword closest to the received word in Hamming distance. The MLD function denoted by <math>D_{MLD} : \sumSigma^n \rightarrowto C</math> is defined as follows. For every <math>y\in\sum_n</math>Sigma^n, <math>D_{MLD}(y) = \arg \min_{c \in C}\Delta(c, y)</math>.
#* [[Probability density function]] : A [[probability distribution]] <math>\Pr[\bullet]</math> on a sample space <math>S</math> is a mapping from events of <math>S</math> to [[real numbersnumber]]s such that <math>\Pr[A] \ge 0</math> for any event <math>A</math>, <math>\Pr[S] = 1</math>, and <math>\Pr[A \cup B] = \Pr[A] + \Pr[B]</math> for any two mutually exclusive events <math>A</math> and <math>B</math>
#* [[Expected value ]]: The expected value of a [[discrete random variable]] <math>X</math> is
::<math>\mathbb{E}[X] = \sum_x \Pr[X = x].</math>.
 
==Randomized algorithm==
Consider the received word <math>\mathbf{y} = (y_1,...\ldots,y_N) \in [q^n]^N</math> which was corrupted by a [http://en.wikipedia.org/wiki/Noisy_channel [noisy channel]]. The following is the algorithm description for the general case. In this algorithm, we can decode y by just declaring an erasure at every bad position and running the errors and erasure decoding algorithm for <math>C_\text{out}</math> on the resulting vector.
 
'''Randomized_Decoder'''
<br />'''Given : '''<math>\mathbf{y} = (y_1,...\dots,y_N) \in [q^n]^N</math>.
# For every <math>1 \le i \le N</math>, compute <math>y_i^\prime' = MLD_{C_\text{in}}(y_i)</math>.
# Set <math>\omega_i = \min(\Delta(C_\text{in}(y_i^\prime'), y_i), \tfrac{d\over2}{2})</math>.
# For every <math>1 \le i \le N</math>, repeat : With probability <math>2\omega_i \over d</math>, set <math>y_i^{\prime\prime}'' \leftarrow ?,</math> ?, otherwise set <math>y_i^{\prime\prime}'' = y_i'</math>.
# Run errors and erasure algorithm for <math>C_\text{out}</math> on <math>\mathbf{y}^{\prime\prime}'' = (y_1^{\prime\prime}'', \ldots, y_N^{\prime\prime}'')</math>.
 
'''Theorem 1.''' ''Let y be a received word such that there exists a [http://en.wikipedia.org/wiki/Codeword[Code word (communication)|codeword]]'' <math>\mathbf{c} = (c_1,...\cdots, c_N) \in C_\text{out}\circ{C_\text{in}} \subseteq [q^n]^N</math> ''such that'' <math>\Delta(\mathbf{c}, \mathbf{y})</math> < <math>\tfrac{Dd \over }{2}</math>. ''Then the deterministic GMD algorithm outputs'' <math>\mathbf{c}</math>.
 
Note that a [http://en.wikipedia.org/wiki/Concatenated_codes[Concatenated codes|naive decoding algorithm for concatenated codes]] can correct up to <math>Dd \over tfrac{Dd}{4}</math> errors.
 
:'''Lemma 1.''' ''Let the assumption in Theorem 1 hold. And if'' <math>\mathbf{y^{\prime\prime}}''</math> ''has'' <math>e'</math> ''errors and'' <math>s'</math> ''erasures (when compared with'' <math>\mathbf{c}</math>'') after'' '''Step 1''', ''then'' <math>\mathbb{E}[2e' + s']</math> < <math>D.</math>.
 
''Remark.'' If <math>2e' + s'</math> < <math>D</math>, then the algorithm in '''Step 2''' will output <math>\mathbf{c}</math>. The lemma above says that in expectation, this is indeed the case. Note that this is not enough to prove '''Theorem 1''', but can be crucial in developing future variations of the algorithm.
 
'''Proof of lemma 1.''' For every <math>1 \le i \le N,</math>, define <math>e_i = \Delta(y_i, c_i).</math>. This implies that
 
<math display="block">\sum_{i=1}^N e_i < \frac{Dd}{2} \qquad\qquad (1)</math>
'''Proof of lemma 1.''' For every <math>1 \le i \le N</math>, define <math>e_i = \Delta(y_i, c_i)</math>. This implies that
Next for every <math>1 \le i \le N</math>, we define two [http://en.wikipedia.org/wiki/Indicator_variable [indicator variablesvariable]]s:
 
<math display="block">\begin{align}
<math>\sum_{i=1}^Ne_i</math> < <math>Dd\over2</math>............................................................. (1)
X{_i^?} = 1 &\Leftrightarrow y_i'' = ? \\
X{_i^e} = 1 &\Leftrightarrow C_\text{in}(y_i'') \ne c_i \ \text{and} \ y_i'' \neq ?
\end{align}</math>
We claim that we are done if we can show that for every <math>1 \le i \le N</math>:
 
<math display="block">\mathbb{E} \left [2X{_i^e + X{_i^?}} \right ] \leqslant {2e_i \over d}\qquad\qquad (2)</math>
Next for every <math>1 \le i \le N</math>, we define two [http://en.wikipedia.org/wiki/Indicator_variable indicator variables]:
Clearly, by definition
 
<math display="block">e' = \sum_i X_i^e \quad \text{and} \quad s' = \sum_i X_i^?.</math>
<math>X{_i^?} = 1</math> iff <math>y_i^{\prime\prime}</math> = ?, and <math>X{_i^e} = 1</math> iff <math>C_{in}(y_i^{\prime\prime}) \ne c_i</math> and <math>y_i^{\prime\prime} \ne ?</math>.
Further, by the [[linear]]ity of expectation, we get
 
<math display="block">\mathbb{E}[2e' + s'] \leqslant \frac{2}{d}\sum_ie_i < D.</math>
We claim that we are done if we can show that for every <math>1 \le i \le N</math>:
To prove (2) we consider two cases: <math>i</math>-th block is correctly decoded ('''Case 1'''), <math>i</math>-th block is incorrectly decoded ('''Case 2'''):
 
'''Case 1:''' <math>(c_i = C_\text{in}(y_i'))</math>
<math>\mathbb{E}[2X{_i^e + X{_i^?}}] \le {{2e_i} \over d}</math>............................................................. (2)
 
Clearly,Note bythat definitionif <math>ey_i'' = \sum_{i}X{_i^e}?</math> andthen <math>s'X_i^e = \sum_{i}X{_i^?}0</math>. Futher, by the linearity of expectation, we getand <math>\mathbb{E}Pr[2ey_i'' += s'?] = \le tfrac{2 \over omega_i}{d}\sum_ie_i</math> <implies <math>D</math>.\mathbb{E}[X_i^?] We= consider\Pr[X_i^? two= cases1] to= prove (\tfrac{2) : \omega_i}{d},</math>i'th and </math>\mathbb{E}[X_i^e] block= is\Pr[X_i^e correctly decoded('''Case= 1'''),] = <math>i'th0</math> block is incorrectly decoded('''Case 2''').
 
'''Case 1:''' <math>(c_i = C_{in}(y_i'))</math>
 
Note that if <math>y_i^{\prime\prime} = ?</math> then <math>X_i^e = 0</math>, and <math>Pr[y_i^{\prime\prime} = ?] = {2\omega_i \over d}</math> implies
<math>\mathbb{E}[X_i^?] = Pr[X_i^? = 1] = {2\omega_i \over d}</math>, and <math>\mathbb{E}[X_i^e] = Pr[X_i^e = 1] = 0</math>.
 
Further, by definition we have
 
<math display="block">\omega_i = \min \left (\Delta(C_\text{in}(y_i'), y_i), \tfrac{d}{2} \overright 2}) \leleqslant \Delta(C_\text{in}(y_i'), y_i) = \Delta(c_i, y_i) = e_i</math>
'''Case 2:''' <math>(c_i \ne C_\text{in}(y_i'))</math>
 
In this case, <math>\mathbb{E}[X_i^?] = Pr[X_i^? = 1] = \tfrac{2\omega_i \over }{d}</math>, and <math>\mathbb{E}[X_i^e] = \Pr[X_i^e = 1] = 01 - \tfrac{2\omega_i}{d}.</math>.
'''Case 2:''' <math>(c_i \ne C_{in}(y_i'))</math>
 
Since <math>c_i \ne C_\text{in}(y_i'), e_i + \omega_i \geqslant d</math>. This follows another case analysis<ref>{{cite web|url=https://cse.buffalo.edu/faculty/atri/courses/coding-theory/lectures/lect28.pdf |title=Lecture 28: Generalized Minimum Distance Decoding |date=November 5, 2007 |archive-url=https://web.archive.org/web/20110606191851/http://www.cse.buffalo.edu/~atri/courses/coding-theory/lectures/lect28.pdf |archive-date=2011-06-06 |url-status=live}}</ref> when <math>(\omega_i = \Delta(C_\text{in}(y_i'), y_i) < \tfrac{d}{2})</math> or not.
In this case,
 
<math>\mathbb{E}[X_i^?] = {2\omega_i \over d}</math>and <math>\mathbb{E}[X_i^e] = Pr[X_i^e = 1] = 1 - {2\omega_i \over d}</math>.
 
Since <math>c_i \ne C_{in}(y_i')</math>,
 
<math>e_i + \omega_i \ge d</math>. This follows [http://www.cse.buffalo.edu/~atri/courses/coding-theory/lectures/lect28.pdf another case analysis] when <math>(\omega_i = \Delta(C_{in}(y_i'), y_i)</math> < <math>{d \over 2})</math> or not.
 
Finally, this implies
 
<math display="block">\mathbb{E}[2X_i^e + X_i^?] = 2 - {2\omega_i \over d} \le {2e_i \over d}.</math>.
In the following sections, we will finally show that the deterministic version of the algorithm above can do unique decoding of <math>C_\text{out} \circ C_\text{in}</math> up to half its design distance.
 
In the following sections, we will finally show that the deterministic version of the algorithm above can do unique decoding of <math>C_{out} \circ C_{in}</math> up to half its design distance.
 
==Modified randomized algorithm==
Note that, in the previous version of the GMD algorithm in step "3", we do not really need to use "fresh" [http://en.wikipedia.org/wiki/Randomness [randomness]] for each <math>i</math>. Now we come up with another randomized version of the GMD algorithm that uses the ''same'' randomness for every <math>i</math>. This idea follows the algorithm below.
 
'''Modified_Randomized_Decoder'''
<br />'''Given : '''<math>\mathbf{y} = (y_1, \ldots,y_N) \in [q^n]^N</math>, pick <math>\theta \in [0, 1]</math> at random. Then every for every <math>1 \le i \le N</math>:
# Set <math>y_i^\prime' = MLD_{C_\text{in}}(y_i)</math>.
# Compute <math>\omega_i = \min(\Delta(C_\text{in}(y_i^\prime'), y_i), {d\over2})</math>.
# If <math>\theta</math> < <math>\tfrac{2\omega_i \over }{d}</math>, set <math>y_i^{\prime\prime}'' \leftarrow ?,</math> ?, otherwise set <math>y_i^{\prime\prime}'' = y_i'</math>.
# Run errors and erasure algorithm for <math>C_\text{out}</math> on <math>\mathbf{y}^{\prime\prime}'' = (y_1^{'',\prime\prime},...ldots, y_N^{\prime\prime}'')</math>.
 
For the proof of '''[[Lemma (mathematics)|Lemma 1]]''', we only use the randomness to show that
 
<math>Pr[y_i^{\prime\prime} = ?] = {2\omega_i \over d}</math>.
 
<math display="block">\Pr[y_i^{\prime\prime}'' = ?] = {2\omega_i \over d}.</math>.
In this version of the GMD algorithm, we note that
 
<math display="block">\Pr[y_i^{\prime\prime}'' = ?] = \Pr \left [\theta \in \left [0, \tfrac{2\omega_i}{d} \overright d}] \right ] = \tfrac{2\omega_i \over }{d}.</math>.
The second [[Equality (mathematics)|equality]] above follows from the choice of <math>\theta</math>. The proof of '''Lemma 1''' can be also used to show <math>\mathbb{E}[2e' + s'] < D</math> for version2 of GMD. In the next section, we will see how to get a deterministic version of the GMD algorithm by choosing θ<math>\theta</math> from a polynomially sized set as opposed to the current infinite set <math>[0, 1]</math>.
 
The second equality above follows from the choice of <math>\theta</math>. The proof of '''Lemma 1''' can be also used to show <math>\mathbb{E}[2e' + s']</math> < <math>D</math> for version2 of GMD.
In the next section, we will see how to get a deterministic version of the GMD algorithm by choosing θ from a polynomially sized set as opposed to the current infinite set <math>[0, 1]</math>.
 
==Deterministic algorithm==
Let <math>Q = \{0,1\} \cup \{{2\omega_1 \over d}, \ldots,{2\omega_N \over d}\}</math>. Since for each <math>i, \omega_i = \min(\Delta(\mathbf{y_i'}, \mathbf{y_i}), {d \over 2})</math>, we have
 
<math display="block">Q = \{0, 1\} \cup \{q_1, \ldots,q_m\}</math>
where <math>q_1</math> < <math>q_2</math> < <math>\ldots</math>cdots < <math>q_m</math> for some <math>m \le \left \lfloor \frac{d}{2} \right \rfloor</math>. Note that for every <math>\theta \in [q_i, q_{i+1}]</math>, the step 1 of the second version of randomized algorithm outputs the same <math>\mathbf{y^{\prime\prime}}''.</math>. Thus, we need to consider all possible value of <math>\theta \in Q</math>. This gives the deterministic algorithm below.
 
where <math>q_1</math> < <math>q_2</math> < <math>\ldots</math> < <math>q_m</math> for some <math>m \le \left \lfloor \frac{d}{2} \right \rfloor</math>. Note that for every <math>\theta \in [q_i, q_{i+1}]</math>, the step 1 of the second version of randomized algorithm outputs the same <math>\mathbf{y^{\prime\prime}}</math>. Thus, we need to consider all possible value of <math>\theta \in Q</math>. This gives the deterministic algorithm below.
 
'''Deterministic_Decoder'''
<br />''' Given : '''<math>\mathbf{y} = (y_1,...\ldots,y_N) \in [q^n]^N</math>, for every <math>\theta \in Q</math>, repeat the following.
# Compute <math>y_i^\prime' = MLD_{C_\text{in}}(y_i)</math> for <math>1 \le i \le N</math>.
# Set <math>\omega_i = \min(\Delta(C_\text{in}(y_i^\prime'), y_i), {d\over2})</math> for every <math>1 \le i \le N</math>.
# If <math>\theta</math> < <math>{2\omega_i \over d}</math>, set <math>y_i^{\prime\prime}'' \leftarrow ?,</math> ?, otherwise set <math>y_i^{\prime\prime}'' = y_i'</math>.
# Run errors-and-erasures algorithm for <math>C_\text{out}</math> on <math>\mathbf{y^{\prime\prime}}'' = (y_1^{\prime\prime}'', \ldots, y_N^{\prime\prime}'')</math>. Let <math>c_\theta</math> be the codeword in <math>C_\text{out} \circ C_\text{in}</math> corresponding to the output of the algorithm, if any.
# Among all the <math>c_\theta</math> output in 4, output the one closest to <math>\mathbf{y}</math>
 
Every loop of 1~4 can be run in [http://en.wikipedia.org/wiki/Polynomial_time#Polynomial_time [polynomial time]], the algorithm above can also be computed in polynomial time. Specifically, each call to an errors and erasures decoder of <math><dD/2</math> errors takes <math>O(d)</math> time. Finally, the runtime of the algorithm above is <math>O(NQn^{O(1)} + NT_\text{out})</math> where <math>T_\text{out}</math> is the running time of the outer errors and erasures decoder.
Specifically, each call to an errors and erasures decoder of < <math>dD/2</math> errors takes <math>O(d)</math> time. Finally, the runtime of the algorithm above is <math>O(NQn^{O(1)} + NT_{out})</math> where <math>T_{out}</math> is the running time of the outer errors and erasures decoder.
 
==See also==
#[http://en.wikipedia.org/wiki/Concatenated_code* [[Concatenated codescode]]s
#* [http://en.wikipedia.org/wiki/Reed_Solomon[Reed Solomon|Reed Solomon error correction]]
#* [http://en.wikipedia.org/wiki/Berlekamp–Welch_algorithm[Berlekamp–Welch algorithm|Welch Berlekamp algorithm]]
 
==References==
{{Reflist}}
#[http://www.cse.buffalo.edu/~atri/courses/coding-theory/lectures University at Buffalo Lecture Notes on Coding Theory - Atri Rudra]
#* [httphttps://peoplecse.csail.mitbuffalo.edu/madhufaculty/atri/courses/coding-theory/lectures/FT01 MITUniversity at Buffalo Lecture Notes on Essential Coding Theory - MadhuAtri SudanRudra]
* [http://people.csail.mit.edu/madhu/FT01 MIT Lecture Notes on Essential Coding Theory – Madhu Sudan]
#* [http://www.cs.washington.edu/education/courses/cse533/06au University of Washington - Venkatesan Guruswami]
#* G. David Forney. Generalized Minimum Distance decoding. ''IEEE Transactions on Information Theory'', 12:125-131125–131, 1966
 
{{DEFAULTSORT:Generalized minimum distance decoding}}
[[Category:Error detection and correction]]
[[Category:Coding theory]]
[[Category:Finite fields]]
[[Category:Information theory]]