Hamming code: Difference between revisions

Content deleted Content added
m edited the link to a new separate repo with just this project
Tags: Reverted Visual edit
m added mention of similar-sounding Huffman code
 
(5 intermediate revisions by 4 users not shown)
Line 1:
{{distinguish|text=[[Huffman coding|Huffman code]]}}
{{short description|Family of linear error-correcting codes}}
{{More footnotes needed|date=March 2013}}
Line 17 ⟶ 18:
}}
 
In [[computer science]] and [[telecommunications]], '''Hamming codes''' are a family of [[linear code|linear error-correcting codes]]. Hamming codes can detect one-bit and two-bit errors, or correct one-bit errors without detection of uncorrected errors. By contrast, the simple [[parity bit|parity code]] cannot correct errors, and can detect only an odd number of bits in error. Hamming codes are [[perfect code]]s, that is, they achieve the highest possible [[Block code#The rate R|rate]] for codes with their [[block code#The block length n|block length]] and [[Block code#The distance d|minimum distance]] of three.<ref>[https://www.cs.cmu.edu/~venkatg/teaching/codingtheory/notes/notes1.pdf See Lemma 12 of]</ref>
[[Richard Hamming|Richard W. Hamming]] invented Hamming codes in 1950 as a way of automatically correcting errors introduced by [[punched card]] readers. In his original paper, Hamming elaborated his general idea, but specifically focused on the [[Hamming(7,4)]] code which adds three parity bits to four bits of data.{{Sfnp|Hamming|1950|pp=153&ndash;154}}
 
In [[mathematics|mathematical]] terms, Hamming codes are a class of binary linear code. For each integer {{Math|''r'' ≥ 2}} there is a code-word with [[Block code#The block length n|block length]] {{Math|''n'' {{=}} 2<sup>''r''</sup> − 1}} and [[Block code#The message length k|message length]] {{Math|''k'' {{=}} 2<sup>''r''</sup> − ''r'' − 1}}. Hence the rate of Hamming codes is {{Math|''R'' {{=}} ''k'' / ''n'' {{=}} 1 − ''r'' / (2<sup>''r''</sup> − 1)}}, which is the highest possible for codes with minimum distance of three (i.e., the minimal number of bit changes needed to go from any code word to any other code word is three) and block length {{Math|2<sup>''r''</sup> − 1}}. The [[parity-check matrix]] of a Hamming code is constructed by listing all columns of length {{Math|''r''}} that are non-zero, which means that the [[dual code]] of the Hamming code is the [[Hadamard code|shortened Hadamard code]], also known as a Simplex code. The parity-check matrix has the property that any two columns are pairwise [[Linear Independence|linearly independent]].
 
Due to the limited redundancy that Hamming codes add to the data, they can only detect and correct errors when the error rate is low. This is the case in computer memory (usually RAM), where bit errors are extremely rare and Hamming codes are widely used, and a RAM with this correction system is an ECC RAM ([[ECC memory]]). In this context, an extended Hamming code having one extra parity bit is often used. Extended Hamming codes achieve a Hamming distance of four, which allows the decoder to distinguish between when at most one one-bit error occurs and when any two-bit errors occur. In this sense, extended Hamming codes are single-error correcting and double-error detecting, abbreviated as '''SECDED'''.
 
== History ==
[[Richard Hamming]], the inventor of Hamming codes, worked at [[Bell Labs]] in the late 1940s on the Bell [[Model V]] computer, an [[electromechanical]] relay-based machine with cycle times in seconds. Input was fed in on [[Punched tape|punched paper tape]], seven-eighths of an inch wide, which had up to six holes per row. During weekdays, when errors in the relays were detected, the machine would stop and flash lights so that the operators could correct the problem. During after-hours periods and on weekends, when there were no operators, the machine simply moved on to the next job.
 
Hamming worked on weekends, and grew increasingly frustrated with having to restart his programs from scratch due to detected errors. In a taped interview, Hamming said, "And so I said, 'Damn it, if the machine can detect an error, why can't it locate the position of the error and correct it?'".<ref>{{citation|last=Thompson|first=Thomas M.|title=From Error-Correcting Codes through Sphere Packings to Simple Groups|url=https://books.google.com/books?id=ggqxuG31B3cC&q=%22From%20Error-Correcting%20Codes%20through%20Sphere%20Packings%20to%20Simple%20Groups%22&pg=PA16|pages=16–17|year=1983|series=The Carus Mathematical Monographs (#21)|publisher=Mathematical Association of America|isbn=0-88385-023-0}}</ref> Over the next few years, he worked on the problem of error-correction, developing an increasingly powerful array of algorithms. In 1950, he published what is now known as Hamming code, which remains in use today in applications such as [[ECC memory]].
Line 35 ⟶ 36:
{{main|Parity bit}}
 
Parity adds a single [[binary digit|bit]] that indicates whether the number of ones (bit-positions with values of one) in the preceding data was [[even number|even]] or [[odd number|odd]]. If an odd number of bits is changed in transmission, the message will change parity and the error can be detected at this point; however, the bit that changed may have been the parity bit itself. The most common convention is that a parity value of one indicates that there is an odd number of ones in the data, and a parity value of zero indicates that there is an even number of ones. If the number of bits changed is even, the check bit will be valid and the error will not be detected.
 
Moreover, parity does not indicate which bit contained the error, even when it can detect it. The data must be discarded entirely and re-transmitted from scratch. On a noisy transmission medium, a successful transmission could take a long time or may never occur. However, while the quality of parity checking is poor, since it uses only a single bit, this method results in the least overhead.
Line 63 ⟶ 64:
 
===General algorithm===
The following general algorithm generates a single-error correcting (SEC) code for any number of bits. The main idea is to choose the error-correcting bits such that the index-XOR (the [[Exclusive or|XOR]] of all the bit positions containing a 1) is 0. We use positions 1, 10, 100, etc. (in binary) as the error-correcting bits, which guarantees it is possible to set the error-correcting bits so that the index-XOR of the whole message is 0. If the receiver receives a string with index-XOR 0, they can conclude there were no corruptions, and otherwise, the index-XOR indicates the index of the corrupted bit.
 
An algorithm can be deduced from the following description:
Line 114 ⟶ 115:
|}
 
Shown are only 20 encoded bits (5 parity, 15 data) but the pattern continues indefinitely. The key thing about Hamming Codescodes that can be seen from visual inspection is that any given bit is included in a unique set of parity bits. To check for errors, check all of the parity bits. The pattern of errors, called the [[Syndrome decoding|error syndrome]], identifies the bit in error. If all parity bits are correct, there is no error. Otherwise, the sum of the positions of the erroneous parity bits identifies the erroneous bit. For example, if the parity bits in positions 1, 2 and 8 indicate an error, then bit 1+2+8=11 is in error. If only one parity bit indicates an error, the parity bit itself is in error.
 
With {{mvar|m}} parity bits, bits from 1 up to <math>2^m-1</math> can be covered. After discounting the parity bits, <math>2^m-m-1</math> bits remain for use as data. As {{mvar|m}} varies, we get all the possible Hamming codes:
Line 154 ⟶ 155:
If the decoder does not attempt to correct errors, it can reliably detect triple bit errors. If the decoder does correct errors, some triple errors will be mistaken for single errors and "corrected" to the wrong value. Error correction is therefore a trade-off between certainty (the ability to reliably detect triple bit errors) and resiliency (the ability to keep functioning in the face of single bit errors).
 
This extended Hamming code was popular in computer memory systems, starting with [[IBM 7030 Stretch]] in 1961,{{sfn|Kythe|Kythe|2017|p=115}} where it is known as ''SECDED'' (or SEC-DED, abbreviated from ''single error correction, double error detection'').{{sfn|Kythe|Kythe|2017|p=95}} Server computers in 21st century, while typically keeping the SECDED level of protection, no longer use the Hamming's method, relying instead on the designs with longer codewords (128 to 256 bits of data) and modified balanced parity-check trees.{{sfn|Kythe|Kythe|2017|p=115}} The (72,64) Hamming code is still popular in some hardware designs, including [[Xilinx]] [[FPGA]] families.{{sfn|Kythe|Kythe|2017|p=115}}
 
==[7,4] Hamming code==
{{main|Hamming(7,4)}}
[[File:Hamming(7,4).svg|thumb|300px|Graphical depiction of the four data bits and three parity bits and which parity bits apply to which data bits]]
 
{{main|Hamming(7,4)}}
 
In 1950, Hamming introduced the [7,4] Hamming code. It encodes four data bits into seven bits by adding three parity bits. As explained earlier, it can either detect and correct single-bit errors or it can detect (but not correct) both single and double-bit errors.
Line 244:
0 & 1 & 1 & 0 & 0 & 1 & 1 & 0\\
0 & 0 & 0 & 1 & 1 & 1 & 1 & 0\\
01 & 01 & 1 & 01 & 1 & 1 & 01 & 1
\end{pmatrix}_{4,8}
.</math>
Line 348:
* [http://www.toolmenow.com/34/Hamming(7,4)-Code-Calculator Tool for calculating Hamming code]
 
== Implementations ==
 
* [https://github.com/malikAkurtz/HammingCode C++]
{{DEFAULTSORT:Hamming Code}}
[[Category:American inventions]]