Error detection and correction: Difference between revisions

Content deleted Content added
Tcs4flt (talk | contribs)
m fix wrong citation
m Reverting possible vandalism by 50.175.213.202 to version by Bender the Bot. Report False Positive? Thanks, ClueBot NG. (4410618) (Bot)
 
(7 intermediate revisions by 7 users not shown)
Line 4:
{{More citations needed|article|date=August 2008}}
 
[[File:Reed–Solomon error correction Mona Lisa LroLrLasercomFig4.jpg|thumb|To clean up transmission errors introduced by Earth's atmosphere (left), Goddard scientists applied [[Reed–Solomon error correction]] (right), which is commonly used in CDs and DVDs. Typical errors include missing pixels (white) and false signals (black). The white stripe indicates a brief period when transmission was interrupted.]]
 
In [[information theory]] and [[coding theory]] with applications in [[computer science]] and [[telecommunications]], '''error detection and correction''' ('''EDAC''') or '''error control''' are techniques that enable [[reliable delivery]] of [[digital data]] over unreliable [[communication channel]]s. Many communication channels are subject to [[channel noise]], and thus errors may be introduced during transmission from the source to a receiver. Error detection techniques allow detecting such errors, while error correction enables reconstruction of the original data in many cases.
Line 37:
Three types of ARQ protocols are [[Stop-and-wait ARQ]], [[Go-Back-N ARQ]], and [[Selective Repeat ARQ]].
 
ARQ is appropriate if the communication channel has varying or unknown [[channel capacity|capacity]], such as is the case on the Internet. However, ARQ requires the availability of a [[Backward channel|back channel]], results in possibly increased [[LatencyNetwork (engineering)latency|latency]] due to retransmissions, and requires the maintenance of buffers and timers for retransmissions, which in the case of [[network congestion]] can put a strain on the server and overall network capacity.<ref name="reliable-erasure-code">A. J. McAuley, ''Reliable Broadband Communication Using a Burst Erasure Correcting Code'', ACM SIGCOMM, 1990.</ref>
 
For example, ARQ is used on shortwave radio data links in the form of [[ARQ-E]], or combined with multiplexing as [[ARQ-M]].
Line 142:
Applications that use ARQ must have a [[return channel]]; applications having no return channel cannot use ARQ. Applications that require extremely low error rates (such as digital money transfers) must use ARQ due to the possibility of uncorrectable errors with FEC.
 
Reliability and inspection engineering also make use of the theory of error-correcting codes,<ref>{{cite journal |url=http://www.eng.tau.ac.il/~bengal/SCI_paper.pdf|journal=IIE Transactions |title=Self-correcting inspection procedure under inspection errors |author1=Ben-Gal I. |author2=Herer Y. |author3=Raz T. |publisher=IIE Transactions on Quality and Reliability, 34(6), pp. 529-540. |year=2003 |access-date=2014-01-10 |archive-url=https://web.archive.org/web/20131013171945/http://www.eng.tau.ac.il/~bengal/SCI_paper.pdf |archive-date=2013-10-13 |url-status=dead }}</ref>, as well as natural language.<ref name="DOI10.7275/bjvb-2n37">
{{cite journal
| author = Yvo Meeres, Tommi A. Pirinen
Line 155:
| pmid =
}}
</ref>.
 
=== Internet ===
In a typical [[TCP/IP]] stack, error control is performed at multiple levels:
* Each [[Ethernet frame]] uses [[Cyclic redundancy check|CRC-32]] error detection. Frames with detected errors are discarded by the receiver hardware.
* The [[IPv4]] header contains a [[IPv4 header checksum|checksum]] protecting the contents of the header. [[Network packet|Packets]] with incorrect checksums are dropped within the network or at the receiver.
* The checksum was omitted from the [[IPv6]] header in order to minimize processing costs in [[network routing]] and because current [[link layer]] technology is assumed to provide sufficient error detection (see also RFC 3819).
Line 178:
 
=== Data storage ===
Error detection and correction codes are often used to improve the reliability of data storage media.<ref>{{Cite book|last1=Kurtas|first1=Erozan M.|url=https://books.google.com/books?id=Vx_NBQAAQBAJ&q=Error+detection+and+correction+codes+are+often+used+to+improve+the+reliability+of+data+storage+media&pg=PR5|title=Advanced Error Control Techniques for Data Storage Systems|last2=Vasic|first2=Bane|date=2018-10-03|publisher=CRC Press|isbn=978-1-4200-3649-7|language=en}}{{Dead link|date=March 2020 |bot=InternetArchiveBot |fix-attempted=yes }}</ref> A parity track capable of detecting single-bit errors was present on the first [[magnetic tape data storage]] in 1951. The [[optimal rectangular code]] used in [[group coded recording]] tapes not only detects but also corrects single-bit errors. Some [[file format]]s, particularly [[archive formats]], include a checksum (most often [[CRC32]CRC-32]) to detect corruption and truncation and can employ redundancy or [[parity file]]s to recover portions of corrupted data. [[Cross-interleaved Reed–Solomon coding|Reed-Solomon codes]] are used in [[compact disc]]s to correct errors caused by scratches.
 
Modern hard drives use Reed–Solomon codes to detect and correct minor errors in sector reads, and to recover corrupted data from failing sectors and store that data in the spare sectors.<ref>{{cite web |archive-url=https://web.archive.org/web/20080202143103/http://www.myharddrivedied.com/presentations_whitepaper.html |archive-date=2008-02-02 |url=http://www.myharddrivedied.com/presentations_whitepaper.html |title=My Hard Drive Died |author=Scott A. Moulton}}</ref> [[RAID]] systems use a variety of error correction techniques to recover data when a hard drive completely fails. Filesystems such as [[ZFS]] or [[Btrfs]], as well as some [[RAID]] implementations, support [[data scrubbing]] and resilvering, which allows bad blocks to be detected and (hopefully) recovered before they are used.<ref>{{Cite book|last1=Qiao|first1=Zhi|last2=Fu|first2=Song|last3=Chen|first3=Hsing-Bung|last4=Settlemyer|first4=Bradley|title=2019 IEEE International Conference on Cluster Computing (CLUSTER) |chapter=Building Reliable High-Performance Storage Systems: An Empirical and Analytical Study |date=2019|pages=1–10|doi=10.1109/CLUSTER.2019.8891006|isbn=978-1-7281-4734-5|s2cid=207951690}}</ref> The recovered data may be re-written to exactly the same physical ___location, to spare blocks elsewhere on the same piece of hardware, or the data may be rewritten onto replacement hardware.
Line 203:
| author = Jeff Layton | magazine = [[Linux Magazine]]
}}</ref><ref>{{cite web
| url = httphttps://bluesmoke.sourceforge.net/
| title = EDAC Project
| access-date = 2014-08-12