Noisy-channel coding theorem: Difference between revisions

Content deleted Content added
Chathu69 (talk | contribs)
No edit summary
m Hartley
 
(32 intermediate revisions by 23 users not shown)
Line 1:
{{short description|Limit on data transfer rate}}
 
{{Information theory}}
 
{{redirect|Shannon's theorem|text=Shannon's name is also associated with the [[sampling theorem]]}}
 
In [[information theory]], the '''noisy-channel coding theorem''' (sometimes '''Shannon's theorem''' or '''Shannon's limit'''), establishes that for any given degree of [[Noisy channel model|noise contamination of a communication channel]], it is possible (in theory) to communicate discrete data (digital [[information]]) nearly error-free up to a computable maximum rate through the channel. This result was presented by [[Claude Shannon]] in 1948 and was based in part on earlier work and ideas of [[Harry Nyquist]] and [[Ralph Hartley]].
 
The '''Shannon limit''' or '''Shannon capacity''' of a communication channel refers to the maximum [[Code rate|rate]] of error-free data that can theoretically be transferred over the channel if the link is subject to random data transmission errors, for a particular noise level. It was first described by Shannon (1948), and shortly after published in a book by Shannon and [[Warren Weaver]] entitled ''[[The Mathematical Theory of Communication]]'' (1949). This founded the modern discipline of [[information theory]].
== Overview ==
 
Stated by [[Claude Shannon]] in 1948, the theorem describes the maximum possible efficiency of [[error-correcting code|error-correcting methods]] versus levels of noise interference and data corruption. Shannon's theorem has wide-ranging applications in both communications and [[data storage device|data storage]]. This theorem is of foundational importance to the modern field of [[information theory]]. Shannon only gave an outline of the proof. The first rigorous proof for the discrete case is duegiven toin [[Amiel Feinstein]]<ref>{{Cite journalharv|date=1954|others=Feinstein, Amiel.|title=A new basic theorem of information theory|hdl=1721.1/4798|bibcode=1955PhDT........12F1954}}</ref> in 1954.
 
The Shannon theorem states that given a noisy channel with [[channel capacity]] ''C'' and information transmitted at a rate ''R'', then if <math>R < C</math> there exist [[code]]s that allow the [[probability of error]] at the receiver to be made arbitrarily small. This means that, theoretically, it is possible to transmit information nearly without error at any rate below a limiting rate, ''C''.
Line 18 ⟶ 16:
The channel capacity <math>C</math> can be calculated from the physical properties of a channel; for a band-limited channel with Gaussian noise, using the [[Shannon–Hartley theorem]].
 
Simple schemes such as "send the message 3 times and use a best 2 out of 3 voting scheme if the copies differ" are inefficient error-correction methods, unable to asymptotically guarantee that a block of data can be communicated free of error. Advanced techniques such as [[Reed–Solomon code]]s and, more recently, [[low-density parity-check code|low-density parity-check]] (LDPC) codes and [[turbo code]]s, come much closer to reaching the theoretical Shannon limit, but at a cost of high computational complexity. Using these highly efficient codes and with the computing power in today's [[digital signal processors]], it is now possible to reach very close to the Shannon limit. In fact, it was shown that LDPC codes can reach within 0.0045&nbsp;dB of the Shannon limit (for binary [[Additiveadditive white Gaussian noise]] (AWGN) channels, with very long block lengths).<ref>[[Sae-Young{{cite Chung]],journal [[ |author2-link=G. David Forney, Jr.]], [[Thomas J.|author4-link=Rüdiger Richardson]],Urbanke and|author1=Sae-Young [[RüdigerChung Urbanke]],|first2=G. "[http://wwwD.josephboutros |last2=Forney |first3=T.org/ldpc_vs_turbo/ldpc_Chung_CLfeb01J. |last3=Richardson |first4=R.pdf |last4=Urbank |title=On the Design of Low-Density Parity-Check Codes within 0.0045 dB of the Shannon Limit]", ''[[|journal=IEEE Communications Letters]]'', |volume=5: 58-60,|issue=2 Feb.|pages=58–60 |date=February 2001 |doi=10.1109/4234.905935 ISSN|s2cid=7381972 1089-7798|url=http://www.josephboutros.org/ldpc_vs_turbo/ldpc_Chung_CLfeb01.pdf}}</ref>
 
== Mathematical statement ==
 
[[Image:Noisy-channel coding theorem — channel capacity graph.png|thumb|right|300px|Graph showing the proportion of a channel’s capacity (''y''-axis) that can be used for payload based on how noisy the channel is (probability of bit flips; ''x''-axis)]]
 
The basic mathematical model for a communication system is the following:
Line 33:
'''Theorem''' (Shannon, 1948):
 
: 1. For every discrete memoryless channel, the [[channel capacity]] is, defined in terms of the [[mutual information]] <math>I(X; Y)</math>, as
 
:: <math>\ C = \sup_{p_X} I(X;Y)</math><ref>For a description of the "sup" function, see [[Supremum]]</ref>
Line 82:
# The message W is sent across the channel.
# The receiver receives a sequence according to <math>P(y^n|x^n(w))= \prod_{i = 1}^np(y_i|x_i(w))</math>
# Sending these codewords across the channel, we receive <math>Y_1^n</math>, and decode to some source sequence if there exists exactly 1 codeword that is jointly typical with Y. If there are no jointly typical codewords, or if there are more than one, an error is declared. An error also occurs if a decoded codeword doesn'tdoes not match the original codeword. This is called ''typical set decoding''.
 
The probability of error of this scheme is divided into two parts:
Line 114:
Suppose a code of <math>2^{nR}</math> codewords. Let W be drawn uniformly over this set as an index. Let <math>X^n</math> and <math>Y^n</math> be the transmitted codewords and received codewords, respectively.
 
# <math>nR = H(W) = H(W|Y^n) + I(W;Y^n)</math> using identities involving entropy and [[mutual information]]
# <math>\le H(W|Y^n) + I(X^n(W);Y^{n})</math> since X is a function of W
# <math>\le 1 + P_e^{(n)}nR + I(X^n(W);Y^n)</math> by the use of [[Fano's Inequality]]
# <math>\le 1 + P_e^{(n)}nR + nC</math> by the fact that capacity is maximized mutual information.
 
The result of these steps is that <math> P_e^{(n)} \ge 1 - \frac{1}{nR} - \frac{C}{R} </math>. As the block length <math>n</math> goes to infinity, we obtain <math> P_e^{(n)}</math> is bounded away from 100 if R is greater than C - we can get arbitrarily low rates of error only if R is less than C.
 
=== Strong converse for discrete memoryless channels ===
 
A strong converse theorem, proven by Wolfowitz in 1957,<ref>{{cite book |first=Robert |last=Gallager. ''|title=Information Theory and Reliable Communication.'' New York: [[John |publisher=Wiley & Sons]], |date=1968. {{ISBN|isbn=0-471-29048-3 }}</ref> states that,
 
: <math>
Line 154:
 
The technicality of [[lim inf]] comes into play when <math>\frac{1}{n}\sum_{i=1}^n C_i</math> does not converge.
 
== Channel coding rate in the Finite Blocklength Regime ==
 
[[Finite Blocklength Information Theory]] investigates the maximum channel coding rate achievable at a given block-length and error probability in the finite block-length regime (non-asymptotic regime)<ref>{{Cite conference |last=Mary |first=Philippe |last2=Gorce |first2=Jean-Marie |last3=Unsal |first3=Ayse |last4=Poor |first4=H. Vincent |date=2021-07-20 |title=Finite Blocklength Information Theory: What Is the Practical Impact on Wireless Communications? |url=https://ieeexplore.ieee.org/document/7848909 |conference=IEEE Globecom Workshops |pages=1–6 |doi=10.1109/GLOCOMW.2016.7848909}}</ref><ref>{{Cite journal |last=Polyanskiy |first=Yury |last2=Poor |first2=H. Vincent |last3=Verdu |first3=Sergio |date=2010-05-01 |title=Channel Coding Rate in the Finite Blocklength Regime |url=https://ieeexplore.ieee.org/abstract/document/5452208 |journal=IEEE Transactions on Information Theory |volume=56 |issue=5 |pages=2307–2359 |doi=10.1109/TIT.2010.2043769 |issn=1557-9654}}</ref><ref>{{Cite conference |last=Wijerathna Basnayaka |first=Chathuranga M. |last2=Jayakody |first2=Dushantha Nalin K. |last3=Ponnimbaduge Perera |first3=Tharindu D. |last4=Vidal Ribeiro |first4=Moisés |date=2021-07-20 |title=Age of Information in an URLLC-enabled Decode-and-Forward Wireless Communication System |url=https://ieeexplore.ieee.org/document/9449007 |conference= 2021 IEEE 93rd Vehicular Technology Conference (VTC2021-Spring) |pages=1–6 |doi=10.1109/VTC2021-Spring51267.2021.9449007}}</ref>. The maximal achievable channel coding rate <math> \left ( \bar{R} \right ) </math> with given block error probability <math> \left ( \epsilon \right ) </math> and block-length <math> \left ( n \right ) </math> (for binary [[Additive white Gaussian noise]] (AWGN) channels, with short block lengths) , closely approximated by Polyanskiy, [[Vincent Poor|Poor]] and [[Sergio Verdú|Verdú]] (PPV) in 2010, is given by
 
:<math> \bar{R} =C-\sqrt{\frac{V}{n}}Q^{-1}\left ( \epsilon \right )</math>
where <math> Q^{-1}</math> is the inverse of the complementary [[Normal distribution|Gaussian]] [[Cumulative distribution function|cumulative distribution function]], <math> C</math> is the channel capacity and <math> V </math> is a characteristic of the channel referred to as channel dispersion.
 
== See also ==
Line 171 ⟶ 164:
* [[Turbo code]]
 
== Notes ==
 
== References ==
{{reflist}}
== References ==
* [[Thomas M. Cover|Cover T. M.]], Thomas J. A., ''Elements of Information Theory'', [[John Wiley & Sons]], 1991. {{ISBN|0-471-06259-6}}
*{{cite web |first=B. |last=Aazhang |title=Shannon's Noisy Channel Coding Theorem |date=2004 |work=Connections |publisher= |url=https://www.cse.iitd.ac.in/~vinay/courses/CSL858/reading/m10180.pdf}}
* [[Fano|Fano, R. A.]], ''Transmission of information; a statistical theory of communications'', [[MIT Press]], 1961. {{ISBN|0-262-06001-9}}
*{{cite [[book |author1-link=Thomas M. Cover |last1=Cover |first1=T. M.]], |last2=Thomas |first2=J. A., ''|title=Elements of Information Theory'', [[John |publisher=Wiley & Sons]], |date=1991. {{ISBN|isbn=0-471-06259-6 }}
* [[Amiel Feinstein|Feinstein, Amiel]], "A New basic theorem of information theory", ''[[IEEE Transactions on Information Theory]]'', 4(4): 2-22, 1954.
*{{cite book [[Fano|author1-link=Robert Fano, |first=R. AM.]], ''|last=Fano |title=Transmission of information; a statistical theory of communications'', [[|publisher=MIT Press]], |date=1961. {{ISBN|isbn=0-262-06001-9 }}
* [[David J.C. MacKay|MacKay, David J. C.]], ''[http://www.inference.phy.cam.ac.uk/mackay/itila/book.html Information Theory, Inference, and Learning Algorithms]'', [[Cambridge University Press]], 2003. {{ISBN|0-521-64298-1}} [free online]
*{{cite journal |last1=Feinstein |first1=Amiel |title=A new basic theorem of information theory |journal=Transactions of the IRE Professional Group on Information Theory |date=September 1954 |volume=4 |issue=4 |pages=2–22 |doi=10.1109/TIT.1954.1057459 |hdl=1721.1/4798 |bibcode=1955PhDT........12F|hdl-access=free }}
* [[Claude E. Shannon|Shannon, C. E.]], [https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6773024 ''A Mathematical Theory of Communication'']. ''The Bell System Technical Journal'' 27,3: 379–423, 1948.
*{{cite journal |first=Lars |last=Lundheim |title=On Shannon and Shannon's Formula |journal=Telektronik |volume=98 |issue=1 |pages=20–29 |date=2002 |doi= |url=http://www.cs.miami.edu/home/burt/learning/Csc524.142/LarsTelektronikk02.pdf}}
* [[Claude E. Shannon|Shannon, C. E.]], [http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html ''A Mathematical Theory of Communication''] Urbana, IL: University of Illinois Press, 1948 (reprinted 1998).
* [[Wolfowitz|Wolfowitz, J.]], "[https://projecteuclid.org/download/pdf_1/euclid.ijm/1255380682 The coding of messages subject to chance errors]", ''Illinois J. Math.'', 1: 591–606, 1957.
 
== External links ==
 
*{{cite [[book |author1-link=David J.C. MacKay|MacKay, |first=David J. C.]], ''[http://www.inference.phy.cam.ac.uk/mackay/itila/book.html|last=MacKay |title=Information Theory, Inference, and Learning Algorithms]'', [[|publisher=Cambridge University Press]], |date=2003. {{ISBN|isbn=0-521-64298-1 |pages= |url=http://www.inference.phy.cam.ac.uk/mackay/itila/book.html}} [free online]
* [http://www.cs.miami.edu/home/burt/learning/Csc524.142/LarsTelektronikk02.pdf On Shannon and Shannon's law]
*{{cite journal|author-link=Claude E. Shannon | doi=10.1002/j.1538-7305.1948.tb01338.x | title=A Mathematical Theory of Communication | year=1948 | last1=Shannon | first1=C. E. | journal=Bell System Technical Journal | volume=27 | issue=3 | pages=379–423 }}
* [http://cnx.org/content/m10180/latest/ Shannon's Noisy Channel Coding Theorem]
*{{cite book |author-link=Claude E. Shannon |first=C.E. |last=Shannon |title=A Mathematical Theory of Communication |publisher=University of Illinois Press |orig-year=1948 |date=1998 |pages= |url=http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html}}
*{{cite journal |first=J. |last=Wolfowitz |title=The coding of messages subject to chance errors |journal=Illinois J. Math. |volume=1 |issue= 4|pages=591–606 |date=1957 |doi= 10.1215/ijm/1255380682|url=https://projecteuclid.org/download/pdf_1/euclid.ijm/1255380682|doi-access=free }}
 
{{DEFAULTSORT:Noisy-Channel Coding Theorem}}