Noisy-channel coding theorem: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 11:44, 23 July 2021 edit Chathu69 (talk \| contribs) 379 edits No edit summary ← Previous edit		Latest revision as of 12:08, 16 April 2025 edit undo Atrivo (talk \| contribs) Extended confirmed users 1,674 edits m Hartley
(32 intermediate revisions by 23 users not shown)
Line 1: {{short description\|Limit on data transfer rate}} ~~{{Information theory}}~~ {{redirect\|Shannon's theorem\|text=Shannon's name is also associated with the [[sampling theorem]]}} In [[information theory]], the '''noisy-channel coding theorem''' (sometimes '''Shannon's theorem''' or '''Shannon's limit'''), establishes that for any given degree of ~~[[Noisy channel model\|~~noise contamination of a communication channel]], it is possible (in theory) to communicate discrete data (digital [[information]]) nearly error-free up to a computable maximum rate through the channel. This result was presented by [[Claude Shannon]] in 1948 and was based in part on earlier work and ideas of [[Harry Nyquist]] and [[Ralph Hartley]]. The '''Shannon limit''' or '''Shannon capacity''' of a communication channel refers to the maximum [[Code rate\|rate]] of error-free data that can theoretically be transferred over the channel if the link is subject to random data transmission errors, for a particular noise level. It was first described by Shannon (1948), and shortly after published in a book by Shannon and [[Warren Weaver]] entitled ''[[The Mathematical Theory of Communication]]'' (1949). This founded the modern discipline of [[information theory]]. == Overview == Stated by [[Claude Shannon]] in 1948, the theorem describes the maximum possible efficiency of [[error-correcting code\|error-correcting methods]] versus levels of noise interference and data corruption. Shannon's theorem has wide-ranging applications in both communications and [[data storage device\|data storage]]. This theorem is of foundational importance to the modern field of [[information theory]]. Shannon only gave an outline of the proof. The first rigorous proof for the discrete case is ~~due~~given toin ~~[[Amiel Feinstein]]<ref>~~{{~~Cite journal~~harv\|~~date=1954\|others=~~Feinstein~~, Amiel.~~\|~~title=A new basic theorem of information theory\|hdl=1721.1/4798\|bibcode=1955PhDT........12F~~1954}}~~</ref> in 1954~~. The Shannon theorem states that given a noisy channel with [[channel capacity]] ''C'' and information transmitted at a rate ''R'', then if <math>R < C</math> there exist [[code]]s that allow the [[probability of error]] at the receiver to be made arbitrarily small. This means that, theoretically, it is possible to transmit information nearly without error at any rate below a limiting rate, ''C''. Line 18 ⟶ 16: The channel capacity <math>C</math> can be calculated from the physical properties of a channel; for a band-limited channel with Gaussian noise, using the [[Shannon–Hartley theorem]]. Simple schemes such as "send the message 3 times and use a best 2 out of 3 voting scheme if the copies differ" are inefficient error-correction methods, unable to asymptotically guarantee that a block of data can be communicated free of error. Advanced techniques such as [[Reed–Solomon code]]s and, more recently, [[low-density parity-check code\|low-density parity-check]] (LDPC) codes and [[turbo code]]s, come much closer to reaching the theoretical Shannon limit, but at a cost of high computational complexity. Using these highly efficient codes and with the computing power in today's [[digital signal processors]], it is now possible to reach very close to the Shannon limit. In fact, it was shown that LDPC codes can reach within 0.0045 dB of the Shannon limit (for binary [[~~Additive~~additive white Gaussian noise]] (AWGN) channels, with very long block lengths).<ref>~~[[Sae-Young~~{{cite ~~Chung]],~~journal [[ \|author2-link=G. David Forney, Jr.~~]],~~ ~~[[Thomas~~ J.\|author4-link=Rüdiger ~~Richardson]],~~Urbanke ~~and~~\|author1=Sae-Young ~~[[Rüdiger~~Chung ~~Urbanke]],~~\|first2=G. ~~"[http://www~~D.~~josephboutros~~ \|last2=Forney \|first3=T.~~org/ldpc_vs_turbo/ldpc_Chung_CLfeb01~~J. \|last3=Richardson \|first4=R.~~pdf~~ \|last4=Urbank \|title=On the Design of Low-Density Parity-Check Codes within 0.0045 dB of the Shannon Limit~~]",~~ ~~''[[~~\|journal=IEEE Communications Letters~~]]'',~~ \|volume=5: ~~58-60,~~\|issue=2 ~~Feb.~~\|pages=58–60 \|date=February 2001 \|doi=10.1109/4234.905935 ~~ISSN~~\|s2cid=7381972 ~~1089-7798~~\|url=http://www.josephboutros.org/ldpc_vs_turbo/ldpc_Chung_CLfeb01.pdf}}</ref> == Mathematical statement == [[Image:Noisy-channel coding theorem — channel capacity graph.png\|thumb\|right\|300px\|Graph showing the proportion of a channel’s capacity (''y''-axis) that can be used for payload based on how noisy the channel is (probability of bit flips; ''x''-axis)]] The basic mathematical model for a communication system is the following: Line 33: '''Theorem''' (Shannon, 1948): : 1. For every discrete memoryless channel, the [[channel capacity]] is, defined in terms of the [[mutual information]] <math>I(X; Y)</math>, as :: <math>\ C = \sup_{p_X} I(X;Y)</math><ref>For a description of the "sup" function, see [[Supremum]]</ref> Line 82: # The message W is sent across the channel. # The receiver receives a sequence according to <math>P(y^n\|x^n(w))= \prod_{i = 1}^np(y_i\|x_i(w))</math> # Sending these codewords across the channel, we receive <math>Y_1^n</math>, and decode to some source sequence if there exists exactly 1 codeword that is jointly typical with Y. If there are no jointly typical codewords, or if there are more than one, an error is declared. An error also occurs if a decoded codeword ~~doesn't~~does not match the original codeword. This is called ''typical set decoding''. The probability of error of this scheme is divided into two parts: Line 114: Suppose a code of <math>2^{nR}</math> codewords. Let W be drawn uniformly over this set as an index. Let <math>X^n</math> and <math>Y^n</math> be the transmitted codewords and received codewords, respectively. # <math>nR = H(W) = H(W\|Y^n) + I(W;Y^n)</math> using identities involving entropy and [[mutual information]] # <math>\le H(W\|Y^n) + I(X^n(W);Y^{n})</math> since X is a function of W # <math>\le 1 + P_e^{(n)}nR + I(X^n(W);Y^n)</math> by the use of [[Fano's Inequality]] # <math>\le 1 + P_e^{(n)}nR + nC</math> by the fact that capacity is maximized mutual information. The result of these steps is that <math> P_e^{(n)} \ge 1 - \frac{1}{nR} - \frac{C}{R} </math>. As the block length <math>n</math> goes to infinity, we obtain <math> P_e^{(n)}</math> is bounded away from 100 if R is greater than C - we can get arbitrarily low rates of error only if R is less than C. === Strong converse for discrete memoryless channels === A strong converse theorem, proven by Wolfowitz in 1957,<ref>{{cite book \|first=Robert \|last=Gallager. ''\|title=Information Theory and Reliable Communication~~.'' New York: [[John~~ \|publisher=Wiley ~~& Sons]],~~ \|date=1968. ~~{{ISBN~~\|isbn=0-471-29048-3 }}</ref> states that, : <math> Line 154: The technicality of [[lim inf]] comes into play when <math>\frac{1}{n}\sum_{i=1}^n C_i</math> does not converge. ~~== Channel coding rate in the Finite Blocklength Regime ==~~ [[Finite Blocklength Information Theory]] investigates the maximum channel coding rate achievable at a given block-length and error probability in the finite block-length regime (non-asymptotic regime)<ref>{{Cite conference \|last=Mary \|first=Philippe \|last2=Gorce \|first2=Jean-Marie \|last3=Unsal \|first3=Ayse \|last4=Poor \|first4=H. Vincent \|date=2021-07-20 \|title=Finite Blocklength Information Theory: What Is the Practical Impact on Wireless Communications? \|url=https://ieeexplore.ieee.org/document/7848909 \|conference=IEEE Globecom Workshops \|pages=1–6 \|doi=10.1109/GLOCOMW.2016.7848909}}</ref><ref>{{Cite journal \|last=Polyanskiy \|first=Yury \|last2=Poor \|first2=H. Vincent \|last3=Verdu \|first3=Sergio \|date=2010-05-01 \|title=Channel Coding Rate in the Finite Blocklength Regime \|url=https://ieeexplore.ieee.org/abstract/document/5452208 \|journal=IEEE Transactions on Information Theory \|volume=56 \|issue=5 \|pages=2307–2359 \|doi=10.1109/TIT.2010.2043769 \|issn=1557-9654}}</ref><ref>{{Cite conference \|last=Wijerathna Basnayaka \|first=Chathuranga M. \|last2=Jayakody \|first2=Dushantha Nalin K. \|last3=Ponnimbaduge Perera \|first3=Tharindu D. \|last4=Vidal Ribeiro \|first4=Moisés \|date=2021-07-20 \|title=Age of Information in an URLLC-enabled Decode-and-Forward Wireless Communication System \|url=https://ieeexplore.ieee.org/document/9449007 \|conference= 2021 IEEE 93rd Vehicular Technology Conference (VTC2021-Spring) \|pages=1–6 \|doi=10.1109/VTC2021-Spring51267.2021.9449007}}</ref>. The maximal achievable channel coding rate <math> \left ( \bar{R} \right ) </math> with given block error probability <math> \left ( \epsilon \right ) </math> and block-length <math> \left ( n \right ) </math> (for binary [[Additive white Gaussian noise]] (AWGN) channels, with short block lengths) , closely approximated by Polyanskiy, [[Vincent Poor\|Poor]] and [[Sergio Verdú\|Verdú]] (PPV) in 2010, is given by ~~:<math> \bar{R} =C-\sqrt{\frac{V}{n}}Q^{-1}\left ( \epsilon \right )</math>~~ where <math> Q^{-1}</math> is the inverse of the complementary [[Normal distribution\|Gaussian]] [[Cumulative distribution function\|cumulative distribution function]], <math> C</math> is the channel capacity and <math> V </math> is a characteristic of the channel referred to as channel dispersion. == See also == Line 171 ⟶ 164: * [[Turbo code]] == Notes == == References ==▼ {{reflist}} ▲== References == * [[Thomas M. Cover\|Cover T. M.]], Thomas J. A., ''Elements of Information Theory'', [[John Wiley & Sons]], 1991. {{ISBN\|0-471-06259-6}}▼ {{cite web \|first=B. \|last=Aazhang \|title=Shannon's Noisy Channel Coding Theorem \|date=2004 \|work=Connections \|publisher= \|url=https://www.cse.iitd.ac.in/~vinay/courses/CSL858/reading/m10180.pdf}} [[Fano\|Fano, R. A.]], ''Transmission of information; a statistical theory of communications'', [[MIT Press]], 1961. {{ISBN\|0-262-06001-9}}▼ ▲{{cite [[book \|author1-link=Thomas M. Cover \|last1=Cover \|first1=T. M.~~]],~~ \|last2=Thomas \|first2=J. A., ''\|title=Elements of Information Theory~~'', [[John~~ \|publisher=Wiley ~~& Sons]],~~ \|date=1991. ~~{{ISBN~~\|isbn=0-471-06259-6 }} [[Amiel Feinstein\|Feinstein, Amiel]], "A New basic theorem of information theory", ''[[IEEE Transactions on Information Theory]]'', 4(4): 2-22, 1954. ▲{{cite book ~~[[Fano~~\|author1-link=Robert Fano, \|first=R. AM.~~]],~~ ''\|last=Fano \|title=Transmission of information; a statistical theory of communications~~'',~~ [[\|publisher=MIT Press~~]],~~ \|date=1961. ~~{{ISBN~~\|isbn=0-262-06001-9 }} [[David J.C. MacKay\|MacKay, David J. C.]], ''[http://www.inference.phy.cam.ac.uk/mackay/itila/book.html Information Theory, Inference, and Learning Algorithms]'', [[Cambridge University Press]], 2003. {{ISBN\|0-521-64298-1}} [free online]▼ {{cite journal \|last1=Feinstein \|first1=Amiel \|title=A new basic theorem of information theory \|journal=Transactions of the IRE Professional Group on Information Theory \|date=September 1954 \|volume=4 \|issue=4 \|pages=2–22 \|doi=10.1109/TIT.1954.1057459 \|hdl=1721.1/4798 \|bibcode=1955PhDT........12F\|hdl-access=free }} [[Claude E. Shannon\|Shannon, C. E.]], [https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6773024 ''A Mathematical Theory of Communication'']. ''The Bell System Technical Journal'' 27,3: 379–423, 1948. {{cite journal \|first=Lars \|last=Lundheim \|title=On Shannon and Shannon's Formula \|journal=Telektronik \|volume=98 \|issue=1 \|pages=20–29 \|date=2002 \|doi= \|url=http://www.cs.miami.edu/home/burt/learning/Csc524.142/LarsTelektronikk02.pdf}} [[Claude E. Shannon\|Shannon, C. E.]], [http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html ''A Mathematical Theory of Communication''] Urbana, IL: University of Illinois Press, 1948 (reprinted 1998). * [[Wolfowitz\|Wolfowitz, J.]], "[https://projecteuclid.org/download/pdf_1/euclid.ijm/1255380682 The coding of messages subject to chance errors]", ''Illinois J. Math.'', 1: 591–606, 1957. ~~== External links ==~~ ▲{{cite [[book \|author1-link=David J.C. MacKay~~\|MacKay,~~ \|first=David J. C.~~]],~~ ~~''[http://www.inference.phy.cam.ac.uk/mackay/itila/book.html~~\|last=MacKay \|title=Information Theory, Inference, and Learning Algorithms~~]'',~~ [[\|publisher=Cambridge University Press~~]],~~ \|date=2003. ~~{{ISBN~~\|isbn=0-521-64298-1 \|pages= \|url=http://www.inference.phy.cam.ac.uk/mackay/itila/book.html}} [free online] [http://www.cs.miami.edu/home/burt/learning/Csc524.142/LarsTelektronikk02.pdf On Shannon and Shannon's law] {{cite journal\|author-link=Claude E. Shannon \| doi=10.1002/j.1538-7305.1948.tb01338.x \| title=A Mathematical Theory of Communication \| year=1948 \| last1=Shannon \| first1=C. E. \| journal=Bell System Technical Journal \| volume=27 \| issue=3 \| pages=379–423 }} [http://cnx.org/content/m10180/latest/ Shannon's Noisy Channel Coding Theorem] {{cite book \|author-link=Claude E. Shannon \|first=C.E. \|last=Shannon \|title=A Mathematical Theory of Communication \|publisher=University of Illinois Press \|orig-year=1948 \|date=1998 \|pages= \|url=http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html}} {{cite journal \|first=J. \|last=Wolfowitz \|title=The coding of messages subject to chance errors \|journal=Illinois J. Math. \|volume=1 \|issue= 4\|pages=591–606 \|date=1957 \|doi= 10.1215/ijm/1255380682\|url=https://projecteuclid.org/download/pdf_1/euclid.ijm/1255380682\|doi-access=free }} {{DEFAULTSORT:Noisy-Channel Coding Theorem}}