Noisy-channel coding theorem

This is an old revision of this page, as edited by Iseetho (talk | contribs) at 20:05, 17 May 2006 (added a short description of one style of proof of the coding theorem). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In information theory, the noisy-channel coding theorem establishes that however contaminated with noise interference a communication channel may be, it is possible to communicate digital data (information) error-free up to a given maximum rate through the channel. This surprising result, sometimes called the fundamental theorem of information theory, or just Shannon's theorem, was first presented by Claude Shannon in 1948.

The Shannon limit or Shannon capacity of a communications channel is the theoretical maximum information transfer rate of the channel, for a particular noise level.

Overview

Proved by Claude Shannon in 1948, the theorem describes the maximum possible efficiency of error-correcting methods versus levels of noise interference and data corruption. The theory doesn't describe how to construct the error-correcting method, it only tells us how good the best possible method can be. Shannon's theorem has wide-ranging applications in both communications and data storage applications. This theorem is of foundational importance to the modern field of information theory.

The Shannon theorem states that given a noisy channel with information capacity C and information transmitted at a rate R, then if

 

there exists a coding technique which allows the probability of error at the receiver to be made arbitrarily small. This means that theoretically, it is possible to transmit information without error up to a limit, C.

The converse is also important. If

 

the probability of error at the receiver increases without bound as the rate is increased. So no useful information can be transmitted beyond the channel capacity. The theorem does not address the rare situation in which rate and capacity are equal.

Simple schemes such as "send the message 3 times and use at best 2 out of 3 voting scheme if the copies differ" are inefficient error-correction methods, unable to asymptotically guarantee that a block of data can be communicated free of error. Advanced techniques such as Reed-Solomon codes and, more recently, Turbo codes come much closer to reaching the theoretical Shannon limit, but at a cost of high computational complexity. With Turbo codes and the computing power in today's digital signal processors, it is now possible to reach within 1/10 of one decibel of the Shannon limit.

Mathematical statement

Theorem (Shannon, 1948):

1. For every discrete memoryless channel, the channel capacity
 
has the following property. For any ε > 0 and R < C, for large enough N, there exists a code of length N and rate ≥ R and a decoding algorithm, such that the maximal probability of block error is ≤ ε.
2. If a probability of bit error pb is acceptable, rates up to R(pb) are achievable, where
 
and   is the binary entropy function
 
3. For any pb, rates greater than R(pb) are not achievable.

(MacKay (2003), p. 162; cf Gallager (1968), ch.5; Cover and Thomas (1991), p. 198; Shannon (1948) thm. 11)

Outline of Proof

As with several other major results in information theory, the proof of the noisy channel coding theorem includes an achievability result and a matching converse result. These two components serve to bound, in this case, the set of possible rates at which one can communicate over a noisy channel, and matching serves to show that these bounds are tight bounds.


The following outlines are only one set of many different styles available for study in information theory texts.

Achievability for discrete memoryless channels

Converse for discrete memoryless channels

Suppose a code of   codewords. Let W be drawn uniformly over this set as an index. Let   and   be the codewords and received codewords, respectively.

1)   using identities involving entropy and mutual information
2)   since X is a function of W
3)   by the use of Fano's Inequality
4)   by the fact that capacity is maximized mutual information.

The result of these steps is that  . As the block length n goes to infinity, we obtain   is bounded away from 0 if R is greater than C - we can only get arbitrarily low rates of error if R is less than C.

References

See also