Content deleted Content added
Robot-assisted spelling. See User:Mathbot/Logged misspellings for changes. |
m downcased words, as per Wikipedia:Manual of Style#Headings, rm extra space and newline for a pleasant reading experience. |
||
Line 3:
There are two classes of codes.
# Source
# Channel
The first is
▲The first is Source Encoding which attempts to compress the data from a source in order to transmit it more efficiently. We see this in practice every day on the Internet where the common "Zip" data compression is used to reduce the network load and make files smaller. The second is Channel Encoding. This technique adds extra data bits, commonly called parity bits, to make the tranmission of data more robust to disturbances present on the transmission channel. There are many application that the ordinary user is not aware of that utilize channel encoding. A typical music CD has powerful BCH block codes to correct for scratches and dust. In this application the transmission channel is the CD itself. Cell phones also use powerful coding technique to correct for the fading and noise of high frequency radio transmission. Data modems, telephone transmission and of course NASA all employ powerful channel coding to get the bits through.
The aim of source encoding is to take the source data and make it smaller. FAX transmission which has been around for many years uses a simple run length code. The principle is to recognize that most documents are white space with brief interruptions for the black typing. So FAX compresses a document by adding a repeat count to the next transition. It may tell the receiver that 100 of the next pixels are white. Another common encoding technique is string compression. This is used for data files. The encoder has a dictionary of strings. It matches the incoming text to the strings in the dictionary and when found, it will send a single number to the receiver which is the index to the string. All know implementations of string compression are adaptive in nature and allow the encoder to create new strings and transmit them to the decoder so the two dictionaries remain the same.▼
▲== Source Encoding ==
▲The aim of source encoding is to take the source data and make it smaller. FAX transmission which has been around for many years uses a simple run length code. The principle is to recognize that most documents are white space with brief interruptions for the black typing. So FAX compresses a document by adding a repeat count to the next transition. It may tell the receiver that 100 of the next pixels are white. Another common encoding technique is string compression. This is used for data files. The encoder has a dictionary of strings. It matches the incoming text to the strings in the dictionary and when found, it will send a single number to the receiver which is the index to the string. All know implementations of string compression are adaptive in nature and allow the encoder to create new strings and transmit them to the decoder so the two dictionaries remain the same.
▲== Channel Encoding ==
The aim of channel encoding theory is to find codes which transmit quickly, contain many valid [[code word]]s and can correct or at least [[error detection|detect]] many errors. These aims are mutually exclusive however, so different codes are optimal for different applications. The needed properties of this code mainly depend on the probability of errors happening during transmission. In a typical CD, the impairment is mainly dust or scratches. Thus codes are used in an interleaved manner. The data is spread out over the disk. Although not a very good code, a simple repeat code can serve as an understandable example. Suppose we take a block of data bits (representing sound) and send it three times. At the receiver we will examine the three repetitions bit by bit and take a majority vote. The twist on this is that we don't merely send the bits in order. We interleave them. The block of data bits is first divided into 4 smaller blocks. Then we cycle through the block and send one bit from the first, then the second, etc. This is done three times to spread the data out over the surface of the disk. In the context of the simple repeat code, this may not appear effective. However, there are more powerful codes known which are very effective at correcting the "burst" error of a scratch or a dust spot when this interleaving technique is used.
Line 24 ⟶ 21:
Algebraic Coding theory, is basically divided into two major types of codes
# Linear
# Convolutional
It analyzes the following three properties of a code -- mainly:
Line 32 ⟶ 29:
* the minimum [[Hamming distance]] between two valid code words
Linear block codes, have the property of [[linearity]], i.e the sum of any two codewords is also a code word; and they are applied to the source bits in blocks; hence the name linear block codes. Although linearity is not a requirement, it is difficult to prove that a code is a good one without this property.
▲=== Linear Block Codes ===
Any linear block code, is represented at <math>(n,k,d_{min})</math> where
Line 47 ⟶ 40:
There are many types within linear block codes, like
# Cyclic
# [[Repetition
# [[Parity
# Reed Solomon
# BCH
# Reed Muller codes
# Perfect codes
Block codes are tied to the "penny packing" problem which has received some attention over the years. In two dimensions, it is easy to visualize. Take a bunch of pennies flat on the table and push them together. The result is a hexagon pattern like a bee's nest. But block codes rely on more dimensions which cannot easily be visualized. The powerful Golay code used in deep space communications uses 24 dimensions. If used as a binary code (which it usually is,) the dimensions refer to the length of the codeword as defined above.
The theory of coding uses the ''N''-dimensional sphere model. For example, how many pennies can be packed into a circle on a tabletop or in 3 dimensions, how many marbles can be packed into a globe. Other considerations enter the choice of a code. For example, hexagon packing into the constraint of a rectangular box will leave empty space at the corners. As the dimensions get larger, the percentage of empty space grows smaller. But at certain dimensions, the packing uses all the space and these codes are the so called perfect codes. There are very few of these codes.
Another item which is often overlooked is the number of neighbors a single codeword may have. Again, lets use pennies as an example. First we pack the pennies in a rectangular grid. Each penny will have 4 near neighbors (and 4 at the corners which are farther away). In a hexagon, each penny will have 6 near neighbors. When we increase the dimensions, the number of near neighbors increases very rapidly.
The result is the the number of ways for noise to make the receiver choose
a neighbor (hence an error) grows as well. This is a fundamental limitation
Line 83 ⟶ 60:
total error probability actually suffers.
=== Convolution
Convolutional codes are used in voiceband modems (V.32, V.17, V.34) and in GSM mobile phones. additionaly they have widespread use in satellite and military use.
Here the idea is to make every codeword symbol, be the weighted sum of the various, input message symbols. This is like [[convolution]] used in [[linear time invariant|LTI]] systems to find the output of a system, when you know the input and impulse response.
So we generally find the output of the system [convolutional encoder] , which is the convolution of the input bit, against the states of the convolution encoder, registers.
Fundamentally, convolutional codes do not offer more protection against noise than an equivalent block code. In many cases, they generally offer greater simplicity of implementation over a block code of equal power. The encoder is usually a simple circuit which has state memory and some feedback logic, normally XOR gates. The decoder can be implemented in software or firmware
The [[Viterbi algorithm]] is the optimum algorithm used to decode convolutional codes. There are simplifications to reduce the computational load. They rely on searching only the most likely paths. Although not optimum, they have generally found to give good results in the lower noise environments. Modern microprocessors are capable of implementing these reduced search algorithms at rate greater than 4000 codewords/s.
== Applications of
Another concern of coding theory is designing codes that help [[synchronization]]. A code may be designed so that a [[phase shift]] can be easily detected and corrected and that multiple signals can be sent on the same channel. There is an interesting class of coded we see every day on our cell phones. These are the Code Division Multiple Access (CDMA) codes. The details are beyond the scope of this discussion but briefly, each phone is assigned a codeword from a special class (algebraic field). When transmitting, the code word is used to scramble the bits representing the voice message. At the receiver, a descrambling process is done to decipher the message. The properties of this class of code words allow many users (with different codes) to use the same radio channel at the same time. The receiver, using the descrambling, will only "hear" other callers as low level "noise".
Another popular class of codes are the Automatic Repeat reQuest (ARQ) codes. In this general class, the transmitter adds the parity check bits to a longer message. The receiver checks the parity bits against the message and if there is not a match, it will ask the transmitter to retransmit the message. Almost all wide area networks [[WAN]] and protocols except for the very simple ones use ARQ retransmission. Common protocols include SDLC (IBM), TCP (Internet), X,25 (International) and many others. There is an extensive field of research on this topic because of the problem of matching a rejected packet against a new packet. Is it a new one or is it a retransmission? Typically numbering schemes have been used, although in some networks, the packet may have another identifier or it may be left to higher layers to request retransmission. TCP/IP is a good example of a protocol that supports both techniques. In a connected scenario, TCP/IP leaves the retransmission to the network thus it uses the ARQ coding. In a connectionless network, ARQ is not used. Instead, it is up to the application to examine the packet and request retransmission as needed. This may go as high up as requiring the user to hit the "refresh" button on a browser. But, even this is still in the class of ARQ research; the user just has to become involved.
|