Prefix code: Difference between revisions

Content deleted Content added
No edit summary
Line 12:
 
==Techniques==
If every word in the code has the same length, the code is called a '''fixed-length code''', or a '''block code''' (though the term [[block code]] is also used for fixed-size [[error-correcting code]]s in [[channel coding]]). For example, [[ISO 8859-15]] letters are always 8 bits long. [[UTF-32/UCS-4]] letters are always 32 bits long. [[Asynchronous Transfer Mode|ATM packets]] are always 424 bits long. A blockfixed-length code of fixed length ''k'' bits can encode up to <math>2^{k}</math> source symbols.
 
PrefixesA cannotfixed-length existcode inis necessarily a prefix code. It is possible to turn any code into a fixed-length code withoutby padding fixed codessymbols to the shorter prefixes in order to meet the length of the longest prefixes. (howeverAlternately, such padding codes may be selectedemployed to introduce redundancy that allows autocorrection and/or synchronisation). However, fixed length encodings are inefficient in situations where some words are much more likely to be transmitted than others (in which case some or all of the redundancy may be eliminated for data compression).
 
[[Truncated binary encoding]] is a straightforward generalization of blockfixed-length codes to deal with cases where the number of symbols ''n'' is not a power of two. Source symbols are assigned codewords of length ''k'' and ''k''+1. where <math>2^{k} < n < 2^{k+1}</math>.
 
[[Huffman coding]] is a more sophisticated technique for constructing variable-length prefix codes. The Huffman coding algorithm takes as input the frequencies that the code words should have, and constructs a prefix code that minimizes the weighted average of the code word lengths. This is a form of [[lossless data compression]] based on [[entropy encoding]].