Revision as of 20:41, 20 January 2004 edit Charles Matthews (talk \| contribs) Autopatrolled, Administrators 367,985 edits mNo edit summary ← Previous edit		Revision as of 04:15, 14 February 2004 edit undo Damian Yerrick (talk \| contribs) Autopatrolled, Extended confirmed users, Pending changes reviewers 15,102 edits Small alphabets create problems; alternating between codes can provide a solution. Also sectionified it and explained the basis behind Huffman Template. Next edit →
Line 15: For a set of symbols whose cardinality is a power of two and a uniform probability distribution, Huffman coding is equivalent to simple binary block encoding. == Basic technique == Huffman coding is optimal when the frequencies of input characters are powers of two.▼ ~~Huffman~~The technique works by creating a [[binary tree]] of symbols:▼ [[Arithmetic coding]] produces slight gains over Huffman coding, but in practice these gains have not been large enough to offset arithmetic coding's higher computational complexity and patent royalties (as of November 2001, IBM owns patents on the core concepts of arithmetic coding in several jurisdictions).▼ ▲Huffman works by creating a [[binary tree]] of symbols: #Start with as many trees as there are symbols. #While there is more than one tree: ##Find the two trees with the smallest total ~~probability~~weight. ##Combine the trees into one, setting one as the left child and the other as the right. #Now the tree contains all the symbols. A '0' represents following the left child; a '1' represents following the right child. == Variations == ~~There are variations.~~ The frequencies used can be generic ones for the application ___domain that are based on average experience, or they can be the actual frequencies found in the text being compressed. (This variation requires that a [[frequency table]] or other hint as to the encoding must be stored with the compressed text; implementations employ various tricks to store these tables efficiently.) A variation called "adaptive Huffman coding" calculates the frequencies dynamically based on recent actual frequencies in the source string. This is somewhat related to the [[LZ77\|LZ]] family of algorithms. Most often, the weights used in implementations of Huffman coding represent numeric probabilities, but the algorithm given above does not require this; it requires only a way to order weights and to add them. Huffman Template algorithm enables to use non-numerical weights (costs, frequences). ▼ For example, see http://alexvn.freeservers.com/s1/huffman_template_algorithm.html▼ n-ary Huffman algorithm uses the {0, 1, ..., n-1} alphabet to encode message. Built tree is n-ary one.▼ Extreme cases of Huffman codes are connected with [[Fibonacci number]]s. For example, see http://mathforum.org/discuss/sci.math/t/207334. ▲Huffman coding is optimal when the ~~frequencies~~probability of each input ~~characters~~symbol is ~~are~~a ~~powers~~power of two. Prefix-free codes tend to have slight inefficiency on small alphabets, where probabilities often fall between powers of two. ▲[[Arithmetic coding]] produces slight gains over Huffman coding, but in practice these gains have not been large enough to offset arithmetic coding's higher computational complexity and patent royalties ([[As of 2001\|as of November 2001]], IBM owns patents on the core concepts of arithmetic coding in several jurisdictions). In addition, alternating between Huffman codes for symbols in even and odd positions can in effect allocate fractional numbers of bits, improving results in small-symbol alphabets. == Applications == Huffman coding today is often used as a "back-end" to some other compression method. [[DEFLATE]] ([[PKZIP]]'s algorithm) and multimedia [[codec]]s such as [[JPEG]] and [[MP3]] have a front-end model and [[quantization]] followed by Huffman coding. ▲n-ary Huffman algorithm uses the {0, 1, ..., n-1} alphabet to encode message. Built tree is n-ary one. ▲Huffman Template algorithm enables to use non-numerical weights (costs, frequences). ▲For example, see http://alexvn.freeservers.com/s1/huffman_template_algorithm.html

Huffman coding: Difference between revisions