Content deleted Content added
mNo edit summary |
Main properties and variations |
||
Line 23:
#Now the tree contains all the symbols. A '0' represents following the left child; a '1' represents following the right child.
==
The frequencies used can be generic ones for the application ___domain that are based on average experience, or they can be the actual frequencies found in the text being compressed.
(This variation requires that a [[frequency table]] or other hint as to the encoding must be stored with the compressed text; implementations employ various tricks to store these tables efficiently.)
Huffman coding is optimal when the probability of each input symbol is a power of two. Prefix-free codes tend to have slight inefficiency on small alphabets, where probabilities often fall between powers of two. Expanding the alphabet size by coalescing multiple symbols into "words" before Huffman coding can help a bit.▼
Extreme cases of Huffman codes are connected with [[Fibonacci number]]s.▼
[[Arithmetic coding]] produces slight gains over Huffman coding, but in practice these gains have not been large enough to offset arithmetic coding's higher computational complexity and [[patent]] [[royalty|royalties]] ([[As of November 2001]], [[IBM]] owns patents on the core concepts of arithmetic coding in several jurisdictions.)▼
== Variations ==
=== Adaptive Huffman coding ===
A variation called '''adaptive Huffman coding''' calculates the frequencies dynamically based on recent actual frequencies in the source string. This is somewhat related to the [[LZ77|LZ]] family of algorithms.
=== Huffman Template algorithm ===
Most often, the weights used in implementations of Huffman coding represent numeric probabilities, but the algorithm given above does not require this; it requires only a way to order weights and to add them. '''Huffman Template algorithm''' enables to use non-numerical weights (costs, frequences).
=== n-ary Huffman coding ===
The '''n-ary Huffman''' algorithm uses the {0, 1, ..., n-1} alphabet to encode message and ▲Extreme cases of Huffman codes are connected with [[Fibonacci number]]s.
▲Huffman coding is optimal when the probability of each input symbol is a power of two. Prefix-free codes tend to have slight inefficiency on small alphabets, where probabilities often fall between powers of two. Expanding the alphabet size by coalescing multiple symbols into "words" before Huffman coding can help a bit.
▲[[Arithmetic coding]] produces slight gains over Huffman coding, but in practice these gains have not been large enough to offset arithmetic coding's higher computational complexity and [[patent]] [[royalty|royalties]] ([[As of November 2001]], [[IBM]] owns patents on the core concepts of arithmetic coding in several jurisdictions.)
== Applications ==
|