Huffman coding: Difference between revisions

Content deleted Content added
No edit summary
m Fixed and added links
Line 5:
It was developed by [[David A. Huffman]] and published [[1952]] ''A Method for the Construction of Minimum-Redundancy Codes''.
 
The basic idea is borrowed from an older and slightly less efficient method called [[Shannon-Fano coding]].
[[Data compression/Shannon-Fano coding|Shannon-Fano coding]].
 
The text to be compressed is considered as a [[string]] of symbols.
Line 17 ⟶ 16:
 
Huffman coding is optimal when the frequencies of input characters are powers of two.
[[Data compression/Arithmetic coding|Arithmetic coding]] produces slight gains over Huffman coding, but in practice these gains have not been large enough to offset arithmetic coding's higher computational complexity and patent royalties (as of November 2001, IBM owns patents on the core concepts of arithmetic coding in several jurisdictions).
 
Huffman works by creating a [[binary tree]] of symbols:
Line 31 ⟶ 30:
A variation called "adaptive Huffman coding" calculates the frequencies dynamically based on recent actual frequencies in the source string. This is somewhat related to the [[LZ77|LZ]] family of algorithms.
 
Extreme cases of Huffman codes are connected with [[Fibonacci numbersnumber]]s. For example, see http://mathforum.org/discuss/sci.math/t/207334.
 
Huffman coding today is often used as a "back-end" to some other compression method.
[[Data compression/Deflation|DeflationDEFLATE]] ([[PKZIP]]'s algorithm) and multimedia codecs such as [[JPEG]] and [[MP3]] have a front-end model and quantization followed by Huffman coding.
 
n-ary Huffman algorithm uses the {0, 1, ..., n-1} alphabet to encode message. Built tree is n-ary one.