Shannon–Fano coding: Difference between revisions

Content deleted Content added
m Reverted 1 edit by 41.136.36.202 identified as test/vandalism using STiki
No edit summary
Tag: extraneous markup
Line 1:
<gallery>
'''Bold text'''[[File:Fano-en.svg|thumb|300x300px|Simple example of encoding 6 symbols]]
In the field of [[data compression]], '''Shannon–Fano coding''', named after [[Claude Shannon]] and [[Robert Fano]], is a technique for constructing a [t[prefix code]] based on a set of symbols and their probabilities (estimated or measured). It is [[Optimization (mathematics)|suboptimal]] in the sense that it does not achieve the lowest possible expected code word length like [[Huffman coding]]; however unlike Huffman coding, it does guarantee that all code word lengths are within one bit of their theoretical ideal <math>{-\log} P(x)</math>.{{cn|date=October 2016}}
 
he field of [[data compression]], '''Shannon–Fano coding''', named after [[Claude Shannon]] and [[Robert Fano]], is a technique for constructing a [
The technique was proposed in Shannon's "[[A Mathematical Theory of Communication]]", his 1948 article introducing the field of [[information theory]]. The method was attributed to Fano, who later published it as a [[technical report]].<ref>{{harvnb|Fano|1949}}</ref>
Shannon–Fano coding should not be confused with [[Shannon coding]], the coding method used to prove [[Shannon's noiseless coding theorem]], or with [[Shannon–Fano–Elias coding]] (also known as Elias coding), the precursor to [[arithmetic coding]].
Line 10 ⟶ 11:
 
The algorithm produces fairly efficient variable-length encodings; when the two smaller sets produced by a partitioning are in fact of equal probability, the one bit of information used to distinguish them is used most efficiently. Unfortunately, Shannon–Fano does not always produce optimal prefix codes; the set of probabilities {0.35, 0.17, 0.17, 0.16, 0.15} is an example of one that will be assigned non-optimal codes by Shannon–Fano coding.
this reason, Shannon–Fano is almost never used; [[Huffman coding]] is almost as computationally simple and produces prefix codes that always achieve
 
For this reason, Shannon–Fano is almost never used; [[Huffman coding]] is almost as computationally simple and produces prefix codes that always achieve the lowest expected code word length, under the constraints that each symbol is represented by a code formed of an integral number of bits. This is a constraint that is often unneeded, since the codes will be packed end-to-end in long sequences. If we consider groups of codes at a time, symbol-by-symbol Huffman coding is only optimal if the probabilities of the symbols are [[Statistical independence|independent]] and are some power of a half, i.e., <math>\textstyle \frac{1}{2^n}</math>. In most situations, [[arithmetic coding]] can produce greater overall compression than either Huffman or Shannon–Fano, since it can encode in fractional numbers of bits which more closely approximate the actual information content of the symbol. However, arithmetic coding has not superseded Huffman the way that Huffman supersedes Shannon–Fano, both because arithmetic coding is more computationally expensive and because it is covered by multiple patents.{{cn|date=March 2014}}
 
Shannon–Fano coding is used in the <tt>IMPLODE</tt> compression method, which is part of the [[Zip (file format)|<tt>ZIP</tt> file format]].<ref name="appnote">{{cite web
Line 161 ⟶ 162:
[[Category:Lossless compression algorithms]]
[[Category:Claude Shannon]]
 
</gallery>