Content deleted Content added
→Compression vs. Delta-Compression.: new section |
GreenC bot (talk | contribs) Add {{reflist-talk}} to #Compression vs. Delta-Compression. (via reftalk bot) |
||
Line 20:
It is the compression that matters, that is the sum of Compression1+Compression2; not Compression2 alone, when using multiple compression algorithms. Comparing the compression gain of BOCU/SCSU and UTF-8/UTF-16 is unfair - because SCSU/BOCU stream is already compressed.<ref>Technical Note 14, Unicode: A surveyof Unicode compressionJanuary 30, 2004 Using bzip2, a compressor which employs the Burrows-Wheeler algorithm, evenabsurdlyinefficient formats—such as representing each character by its full Unicode name (e.g. LATIN CAPITAL LETTER A WITH CIRCUMFLEX)—could be reduced to almostthe same size as more compact formats. Atkin and Stansifer demonstrate that block-sorting compression techniqueseliminate most of the redundancy of the encoding format. The supporting data compares the compressibility of each encoding format with bzip2 and declares a “winner,” which is somewhat misleading since the paper attempts to show that all formats are about equally compressible. (In some cases,the “winner” was only 0.01% smaller than another format!) In contrast, the gzip compression tool, which uses LZ77, generally performed 15% to 25% better on natural-language, small-alphabet textencoded inSCSU or BOCU-1 than on the sametextencoded in UTF-16. The authors claim thatthese differences are notsignificant,but theycan hardly be considered negligible.</ref>. Please correct if wrong. Thanks.[[Special:Contributions/175.157.246.242|175.157.246.242]] ([[User talk:175.157.246.242|talk]]) 18:48, 21 October 2015 (UTC)
{{reflist-talk}}
|