Revision as of 14:10, 16 January 2006 edit Suruena (talk \| contribs) Extended confirmed users, Pending changes reviewers 8,774 edits m – for ranges ← Previous edit		Revision as of 19:46, 16 January 2006 edit undo Kbolino (talk \| contribs) Extended confirmed users 2,000 edits m →Summary of size issues: reword -- rm "eight-bit-clean environments" as I've never heard of such a thing (commented out) Next edit →
Line 3: ==Summary of size issues== UTF-32 ~~loses~~requires infour ~~every~~bytes ~~case~~to ~~since~~encode ~~it uses 4 bytes for every~~any character. Since characters outside the [[basic multilingual plane]] are ~~very~~ rare, ita document encoded in UTF-32 will ~~normally~~usually be ~~very~~ nearly twice as ~~big~~large as its UTF-16–encoded equivalent. ~~For~~ ~~seven-bit~~On the other ~~environments~~hand, UTF-78 ~~clearly~~uses ~~wins~~anywhere ~~over~~between ~~the~~one ~~combination~~and offour ~~other~~bytes ~~Unicode~~to ~~encodings~~encode ~~with~~a ~~[[quoted~~character; ~~printable]]~~it orwill ~~[[base64]].~~use ~~For~~as ~~eight-bit-clean~~many ~~environments~~or ~~things~~fewer ~~vary~~bytes ~~considerably~~than ~~depending~~UTF-16 onto ~~what~~encode ~~code~~the ~~points~~same ~~are~~character in ~~the~~all ~~text~~cases. For seven-bit environments, UTF-7 clearly wins over the combination of other Unicode encodings with [[quoted printable]] or [[base64]]. <!--For eight-bit-clean environments things vary considerably depending on what code points are in the text.--> ==Considerations other than size==

Comparison of Unicode encodings: Difference between revisions