Revision as of 12:51, 27 January 2006 edit Plugwash (talk \| contribs) Extended confirmed users 9,427 edits →Seven-bit environments ← Previous edit		Revision as of 09:35, 16 February 2006 edit undo 192.16.134.66 (talk) →Summary of size issues Next edit →
Line 3: ==Summary of size issues== [[UTF-32]] requires four bytes to encode any character. Since characters outside the [[basic multilingual plane]] are rare, a document encoded in UTF-32 will usually be nearly twice as large as its [[UTF-16]]–encoded equivalent. On the other hand, [[UTF-8]] uses anywhere between one and four bytes to encode a character; it may use fewer, the same, or more bytes than UTF-16 to encode the same character. [[UTF-EBCDIC]] is always as bad as or worse than [[UTF-8]] for printable characters due to a ~~descision~~decision made to allow encoding the C1 control codes as single bytes. For seven-bit environments, UTF-7 clearly wins over the combination of other Unicode encodings with [[quoted printable]] or [[base64]]. <!--For eight-bit-clean environments things vary considerably depending on what code points are in the text.-->

Comparison of Unicode encodings: Difference between revisions