Comparison of Unicode encodings: Difference between revisions

Content deleted Content added
m Summary of size issues: ln UTF-32, UTF-16, UTF-8
m Summary of size issues: correction on UTF-8 (self)
Line 3:
 
==Summary of size issues==
[[UTF-32]] requires four bytes to encode any character. Since characters outside the [[basic multilingual plane]] are rare, a document encoded in UTF-32 will usually be nearly twice as large as its [[UTF-16]]–encoded equivalent. On the other hand, [[UTF-8]] uses anywhere between one and four bytes to encode a character; it willmay use asfewer, manythe same, or fewermore bytes than UTF-16 to encode the same character in all cases.
 
For seven-bit environments, UTF-7 clearly wins over the combination of other Unicode encodings with [[quoted printable]] or [[base64]]. <!--For eight-bit-clean environments things vary considerably depending on what code points are in the text.-->