Comparison of Unicode encodings: Difference between revisions

Content deleted Content added
7 bit environments: 7F needs to be escaped in quoted-printable
Line 21:
 
===7 bit environments===
This table may not cover every special case and so should be used for estimation and comparion only. To accurately determine the size of text in an encoding please see the actual specifications.
{| {{prettytable}}
|code range (hexadecimal)||[[UTF-7]]||[[UTF-8]] [[quoted printable]]||UTF-8 [[base64]]||[[UTF-16]] quoted printable||UTF-16 base64||[[UTF-32]] quoted printable||UTF-32 base64||[[GB18030]] quoted printable||[[GB18030]] base64
Line 26 ⟶ 27:
|000000 - 000032||same as 000080-00FFFFFF||3||1⅓||6||2⅔||12||5⅓||3||1⅓
|-
|000033 - 00007F00007E||1 for "direct characters" and possiblly "optional direct characters" (depending on the encoder setting) 2 for +, otherwise same as 000080-00FFFFFF||1||1⅓||4||2⅔||10||5⅓||1||1⅓
|-
|000080 - 0007FF00007F||rowspan=23|5 for an isolted case inside a run of single byte characters. For runs 2⅔ per character plus padding to make it a whole number of bytes plus two to start and finish the run||3||1⅓||6||2⅔||12||5⅓||3||1⅓
|-
|000080 - 0007FF||6||2&#x2154;||rowspan=2|2-6 depending on if the byte values need to be escaped||2⅔||rowspan=3|8-12 depending on if the final two byte values need to be escaped||5⅓||rowspan=2|4-6 for stuff inherited from [[GB2312]]/[[GBK]] (e.g.<br>most chineese stuff) 6-10 for everything else.||rowspan=2|2&#x2154; for stuff inherited from [[GB2312]]/[[GBK]] (e.g.<br>most chineese stuff) 5⅓ for everything else.
|-
|000800 - 00FFFF||9||4||2⅔||5⅓