Content deleted Content added
m Comparison of unicode encodings moved to Comparison of Unicode encodings |
Clarified comparing size only. Expanded abbreviations. Caps and spelling. |
||
Line 1:
This page compares
==In summary==
If space were the only consideration, UTF-32
==In detail==
The tables below list the number of bytes per code point for different
===
{| {{prettytable}}
|
|-
|000000 - 00007F||1||2||4||1
Line 20:
|}
===
This table may not cover every special case and so should be used for estimation and
{| {{prettytable}}
|code range (hexadecimal)||[[UTF-7]]||[[UTF-8]] [[quoted printable]]||UTF-8 [[base64]]||[[UTF-16]] quoted printable||UTF-16 base64||[[UTF-32]] quoted printable||UTF-32 base64||[[GB18030]] quoted printable||[[GB18030]] base64
Line 27:
|000000 - 000032||same as 000080-00FFFFFF||3||1⅓||6||2⅔||12||5⅓||3||1⅓
|-
|000033 - 00003C||rowspan=3|1 for "direct characters" and
|-
|00003D (equals sign)||3||1⅓||6||2⅔||12||5⅓||3||1⅓
Line 33:
|00003E - 00007E||1||1⅓||4||2⅔||10||5⅓||1||1⅓
|-
|00007F||rowspan=3|5 for an
|-
|000080 - 0007FF||6||2⅔||rowspan=2|2-6 depending on if the byte values need to be escaped||2⅔||rowspan=3|8-12 depending on if the final two byte values need to be escaped||5⅓||rowspan=2|4-6 for stuff inherited from [[GB2312]]/[[GBK]] (e.g.<br>most Chinese stuff) 6-10 for everything else.||rowspan=2|2⅔ for stuff inherited from [[GB2312]]/[[GBK]] (e.g.<br>most Chinese stuff) 5⅓ for everything else.
|