Comparison of Unicode encodings: Difference between revisions

Content deleted Content added
m – for ranges
m Summary of size issues: reword -- rm "eight-bit-clean environments" as I've never heard of such a thing (commented out)
Line 3:
 
==Summary of size issues==
UTF-32 losesrequires infour everybytes caseto sinceencode it uses 4 bytes for everyany character. Since characters outside the [[basic multilingual plane]] are very rare, ita document encoded in UTF-32 will normallyusually be very nearly twice as biglarge as its UTF-16–encoded equivalent. For seven-bitOn the other environmentshand, UTF-78 clearlyuses winsanywhere overbetween theone combinationand offour otherbytes Unicodeto encodingsencode witha [[quotedcharacter; printable]]it orwill [[base64]].use Foras eight-bit-cleanmany environmentsor thingsfewer varybytes considerablythan dependingUTF-16 onto whatencode codethe pointssame arecharacter in theall textcases.
 
For seven-bit environments, UTF-7 clearly wins over the combination of other Unicode encodings with [[quoted printable]] or [[base64]]. <!--For eight-bit-clean environments things vary considerably depending on what code points are in the text.-->
 
==Considerations other than size==