Comparison of Unicode encodings: Difference between revisions

Content deleted Content added
Line 13:
Fixed-size characters can be helpful, but it should be remembered that even if there is a fixed width per code point (as in UTF-32), there is not a fixed width per displayed character due to [[combining character]]s. If you are working with a particular [[application programming interface|API]] heavily and that API has standardised on a particular Unicode encoding it is generally a good idea to use the encoding that the API does to avoid the need to convert before every call to the API. Similarly if you are writing server side software it may simplify matters to use the same format for processing that you are communicating in.
 
UTF-16 is popular because many APIs date to the time when Unicode was 16-bit fixed width. Unfortunately using UTF-16 makes characters outside the BMP a special case which increases the risk of oversights related to their handling. However, another special consideration for UTF-16 exists in that the standard actually has three different encoding schemes (one that identifies the used [[endianness]] and two that do not), thus giving potential for an inherent endianness problem.
 
===For communication===