Unicode in Microsoft Windows: Difference between revisions

Content deleted Content added
UTF-8: Some rewording to make it clear the problem is on Windows, not "other operating systems"
Removed COMPLETELY IRRELEVANT bloat
Line 8:
=== Windows NT based systems ===
Current Windows versions and all back to [[Windows XP]] and prior [[Windows NT]] (3.x, 4.0) are shipped with [[Windows API|system libraries]] that support string [[character encoding|encoding]] of two types: 16-bit "Unicode" ([[UTF-16]] since [[Windows 2000]]) and a (sometimes multibyte) encoding called the "[[Windows code page|code page]]" (or incorrectly referred to as ''[[American National Standards Institute|ANSI]] code page''). 16-bit functions have names suffixed with 'W' (from [[wide character|"wide"]]) such as <code>SetWindowTextW</code>. Code page oriented functions use the suffix 'A' for "ANSI" such as <code>SetWindowTextA</code> (some other conventions were used for APIs that were copied from other systems, such as <code>_wfopen/fopen</code> or <code>wcslen/strlen</code>). This split was necessary because many languages, including [[C (programming language)|C]], did not provide a clean way to pass both 8-bit and 16-bit strings to the same function.
 
Most 'A' functions are implemented as [[wrapper function|wrappers]] that translate the text from the current "ANSI" code page to UTF-16 and then call the corresponding 'W' functions.{{citation needed|date=March 2023}} 'A' functions that return strings do the opposite conversion, turning characters that don't exist in the current locale into '?'.{{citation needed|date=March 2023}}
 
[[Microsoft]] attempted to support Unicode "portably" by providing a "UNICODE" switch to the compiler, that switches unsuffixed "generic" calls from the 'A' to the 'W' interface and converts all string constants to "wide" UTF-16 versions.<ref>{{cite web|title=Unicode in the Windows API|url=https://msdn.microsoft.com/en-us/library/dd374089.aspx|access-date=7 May 2018}}</ref><ref>{{cite web|title=Conventions for Function Prototypes (Windows)|url=https://msdn.microsoft.com/en-us/library/dd317766.aspx|website=MSDN|access-date=7 May 2018|language=en}}</ref> This does not actually work because it does not translate UTF-8 outside of string constants, resulting in code that attempts to open files just not compiling.{{citation needed|date=October 2019}}