Unicode in Microsoft Windows: Difference between revisions

Content deleted Content added
Windows NT based systems: You can't accidentally call the "A" API when it's #define'd to "W".
Line 9:
Modern Windows versions like [[Windows XP]] and [[Windows Server 2003]], and prior to them [[Windows NT]] (3.x, 4.0) and Windows 2000 are shipped with [[Windows API|system libraries]] which support string [[character encoding|encoding]] of two types: UTF-16 (often called "Unicode" in Windows documentation) and an local (sometimes multibyte) encoding called the "[[Windows code page|code page]]" (or incorrectly referred to as ''ANSI code page''). 16-bit functions have names suffixed with -W (from [[wide character|"wide"]]). Code page oriented functions use the suffix -A for "ANSI". This split was necessary because many languages, including C, did not provide a clean way to pass both 8-bit and 16-bit strings to the same function. Most such 'A' functions are implemented as a [[Wrapper function|wrapper]] that translates the code page to UTF-16 and calls the 'W' function.
 
Microsoft attempted to support Unicode "portably" by providing a "UNICODE" switch to the compiler, that switches unsiffixed "generic" calls from the 'A' to the 'W' interface and converts all string constants to "wide" UTF-16 versions.<ref>{{cite web|title=Unicode in the Windows API|url=https://msdn.microsoft.com/en-us/library/windows/desktop/dd374089%28v=vs.85%29.aspx|accessdate=7 May 2018}}</ref><ref>{{cite web|title=Conventions for Function Prototypes (Windows)|url=https://msdn.microsoft.com/en-us/library/windows/desktop/dd317766(v=vs.85).aspx|website=MSDN|accessdate=7 May 2018|language=en}}</ref> This does not actually work because it does not translate UTF-8 outside of string constants, resulting in code that attempts to open files just not compiling or accidentally calling the 'A' version anyway.
 
Earlier, and independent of the "UNICODE" switch, Windows also provides the "MBCS" API switch.<ref>{{cite web|title=Support for Multibyte Character Sets (MBCSs)|url=https://msdn.microsoft.com/en-us/library/5z097dxa.aspx|language=en}}</ref> This switch turns on some C functions prefixed with<code>_mbs</code>, and selects the 'A' functions for the current locale.<ref>{{cite web|title=Double-byte Character Sets|url=https://msdn.microsoft.com/en-us/library/windows/desktop/dd317794(v=vs.85).aspx|website=MSDN|accessdate=7 May 2018|quote=our applications use DBCS Windows code pages with the "A" versions of Windows functions.}}</ref>