Revision as of 16:28, 10 March 2023 edit 93.230.217.137 (talk) IRRELEVANT derailment about "other" OS removed: this article is about WINDOWS! Tag: Reverted ← Previous edit		Revision as of 16:48, 10 March 2023 edit undo 93.230.217.137 (talk) Misconception about A functions Next edit →
Line 9: Current Windows versions and all back to [[Windows XP]] and prior [[Windows NT]] (3.x, 4.0) are shipped with [[Windows API\|system libraries]] that support string [[character encoding\|encoding]] of two types: 16-bit "Unicode" ([[UTF-16]] since [[Windows 2000]]) and a (sometimes multibyte) encoding called the "[[Windows code page\|code page]]" (or incorrectly referred to as ''[[American National Standards Institute\|ANSI]] code page''). 16-bit functions have names suffixed with 'W' (from [[wide character\|"wide"]]) such as <code>SetWindowTextW</code>. Code page oriented functions use the suffix 'A' for "ANSI" such as <code>SetWindowTextA</code> (some other conventions were used for APIs that were copied from other systems, such as <code>_wfopen/fopen</code> or <code>wcslen/strlen</code>). This split was necessary because many languages, including [[C (programming language)\|C]], did not provide a clean way to pass both 8-bit and 16-bit strings to the same function. ~~Most~~ 'A' functions ~~are~~may be implemented as [[wrapper function\|wrappers]] that translate the text using the current code page to UTF-16 and then call the corresponding 'W' functions.{{citation needed\|date=June 2020}}{{citation needed\|date=March 2023}} Notable exceptions are the 'A' functions of Windows' [[National Language Support]].<ref>{{cite web\|url=https://skanthak.homepage.t-online.de/quirks.html#quirk31}}</ref> 'A' functions that return strings do the opposite conversion, turning characters that don't exist in the current locale into '?'.{{citation needed\|date=March 2023}} [[Microsoft]] attempted to support Unicode "portably" by providing a "UNICODE" switch to the compiler, that switches unsuffixed "generic" calls from the 'A' to the 'W' interface and converts all string constants to "wide" UTF-16 versions.<ref>{{cite web\|title=Unicode in the Windows API\|url=https://msdn.microsoft.com/en-us/library~~/windows/desktop~~/dd374089~~%28v=vs.85%29~~.aspx\|access-date=7 May 2018}}</ref><ref>{{cite web\|title=Conventions for Function Prototypes (Windows)\|url=https://msdn.microsoft.com/en-us/library~~/windows/desktop~~/dd317766~~(v=vs.85)~~.aspx\|website=MSDN\|access-date=7 May 2018\|language=en}}</ref> This does not actually work because it does not translate UTF-8 outside of string constants, resulting in code that attempts to open files just not compiling.{{citation needed\|date=October 2019}} Earlier, and independent of the "UNICODE" switch, Windows also provided the Multibyte Character Sets (MBCS) API switch.<ref>{{cite web\|title=Support for Multibyte Character Sets (MBCSs)\|url=https://docs.microsoft.com/en-us/cpp/text/support-for-multibyte-character-sets-mbcss?view=vs-2019\|access-date=2020-06-15\|language=en}}</ref> This changes some functions that don't work in MBCS such as <code>strrev</code> to an MBCS-aware one such as <code>_mbsrev</code>.<ref>{{cite web\|title=Double-byte Character Sets\|url=https://docs.microsoft.com/en-us/windows/win32/intl/double-byte-character-sets\|website=MSDN\|access-date=2020-06-15\|date=2018-05-31\|quote=our applications use DBCS Windows code pages with the "A" versions of Windows functions.}}</ref><ref>[https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/strrev-wcsrev-mbsrev-mbsrev-l _strrev, _wcsrev, _mbsrev, _mbsrev_l] Microsoft Docs</ref>

Unicode in Microsoft Windows: Difference between revisions