Unicode in Microsoft Windows: Difference between revisions

Content deleted Content added
UTF-8: Move reference other edtior stuck in somewhere else to the relevant text.
UTF-8: Remove irrelevant portion of quote from reference
Line 30:
In April 2018 (or possibly November 2017<ref>{{cite web|title=Windows10 Insider Preview Build 17035 Supports UTF-8 as ANSI|url=https://news.ycombinator.com/item?id=15710685|website=Hacker News|access-date=7 May 2018}}</ref>), with insider build 17035 (nominal build 17134) for Windows 10, a "Beta: Use Unicode UTF-8 for worldwide language support" checkbox appeared for setting the locale code page to UTF-8.{{efn|1=Found under control panel, "Region" entry, "Administrative" tab, "Change system locale" button.}} This allows for calling "narrow" functions, including <code>fopen</code> and <code>SetWindowTextA</code>, with UTF-8 strings. However this is a system-wide setting and a program cannot assume it is set.
 
In May 2019, Microsoft added the ability for a program to set the code page to UTF-8 itself,<ref name="Microsoft-UTF-8">{{Cite web|title=Use the Windows UTF-8 code page - UWP applications|url=https://docs.microsoft.com/en-us/windows/uwp/design/globalizing/use-utf8-code-page|access-date=2020-06-06|quote=As of Windows Version 1903 (May 2019 Update), you can use the ActiveCodePage property in the appxmanifest for packaged apps, or the fusion manifest for unpackaged apps, to force a process to use UTF-8 as the process code page. [..] <code>CP_ACP</code> equates to <code>CP_UTF8</code> only if running on Windows Version 1903 (May 2019 Update) or above and the ActiveCodePage property described above is set to UTF-8. Otherwise, it honors the legacy system code page. We recommend using <code>CP_UTF8</code> explicitly.|website=docs.microsoft.com|language=en-us}}</ref><ref>{{cite web|url=https://skanthak.homepage.t-online.de/quirks.html#quirk31|title=Windows 10 1903 and later versions finally support UTF-8 with the A forms of the Win32 functions}}</ref> allowing programs written to use UTF-8 to be run by non-expert users.
 
In [[Windows 11]] some system files are required to use UTF-8 and do not require a Byte Order Mark.<ref>{{Cite web|last=themar-msft|title=Customize the Windows 11 Start menu|url=https://docs.microsoft.com/en-us/windows-hardware/customize/desktop/customize-the-windows-11-start-menu|access-date=2021-06-29|website=docs.microsoft.com|language=en-us|quote=Make sure your LayoutModification.json uses UTF-8 encoding.}}</ref> Notepad can now recognize UTF-8 without the Byte Order Mark, and can be told to write UTF-8 without a Byte Order Mark.{{cn|date=November 2022}} Some other Microsoft products are using UTF-8 internally, including Visual Studio{{cn|date=November 2022}} and their [[SQL Server 2019]], with Microsoft claiming 35% speed increase from use of UTF-8, and "nearly 50% reduction in storage requirements."<ref>{{Cite web|date=2019-07-02|title=Introducing UTF-8 support for SQL Server|url=https://techcommunity.microsoft.com/t5/sql-server/introducing-utf-8-support-for-sql-server/ba-p/734928|quote=For example, changing an existing column data type from NCHAR(10) to CHAR(10) using an UTF-8 enabled collation, translates into nearly 50% reduction in storage requirements. [..] In the ASCII range, when doing intensive read/write I/O on UTF-8<!-- " " in quote, but ok to strip-->, we measured an average 35% performance improvement over UTF-16 using clustered tables with a non-clustered index on the string column, and an average 11% performance improvement over UTF-16 using a heap. |access-date=2021-08-24|website=techcommunity.microsoft.com|language=en}}</ref>