Revision as of 20:25, 25 August 2022 edit JMF (talk \| contribs) Extended confirmed users 61,398 edits →ANSI code page{{anchor\|ANSI}}: {{subst:anchor\|ANSI}} ← Previous edit		Revision as of 16:56, 23 October 2022 edit undo Comp.arch (talk \| contribs) Extended confirmed users 41,493 edits →History Tag: 2017 wikitext editor Next edit →
Line 41: Initially, computer systems and system programming languages did not make a distinction between [[character (computing)\|character]]s and [[byte]]s: for the [[List of writing systems#Segmental script\|segmental scripts]] used in most of Africa, the Americas, southern and south-east Asia, the Middle East and Europe, a character needs just one byte, but two or more bytes are needed for the [[ideographic]] sets used in the rest of the world. This led to much confusion subsequently. Microsoft software and systems prior to the [[Windows NT]] line are examples of this, because they use the OEM and ANSI code pages that do not make the distinction. Since the late 1990s, software and systems have adopted [[Unicode]] as their preferred storage format; this trend has been improved by the widespread adoption of [[XML]], which ~~provides~~default ato ~~more~~[[UTF-8]] ~~adequate~~but also provides a mechanism for labelling the encoding used.<ref>{{cite web \| url = http://www.w3.org/TR/xml11/#charencoding \| title = Extensible Markup Language (XML) 1.1 (Second Edition): Character encodings \| publisher = [[W3C]] \| date = 29 September 2006 \| access-date = 5 October 2020 \| archive-date = 19 April 2021 \| archive-url = https://web.archive.org/web/20210419133700/https://www.w3.org/TR/xml11/#charencoding \| url-status = live }}</ref> ~~Recent~~All current Microsoft products and [[application program interfaces]] use Unicode internally,{{cn\|date=October 2020}} but ~~many~~some applications ~~and APIs~~ continue to use the default encoding of the computer's 'locale' when reading and writing text data to files or standard output.{{cn\|date=October 2020}} Therefore, files may still be encountered that are legible and intelligible in one part of the world but unintelligible [[mojibake]] in another. === UTF-8, UTF-16 === Microsoft ~~decided~~adopted toa ~~adopt~~Unicode encoding (first the 16now-~~bit~~obsolete ~~(two~~[[UCS-~~byte~~2]], which was then Unicode's only encoding), i.e. [[UTF-16]] ~~system~~ for all its [[operating system]]s from Windows NT onwards., but ~~This~~now ~~method~~additionally ~~encodes~~[[Unicode in Microsoft Windows\|supports and recommends]] using [[UTF-8]] (aka <code>CP_UTF8</code>). UTF-16 uniquely encodes all Unicode characters in the [[Basic Multilingual Plane]] ~~and~~(BMP) using 16 bits but the remaining Unicode (e.g. [[emoji]]s) is encoded with a 32-bit (four byte) code ~~for others~~{{snd}} ~~but~~while the rest of the industry ([[Unix-like]] systems and the web), and now Microsoft chose [[UTF-8]] (which uses one byte for the 7-bit [[ASCII]] character set, two or three bytes for other characters in the BMP, and four bytes for the remainder). Since [[Windows 10 version history#Version 1803 (April 2018 Update)\|Windows 10 version 1803]], Windows machines can be configured to allow UTF-8 as the "ANSI" and OEM codepage.<ref>{{cite web\|url=https://srad.jp/story/17/11/14/0640253/\|title=Windows 10のInsider PreviewでシステムロケールをUTF-8にするオプションが追加される\|trans-title=The option to make UTF-8 the system locale added in Windows 10 Insider Preview\|author=hylom\|website=スラド\|language=ja\|date=2017-11-14\|access-date=2018-05-10\|archive-date=2018-05-11\|archive-url=https://web.archive.org/web/20180511012606/https://srad.jp/story/17/11/14/0640253/\|url-status=live}}</ref> == List ==

Windows code page: Difference between revisions