Windows code page: Difference between revisions

Content deleted Content added
Microsoft DON'T recommend UTF-8, they still recommend UTF-16 (MANY -W functions of the Windows API, especially those introduced since Vista, have no -A form); they only recommend to use CP_UTF8 over CP_ANSI
History: Minor edit for readability.
Line 39:
 
== History ==
Initially, computer systems and system programming languages did not make a distinction between [[character (computing)|character]]s and [[byte]]s: for the [[List of writing systems#Segmental script|segmental scripts]] used in most of Africa, the Americas, southern and south-east Asia, the Middle East and Europe, a character needs just one byte, but two or more bytes are needed for the [[ideographic]] sets used in the rest of the world. This, subsequently, led to much confusion subsequently. Microsoft software and systems prior to the [[Windows NT]] line are examples of this, because they use the OEM and ANSI code pages that do not make the distinction.
 
Since the late 1990s, software and systems have adopted [[Unicode]] as their preferred storage format; this trend has been improved by the widespread adoption of [[XML]] which default to [[UTF-8]] but also provides a mechanism for labelling the encoding used.<ref>{{cite web | url = http://www.w3.org/TR/xml11/#charencoding | title = Extensible Markup Language (XML) 1.1 (Second Edition): Character encodings | publisher = [[W3C]] | date = 29 September 2006 | access-date = 5 October 2020 | archive-date = 19 April 2021 | archive-url = https://web.archive.org/web/20210419133700/https://www.w3.org/TR/xml11/#charencoding | url-status = live }}</ref> All current Microsoft products and [[application program interfaces]] use Unicode internally,{{cn|date=October 2020}} but some applications continue to use the default encoding of the computer's 'locale' when reading and writing text data to files or standard output.{{cn|date=October 2020}} Therefore, files may still be encountered that are legible and intelligible in one part of the world but unintelligible [[mojibake]] in another.