Revision as of 04:42, 11 October 2024 edit 14.203.10.202 (talk) →History: Fix plurality ← Previous edit		Revision as of 09:59, 11 October 2024 edit undo JMF (talk \| contribs) Extended confirmed users 61,398 edits →History: Unicode is not a storage mechanism. Alos, afik, it is agnostic about storage (UCS2, UTF8, UTF16). This section needs a rewrite by someone who understands these distinctions. Next edit →
Line 61: Initially, computer systems and system programming languages did not make a distinction between [[character (computing)\|character]]s and [[byte]]s: for the [[List of writing systems#Segmental script\|segmental scripts]] used in most of Africa, the Americas, southern and south-east Asia, the Middle East and Europe, a character needs just one byte, but two or more bytes are needed for the [[ideographic]] sets used in the rest of the world. This subsequently led to much confusion. Microsoft software and systems prior to the [[Windows NT]] line are examples of this, because they use the OEM and ANSI code pages that do not make the distinction. Since the late 1990s, software and systems have adopted [[Unicode]] as their preferred ~~storage~~character encoding format; this trend has been improved by the widespread adoption of [[XML]] which defaults to [[UTF-8]] but also provides a mechanism for labelling the encoding used.<ref>{{cite web \|url=http://www.w3.org/TR/xml11/#charencoding \|title=Extensible Markup Language (XML) 1.1 (Second Edition): Character encodings Line 69: \|archive-date=19 April 2021 \|archive-url=https://web.archive.org/web/20210419133700/https://www.w3.org/TR/xml11/#charencoding \|url-status=live}}</ref> All current Microsoft products and [[application program interfaces]] use Unicode internally,{{cn\|date=October 2020}} but some applications continue to use the default encoding{{clarify}} of the computer's 'locale' when reading and writing text data to files or standard output.{{cn\|date=October 2020}} Therefore, files may still be encountered that are legible and intelligible in one part of the world but unintelligible [[mojibake]] in another. === UTF-8, UTF-16 ===

Windows code page: Difference between revisions