Initially,Early computer systems had limited storage and systemrestricted programmingthe languagesnumber didof not[[bit]]s makeavailable ato distinctionencode betweena [[character (computing)|character]]. Although earlier proprietary encodings had fewer, the [[ASCII|American Standard Code for Information Interchange]] (ASCII) settled on seven bits: this was sufficient to encode a minimal subset of the characters used in the US. As eight-bit [[byte]]s came to predominate, Microsoft (and others) expanded their repertoire to 224, to handle a variety of other uses such a box-drawing symbols. The need to provide [[byteprecomposed character]]s for the Western European and South American markets required a different character set: Microsoft established the principle of code pages, one for each alphabet. For the [[List of writing systems#Segmental script|segmental scripts]] used in most of Africa, the Americas, southern and south-east Asia, the Middle East and Europe, a character needs just one byte, but two or more bytes are needed for the [[ideographic]] sets used in the rest of the world. ThisThe subsequentlycode-page ledmodel towas much confusion. Microsoft software and systems priorunable to the [[Windows NT]] line are examples ofhandle this, because they use the OEM and ANSI code pages that do not make thedistinctionchallenge.
Since the late 1990s, software and systems have adopted [[Unicode]] as their preferred character encoding format;: thisUnicode trendis hasdesigned beento improvedhandle bymillions theof widespreadcharacters. adoptionAll ofcurrent Microsoft products and [[XMLapplication program interfaces]] whichuse defaultsUnicode tointernally,{{cn|date=October [[UTF-8]]2020}} but alsosome providesapplications acontinue mechanismto for labellinguse the default encoding{{clarify|date=October used2024}} of the computer's 'locale' when reading and writing text data to files or standard output.<ref>{{citecn|date=October web2020}} Therefore, files may still be encountered that are legible and intelligible in one part of the world but unintelligible [[mojibake]] in another.
|url=http://www.w3.org/TR/xml11/#charencoding
|title=Extensible Markup Language (XML) 1.1 (Second Edition): Character encodings
|url-status=live}}</ref> All current Microsoft products and [[application program interfaces]] use Unicode internally,{{cn|date=October 2020}} but some applications continue to use the default encoding{{clarify|date=October 2024}} of the computer's 'locale' when reading and writing text data to files or standard output.{{cn|date=October 2020}} Therefore, files may still be encountered that are legible and intelligible in one part of the world but unintelligible [[mojibake]] in another.