Unicode: Difference between revisions

Content deleted Content added
Calga170 (talk | contribs)
Make the Japan text blued
Tag: Reverted
Reverted 2 edits by Calga170 (talk): Emoji is an acceptable plural form and country names should not be linked per MOS:OVERLINK
Line 23:
The Unicode [[character repertoire]] is synchronized with [[Universal Coded Character Set|ISO/IEC 10646]], each being code-for-code identical with one another. However, ''The Unicode Standard'' is more than just a repertoire within which characters are assigned. To aid developers and designers, the standard also provides charts and reference data, as well as annexes explaining concepts germane to various scripts, providing guidance for their implementation. Topics covered by these annexes include [[Unicode equivalence#Normalization|character normalization]], [[Combining character|character composition]] and decomposition, [[Unicode collation algorithm|collation]], and [[Bidirectional text#Unicode bidi support|directionality]].<ref>{{Cite web |title=The Unicode Standard: A Technical Introduction |url=https://www.unicode.org/standard/principles.html |date=22 August 2019 |access-date=11 September 2024}}</ref>
 
Unicode encodes 3,790 [[emoji|emojis]], with the continued development thereof conducted by the Consortium as a part of the standard.<ref>{{Cite web |title=Emoji Counts, v16.0 |url=https://www.unicode.org/emoji/charts-16.0/emoji-counts.html |access-date=10 September 2024 |publisher=The Unicode Consortium}}</ref> The widespread adoption of Unicode was in large part responsible for the initial popularization of emoji outside of [[Japan]].{{citation needed|date=June 2025}}
 
Unicode text is processed and stored as binary data [[comparison of Unicode encodings|using one of several encodings]], which define how to translate the standard's abstracted codes for characters into sequences of bytes. ''The Unicode Standard'' itself defines three encodings: [[UTF-8]], [[UTF-16]],{{efn|A large amount of documentation for Windows incorrectly uses the term "Unicode" to mean ''only'' the UTF-16 encoding.}} and [[UTF-32]], though several others exist. UTF-8 is the most widely used by a large margin, in part due to its backwards-compatibility with [[ASCII]].