Unicode character property: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 18:38, 8 January 2025 edit 76.137.105.116 (talk) →{{anchor\|Version 1.0 names}}Unicode 1.0 names Tag: Reverted ← Previous edit		Latest revision as of 22:28, 11 June 2025 edit undo Drmccreedy (talk \| contribs) Extended confirmed users, Template editors 26,287 edits m switch to a proper ref using https instead of ftp
(7 intermediate revisions by 5 users not shown)
Line 26: The following Unicode categories do not have a Name value assigned: Controls (General Category: Cc), Private use (Co), Surrogate (Cs), Non-characters (Cn) and Reserved (Cn). They may be referenced, informally, by a generic or specific meta-name, called "Code Point Labels": {{not a typo\|<control>, <control-0088>, <reserved>, <noncharacter-''hhhh''>, <private-use-''hhhh''>, or <surrogate>}}. Since these labels contain "<" and ">", they can never appear in a Name, which prevents confusion. ==={{anchor\|Version 1.0 names}}Unicode 1.0 names=== In version 2.0 of Unicode, many names were changed. From then on the rule "a name will never change" came into effect, including the strict (normative) use of alias names. Disused Unicode 1.0 names were moved to the property Alias, to provide backward compatibility. For example, {{Unichar\|264}} has the Unicode 1.0 name "LATIN SMALL LETTER BABY GAMMA". ===Character name alias=== Line 59 ⟶ 64: ===Casing=== The Case value is normative in Unicode. It pertains to those scripts with uppercase and ~~the~~ lowercase letters. Case-difference occurs in Adlam, Armenian, Cherokee, Coptic, Cyrillic, Deseret, Garay, Glagolitic, Greek, Khutsuri and Mkhedruli Georgian, Latin, Medefaidrin, Old Hungarian, Osage, Vithkuqi and Warang Citi scripts. <!--(upper, lower, title, folding—both simple and full)--> Line 71 ⟶ 76: In Greek, the letter sigma has different lowercase forms depending on where it is in a word. {{Unichar\|03a3}} converts to {{Unichar\|03c3}} if it is at the start or middle of a word, and converts to {{Unichar\|03c2}} if it is at the end of a word. In Lithuanian, the dot in lowercase i and j is preserved when followed by accents. For example: Í in lowercase is i̇́.<ref>~~[http~~{{Cite web\|url=https://~~ftp~~www.unicode.org/Public/~~UNIDATA~~UCD/latest/ucd/SpecialCasing.txt]\|title=Unicode Character Database: Special Casing Data\|date=2024-05-10}}</ref> Despite the existence of {{Unichar\|1E9E}}, {{Unichar\|00DF}} corresponds to "SS".