Unicode character property: Difference between revisions

Content deleted Content added
Tag: Reverted
m switch to a proper ref using https instead of ftp
 
(7 intermediate revisions by 5 users not shown)
Line 26:
 
The following Unicode categories do not have a Name value assigned: Controls (General Category: Cc), Private use (Co), Surrogate (Cs), Non-characters (Cn) and Reserved (Cn). They may be referenced, informally, by a generic or specific meta-name, called "Code Point Labels": {{not a typo|<control>, <control-0088>, <reserved>, <noncharacter-''hhhh''>, <private-use-''hhhh''>, or <surrogate>}}. Since these labels contain "<" and ">", they can never appear in a Name, which prevents confusion.
 
==={{anchor|Version 1.0 names}}Unicode 1.0 names===
In version 2.0 of Unicode, many names were changed. From then on the rule "a name will never change" came into effect, including the strict (normative) use of alias names. Disused Unicode 1.0 names were moved to the property Alias, to provide backward compatibility.
 
For example, {{Unichar|264}} has the Unicode 1.0 name "LATIN SMALL LETTER BABY GAMMA".
 
===Character name alias===
Line 59 ⟶ 64:
 
===Casing===
The Case value is normative in Unicode. It pertains to those scripts with uppercase and the lowercase letters. Case-difference occurs in Adlam, Armenian, Cherokee, Coptic, Cyrillic, Deseret, Garay, Glagolitic, Greek, Khutsuri and Mkhedruli Georgian, Latin, Medefaidrin, Old Hungarian, Osage, Vithkuqi and Warang Citi scripts.
 
<!--(upper, lower, title, folding—both simple and full)-->
Line 71 ⟶ 76:
In Greek, the letter sigma has different lowercase forms depending on where it is in a word. {{Unichar|03a3}} converts to {{Unichar|03c3}} if it is at the start or middle of a word, and converts to {{Unichar|03c2}} if it is at the end of a word.
 
In Lithuanian, the dot in lowercase i and j is preserved when followed by accents. For example: Í in lowercase is i̇́.<ref>[http{{Cite web|url=https://ftpwww.unicode.org/Public/UNIDATAUCD/latest/ucd/SpecialCasing.txt]|title=Unicode Character Database: Special Casing Data|date=2024-05-10}}</ref>
 
Despite the existence of {{Unichar|1E9E}}, {{Unichar|00DF}} corresponds to "SS".