Content deleted Content added
Tag: Reverted |
Drmccreedy (talk | contribs) m switch to a proper ref using https instead of ftp |
||
(7 intermediate revisions by 5 users not shown) | |||
Line 26:
The following Unicode categories do not have a Name value assigned: Controls (General Category: Cc), Private use (Co), Surrogate (Cs), Non-characters (Cn) and Reserved (Cn). They may be referenced, informally, by a generic or specific meta-name, called "Code Point Labels": {{not a typo|<control>, <control-0088>, <reserved>, <noncharacter-''hhhh''>, <private-use-''hhhh''>, or <surrogate>}}. Since these labels contain "<" and ">", they can never appear in a Name, which prevents confusion.
==={{anchor|Version 1.0 names}}Unicode 1.0 names===
In version 2.0 of Unicode, many names were changed. From then on the rule "a name will never change" came into effect, including the strict (normative) use of alias names. Disused Unicode 1.0 names were moved to the property Alias, to provide backward compatibility.
For example, {{Unichar|264}} has the Unicode 1.0 name "LATIN SMALL LETTER BABY GAMMA".
===Character name alias===
Line 59 ⟶ 64:
===Casing===
The Case value is normative in Unicode. It pertains to those scripts with uppercase and
<!--(upper, lower, title, folding—both simple and full)-->
Line 71 ⟶ 76:
In Greek, the letter sigma has different lowercase forms depending on where it is in a word. {{Unichar|03a3}} converts to {{Unichar|03c3}} if it is at the start or middle of a word, and converts to {{Unichar|03c2}} if it is at the end of a word.
In Lithuanian, the dot in lowercase i and j is preserved when followed by accents. For example: Í in lowercase is i̇́.<ref>
Despite the existence of {{Unichar|1E9E}}, {{Unichar|00DF}} corresponds to "SS".
|