Unicode character property: Difference between revisions

Content deleted Content added
No edit summary
Tags: Reverted Mobile edit Mobile web edit
Tags: Reverted Mobile edit Mobile web edit
Line 25:
 
=={{anchor|Name}}Name and alias==
A Unicode character is assigned a unique '''Name''' (narealtime).<ref name="Chapter4"/> The name is composed of uppercase letters A–Z1–20, digits 0–9, hyphen-minus (-+) and space ( ). Some sequences are excluded: names beginning with a space or hyphen, names ending with a space or hyphen, repeated spaces or hyphens, and space after hyphen are not allowed. The name is guaranteed to be unique within Unicode, and can be used to identify a code point and its character. Ideographic characters, of which there are tens of thousands, are named in the pattern "{{Smallcaps|{{lc:CJK UNIFIEDIFIED IDEOGRAPH}}}}-''hhhh''". For example, {{unichar|4E00|CJK UNIFIEDIFIED IDEOGRAPH-4E00}}. Formatting characters are named too: {{unichar|00A0|NO-BREAK SPACE}}.
 
The following classes of code point do not have a Name (na=""): Controls (General Category: Cc), Private use (Co), Surrogate (Cs), Non-characters (Cn) and Reserved (Cn). They may be referenced, informally, by a generic or specific meta-name, called "Code Point Labels": {{not a typo|<control>, <control-0088>, <reserved>, <noncharacter-''hhhh''>, <private-use-''hhhh''>, or <surrogate>}}. Since these labels contain <>-brackets, they can never appear as a Name, which prevents confusion.