Content deleted Content added
/* List of blockss /* Tag: Reverted |
|||
(29 intermediate revisions by 19 users not shown) | |||
Line 1:
{{Short description|Named range of Unicode code points}}
{{for|the specific group of square characters in the Unicode typeset|Block Elements}}
A '''Unicode block''' is one of several contiguous ranges of numeric character codes ([[code point]]s) of the [[Unicode]] character set that are defined by the [[Unicode Consortium]] for administrative and documentation purposes. Typically, proposals such as the addition of new glyphs are discussed and evaluated by considering the relevant block or blocks as a whole.
Line 5 ⟶ 6:
== Design and implementation ==
Unicode blocks are identified by unique names, which use only ASCII characters and are usually descriptive of the nature of the symbols, in [[English language|English]]; such as "Tibetan" or "Supplemental Arrows-A". (When comparing block names, one is supposed to equate uppercase with lowercase letters, and ignore any whitespace, hyphens, and underbars; so the last name is equivalent to "
Blocks are [[intersection (set theory)|pairwise disjoint]]; that is, they do not overlap. The starting code point and the size (number of code points) of each block are always multiples of 16; therefore, in the [[hexadecimal
Every assigned code point has a glyph property called "Block", whose value is a character string naming the unique block that owns that point.<ref>{{Cite web |title=Glossary |url=https://www.unicode.org/glossary/#B |access-date=2022-08-07 |website=www.unicode.org}}</ref> However, a block may also contain unassigned code points, usually reserved for future additions of characters that "logically" should belong to that block. Code points not belonging to any of the named blocks, e.g. in the unassigned [[Plane (Unicode)|planes]] 4–13, have the value block="
Simply belonging to a particular Unicode block does not guarantee the certain particular properties of the characters it is or will be expected to contain. The identity of any character is determined by its properties stated in the Unicode Character Database. For example, the contiguous range of 32 noncharacter code points U+FDD0..U+FDEF share none of the properties common to the other characters in the [[Arabic Presentation Forms-A]] block, that they are certainly not Arabic script characters or "right-to-left noncharacters", and are assigned there as a filler to this block given that it has been agreed that no further Arabic compatibility characters will be encoded. <ref>{{Cite web |title=Private-Use Characters, Noncharacters & Sentinels FAQ |url=https://www.unicode.org/faq/private_use.html |access-date=2023-07-24 | website=www.unicode.org}}</ref>
== Other classifications ==
Line 19 ⟶ 22:
== List of blocks ==
Unicode
*
*
*
* 2 in plane 3, the Tertiary Ideographic Plane ({{slink||TIP}})
* 2 in plane
* One each in the planes 15 (F<sub>hex</sub>) and 16 (10<sub>hex</sub>), called Supplementary Private Use Area-A and -B ({{slink||PUA-A}})
{{Unicode blocks|state=uncollapsed}}
== {{anchor|Deleted blocks}}Moved blocks ==
The Unicode Stability Policy requires that a character, once assigned, may not be moved or removed, although it may be deprecated. This applies to Unicode 2.0 and all subsequent versions.
Prior to this, the following former blocks were
{|class="wikitable collapsible" style="width:100%; margin:0;"
|+Former Unicode blocks from before Unicode 2.0
|