Content deleted Content added
→Bidirectional writing: Corrected markup per MOS:BOLD and MOS:WAW, other tweaks |
→Numeric values and types: Corrected markup per MOS:BOLD and MOS:WAW |
||
Line 127:
===Decimal===
Characters are classified with a
The characters that do have a numeric value are separated in three groups: Decimal (De), Digit (Di) and Numeric (Nu, i.e. all other). "Decimal" means the character is a straight decimal digit. Only characters that are part of a contiguous encoded range 0..9 have numeric type Decimal. Other digits, like superscripts, have numeric type Digit. All numeric characters like fractions and Roman numerals end up with the type "Numeric". The intended effect is that a simple parser can use these decimal numeric values, without being distracted by say a numeric superscript or a fraction. Eighty-three CJK Ideographs that represent a number, including those used for accounting, are typed Numeric.
On the other hand, characters that could have a numeric value as a second meaning are still marked Numeric type
{{Numeric Type (Unicode)}}
===Hexadecimal digits===
[[Hexadecimal]] characters are those in the series with hexadecimal values 0...9ABCDEF (sixteen characters, decimal value 0–15). The character property
{{Hexadecimal digit (Unicode)}}
Forty-four characters are marked as ''Hex_Digit''. The ones in the Basic Latin block are also marked as
Unicode has no separate characters for hexadecimal values. A consequence is, that when using regular characters it is not possible to determine whether hexadecimal value is intended, or even whether a value is intended at all. That should be determined at a higher level, e.g. by prepending
==Block==
|