Unicode compatibility characters: Difference between revisions

Content deleted Content added
Glyph substitution and composition: there is no such thing as “layoutting”
Glyph substitution and composition: Precomposed characters are in the Number Forms block.
Line 26:
Some compatibility characters are completely dispensable for text processing and display software that conforms to the Unicode standard. These include:
;[[typographic ligature|Ligatures]]: Ligatures such as 'ffi' in the Latin script were often encoded as a separate character in legacy character sets. Unicode's approach to ligatures is to treat them as rich text and, if turned on, handle them through glyph substitution.
;Precomposed Roman numerals: For example, Roman numeral twelve ('Ⅻ': U+216B) can be decomposed into a Roman numeral ten ('Ⅹ': U+2169) and two Roman numeral ones ('Ⅰ': U+2160). Precomposed characters are in the [[Number Forms]] block.
;Precomposed [[vulgar fraction|fractions]]: These decomposition have the keyword &lt;fraction&gt;. A fully conforming text handler should<ref>{{cite web|author=The Unicode Consortium|authorlink=Unicode Consortium|year=2010|title=The Unicode Standard, Version 6.0.0|publisher=Addison-Wesley Professional|isbn=978-0321480910|pages=212|url=https://www.unicode.org/versions/Unicode6.0.0/ch06.pdf#G12861}}</ref> display the vulgar fraction ¼ (U+00BC) identically to the composed fraction 1⁄4 (numeral 1 with fraction slash U+2044 and numeral 4). Precomposed characters are in the [[Number Forms]] block.
;Contextual glyphs or forms: These arise primarily in the Arabic script. Using fonts with glyph substitution capabilities such as [[OpenType]] and [[Apple Advanced Typography|TrueTypeGX]], Unicode conforming software can substitute the proper glyphs for the same character depending on whether that character appears at the beginning, end, middle of a word or in isolation. Such glyph substitution is also necessary for vertical (top to bottom) text layout for some East Asian languages. In this case glyphs must be substituted or synthesized for wide, narrow, small and square glyph forms. Non-conforming software or software using other character sets instead use multiple separate character for the same letter depending on its position: further complicating text processing.