Indian Script Code for Information Interchange: Difference between revisions

Content deleted Content added
Line 312:
== Special code points ==
 
; {{anchor|INV}}INV character—code point D9 (217): The INV (invisible consonant) character is used as a pseudo-consonant to display combining elements in isolation. For example, क (ka) + ् (halant) + INV = क्‍ (half ka). The Unicode equivalent is {{unichar|200D|ZERO WIDTH JOINER}} ({{ctrl|ZWJ}}). However, as noted [[#virama|below]], the ISCII halent character can also be doubled or combined with the ISCII nukta to achieve {{ctrl|ZWNJ}} or ZWJ effects. For this reason, [[Apple Inc|Apple]] maps the ISCII INV character to the Unicode {{ctrl|LRM|left-to-right mark}}, so as to guarantee [[round-trip format conversion|round-tripping]].<ref>{{cite web |url=https://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/DEVANAGA.TXT |title=Map (external version) from Mac OS Devanagari encoding to Unicode 2.1 and later. |author=Apple |author-link=Apple Inc |publisher=[[Unicode Consortium]] |date=2005-04-05 |orig-year=1998-02-05}}</ref>
; {{anchor|ATR}}ATR character—code point EF (239): The ATR (attribute) character followed by a byte code is used to switch to a different font attribute (such as bold) or to a different ISCII or [[PASCII]] language (such as Bengali), up to the next ATR sequence or the end of the line. This has no direct Unicode equivalent, as font attributes are not part of Unicode, and each script has a distinct set of code points.
{| class="wikitable"
|+ Presentational attributes
|-
!ATR + byte!!Mnemonic!!Formatting option
|-
|0x30||BLD||[[Boldface|Bold]]
|-
|0x31||ITA||[[Italics]]
|-
|0x32||UL||[[Underlining]]
|-
|0x33||EXP||Expanded
|-
|0x34||HLT||Highlight
|-
|0x35||OTL||Outline
|-
|0x36||SHD||Shadow
|-
|0x37||TOP||Top half of character (used with LOW to create double-height characters)
|-
|0x38||LOW||Bottom half of character (used with TOP to create double-height characters)
|-
|0x39||DBL||Entire row double-width and double-height
|}
{| class="wikitable"
|+ Shifts to ISCII scripts
|-
!ATR + byte!!Mnemonic!!ISCII script
|-
|0x40||DEF||Reset to default script
|-
|0x41||RMN||Romanised [[transliteration]]
|-
|0x42||DEV||[[Devanagari]]
|-
|0x43||BNG||[[Bengali script]]
|-
|0x44||TML||[[Tamil script]]
|-
|0x45||TLG||[[Telugu script]]
|-
|0x46||ASM||[[Assamese script]]
|-
|0x47||ORI||[[Oriya script]]
|-
|0x48||KND||[[Kannada script]]
|-
|0x49||MLM||[[Malayalam script]]
|-
|0x4A||GJR||[[Gujarati script]]
|-
|0x4B||PNJ||[[Punjabi script]]
|}
{| class="wikitable"
|+ Shifts to [[PASCII]]
|-
!ATR + byte!!Mnemonic!!PASCII locale
|-
|0x71||ARB||[[Arabic alphabet]]
|-
|0x72||PES||[[Persian alphabet]]
|-
|0x73||URD||[[Urdu alphabet]]
|-
|0x74||SND||[[Sindhi alphabet]]
|-
|0x75||KSM||[[Kashmiri alphabet]]
|-
|0x76||PST||[[Pashto alphabet]]
|}
; {{anchor|EXT}}EXT character—code point F0 (240): The EXT (extensions for Vedic) character followed by a byte code indicates a Vedic accent. This has no direct Unicode equivalent, as Vedic accents are assigned to distinct code points.
; {{anchor|virama}}Halant character ्—code point E8 (232): The halant character removes the implicit vowel from a consonant and is used between consonants to represent conjunct consonants. For example, क (ka) + ् (halant) + त (ta) = क्त (kta). The sequence ् (halant) + ् (halant) displays a conjunct with an explicit halant, for example क (ka) + ् (halant) + ् (halant) + त (ta) = क्‌त. The sequence ् (halant) + ़ (nukta) displays a conjunct with half consonants, if available, for example क (ka) + ् (halant) + ़ (nukta) + त (ta) = क्‍त.
{| class="wikitable Unicode"
!colspan=2| ISCII !!colspan=2| Unicode
Line 325 ⟶ 396:
| halant + nukta || <code>E8 E9</code> || halant + [[zero-width joiner|ZWJ]] || <code>094D 200D</code>
|}
; {{anchor|nuqta}}Nukta character ़—code point E9 (233): The [[nukta]] character after another ISCII character is used for a number of rarer characters which don't exist in the main ISCII set. For example क (ka) + ़ (nukta) = क़ (qa). These characters have precomposed forms in Unicode, as shown in the following table.
{| class="wikitable Unicode" style="font-size:120%;"
! ISCII<br>code point !! Original<br>character !! Character<br>with nukta !! Unicode<br>code point