Content deleted Content added
Move external link to citation |
Add citations |
||
Line 4:
'''Indian Standard Code for Information Interchange''' ('''ISCII''') is a coding scheme for representing various writing systems of [[India]]. It encodes the main [[Indic script]]s and a Roman transliteration. The supported scripts are: [[Eastern Nagari|Bengali–Assamese]], [[Devanagari]], [[Gujarāti script|Gujarati]], [[Gurmukhi]], [[Kannada script|Kannada]], [[Malayalam script|Malayalam]], [[Oriya script|Oriya]], [[Tamil script|Tamil]], and [[Telugu script|Telugu]]. ISCII does not encode the writing systems of India that are based on [[Persian language|Persian]], but its writing system switching codes nonetheless provide for [[Kashmiri language|Kashmiri]], [[Sindhi language|Sindhi]], [[Urdu]], [[Persian language|Persian]], [[Pashto language|Pashto]] and [[Arabic]]. The Persian-based writing systems were subsequently encoded in the [[Perso-Arabic Script Code for Information Interchange|PASCII]] encoding.
ISCII has not been widely used outside certain government institutions, although a variant without the {{ctrl|ATR|internal=yes}} mechanism was used on [[classic Mac OS]], [[Mac OS Devanagari encoding|Mac OS Devanagari]],<ref name="appledevanagari"/> and it has now been rendered largely obsolete by [[Unicode]]. Unicode uses a separate block for each Indic writing system, and largely preserves the ISCII layout within each block<ref name="unicode">{{cite book |title=The Unicode Standard v15.0 Chapter 12 |publisher=The Unicode Consortium |url=https://www.unicode.org/versions/Unicode15.0.0/ch12.pdf |access-date=13 August 2024}}</ref>{{rp|p=462}}.
==Background==
The Brahmi-derived writing systems have similar structure<ref name="unicode"/>{rp|p=462}. So ISCII encodes letters with the same phonetic value at the same code point, overlaying the various scripts. For example, the ISCII codes 0xB3 0xDB represent [ki]. This will be rendered as കി in [[Malayalam]], कि in Devanagari, as ਕਿ in Gurmukhi, and as கி in Tamil. The writing system can be selected in rich text by markup or in plain text by means of the {{ctrl|ATR|internal=yes}} code described below.
One motivation for the use of a single encoding is the idea that it will allow easy [[transliteration]] from one writing system to another<ref name="unicode"/>{{rp|p=462}}. However, there are enough incompatibilities that this is not really a practical idea.
ISCII is an 8-bit encoding<ref name="std"/>{{rp|p=4}}. The lower 128 code points are plain [[American Standard Code for Information Interchange|ASCII]], the upper 128 code points are ISCII-specific. In addition to the code points representing characters, ISCII makes use of a code point with mnemonic {{ctrl|ATR|internal=yes}} that indicates that the following byte contains one of two kinds of information. One set of values changes the writing system until the next writing system indicator or end-of-line. Another set of values select display modes such as bold and italic. ISCII does not provide a means of indicating the default writing system.
== Codepage layout ==
The following table shows the character set for [[Devanagari]]. The code sets for Assamese, Bengali, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, and Telugu are similar, with each Devanagari form replaced by the [[Brahmic family of scripts|equivalent form in each writing system]]{{r|name=unicode|p=462}}. Each character is shown with its decimal code and its [[Unicode]] equivalent.
{|{{chset-table-header1|ISCII Devanagari<ref name="std">{{Cite
|-
| {{chset-left1|0x}}
Line 453:
{| class="wikitable collapsible collapsed Unicode" border="1" style="text-align:center; font-size:100%;"
|+ Code set for all abugidas using ISCII{{refn|{{multiref|This table can be derived from the correpsondece by tables 2 and 3 in the ISCII standard here<ref name="std"/> and the [[Unicode Standard]] code charts.}}}}
! Hex !! Official<br>Listing !! [[ISO 15919]]!! colspan="2"| [[Devanagari]]!! colspan="2"| [[Bengali alphabet|Bengali]]
! colspan="2" |Assamese!! colspan="2" | [[Gurmukhi script|Gurmukhi]]!! colspan="2"| [[Gujarati script|Gujarati]]!! colspan="2"| [[Oriya script|Oriya]]!! colspan="2"| [[Tamil script|Tamil]]!! colspan="2"| [[Telugu script|Telugu]]!! colspan="2"| [[Kannada script|Kannada]]!! colspan="2"| [[Malayalam script|Malayalam]]
|