Indian Script Code for Information Interchange: Difference between revisions

Content deleted Content added
Background: Fix template error
WikiCleanerBot (talk | contribs)
m v2.05b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation)
Line 4:
'''Indian Standard Code for Information Interchange''' ('''ISCII''') is a coding scheme for representing various writing systems of [[India]]. It encodes the main [[Indic script]]s and a Roman transliteration. The supported scripts are: [[Eastern Nagari|Bengali–Assamese]], [[Devanagari]], [[Gujarāti script|Gujarati]], [[Gurmukhi]], [[Kannada script|Kannada]], [[Malayalam script|Malayalam]], [[Oriya script|Oriya]], [[Tamil script|Tamil]], and [[Telugu script|Telugu]]. ISCII does not encode the writing systems of India that are based on [[Persian language|Persian]], but its writing system switching codes nonetheless provide for [[Kashmiri language|Kashmiri]], [[Sindhi language|Sindhi]], [[Urdu]], [[Persian language|Persian]], [[Pashto language|Pashto]] and [[Arabic]]. The Persian-based writing systems were subsequently encoded in the [[Perso-Arabic Script Code for Information Interchange|PASCII]] encoding.
 
ISCII has not been widely used outside certain government institutions, although a variant without the {{ctrl|ATR|internal=yes}} mechanism was used on [[classic Mac OS]], [[Mac OS Devanagari encoding|Mac OS Devanagari]],<ref name="appledevanagari"/> and it has now been rendered largely obsolete by [[Unicode]]. Unicode uses a separate block for each Indic writing system, and largely preserves the ISCII layout within each block.<ref name="unicode">{{cite book |title=The Unicode Standard v15.0 Chapter 12 |publisher=The Unicode Consortium |url=https://www.unicode.org/versions/Unicode15.0.0/ch12.pdf |access-date=13 August 2024}}</ref>{{rp|p=462}}.
 
==Background==
 
The Brahmi-derived writing systems have similar structure.<ref name="unicode"/>{{rp|p=462}}. So ISCII encodes letters with the same phonetic value at the same code point, overlaying the various scripts. For example, the ISCII codes 0xB3 0xDB represent [ki]. This will be rendered as കി in [[Malayalam]], कि in Devanagari, as ਕਿ in Gurmukhi, and as கி in Tamil. The writing system can be selected in rich text by markup or in plain text by means of the {{ctrl|ATR|internal=yes}} code described below.
 
One motivation for the use of a single encoding is the idea that it will allow easy [[transliteration]] from one writing system to another.<ref name="unicode"/>{{rp|p=462}}. However, there are enough incompatibilities that this is not really a practical idea.
 
ISCII is an 8-bit encoding.<ref name="std"/>{{rp|p=4}}. The lower 128 code points are plain [[American Standard Code for Information Interchange|ASCII]], the upper 128 code points are ISCII-specific. In addition to the code points representing characters, ISCII makes use of a code point with mnemonic {{ctrl|ATR|internal=yes}} that indicates that the following byte contains one of two kinds of information. One set of values changes the writing system until the next writing system indicator or end-of-line. Another set of values select display modes such as bold and italic. ISCII does not provide a means of indicating the default writing system.
 
== Codepage layout ==