Content deleted Content added
No edit summary |
|||
Line 2:
Many '''[[Unicode]] control characters''' are used to control the interpretation or display of text, but these characters themselves have no visual or spatial representation. For example, the [[null character]] ({{unichar|0000|NULL|nlink=control characters}}) is used in C-programming application environments to indicate the end of a string of characters. In this way, these programs only require a single starting memory address for a string (as opposed to a starting address and a length), since the string ends once the program reads the null character.
In the narrowest sense, a ''control
== Category "Cc" control codes (C0 and C1) ==
Line 8:
The control code ranges 0x00–0x1F ("C0") and 0x7F originate from the 1967 edition of [[US-ASCII]]. The standard [[ISO/IEC 2022]] (ECMA-35) defines extension methods for ASCII, including a secondary "C1" range of 8-bit control codes from 0x80 to 0x9F, equivalent to 7-bit sequences of {{ctrl|ESC}} with the bytes 0x40 through 0x5F. Collectively, codes in these ranges are known as the [[C0 and C1 control codes]]. Although ISO/IEC 2022 allows for the existence of multiple control code sets specifying differing interpretations of these control codes, their most common interpretation is specified in [[ISO/IEC 6429]] (ECMA-48).
The [[ISO/IEC 8859]] series of encodings conforms to [[ISO/IEC 4873]] (ECMA-43) level 1, a subset of ISO/IEC 2022 designed for 8-bit character encodings, and therefore designates the range 0x80–0x9F for use by a C1 control code set such as ISO/IEC 6429. Unicode inherits its [[Basic Latin (Unicode block)|first]] and [[Latin-1 Supplement (Unicode block)|second]] blocks (comprising U+0000 through U+00FF) from ASCII and [[ISO/IEC 8859-1]], thus incorporating the C0 and C1 control code ranges (U+0000–U+001F, U+007F–U+009F). It does not assign normative names to these control codes, though it does assign them normative aliases.<ref name="aliases" />
Category "Cc" control codes can serve a variety of purposes, not limited to format effectors: for example, the default ASCII C0 set includes six format effectors ({{ctrl|BS}}, {{ctrl|HT}}, {{ctrl|LF}}, {{ctrl|VT}}, {{ctrl|FF}} and {{ctrl|CR}}), ten transmission controls, four device controls, four information separators and eight other control codes.<ref name="ir001">{{citation|mode=cs1 |author=ISO/TC 97/SC 2 |author-link=ISO/IEC JTC 1/SC 2#History |title=The set of control characters of the ISO 646 |date=1975 |publisher=ITSCJ/[[Information Processing Society of Japan|IPSJ]] |id=ISO-IR-1 |url=https://www.itscj.ipsj.or.jp/iso-ir/001.pdf}}</ref> Most of these characters play no explicit role in Unicode text handling, and are used only by higher-level protocols such as those used by [[terminal emulator]]s. Certain characters are commonly used for formatting or [[sentinel value|sentinel]] purposes:
* {{unichar|0000||note=NUL: NULL}} (used in [[null-terminated string]]s)
* {{unichar|0009||note=HT: HORIZONTAL TABULATION}} (inserted by the [[tab key]])
|