Content deleted Content added
→top: unichar template is not working for control characters |
→Category "Cc" control codes (C0 and C1): unichar template is broken for control characters |
||
Line 11:
Category "Cc" control codes can serve a variety of purposes, not limited to format effectors: for example, the default ASCII C0 set includes six format effectors ({{ctrl|BS}}, {{ctrl|HT}}, {{ctrl|LF}}, {{ctrl|VT}}, {{ctrl|FF}} and {{ctrl|CR}}), ten transmission controls, four device controls, four information separators and eight other control codes.<ref name="ir001">{{citation|mode=cs1 |author=ISO/TC 97/SC 2 |author-link=ISO/IEC JTC 1/SC 2#History |title=The set of control characters of the ISO 646 |date=1975 |publisher=ITSCJ/[[Information Processing Society of Japan|IPSJ]] |id=ISO-IR-1 |url=https://www.itscj.ipsj.or.jp/iso-ir/001.pdf}}</ref> Most of these characters play no explicit role in Unicode text handling, and are used only by higher-level protocols such as those used by [[terminal emulator]]s. Certain characters are commonly used for formatting or [[sentinel value|sentinel]] purposes:
* {{
* {{
* {{
* {{
* {{
* {{
Unicode only specifies semantics for {{tt|U+0009—U+000D}}, {{tt|U+001C—U+001F}}, and {{tt|U+0085}} (the ASCII format effectors except for {{ctrl|BS}}, plus the ASCII information separators and the C1 {{ctrl|NEL}}). The rest of the "Cc" control codes are transparent to Unicode and their meanings are left to higher-level protocols, although interpretation as defined in ISO/IEC 6429 is suggested as a default.<ref name="unicode-23-1">{{cite book |url=https://www.unicode.org/versions/Unicode12.0.0/ch23.pdf#page=3 |title=23.1: Control Codes |work=The Unicode Standard |edition=12.0.0 |date=2019 |author=Unicode Consortium |author-link=Unicode Consortium |isbn=978-1-936213-22-1 |pages=868–870}}</ref> Furthermore, certain specialised higher-level protocols, such as transcoded [[Teletext]], may include a [[Teletext character set#Control characters|different interpretation]] of the entire C0 control code range.<ref>{{cite web |url=https://corp.unicode.org/pipermail/unicode/2020-October/009120.html |title=Teletext separated mosaic graphics |work=Unicode Mailing List Archive |last=Ewell |first=Doug |date=2020-10-16 |publisher=[[Unicode Consortium]] |quotation=I reiterate that it was UTC {{bracket|[[Unicode Technical Committee]]}} and Script Ad Hoc who provided the guidance to the group writing the [[Symbols for Legacy Computing]] proposal (and there is a second on the way) that 0x00 through 0x1F in the original teletext set should map to U+0000 through U+001F when converting to Unicode.}}</ref>
== Unicode introduced separators ==
|