Content deleted Content added
Tag: Reverted |
m punctuation fix |
||
(43 intermediate revisions by 18 users not shown) | |||
Line 1:
{{Short description|Computer encoding of characters}}
{{Use dmy dates|date=May 2019|cs1-dates=y}}
A '''six-bit character code''' is a [[character encoding]] designed for use on computers with [[word length]]s a multiple of 6. Six bits can only encode 64 distinct characters, so these codes generally include only the upper-case letters, the numerals, some punctuation characters, and sometimes control characters.
==Types of six-bit codes==
An early six-bit binary code was used for [[Braille]], the reading system for the blind that was developed in the 1820s.
The earliest computers dealt with numeric data only, and made no provision for character data. [[Six-bit BCD]], with several variants, was used by [[IBM]] on early computers such as the [[IBM 702]] in 1953 and the [[IBM 704]] in 1954.<ref>{{cite book |author=IBM Corporation |title=704 electronic data-processing machine: manual of operation |date=1954 |url=http://www.bitsavers.org/pdf/ibm/704/24-6661-2_704_Manual_1955.pdf}}</ref>{{rp|p.35}}
Six-bit character codes generally succeeded the five-bit [[Baudot code]] and preceded seven-bit [[ASCII]].
Six-bit codes could encode more than 64 characters by the use of [[Shift Out and Shift In characters]], essentially incorporating two distinct 62-character sets and switching between them. For example, the popular [[IBM 2741]] communications terminal supported a variety of character sets of up to 88 printing characters plus control characters.
===Teletypesetter code===
{{main|Teleprinter#Teletypesetter|Telegraph code#TeleTypeSetter}}
A special 6-level extension of the 5-level [[International Telegraph Alphabet]] was used to remotely control [[Linotype machine]]s beginning around 1930. By 1950 it was widely used by [[wire service]]s to send preformatted news stories to participating newspapers. It supported the 90 [[printable character]]s characters of a Linotype machine, plus [[whitespace character]]s.
The TTS code had two pairs of shift codes allowing a total of four shift states. The first operated much like a keyboard's shift key and selected between a lower-case and digits repertoire, and an upper-case and symbols one. A second pair of Linotype-specific "lower rail" and "upper rail" shift codes would select an alternate (usually italic) font.
===BCD six-bit codes===
Six-bit [[BCD (character encoding)|BCD]] codes were adaptations of the [[punched card code]] to [[binary code]]. [[IBM]] applied the terms ''binary-coded decimal'' and ''BCD'' to the variations of BCD ''alphamerics'' used in most early IBM computers, including the [[IBM 1620]], [[IBM 1400 series]], and non-[[IBM 700/7000 series#Decimal architecture (7070/7072/7074)|decimal architecture]] members of the [[IBM 700/7000 series]].
===COBOL databases six-bit code===
Line 21 ⟶ 27:
A six-bit code, with added odd [[parity bit]], is used on Track 1 of [[magnetic stripe card]]s, as specified in [[ISO/IEC 7811]]-2.
===
A popular six-bit code was [[Digital Equipment Corporation|DEC]] SIXBIT. This is simply the ASCII character codes from 32 to 95 coded as 0 to 63 by subtracting 32 (i.e., columns 2, 3, 4, and 5 of the ASCII table (16 characters to a column), shifted to columns 0 through 3, by subtracting 2 from the high bits); it includes the space, punctuation characters, numbers, and capital letters, but no control characters. Since it included no control characters, not even end-of-line, it was not used for general text processing. However, six-character names such as [[filename]]s and [[assembly language|assembler]] [[identifier|symbol]]s could be stored in a single [[36-bit]] word of the [[PDP-10]], and three characters fit in each word of the [[PDP-1]] and two characters fit in each word of the [[PDP-8]].
Another, less common, variant is obtained by just stripping the high bit of an ASCII code in 32 - 95 range (codes 32 - 63 remain at their positions, higher values have 64 subtracted from them). Such variant was sometimes used on DEC's [[PDP-8]] (1965).
{|{{chset-table-header1|DEC SIXBIT}}
|-
|{{chset-ctrl1|U+0020 SPACE| [[Space character|SP]] }}
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|-
|{{chset-cell1|U+0030 DIGIT ZERO|[[0]]}}
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-cell1|U+003A COLON|[[colon (punctuation)|:]]}}
|{{chset-
|{{chset-
|{{chset-cell1|U+003D EQUALS SIGN|[[equals sign|{{=}}]]}}
|{{chset-cell1|U+003E GREATER-THAN SIGN|[[greater-than sign|>]]}}
|{{chset-cell1|U+003F QUESTION MARK|[[question mark|?]]}}
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-cell1|U+005D RIGHT SQUARE BRACKET|[[Square brackets|]]]}}
|{{chset-
|{{chset-cell1|U+005F LOW LINE|[[Underscore|_]]}}
|}
==={{anchor|ECMA-1}}ECMA and ISO six-bit code===
A six-bit code similar to DEC's, but replacing a few punctuation characters with the most useful control characters—including [[Shift Out and Shift In characters|SO/SI]], allowing code extension—was specified as [[Ecma International|ECMA]]-[https://ecma-international.org/publications-and-standards/standards/ecma-1/ 1] in 1963. Four years later, ISO Recommendation R 646-1967 (which later evolved into [[ISO/IEC_646 | ISO Standard 646]]) included an almost identical six-bit code, differing only in some of the alternative options permitted for a few characters. ECMA-1 was eventually withdrawn, and ISO 646-1973 explicitly removed the six-bit code, standardizing only its 7-bit code.
{|{{chset-
|-
|{{chset-ctrl1|U+0020 SPACE| [[Space character|SP]] }}
|{{chset-ctrl1 | U+0009: CHARACTER TABULATION | [[Horizontal tabulation|HT]] }}
|{{chset-ctrl1 | U+000A: LINE FEED | [[Line feed|LF]] |fn={{efn|name=crlf}}}}
|{{chset-ctrl1 | U+000B: LINE TABULATION | [[Vertical tabulation|VT]] }}
|{{chset-
|{{chset-ctrl1 | U+000D: CARRIAGE RETURN | [[Carriage return|CR]] |fn={{efn|name=crlf}}}}
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-cell1|U+002F SOLIDUS|[[Slash (punctuation)|/]]}}
|-
|{{chset-cell1|U+0030 DIGIT ZERO|[[0]]}}
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-cell1|U+003A COLON|[[colon (punctuation)|:]]}}
|{{chset-
|{{chset-cell1|U+003C LESS-THAN SIGN / U+0024 DOLLAR SIGN|[[less-than sign|<]]/[[Dollar sign|$]]}}
|{{chset-cell1|U+003D EQUALS SIGN / U+0025 PERCENT|[[equals sign|{{=}}]]/[[percent sign|%]]}}
|{{chset-cell1|U+003E GREATER-THAN SIGN / U+0026 AMPERSAND|[[greater-than sign|>]]/[[ampersand|&]]}}
|{{chset-cell1|U+003F QUESTION MARK/ U+0027 APOSTROPHE|[[question mark|?]]/[[apostrophe|']]|fn={{efn|name=iso646q}}}}
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-cell1|U+005B LEFT SQUARE BRACKET|[[Square brackets|[]]|fn={{efn|name=natopt}}}}
|{{chset-
|{{chset-cell1|U+005D RIGHT SQUARE BRACKET|[[Square brackets|]]]|fn={{efn|name=natopt}}}}
|{{chset-ctrl1 | U+001B: ESCAPE | [[Escape character|ESC]] }}
|{{chset-ctrl1 | U+007F: DELETE | [[Delete character|DEL]] }}
|}
{{notelist|refs=
{{efn|name=crlf|In systems where LF both advances to the next line and returns the carriage to the start position, CR may instead be used as a "spare control" according to ECMA-1, and as BS "backspace" according to ISO/R 646. LF then has the designation NL "new line".}}
{{efn|name=natopt|These character positions are intended for national use. Where the local alphabet contains letters additional to the basic latin alphabet, they should be assigned to these positions. The default assignments, according to ECMA-1, are listed here.}}
{{efn|name=iso646q|ECMA-1 permits either the question mark or the apostrophe in this position. ISO/R 646 permits the apostrophe only, making it an invariant.}}
}}
===ICT/ICL 6-bit character set===
The [[ICT_1900_series#Character_sets|ICT (later ICL) 1900-series]] of mainframes<!--any others??--> used a six-bit code derived from an early 1963 version of [[ASCII]] for internal storage and processing, referred to as the "ECMA character set" in its documentation.
{|{{chset-
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|}
===AIS SixBit ASCII===
The [[automatic identification system]] (AIS) uses this code.<ref name='Raymond'>{{cite web |url=https://gpsd.gitlab.io/gpsd/AIVDM.html#_ais_payload_data_types |title=AIVDM/AIVDO protocol decoding |at=AIS Payload Data Types |access-date=2024-03-14 |author-last=Raymond |author-first=Eric S. |date=2023-06-24}}</ref>
{|
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-cell1|U+005A LATIN CAPITAL LETTER Z|[[Z]]}}
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|-
|{{chset-
|{{chset-cell1|U+0021 EXCLAMATION MARK|[[Exclamation mark|!]]}}
|{{chset-cell1|U+0022 QUOTATION MARK|[[Quotation mark|"]]}}
|{{chset-cell1|U+0029 RIGHT PARENTHESIS|[[Parenthesis|)]]}}
|{{chset-cell1|U+002D HYPHEN-MINUS|[[Hyphen-minus|-]]}}
|-
|{{chset-cell1|U+003A COLON|[[colon (punctuation)|:]]}}
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|}
===FIELDATA six-bit code===
[[FIELDATA]] was a seven-bit code (with optional parity) of which only 64 code positions (occupying six bits) were formally defined.<ref name="Mackenzie_1980">{{cite book |url=https://textfiles.meulie.net/bitsaved/Books/Mackenzie_CodedCharSets.pdf |title=Coded Character Sets, History and Development |series=The Systems Programming Series |author-last=Mackenzie |author-first=Charles E. |date=1980 |edition=1 |publisher=[[Addison-Wesley Publishing Company, Inc.]] |isbn=978-0-201-14460-4 |lccn=77-90165 |access-date=2019-08-25 |archive-url=https://web.archive.org/web/20160526172151/https://textfiles.meulie.net/bitsaved/Books/Mackenzie_CodedCharSets.pdf |archive-date=May 26, 2016 |url-status=live |df=mdy-all }}</ref> A variant was used by [[UNIVAC]]'s 1100-series computers.<ref name="Walker_1996">{{cite web |title=UNIVAC 1100 Series FIELDATA Code |work=UNIVAC Memories |author-first=John |author-last=Walker |date=1996-08-06 |url=https://www.fourmilab.ch/documents/univac/fieldata.html |access-date=2016-05-22 |url-status=live |archive-url=https://web.archive.org/web/20160522120813/https://www.fourmilab.ch/documents/univac/fieldata.html |archive-date=2016-05-22}}</ref> Treating the code as a six-bit code these systems used a 36-bit word (capable of storing six such reduced FIELDATA characters).<ref name="Jennings_2016">{{cite web |title=An annotated history of some character codes or ASCII: American Standard Code for Information Infiltration |at=FIELDATA |author-first=Thomas Daniel |author-last=Jennings |author-link=Tom Jennings |website=sensitive research (SR-IX) |date=2016-04-20 |orig-year=1999 |url=https://www.sr-ix.com/Archive/CharCodeHist/index.html#FIELDATA |access-date=2022-06-01}}</ref>
===Braille six-bit code===
[[Braille]] characters are represented using six dot positions, arranged in a rectangle. Each position may contain a raised dot or not, so Braille can be considered to be a six-bit binary code. Some more modern Braille systems add an extra two dots, making these systems an eight-bit code instead.
==Six-bit codes for binary-to-text encoding==
{{See also|Binary-to-text encoding}}
Transmission of binary data over systems which are designed for text only can sometimes introduce problems. For example, [[email]] historically supported only 7-bit ASCII codes and would strip the 8th bit, thus corrupting binary data sent directly through any troublesome mail server. Other systems can cause issues by improperly interpreting control characters during storage or transmission.
A number of schemes exist to pack 8-bit data into text-only representations which can pass through text mail systems, to be decoded at the destination. Examples of 6-bit character subsets used for packing binary data include [[Uuencode]] and [[Base64]]. These sets contain no control characters (only printable numbers, letters, some punctuation, and maybe space) and allow data to be transmitted over any medium which is also able to transmit human-readable text.
=={{anchor|BCD-variants}}Examples of BCD six-bit codes==
IBM, which dominated commercial data processing use a variety of six-bit codes, which were tied to the character set used on [[punched card]]s, ''see'' [[BCD (character encoding)]].
Other vendor character codes are shown below, with their [[Unicode]] equivalents.
{|{{Chset-table-header1|CDC 1604: Magnetic tape BCD codes}}
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-ctrl1|TAPE MARK|[[End-of-file|TAPE<br/>MARK]]|style=line-height:1}}
|-
|{{chset-ctrl1|U+0020 SPACE| {{Control code link|SP}} }}
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-ctrl1|RECORD MARK|[[BCD (character encoding)#Recordmark character|REC<br/>MARK]]|style=line-height:1}}
|{{chset-
|{{chset-
|{{chset-cell1|||style=background:#DDD}}
|{{chset-cell1|||style=background:#DDD}}
|{{chset-
|-
|{{chset-
|{{chset-cell1|U+002D HYPHEN-MINUS U+0030 DIGIT ZERO|[[-0]]}}
|{{chset-cell1|||style=background:#DDD}}
|{{chset-cell1|||style=background:#DDD}}
|{{chset-cell1|||style=background:#DDD}}
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|}
{|{{Chset-table-header1|CDC 1604: [[Punched card]] codes}}
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-cell1|||style=background:#DDD}}
|{{chset-cell1|||style=background:#DDD}}
|{{chset-
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-cell1|||style=background:#DDD}}
|-
|{{chset-
|{{chset-cell1|U+002D HYPHEN-MINUS U+0030 DIGIT ZERO|[[-0]]}}
|{{chset-cell1|||style=background:#DDD}}
|{{chset-cell1|||style=background:#DDD}}
|{{chset-cell1|||style=background:#DDD}}
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|}
{|{{Chset-table-header1|CDC 1612: Printer codes (business applications)}}
|-
|{{chset-
|{{chset-cell1|U+2264 LESS-THAN OR EQUAL TO|[[Inequality (mathematics)|≤]]}}
|{{chset-cell1|U+0021 EXCLAMATION MARK|[[Exclamation mark|!]]}}
|{{chset-cell1|U+005B LEFT SQUARE BRACKET|[[Square brackets|[]]}}
|-
|{{chset-ctrl1|U+0020 SPACE| {{Control code link|SP}} }}
|{{chset-cell1|U+005D RIGHT SQUARE BRACKET|[[Square brackets|]]]}}
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|}
Line 583 ⟶ 590:
{{See also|GOST 10859#6-bit code: with only Cyrillic upper case letters}}
{|{{chset-
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|{{chset-
|}
==Example of six-bit Braille codes==
The following table shows the arrangement of characters, with the hex value, corresponding ASCII character, Braille 6-bit codes (dot combinations), Braille [[Unicode]] glyph, and general meaning (the actual meaning may change depending on context).<ref name='DotlessBraille'>{{cite web |url=
{|
Line 1,075 ⟶ 1,080:
* [[CDC display code]]
* [[DEC RADIX 50]] / [[DEC MOD40|MOD40]]
* [[SQUOZE#Identifier name character encoding|IBM SQUOZE]]
* [[IBM Transcode]]
* [[ASCII]]
Line 1,084 ⟶ 1,089:
* [[UTF-8]]
* [[UTF-16]]
* [[Teletypesetter]] code (TTS)
==References==
Line 1,089 ⟶ 1,095:
==External links==
* {{cite web |url=http://
* {{cite web |url=
*
{{Character encoding}}
|