Content deleted Content added
link to related articles |
clarify whether "output character" refers to encoded or decoded character; add original research :-) |
||
Line 10:
==Encoding plain text==
Although this{{clarifyme}} encoding method is useful for transmitting non-textual data through text-based systems, it is also used as a mechanism for encoding [[plain text]].
Some systems have a more limited character set they can handle
Other systems make minor [[in-band signaling]] additions to the beginning or end of the text -- perhaps the most famous case was "[[The world wonders]]".
By using a binary-to-text encoding on messages that are already plain text, then decoding on the other end, one can make such systems appear to be completely [[Transparency (telecommunication)| transparent]].
Line 30:
* [[Radix-64]]
Some older and today uncommon formats include BOO, BTOA, and USR encoding. A newer, unstandardized encoding method is [http://base91.sourceforge.net/ basE91], which produces the shortest plain ASCII output for compressed 8-bit binary input.
Most of these encodings generate text not containing all [[ASCII]] printable characters: for example, the [[base64]] encoding generates text that only contains upper case and lower case letters, (A–Z, a–z), numerals (0–9), and the "+", "/", and "=" symbols.
Some of these encoding (quoted-printable and percent encoding) are based on a set of allowed characters and a single [[escape character]]. The allowed characters are left unchanged, while all other characters are converted into a string starting with the escape character. This kind of conversion allows the resulting text to be almost readable, in that letters and digits are part of the allowed characters, and are therefore left as they are in the encoded text.
These encodings produce the shortest plain ASCII output for input that is mostly printable ascii.
Some other encodings ([[base64]], [[uuencoding]]) are based on mapping all possible sequences of six [[bit]]s into different printable characters. Since there are more than 2<sup>6</sup> = 64 printable characters, this is possible. A given sequence of bytes is translated by viewing it as stream of bits, breaking this stream in chunks of six bits and generating the sequence of corresponding characters. The different encodings differ in the mapping between sequences of bits and characters and in how the resulting text is formatted.
Some encodings (the original version of BinHex and the recommended encoding for [[CipherSaber]]) use four bits instead of six. Using 4 [[Category:Binary-to-text encoding formats|*]]
|