Binary-to-text encoding

This is an old revision of this page, as edited by Plugwash (talk | contribs) at 15:54, 30 November 2005. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

A binary to text encoding is an encoding of data in plain text. More precisely, it is an encoding of data in a sequence of ASCII printable characters. These encodings are necessary for transmission of data when the channel or the protocol only allows ASCII printable characters.

Binary to text encoding is common in email and USENET communication. The most used forms of binary to text encodings are:

Most of these encodings generate text not containing all ASCII printable characters: for example, the base64 encoding generates text that only contains upper case and lower case letters, (A–Z, a–z), numerals (0–9), and the "+", "/", and "=" symbols.

Some of these encoding (quoted-printable and percent encoding) are based on a set of allowed characters and a single escape character. The allowed characters are converted as they are, while all other characters are converted into a string starting with the escape character. This kind of conversion allows the resulting text to be almost readable, in that letters and digits are part of the allowed characters, and therefore occur as they are in the encoded text.

Some other encodings (base64, uuencoding) are based on mapping all possible sequences of six (or sometimes just four) bits into different printable characters. Since there are more than printable characters, this is possible. A given text is translated by viewing it as stream of bits, breaking this stream in chunks of 6 bits, and generating the sequence of corresponding characters. The differ encoding differ in the mapping between sequences of six bits and characters and in how the resulting text is formatted.

See also