Content deleted Content added
m Reverted edits by 2409:4072:6E1E:BCF4:4BC:BEB9:DEFE:C953 (talk) to last revision by Larry Hockett: editing tests
Tag: Reverted
Line 34:
Error detecting codes can be optimised to detect ''burst errors'', or ''random errors''.
 
==
== Examples ==
=== Codes in communication used for brevity ===
A cable code replaces words (e.g. ''ship'' or ''invoice'') with shorter words, allowing the same information to be sent with fewer [[character (computing)|characters]], more quickly, and less expensively.
 
Codes can be used for brevity. When [[Telegraphy|telegraph]] messages were the state of the art in rapid long-distance communication, elaborate systems of [[commercial code (communications)|commercial codes]] that encoded complete phrases into single mouths (commonly five-minute groups) were developed, so that telegraphers became conversant with such "words" as ''BYOXO'' ("Are you trying to weasel out of our deal?"), ''LIOUY'' ("Why do you not answer my question?"), ''BMULD'' ("You're a skunk!"), or ''AYYLU'' ("Not clearly coded, repeat more clearly."). [[Code word]]s were chosen for various reasons: [[length]], [[pronounceability]], etc. Meanings were chosen to fit perceived needs: commercial negotiations, military terms for military codes, diplomatic terms for diplomatic codes, any and all of the preceding for espionage codes. Codebooks and codebook publishers proliferated, including one run as a front for the American [[Black Chamber]] run by [[Herbert Yardley]] between the First and Second World Wars. The purpose of most of these codes was to save on cable costs. The use of data coding for [[data compression]] predates the computer era; an early example is the telegraph [[Morse code]] where more-frequently used characters have shorter representations. Techniques such as [[Huffman coding]] are now used by computer-based [[algorithm]]s to compress large data files into a more compact form for storage or transmission.
 
=== Character encodings ===
{{Main|Character encoding}}
Character encodings are representations of textual data. A given character encoding may be associated with a specific character set (the collection of characters which it can represent), though some character sets have multiple character encodings and vice versa. Character encodings may be broadly grouped according to the number of bytes required to represent a single character: there are single-byte encodings, [[Wide character|multibyte]] (also called wide) encodings, and [[Variable-width encoding|variable-width]] (also called variable-length) encodings. The earliest character encodings were single-byte, the best-known example of which is [[ASCII]]. ASCII remains in use today, for example in [[HTTP headers]]. However, single-byte encodings cannot model character sets with more than 256 characters. Scripts that require large character sets such as [[CJK|Chinese, Japanese and Korean]] must be represented with multibyte encodings. Early multibyte encodings were fixed-length, meaning that although each character was represented by more than one byte, all characters used the same number of bytes ("word length"), making them suitable for decoding with a lookup table. The final group, variable-width encodings, is a subset of multibyte encodings. These use more complex encoding and decoding logic to efficiently represent large character sets while keeping the representations of more commonly used characters shorter or maintaining backward compatibility properties. This group includes [[UTF-8]], an encoding of the [[Unicode]] character set; UTF-8 is the most common encoding of text media on the Internet.