Prefix code: Difference between revisions

Content deleted Content added
m Disambiguating links to Code word (link changed to Code word (communication)) using DisamAssist.
top: now prefix property redirects to here
 
(7 intermediate revisions by 4 users not shown)
Line 1:
{{Short description|Type of code system}}
A '''prefix code''' is a type of [[code]] system distinguished by its possession of the "'''prefix property"''', which requires that there is no whole [[Code word (communication)|code word]] in the system that is a [[prefix (computer science)|prefix]] (initial segment) of any other code word in the system. It is trivially true for fixed-length codecodes, so only a point of consideration infor [[variable-length code|variable-length codes]].
 
For example, a code with code words {9, 55} has the prefix property; a code consisting of {9, 5, 59, 55} does not, because "5" is a prefix of "59" and also of "55". A prefix code is a [[uniquely decodable code]]: given a complete and accurate sequence, a receiver can identify each word without requiring a special marker between words. However, there are uniquely decodable codes that are not prefix codes; for instance, the reverse of a prefix code is still uniquely decodable (it is a suffix code), but it is not necessarily a prefix code.
 
Prefix codes are also known as '''prefix-free codes''', '''prefix condition codes''' and '''instantaneous codes'''. Although [[Huffman coding]] is just one of many algorithms for deriving prefix codes, prefix codes are also widely referred to as "Huffman codes", even when the code was not produced by a Huffman algorithm. The term '''comma-free code''' is sometimes also applied as a synonym for prefix-free codes<ref>US [[Federal Standard 1037C]]</ref><ref>{{citation|title=ATIS Telecom Glossary 2007|url=http://www.atis.org/glossary/definition.aspx?id=6416|access-date=December 4, 2010|archive-date=July 8, 2010|archive-url=https://web.archive.org/web/20100708083829/http://www.atis.org/glossary/definition.aspx?id=6416|url-status=dead}}</ref> but in most mathematical books and articles (e.g.<ref>{{citation|last1=Berstel|first1=Jean|last2=Perrin|first2=Dominique|title=Theory of Codes|publisher=Academic Press|year=1985}}</ref><ref>{{citation|doi=10.4153/CJM-1958-023-9|last1=Golomb|first1=S. W.|author1-link=Solomon W. Golomb|last2=Gordon|first2=Basil|author2-link=Basil Gordon|last3=Welch|first3=L. R.|title=Comma-Free Codes|journal=Canadian Journal of Mathematics|volume=10|issue=2|pages=202–209|year=1958|s2cid=124092269 |url=https://books.google.com/books?id=oRgtS14oa-sC&pg=PA202|doi-access=free}}</ref>) a comma-free code is used to mean a [[self-synchronizing code]], a subclass of prefix codes.
Line 22 ⟶ 23:
[[Huffman coding]] is a more sophisticated technique for constructing variable-length prefix codes. The Huffman coding algorithm takes as input the frequencies that the code words should have, and constructs a prefix code that minimizes the weighted average of the code word lengths. (This is closely related to minimizing the entropy.) This is a form of [[lossless data compression]] based on [[entropy encoding]].
 
Some codes mark the end of a code word with a special "comma" symbol (also called a [[Sentinel value]]), different from normal data.<ref>[{{cite web |url=http://www.imperial.ac.uk/research/hep/group/theses/JJones.pdf "|title=Development of Trigger and Control Systems for CMS"] by |first1=J. |last1=A. Jones: "Synchronisation"|page=70 p|publisher=High Energy Physics, Blackett Laboratory, Imperial College, London |url-status=dead |archive-url= https://web.archive.org/web/20110613183447/http://www.imperial.ac.uk/research/hep/group/theses/JJones.pdf 70|archive-date= Jun 13, 2011 }}</ref> This is somewhat analogous to the spaces between words in a sentence; they mark where one word ends and another begins. If every code word ends in a comma, and the comma does not appear elsewhere in a code word, the code is automatically prefix-free. However, reserving an entire symbol only for use as a comma can be inefficient, especially for languages with a small number of symbols. [[Morse code]] is an everyday example of a variable-length code with a comma. The long pauses between letters, and the even longer pauses between words, help people recognize where one letter (or word) ends, and the next begins. Similarly, [[Fibonacci coding]] uses a "11" to mark the end of every code word.
 
[[Self-synchronizing code]]s are prefix codes that allow [[frame synchronization]].
Line 74 ⟶ 75:
{{Compression methods}}
[[Category:Coding theory]]
[[Category:Prefixes|code]]
[[Category:Data compression]]
[[Category:Lossless compression algorithms]] <!-- do I really need both categories? -->