Unary coding: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 17:14, 31 May 2025 edit Em3rgent0rdr (talk \| contribs) Extended confirmed users 7,995 edits try again to introduce the "Alternative". Tag: Visual edit ← Previous edit		Latest revision as of 18:11, 31 May 2025 edit undo Em3rgent0rdr (talk \| contribs) Extended confirmed users 7,995 edits shorten wording on thermometer code. Tag: Visual edit
(4 intermediate revisions by the same user not shown)
Line 1: {{Short description\|Entropy encoding}} '''Unary coding''',<ref group="nb" name="NB1"/> or the [[unary numeral system]], is an [[entropy encoding]] that represents a [[natural number]], ''n'', with ''n'' ones followed by a zero (if the term ''natural number'' is understood as ''non-negative integer'') or with ''n'' − 1 ones followed by a zero (if the term ''natural number'' is understood as ''strictly positive integer''). A unary number's code length would thus be ''n'' + 1 with that first definition, or ''n'' with that second definition. ~~A unary code is sometimes called a '''thermometer code''', because a unary~~Unary code ~~written~~when ~~vertically~~vertical behaves like mercury in a [[thermometer]] that gets taller or shorter as ''n'' gets bigger or smaller, and so is sometimes called '''thermometer code'''.<ref>{{Cite web \|title=University of Alberta Dictionary of Cognitive Science: Thermometer Code \|url=http://www.bcp.psych.ualberta.ca/~mike/Pearl_Street/Dictionary/contents/T/thermcode.html \|access-date=2025-05-31 \|website=www.bcp.psych.ualberta.ca}}</ref> An alternative representation uses ''n'' or ''n'' − 1 zeros followed by a one, effectively swapping the ones and zeros, [[without loss of generality]]. For example, the first ten unary codes are: {\| class="wikitable" ! Unary code !! Alternative Line 47: \|} Unary coding is an ''optimally efficient''{{Clarify\|reason=This term needs some introduction.\|date=May 2025}} encoding for the following discrete [[probability distribution]]{{Citation needed\|date=May 2025\|reason=Citation needed for this strong claim and to explain this term "optimally efficient".}} :<math>\operatorname{P}(n) = 2^{-n}\,</math> Line 144: ==Symmetric unary codes== ~~Following~~The ~~set~~following ofsymmetric unary codes ~~are symmetric and~~ can be read ~~in any direction. It is also~~and instantaneously ~~decodable~~decoded in either direction.: {\| class="wikitable" style="text-align: center;" ~~! n (strictly positive) !~~! Unary code !Alternative !n (non-negative) !n (strictly positive) \|- ~~\| 1 \|~~\| 1 \|0 \|0 \|1 \|- ~~\| 2 \|~~\| 00 \|11 \|1 \|2 \|- ~~\| 3 \|~~\| 010 \|101 \|2 \|3 \|- ~~\| 4 \|~~\| 0110 \|1001 \|3 \|4 \|- ~~\| 5 \|~~\| 01110 \|10001 \|4 \|5 \|- ~~\| 6 \|~~\| 011110 \|100001 \|5 \|6 \|- ~~\| 7 \|~~\| 0111110 \|1000001 \|6 \|7 \|- ~~\| 8 \|~~\| 01111110 \|10000001 \|7 \|8 \|- ~~\| 9 \|~~\| 011111110 \|100000001 \|8 \|9 \|- ~~\| 10 \|~~\| 0111111110 \|1000000001 \|9 \|10 \|- \| colspan="4" \| ... \|} == Canonical unary codes == {{See also\|Canonical Huffman code}} For unary values where the maximum is known, one can use canonical unary codes that are of a somewhat numerical nature and different from character based codes. It involves starting with numerical '0' or '-1' ( <math>\operatorname2^{n} - 1\,</math>) and the maximum number of digits then for each step reducing the number of digits by one and increasing/decreasing the result by numerical '1'. For unary values where the maximum is known, one can use canonical unary codes that are of a somewhat numerical nature and different from character based codes. The largest ''n'' numerical '0' or '-1' ( <math>\operatorname2^{n} - 1\,</math>) and the maximum number of digits then for each step reducing the number of digits by one and increasing/decreasing the result by numerical '1'.{{Clarify\|date=May 2025\|reason=(1) Poor grammar. (2) Are canonical codes only for the positive natural number convention? (3) Shouldn't this say that we are starting with the largest n for this 0 or 2^(n-1) code? (4) Why use red to write the final code (in this example corresponding to n=10)? Does it belong or not belong to the canonical format? (6) Why is there an extra row after the n=10 row? (Shouldn't we delete that empty row to indicate that there are no more codes after the maximum?) (7) Is the maximum code corresponding to n=10 or n=9, which this table both uses 9 digits to express. (We should delete this n=10 row if is beyond the maximum...the algorithm part about "reducing the number of digits by one" would only make sense if 9 is the maximum.)}} {\| class="wikitable" ! n !! Unary code Line 231 ⟶ 243: \| colspan="3" \| \|} Canonical codes can [http://www.cs.ucf.edu/courses/cap5015/Huff.pdf require less processing time to decode]{{Clarification\|reason=Citation deals with [[Canonical Huffman code]]. Is the statement relevant only for when dealing with [[Huffman encoding]], or is this a general statement about Canonical unary code?\|date=May 2025}} when they are processed as numbers not a string. If the number of codes required per symbol length is different to 1, i.e. there are more non-unary codes of some length required, those would be achieved by increasing/decreasing the values numerically without reducing the length in that case. ==Generalized unary coding==