Unary coding: Difference between revisions

Content deleted Content added
Importing Wikidata short description: "Entropy encoding" (Shortdesc helper)
Line 1:
{{Short description|Entropy encoding}}
'''Unary coding''',<ref group="nb" name="NB1"/> or the [[unary numeral system]] and also sometimes called '''thermometer code''', is an [[entropy encoding]] that represents a [[natural number]], ''n'', with a code of length ''n''&nbsp;+&nbsp;1 ( or ''n'' ), usually ''n'' ones followed by a zero (if ''natural number'' is understood as ''non-negative integer'') or with ''n''&nbsp;&minus;&nbsp;1 ones followed by a zero (if ''natural number'' is understood as ''strictly positive integer''). For example 5 is represented as 111110 or 11110. Some representations use ''n'' or ''n''&nbsp;&minus;&nbsp;1 zeros followed by a one. The ones and zeros are interchangeable [[without loss of generality]]. Unary coding is both a [[prefix-free code]] and a [[self-synchronizing code]].
{| class="wikitable"
! n (non-negative) !! n (strictly positive) !! Unary code !! Alternative
Line 51:
 
==Standard run-length unary codes==
All binary data is defined by the ability to represent unary numbers in alternating run-lengths of 1s and 0s. This conforms to the standard definition of unary codes ofie N digits of the same number 1 or 0. All run-lengths by definition have at least one digit and thus represent ''strictly positive integers''.
{| class="wikitable"
! n !! RL code !! Next code
Line 104:
| colspan="2" | ...
|}
==Canonical unary codes==
For unary values where the maximum is known, one can use canonical unary codes that are of a somewhat numerical nature and different from character based codes. It involves starting with numerical '0' or '1' and the maximum number of digits then for each step reducing the number of digits by one and increasing/decreasing the result by numerical '1'.
{| class="wikitable"
! n !! Unary code
!Alternative
|-
| 1 || 1
|0
|-
| 2 || 01
|10
|-
| 3 || 001
|110
|-
| 4 || 0001
|1110
|-
| 5 || 00001
|11110
|-
| 6 || 000001
|111110
|-
| 7 || 0000001
|1111110
|-
| 8 || 00000001
|11111110
|-
| 9 || 000000001
|111111110
|-
| 10 || 0000000000
|1111111111
|-
| colspan="2" |
|
|}
Canonical codes can [http://www.cs.ucf.edu/courses/cap5015/Huff.pdf require less processing time to decode] when they are processed as numbers not a string. If the number of codes required per symbol length is different to 1, ie there are more non-unary codes of some length required, those would be achieved by increasing/decreasing the values numerically without reducing the length in that case.
 
==Generalized unary coding==
A generalized version of unary coding was presented by [[Subhash Kak]] to represent numbers much more efficiently than standard unary coding.<ref name="Kak_2015"/> Here's an example of generalized unary coding for integers from 10 through 15 that requires only 7 bits (where three bits are arbitrarily chosen in place of a single one in standard unary to show the number). Note that the representation is cyclic where one uses markers to represent higher integers in higher cycles.
{| class="wikitable"
! n !! Unary code !! Generalized unary