Unary coding: Difference between revisions

Content deleted Content added
unnecessarily confusing by squeezing too much into the first sentence, so split off code length to become second sentence, and make thermometer coding be its own sentence later with a bit of explanation. To hard to give example of 5 because have two definitions, so just introduced the table a little better. Move more technical stump on prefix-free code and a self-synchronizing code to be its own paragraph at end.
Line 1:
{{Short description|Entropy encoding}}
'''Unary coding''',<ref group="nb" name="NB1"/> or the [[unary numeral system]] and also sometimes called '''thermometer code''', is an [[entropy encoding]] that represents a [[natural number]], ''n'', with a code of length ''n''&nbsp;+&nbsp;1 ( or ''n'' ), usually ''n'' ones followed by a zero (if the term ''natural number'' is understood as ''non-negative integer'') or with ''n''&nbsp;&minus;&nbsp;1 ones followed by a zero (if the term ''natural number'' is understood as ''strictly positive integer''). A Forunary examplenumber's 5code islength representedwould asthus 111110be ''n''&nbsp;+&nbsp;1 with that first definition, or 11110''n'' with that second definition. Some representations use ''n'' or ''n''&nbsp;&minus;&nbsp;1 zeros followed by a one. TheA onesunary andcode zerosis aresometimes interchangeablecalled [[withouta loss'''thermometer ofcode''', generality]].because Unarya codingunary iscode written vertically behaves like mercury bothin a [[prefix-free codethermometer]] that gets taller or shorter as ''n'' gets bigger or smaller.<ref>{{Cite web |title=University of Alberta Dictionary of Cognitive Science: Thermometer Code |url=http://www.bcp.psych.ualberta.ca/~mike/Pearl_Street/Dictionary/contents/T/thermcode.html |access-date=2025-05-31 |website=www.bcp.psych.ualberta.ca}}</ref> An alternative format swaps the ones and azeros, [[self-synchronizingwithout codeloss of generality]]. For example, the first ten unary codes are:
{| class="wikitable"
! n (non-negative) !! n (strictly positive) !! Unary code !! Alternative
!n (non-negative)
!n (strictly positive)
|-
| 0 || 1
|| 0 |
| 1
|-
| 1 || 2 || 10 || 01
|1
|2
|-
| 2 || 3 || 110 || 001
|2
|3
|-
| 3 || 4 || 1110 || 0001
|3
|4
|-
| 4 || 5 || 11110 || 00001
|4
|5
|-
| 5 || 6 || 111110 || 000001
|5
|6
|-
| 6 || 7 || 1111110 || 0000001
|6
|7
|-
| 7 || 8 || 11111110 || 00000001
|7
|8
|-
| 8 || 9 || 111111110 || 000000001
|8
|9
|-
| 9 || 10 || 1111111110 || 0000000001
|9
|10
|}
 
Line 39 ⟶ 61:
:<math>\operatorname{P}(n) \ge \operatorname{P}(n+1) + \operatorname{P}(n+2)\, </math>
 
for <math>n=1,2,3,...</math>. Although it is the optimal symbol-by-symbol coding for such probability distributions, [[Golomb coding]] achieves better compression capability for the geometric distribution because it does not consider input symbols independently, but rather implicitly groups the inputs. For the same reason, [[arithmetic encoding]] performs better for general probability distributions, as in the last case above.
 
Unary coding is both a [[prefix-free code]] and a [[self-synchronizing code]].
 
==Unary code in use today==