Golomb coding: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 02:24, 6 February 2024 edit 2601:600:887f:63f0:85bd:665a:a445:6cd2 (talk) →Applications: Changed hyperlink to a more easily accessible reference. ← Previous edit		Latest revision as of 16:33, 30 July 2025 edit undo Hellacioussatyr (talk \| contribs) Extended confirmed users 10,151 edits →Construction of codes Tag: 2017 wikitext editor
(6 intermediate revisions by 6 users not shown)
Line 1: {{distinguish\|Exponential-Golomb coding}} {{Short description\|Lossless data compression method}} {{More citations needed\|section\|date=October 2024\|talk=Many sections unsourced}} '''Golomb coding''' is a [[lossless data compression]] method using a family of [[data compression]] codes invented by [[Solomon W. Golomb]] in the 1960s. Alphabets following a [[geometric distribution]] will have a Golomb code as an optimal [[prefix code]],<ref>{{Cite journal \| last1 = Gallager \| first1 = R. G. \|last2 = van Voorhis \|first2 = D. C.\| title = Optimal source codes for geometrically distributed integer alphabets \| journal = [[IEEE Transactions on Information Theory]]\| volume = 21 \| issue = 2 \| pages = 228–230 \| year = 1975 \| doi=10.1109/tit.1975.1055357}}</ref> making Golomb coding highly suitable for situations in which the occurrence of small values in the input stream is significantly more likely than large values. Line 17 ⟶ 19: Golomb coding uses a tunable parameter {{mvar\|M}} to divide an input value {{mvar\|x}} into two parts: {{mvar\|q}}, the result of a division by {{mvar\|M}}, and {{mvar\|r}}, the remainder. The quotient is sent in [[unary coding]], followed by the remainder in [[truncated binary encoding]]. When <math>M=1</math>, Golomb coding is equivalent to unary coding. Golomb–Rice codes can be thought of as codes that indicate a number by the position of the ''bin'' ({{mvar\|q}}), and the ''offset'' within the ''bin'' ({{mvar\|r}}). The example figure shows the position {{mvar\|q}} and offset {{mvar\|r}} for the encoding of integer {{mvar\|x}} using Golomb–Rice parameter {{math\|''M'' {{=}} 3}}, with source probabilities following a geometric distribution with {{math\|''p''(0) {{=}} 0.2}}. Formally, the two parts are given by the following expression, where {{mvar\|x}} is the nonnegative integer being encoded: :<math display="block">q = \left \lfloor \frac{x}{M} \right \rfloor</math> and :<math display="block">r = x - qM.</math>. [[File:GolombCodeRedundancy.svg\|thumb\|upright 1.5\|This image shows the redundancy, in bits, of the Golomb code, when {{mvar\|M}} is chosen optimally, for {{math\| 1 − ''p''(0) ≥ 0.45}}]] Line 32 ⟶ 34: The integer {{mvar\|x}} treated by Golomb was the run length of a [[Bernoulli process]], which has a [[geometric distribution]] starting at 0. The best choice of parameter {{mvar\|M}} is a function of the corresponding Bernoulli process, which is parameterized by <math>p = P(x=0)</math> the probability of success in a given [[Bernoulli trial]]. {{mvar\|M}} is either the median of the distribution or the median ±1. It can be determined by these inequalities: : <math display="block">(1-p)^M + (1-p)^{M+1} \leq 1 < (1-p)^{M-1} + (1-p)^M,</math> which are solved by : <math display="block">M = \left\lceil -\frac{\log(2 -p)}{\log(1-p)}\right\rceil.</math>. For the example with {{math\|''p''(0) {{=}} 0.2}}: : <math display="block">M = \left\lceil -\frac{\log(1.8)}{\log(0.8)}\right\rceil = \left\lceil 2.634 \right\rceil = 3.</math>. The Golomb code for this distribution is equivalent to the [[Huffman code]] for the same probabilities, if it were possible to compute the Huffman code for the infinite set of source values. Line 67 ⟶ 69: # Skip the 0 delimiter # Let <math>b = \lfloor\log_2(M)\rfloor</math> ## Interpret next ''b'' bits as a binary number ''r'''. If <math>r' < 2^{b+1}-M</math> holds, then the ~~reminder~~remainder <math> r = r' </math> ## Otherwise interpret ''b + 1'' bits as a binary number ''r''', the ~~reminder~~remainder is given by <math>r = r' - 2^{b+1} + M</math> # Compute <math>N = q * M + r</math> Line 141 ⟶ 143: Consider using a Rice code with a binary portion having {{mvar\|b}} bits to run-length encode sequences where ''P'' has a probability {{mvar\|p}}. If <math>\mathbb{P}[\text{bit is part of }k\text{-run}]</math> is the probability that a bit will be part of an {{mvar\|k}}-bit run (<math>k-1</math> ''P''s and one ''Q'') and <math>(\text{compression ratio of }k\text{-run})</math> is the compression ratio of that run, then the expected compression ratio is <!-- below mostly comes from above reference (Kiely), but not exactly, so leave uncited for now --> :<math display="block">\begin{align} \mathbb{E}[\text{compression ratio}] &= \sum_{k=1}^\infty (\text{compression ratio of }k\text{-run}) \cdot \mathbb{P}[\text{bit is part of }k\text{-run}] \\ Line 178 ⟶ 180: * [[Elias delta coding]] * [[Variable-length code]] * [[Exponential-Golomb coding]] == References == Line 196 ⟶ 199: {{DEFAULTSORT:Golomb Coding}} [[Category:Entropy coding]] [[Category:Data compression]]