Dynamic Markov compression: Difference between revisions

Content deleted Content added
Arithmetic coding: fix linter error (1x missing end tag)
Tags: Mobile edit Mobile web edit Advanced mobile edit
Line 7:
=== Arithmetic coding ===
 
A bitwise arithmetic coder such as DMC has two components, a predictor and an arithmetic coder. The predictor accepts an ''n''-bit input string ''x'' = ''x''<sub>1</sub>''x''<sub>2</sub>...''x''<sub>''n''</sub> and assigns it a probability ''p''(''x''), expressed as a product of a series of predictions, ''p''(''x''<sub>1</sub>)''p''(''x''<sub>2</sub>'''|'''''x''<sub>1</sub>)''p''(''x''<sub>3</sub>'''|'''''x''<sub>1</sub>''x''<sub>2</sub>) ... ''p''(''x''<sub>''n''</sub>'''|''' ''x''<sub>1</sub>''x''<sub>2</sub>...''x''<sub>''n''&ndash;1</sub>). The arithmetic coder maintains two high precision binary numbers, ''p''<sub>low</sub> and ''p''<sub>high</sub>, representing the possible range for the total probability that the model would assign to all strings lexicographically less than ''x'', given the bits of ''x'' seen so far. The compressed code for ''x'' is ''p''<sub>''x''</sub>, the shortest bit string representing a number between ''p''<sub>low</sub> and ''p''<sub>high</sub>. It is always possible to find a number in this range no more than one bit longer than the [[Claude Shannon|Shannon]] limit, log<sub>2</sub>&nbsp;1 '''/''' ''p''(''x''). One such number can be obtained from ''p''<sub>high</sub> by dropping all of the trailing bits after the first bit that differs from ''p''<sub>low</sub>.
 
Compression proceeds as follows. The initial range is set to ''p''<sub>low</sub> = 0, ''p''<sub>high</sub> = 1. For each bit, the predictor estimates ''p''<sub>0</sub> = ''p''(''x''<sub>''i''</sub> = 0'''|'''''x''<sub>1</sub>''x''<sub>2</sub>...''x''<sub>''i''&ndash;1</sub>) and ''p''<sub>1</sub> = 1&nbsp;&minus;&nbsp;''p''<sub>0</sub>, the probability of a 0 or 1, respectively. The arithmetic coder then divides the current range, (''p''<sub>low</sub>,&nbsp;''p''<sub>high</sub>) into two parts in proportion to ''p''<sub>0</sub> and ''p''<sub>1</sub>. Then the subrange corresponding to the next bit ''x''<sub>''i''</sub> becomes the new range.