Content deleted Content added
Simon G Best (talk | contribs) →Range encoding and arithmetic encoding: Added references to what seem to be Martin's original paper. |
Simon G Best (talk | contribs) →Range encoding and arithmetic encoding: On Martin's enlightening paper. |
||
Line 37:
::--[[User:Simon G Best|Simon G Best]] 23:57, 21 July 2006 (UTC)
:::Okay, I've read through Martin's paper, and it would seem to contradict much of what you thought, and the main beliefs that are held (at least by many who refer to range coding) about range coding in relation to arithmetic coding (though the issue of patents is (as is so often the case with patents) not straight-forward).
:::The key points from the paper are:-
:::*Contextual modelling is not really dealt with in the paper (though there are specific examples). It does appear that Martin had context modelling ''in general'' in mind.
:::*''Any'' base can be used. (It's not just binary, neither is it just base 256.)
:::*Renormalization is ''digit''-at-a-time, ''not'' byte-at-a-time, as so often claimed (and often the case in implementations). (Except when bytes are digits, of course.)
:::*There's a postscript, with references, that indicate that arithmetic coding ''predates'' range coding.
:::I'll expand on these, and related, issues, as it's really quite interesting to see how generically Martin was thinking, and what questions these issues raise.
:::===Contextual modelling===
:::Martin leaves the issue of contextual modelling (or whatever you want to call it (statistical modelling, probability modelling, etc)) open. He includes some examples, but they seem to be included more for the purpose of illustrating range encoding itself. It does seem clear that he had context modelling in general in mind, as he makes statements that simply couldn't be true if he only had certain kinds of context models in mind. For example, the very first paragraph is this:-
:::<blockquote>Redundancy in a message can be thought of as consisting of contextual redundancy and alphabetic redundancy. The first is illustrated by the fact that the letter Q is nearly always followed by the letter U, the second by the fact that the letter E is far more common than the letter X. Range encoding is an algorithm for removing both sorts of redundancy.</blockquote>
:::And, a few paragraphs down, he writes this:-
:::<blockquote>Many techniques are almost optimal in a wide variety of situations, but none are universally applicable. ''In contrast, range encoding may be used to remove all the redundancy that we can describe in any digitised message.'' It can produce encodings to any base.</blockquote>
:::(Emphasis mine.) That simply wouldn't be true, unless he intends range coding to be used with any context model.
:::===Renormalization issues===
:::Byte-at-a-time renormalization is often said to be one of the differences between range encoding and arithmetic coding. But this simply isn't in Martin's paper. Instead, he describes ''digit''-at-a-time renormalization, and says that range encoding can be used with "any base" - that includes binary. On this issue, range encoding would seem to be as general as arithmetic coding. The stuff about byte-at-a-time renormalization as a difference would therefore seem to be something of a myth when it comes to range encoding itself, though it does happen to be done in a number of range encoders in practice.
:::===Range encoding and arithmetic coding===
:::As range encoding, as defined by Martin in his paper, is actually very general (any context model may be used (which would include adaptive models, etc), any number base may be used, etc), and as it's described in quite a high-level, mathematical way (rather than a low-level implementational way), it's really difficult to see how range encoding is not the same as arithmetic coding. (He even includes "delayed digits" and bit-stuffing!)
:::===Patents===
:::Since Martin refers to arithmetic coding, and as it seems arithmetic coding predates range coding, the argument that range coding must be patent-free because of its age must also apply to arithmetic coding - but it doesn't. My vague understanding of this issue is that unexpired arithmetic coding related patents cover such things as actual implementations, the uses of certain kinds of context models, and the like. It's not arithmetic coding per se that's patented. If, as really does seem to be the case from Martin's paper itself, range encoding is the same as arithmetic coding, then all those arithmetic coding related patents must also apply to range encoding. '''''But this issue could really do with input from a patent lawyer.'''''
:::===Finishing off===
:::The paper's well worth reading - it's not long, and really does show what range encoding actually is (as it is, after all, held to be the definitive paper). But this, for now, is what I've got to contribute on the matter. Sorry if this is a bit of a long item for the talk page, but I really have managed to get 'into the groove' on range encoding :-)
:::--[[User:Simon G Best|Simon G Best]] 01:38, 22 July 2006 (UTC)
|