Chain rule for Kolmogorov complexity: Difference between revisions

Content deleted Content added
m References: clean up; Typo fixing using AWB
Brbauwens (talk | contribs)
No edit summary
Line 29:
that we know where to separate the two programs for ''x'' and ''y''|''x'' (log(''K''(''x'', ''y'')) upper-bounds this length).
 
TheFor the ≥ direction, isit rathersuffices moreto difficult.show Thethat keyfor toall thek,l proofsuch isthat thek+l = K(x,y) we have that either
construction of the set <math>A = \{(u,z) : K(u,z) \le K(x,y)\}</math>; that is, the
construction of the set of all pairs <math>(u,z)</math> such that the shortest input (for
a [[universal Turing machine]]) that produces <math>(u,z)</math> (and some way to distinguish
<math>u</math> from <math>z</math>) is shorter than the shortest producing <math>(x,y)</math>. We will also
need <math>A_x = \{z: K(x,z) \le K(x,y)\}</math>, the subset of <math>A</math> where <math>x = u</math>.
[[recursively enumerable|Enumerating]] <math>A</math> is not hard (although not fast!). In parallel, simulate all
<math>2^{K(x,y)}</math> programs with length <math>\le K(x,y)</math>. Output each <math>u,z</math> as the
simulated program terminates; eventually, any particular such <math>u,z</math> will be
produced. We can restrict to ''A''<sub>''x''</sub> simply by ignoring any output where <math>u \ne x</math>. We will assume that the inequality does not hold and derive a contradiction by finding a short description for ''K''(''x'') in terms of its index in the enumeration of a subset of&nbsp;''A''.
 
''K(x|k,l) ≤ k + O(1)'' or ''K(y|x,k,l) ≤ l + O(1)''.
In particular, we can describe ''y'' by giving the index in which ''y'' is output when enumerating ''A''<sub>''x''</sub> (which is less than or equal to |''A''<sub>''x''</sub>|) along with ''x'' and the value of ''K''(''x'',&nbsp;''y''). Thus,
 
Consider the list ''(a<sub>1</sub>,b<sub>1</sub>), (a<sub>2</sub>,b<sub>2</sub>), ..., (a<sub>e</sub>,b<sub>e</sub>)'' of all pairs ''(a,b)'' produced by programs of length exactly ''K(x,y)'' [hence K(a,b) ≤ K(x,y)]. Note that this list
:<math> K(y|x) \le \log(|A_x|) + O(\log(K(x,y))) + O(1) \, </math>
* contains the pair ''(x,y)'',
* can be [[recursively enumerable|enumerated]] given ''k'' and ''l'' (by running all programs of length ''K(x,y)'' in parallel),
* has at most ''2<mathsup>2^{K(x,y)}</mathsup>'' elements, since(because there are onlyat thatmost many2<sup>n</sup> programs of length n).
 
First, suppose that ''x'' appears less than ''2<sup>l</sup>'' times as first element. We can specify ''y'' given ''x,k,l'' by enumerating ''(a<sub>1</sub>,b<sub>1</sub>), (a<sub>2</sub>,b<sub>2</sub>), ...'' and then selecting ''(x,y)'' in the sub-list of pairs ''(x,b)''. By assumption, the index of ''(x,y)'' in this sub-list is less than ''2<sup>l</sup>'' and hence, there is a program for ''y'' given ''x,k,l'' of length ''l + O(1)''.
where the O(1) term is from the overhead of the enumerator, which does not depend on ''x'' or&nbsp;''y''.
Now, suppose that ''x'' appears at least ''2<sup>l</sup>'' times as first element. This can happen for at most ''2<sup>K(x,y)-l</sup> = 2<sup>k</sup>'' different strings. These strings can be enumerated given ''k,l'' and hence ''x'' can be specified by its index in this enumeration. The corresponding program for ''x'' has size ''k + O(1)''. Theorem proved.
 
Assume by way of contradiction
 
:<math> K(y|x) > K(x,y) - K(x) + O(\log(K(x,y))) \, </math>
 
for some <math>x,y</math>.
 
Combining this with the above, we have
 
:<math>
\log(|A_x|) + O(\log(K(x,y))) + O(1) \ge K(y|x) > K(x,y) - K(x) + O(\log(K(x,y))) \,
</math>
 
So for any constant ''c'' and some&nbsp;''x'',&nbsp;''y'',
 
:<math> \log(|A_x|) \ge K(x,y) - K(x) + c \log(K(x,y)) \, </math>
 
Let
 
: <math>e = K(x,y) - K(x) + c \log K(x,y);\quad |A_x| \ge 2^e.</math>
 
Now, we can enumerate a set of possible values for <math>x</math> by finding all <math>u</math> such
that <math>|A_u| > 2^e</math> -- list only the <math>u</math> such that <math>y</math> is output after more than
<math>2^e</math> other values when enumerating <math>A_u = \{ z: K(u,z) \le K(x,y) \}</math>. <math>x</math> is
one such <math>u</math>. Let <math>U</math> be the set of all such <math>u</math>. We can enumerate <math>U</math> by
enumerating each <math>A_u</math> (in parallel, as usual) and outputting a <math>u</math> only when
<math>y</math> would be output for <math>A_u</math> and more than <math>2^e</math> other values for the <math>A_u</math>
would be output. Note that given <math>K(x,y)</math>, <math>e</math> and the index of <math>x</math> in <math>U</math> we
can enumerate <math>U</math> to recover <math>x</math>.
 
Note that <math>\{(u,z) : u \in U, z \in A_u\}</math> is a subset of <math>A</math>, and <math>A</math> has at
most <math>2^{K(x,y)}</math> elements, since there are only that many programs of length
<math>K(x,y)</math>. So we have:
 
:<math> |U| < \frac{|A|}{2^e} \le \frac{2^{K(x,y)}}{2^e} </math>
 
where the first inequality follows since each element of <math>U</math> corresponds to an
<math>A_u</math> with strictly more than <math>2^e</math> elements.
 
But this leads to a contradiction:
 
:<math>
K(x) < \log(K(x,y)) + \log(e) + \log(|U|) \le \log(K(x,y)) + \log(e) + K(x,y) - e
</math>
 
which when we substitute in <math>e</math> gives
 
:<math>
K(x) < \log(K(x,y)) + \log(K(x,y) + K(x) + \log(K(x,y))
+ K(x,y) - K(x,y) + K(x) - c \log K(x,y)
</math>
 
which for large enough <math>c</math> gives <math>K(x) < K(x)</math>.
 
==References==
Line 110 ⟶ 53:
| month = February
| isbn = 0-387-94868-6 }}
 
* {{cite article
| last = Kolmogorov
| first = A.N.
| title = Logical basis for information theory and probability theory
| journal = IEEE Transactions on Information Theory
| number = 5
| volume = 14
| pages = 662-664
| year = 1968 }}
 
 
[[Category:Algorithmic information theory|*]]