Revision as of 00:31, 11 July 2012 edit Brightgalrs (talk \| contribs) Extended confirmed users 10,206 edits m →References: clean up; Typo fixing using AWB ← Previous edit		Revision as of 19:45, 11 January 2013 edit undo Brbauwens (talk \| contribs) 39 edits No edit summary Next edit →
Line 29: that we know where to separate the two programs for ''x'' and ''y''\|''x'' (log(''K''(''x'', ''y'')) upper-bounds this length). ~~The~~For the ≥ direction, isit ~~rather~~suffices ~~more~~to ~~difficult.~~show ~~The~~that ~~key~~for toall ~~the~~k,l ~~proof~~such isthat ~~the~~k+l = K(x,y) we have that either ~~construction of the set <math>A = \{(u,z) : K(u,z) \le K(x,y)\}</math>; that is, the~~ ~~construction of the set of all pairs <math>(u,z)</math> such that the shortest input (for~~ ~~a [[universal Turing machine]]) that produces <math>(u,z)</math> (and some way to distinguish~~ ~~<math>u</math> from <math>z</math>) is shorter than the shortest producing <math>(x,y)</math>. We will also~~ ~~need <math>A_x = \{z: K(x,z) \le K(x,y)\}</math>, the subset of <math>A</math> where <math>x = u</math>.~~ ~~[[recursively enumerable\|Enumerating]] <math>A</math> is not hard (although not fast!). In parallel, simulate all~~ ~~<math>2^{K(x,y)}</math> programs with length <math>\le K(x,y)</math>. Output each <math>u,z</math> as the~~ ~~simulated program terminates; eventually, any particular such <math>u,z</math> will be~~ produced. We can restrict to ''A''<sub>''x''</sub> simply by ignoring any output where <math>u \ne x</math>. We will assume that the inequality does not hold and derive a contradiction by finding a short description for ''K''(''x'') in terms of its index in the enumeration of a subset of ''A''. ''K(x\|k,l) ≤ k + O(1)'' or ''K(y\|x,k,l) ≤ l + O(1)''. In particular, we can describe ''y'' by giving the index in which ''y'' is output when enumerating ''A''<sub>''x''</sub> (which is less than or equal to \|''A''<sub>''x''</sub>\|) along with ''x'' and the value of ''K''(''x'', ''y''). Thus, Consider the list ''(a<sub>1</sub>,b<sub>1</sub>), (a<sub>2</sub>,b<sub>2</sub>), ..., (a<sub>e</sub>,b<sub>e</sub>)'' of all pairs ''(a,b)'' produced by programs of length exactly ''K(x,y)'' [hence K(a,b) ≤ K(x,y)]. Note that this list ~~:<math> K(y\|x) \le \log(\|A_x\|) + O(\log(K(x,y))) + O(1) \, </math>~~ * contains the pair ''(x,y)'', * can be [[recursively enumerable\|enumerated]] given ''k'' and ''l'' (by running all programs of length ''K(x,y)'' in parallel), * has at most ''2<~~math~~sup>~~2^{~~K(x,y)}</~~math~~sup>'' elements, ~~since~~(because there are ~~only~~at ~~that~~most ~~many~~2<sup>n</sup> programs of length n). ▼ First, suppose that ''x'' appears less than ''2<sup>l</sup>'' times as first element. We can specify ''y'' given ''x,k,l'' by enumerating ''(a<sub>1</sub>,b<sub>1</sub>), (a<sub>2</sub>,b<sub>2</sub>), ...'' and then selecting ''(x,y)'' in the sub-list of pairs ''(x,b)''. By assumption, the index of ''(x,y)'' in this sub-list is less than ''2<sup>l</sup>'' and hence, there is a program for ''y'' given ''x,k,l'' of length ''l + O(1)''. ~~where the O(1) term is from the overhead of the enumerator, which does not depend on ''x'' or ''y''.~~ Now, suppose that ''x'' appears at least ''2<sup>l</sup>'' times as first element. This can happen for at most ''2<sup>K(x,y)-l</sup> = 2<sup>k</sup>'' different strings. These strings can be enumerated given ''k,l'' and hence ''x'' can be specified by its index in this enumeration. The corresponding program for ''x'' has size ''k + O(1)''. Theorem proved. ~~Assume by way of contradiction~~ ~~:<math> K(y\|x) > K(x,y) - K(x) + O(\log(K(x,y))) \, </math>~~ ~~for some <math>x,y</math>.~~ ~~Combining this with the above, we have~~ ~~:<math>~~ ~~\log(\|A_x\|) + O(\log(K(x,y))) + O(1) \ge K(y\|x) > K(x,y) - K(x) + O(\log(K(x,y))) \,~~ ~~</math>~~ ~~So for any constant ''c'' and some ''x'', ''y'',~~ ~~:<math> \log(\|A_x\|) \ge K(x,y) - K(x) + c \log(K(x,y)) \, </math>~~ ~~Let~~ ~~: <math>e = K(x,y) - K(x) + c \log K(x,y);\quad \|A_x\| \ge 2^e.</math>~~ ~~Now, we can enumerate a set of possible values for <math>x</math> by finding all <math>u</math> such~~ ~~that <math>\|A_u\| > 2^e</math> -- list only the <math>u</math> such that <math>y</math> is output after more than~~ ~~<math>2^e</math> other values when enumerating <math>A_u = \{ z: K(u,z) \le K(x,y) \}</math>. <math>x</math> is~~ ~~one such <math>u</math>. Let <math>U</math> be the set of all such <math>u</math>. We can enumerate <math>U</math> by~~ ~~enumerating each <math>A_u</math> (in parallel, as usual) and outputting a <math>u</math> only when~~ ~~<math>y</math> would be output for <math>A_u</math> and more than <math>2^e</math> other values for the <math>A_u</math>~~ ~~would be output. Note that given <math>K(x,y)</math>, <math>e</math> and the index of <math>x</math> in <math>U</math> we~~ ~~can enumerate <math>U</math> to recover <math>x</math>.~~ ~~Note that <math>\{(u,z) : u \in U, z \in A_u\}</math> is a subset of <math>A</math>, and <math>A</math> has at~~ ▲most <math>2^{K(x,y)}</math> elements, since there are only that many programs of length ~~<math>K(x,y)</math>. So we have:~~ ~~:<math> \|U\| < \frac{\|A\|}{2^e} \le \frac{2^{K(x,y)}}{2^e} </math>~~ ~~where the first inequality follows since each element of <math>U</math> corresponds to an~~ ~~<math>A_u</math> with strictly more than <math>2^e</math> elements.~~ ~~But this leads to a contradiction:~~ ~~:<math>~~ ~~K(x) < \log(K(x,y)) + \log(e) + \log(\|U\|) \le \log(K(x,y)) + \log(e) + K(x,y) - e~~ ~~</math>~~ ~~which when we substitute in <math>e</math> gives~~ ~~:<math>~~ ~~K(x) < \log(K(x,y)) + \log(K(x,y) + K(x) + \log(K(x,y))~~ ~~+ K(x,y) - K(x,y) + K(x) - c \log K(x,y)~~ ~~</math>~~ ~~which for large enough <math>c</math> gives <math>K(x) < K(x)</math>.~~ ==References== Line 110 ⟶ 53: \| month = February \| isbn = 0-387-94868-6 }} * {{cite article \| last = Kolmogorov \| first = A.N. \| title = Logical basis for information theory and probability theory \| journal = IEEE Transactions on Information Theory \| number = 5 \| volume = 14 \| pages = 662-664 \| year = 1968 }} [[Category:Algorithmic information theory\|*]]

Chain rule for Kolmogorov complexity: Difference between revisions