Kolmogorov complexity: Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Alter: journal, pages, pmc. Add: bibcode, doi-access. Removed URL that duplicated identifier. | Use this bot. Report bugs. | Suggested by Jay8g | Category:CS1 maint: PMC format | #UCB_Category 2/3
m Compression: {{math}} fix math italic
Line 254:
It is straightforward to compute upper bounds for ''K''(''s'') – simply [[data compression|compress]] the string ''s'' with some method, implement the corresponding decompressor in the chosen language, concatenate the decompressor to the compressed string, and measure the length of the resulting string – concretely, the size of a [[self-extracting archive]] in the given language.
 
A string ''s'' is compressible by a number ''c'' if it has a description whose length does not exceed |''s''| − ''c'' bits. This is equivalent to saying that {{math|''K''(''s'') ≤ {{abs|''s''|}} − ''c''}}. Otherwise, ''s'' is incompressible by ''c''. A string incompressible by 1 is said to be simply ''incompressible''&nbsp;– by the [[pigeonhole principle]], which applies because every compressed string maps to only one uncompressed string, [[incompressible string]]s must exist, since there are 2<sup>''n''</sup> bit strings of length ''n'', but only 2<sup>''n''</sup> − 1 shorter strings, that is, strings of length less than ''n'', (i.e. with length 0, 1, ..., ''n''&nbsp;−&nbsp;1).<ref group=note>As there are {{nobrmath|1=''N''<sub>''L''</sub> = 2<sup>''L''</sup>}} strings of length ''L'', the number of strings of lengths {{nowrapmath|1=''L'' = 0, 1, ..., ''n'' − 1}} is {{nobrmath|''N''<sub>0</sub> + ''N''<sub>1</sub> + ... + ''N''<sub>''n''−1</sub>}} = {{nobrmath|2<sup>0</sup> + 2<sup>1</sup> + ... + 2<sup>''n''−1</sup>}}, which is a finite [[geometric series]] with sum {{nobrmath|2<sup>0</sup> + 2<sup>1</sup> + ... + 2<sup>''n''−1</sup>}} = {{nobrmath|1 = 2<sup>0</sup> × (1 − 2<sup>''n''</sup>) / (1 − 2) = 2<sup>''n''</sup> − 1}}</ref>
 
For the same reason, most strings are complex in the sense that they cannot be significantly compressed&nbsp;– their ''K''(''s'') is not much smaller than |''s''|, the length of ''s'' in bits. To make this precise, fix a value of ''n''. There are 2<sup>''n''</sup> bitstrings of length ''n''. The [[Uniform distribution (discrete)|uniform]] [[probability]] distribution on the space of these bitstrings assigns exactly equal weight 2<sup>−''n''</sup> to each string of length ''n''.
 
'''Theorem''': With the uniform probability distribution on the space of bitstrings of length ''n'', the probability that a string is incompressible by ''c'' is at least {{math|1 − 2<sup>−''c''+1</sup> + 2<sup>−''n''</sup>}}.
 
To prove the theorem, note that the number of descriptions of length not exceeding ''n'' − ''c'' is given by the geometric series: