Complexity function: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 09:12, 20 June 2021 edit Lcdrovers (talk \| contribs) 366 edits add red link to page on morse-hedlund theorem ← Previous edit		Latest revision as of 03:08, 26 March 2025 edit undo Citation bot (talk \| contribs) Bots 5,862,998 edits Altered title. \| Use this bot. Report bugs. \| Suggested by Dominic3203 \| Category:Theoretical computer science \| #UCB_Category 99/137
(3 intermediate revisions by 3 users not shown)
Line 1: {{Short description\|Function that counts distinct factors of a string}} In [[computer science]], the '''complexity function''' of a ''word'' or ''[[string (computer science)\|string]]'' (a finite or infinite sequence of symbols from some [[alphabet (formal languages)\|alphabet]]) is the function that counts the number of distinct ''factors'' (substrings of consecutive symbols) of that string. More generally, the complexity function of a [[formal language]] (a set of finite strings) counts the number of distinct words of given length. Line 13 ⟶ 14: A set ''S'' of finite binary words is ''balanced'' if for each ''n'' the subset ''S''<sub>''n''</sub> of words of length ''n'' has the property that the [[Hamming weight]] of the words in ''S''<sub>''n''</sub> takes at most two distinct values. A '''balanced sequence''' is one for which the set of factors is balanced.<ref name=AS313>Allouche & Shallit (2003) p.313</ref> A balanced sequence has complexity function at most ''n''+1.<ref name=L48>Lothaire (2011) p.48</ref> A [[Sturmian word]] over a binary alphabet is one with complexity function ''n'' + 1.<ref name=PF6>Pytheas Fogg (2002) p.6</ref> A sequence is Sturmian [[if and only if]] it is balanced and aperiodic.<ref name=L46>Lothaire (2011) p.46</ref><ref name=AS318>Allouche & Shallit (2003) p.318</ref> An example is the [[Fibonacci word]].<ref name=PF6/><ref name=deluca1995>{{cite journal \| journal=Information Processing Letters \| year=1995 \| pages=307–312 \| doi=10.1016/0020-0190(95)00067-M \| title=A division property of the Fibonacci word \| first=Aldo \| last=de Luca \| volume=54 \| issue=6 }}</ref> More generally, a Sturmian word over an alphabet of size ''k'' is one with complexity ''n''+''k''−1. An Arnoux-Rauzy word over a ternary alphabet has complexity 2''n'' + 1:<ref name=PF6/> an example is the [[Tribonacci word]].<ref name=PF368>Pytheas Fogg (2002) p.368</ref> For [[recurrent word]]s, those in which each factor appears infinitely often, the complexity function almost characterises the set of factors: if ''s'' is a recurrent word with the same complexity function as ''t'' are then ''s'' has the same set of factors as ''t'' or δ''t'' where δ denotes the letter doubling morphism ''a'' → ''aa''.<ref name=BLRS84>Berstel et al (2009) p.84</ref> Line 31 ⟶ 32: :<math>H_{\mathrm{top}}(u) = \lim_{n \rightarrow \infty} \frac{\log p_u(n)}{n \log k} \ . </math> The limit exists as the logarithm of the complexity function is [[Subadditivity\|subadditive]].<ref name=PF4>Pytheas Fogg (2002) p.4</ref><ref name=AS303>Allouche & Shallit (2003) p.303</ref> Every [[real number]] between 0 and 1 occurs as the topological entropy of some sequence is applicable,<ref name=CN169>Cassaigne & Nicolas (2010) p.169</ref> which may be taken to be [[Uniformly recurrent word\|uniformly recurrent]]<ref name=BR391>Berthé & Rigo (2010) p.391</ref> or even uniquely ergodic.<ref name=BR169>Berthé & Rigo (2010) p.169</ref> For ''x'' a real number and ''b'' an [[integer]] ≥ 2 then the complexity function of ''x'' in base ''b'' is the complexity function ''p''(''x'',''b'',''n'') of the sequence of digits of ''x'' written in base ''b''. If ''x'' is an [[irrational number]] then ''p''(''x'',''b'',''n'') ≥ ''n''+1; if ''x'' is [[rational number\|rational]] then ''p''(''x'',''b'',''n'') ≤ ''C'' for some constant ''C'' depending on ''x'' and ''b''.<ref name=Bug91>Bugeaud (2012) p.91</ref> It is conjectured that for algebraic irrational ''x'' the complexity is ''b''<sup>''n''</sup> (which would follow if all such numbers were [[Normal number\|normal]]) but all that is known in this case is that ''p'' grows faster than any linear function of ''n''.<ref name=BR414>Berthé & Rigo (2010) p.414</ref> The '''abelian complexity function''' ''p''<sup>ab</sup>(''n'') similarly counts the number of occurrences of distinct factors of given length ''n'', where now we identify factors that differ only by a permutation of the positions. Clearly ''p''<sup>ab</sup>(''n'') ≤ ''p''(''n''). The abelian complexity of a Sturmian sequence satisfies ''p''<sup>ab</sup>(''n'') = 2.<ref>{{cite book \| chapter=On the Asymptotic Abelian Complexity of Morphic Words \| title=Developments in Language Theory. Proceedings, 17th International Conference, DLT 2013, Marne-la-Vallée, France, June 18-21, 2013 \| pages=94–105 \| year=2013 \| doi=10.1007/978-3-642-38771-5_10 \| isbn=978-3-642-38770-8 \| series=Lecture Notes in Computer Science \| volume=7907 \| issn=0302-9743 \| publisher=[[Springer-Verlag]] \| ___location=Berlin, Heidelberg \| editor1-first=Marie-Pierre \| editor1-last=Béal \| editor2-first=Olivier \| editor2-last=Carton \| author1-first=Francine \| author1-last=Blanchet-Sadri \| author2-first=Nathan \| author2-last=Fox }}</ref> ==References==