Content deleted Content added
m →Booth's Algorithm: remove python anti-pattern: replace range(len(S)) to enumerate(). see http://lignos.org/py_antipatterns/ |
m Open access bot: arxiv updated in citation with #oabot. |
||
(24 intermediate revisions by 15 users not shown) | |||
Line 1:
In [[computer science]], the '''lexicographically minimal string rotation''' (LMSR) or '''lexicographically least circular substring''' is the problem of finding the [[String_(computer_science)#Rotations|rotation of a string]] possessing the lowest [[lexicographical order]] of all such rotations. For example, the lexicographically minimal rotation of "bbaaccaadd" would be "aaccaaddbb". LMSR is widely used in [[Graph isomorphism|equality checking of graphs]], polygons, automata and chemical structures.<ref name="Wang"/>
It is possible for a string to have multiple | author = Kellogg S. Booth
| last2 = Colbourn | first2 = Charles J.
Line 12 ⟶ 14:
| url = http://epubs.siam.org/sicomp/resource/1/smjcat/v10/i1/p203_s
| doi = 10.1137/0210015
| issn = 0097-5397
}}
</ref>
A common implementation trick when dealing with circular strings is to concatenate the string to itself instead of having to perform [[modular arithmetic]] on the string indices.
Line 18 ⟶ 21:
==Algorithms==
===The
The naive algorithm for finding the lexicographically minimal rotation of a string is to iterate through successive rotations while keeping track of the most lexicographically minimal rotation encountered. If the string is of length {{mvar|n}}, this algorithm runs in {{math|''O''(''n''<sup>2</sup>)}} time in the worst case.
===Booth's
An efficient algorithm was proposed by Booth (1980).<ref>{{cite journal
| author = Kellogg S. Booth
Line 28 ⟶ 31:
| publisher = Elsevier
| volume = 10
| number =
| pages = 240–242
| year = 1980
| doi = 10.1016/0020-0190(80)90149-0
| issn = 0020-0190 }}
</ref>
The algorithm uses a modified preprocessing function from the [[
<
def
"""Booth's lexicographically minimal string rotation algorithm."""
f = [-1] * (2 * n)
k = 0
for j in range(1,
while i != -1
if s[j % n] < s[(k
k = j - i - 1
i = f[i]
if i == -1 and
if
k = j
f[j - k] = -1
else:
f[j - k] = i + 1
return k
</syntaxhighlight>
Of interest is that removing all lines of code which modify the value of {{mvar|k}} results in the original Knuth-Morris-Pratt preprocessing function, as {{mvar|k}} (representing the rotation) will remain zero. Booth's algorithm runs in {{tmath|O(n)}} time, where {{mvar|n}} is the length of the string. The algorithm performs at most {{tmath|3n}} comparisons in the worst case, and requires auxiliary memory of length {{mvar|n}} to hold the failure function table.▼
▲Of interest is that removing all lines of code which modify the value of {{mvar|k}} results in the original Knuth-Morris-Pratt preprocessing function, as {{mvar|k}} (representing the rotation) will remain zero. Booth's algorithm runs in {{tmath|O(n)}} time, where {{mvar|n}} is the length of the string. The algorithm performs at most {{tmath|3n}} comparisons in the worst case, and requires auxiliary memory of length {{mvar|n}} to hold the failure function table.
===Shiloach's Fast Canonization Algorithm===▼
Shiloach (1981)<ref>{{cite journal
| title = Fast canonization of circular strings
Line 70 ⟶ 73:
| issn = 0196-6774
| doi = 10.1016/0196-6774(81)90013-4
| author = Yossi Shiloach }}
</ref>
proposed an algorithm improving on Booth's result in terms of performance. It was observed that if there are
The algorithm is divided into two phases. The first phase is a quick sieve which rules out indices that are obviously not starting locations for the lexicographically minimal rotation. The second phase then finds the lexicographically minimal rotation start index from the indices which remain.
===Duval's Lyndon
Duval (1983)<ref>{{cite journal
| title = Factorizing words over an ordered alphabet
Line 88 ⟶ 90:
| issn = 0196-6774
| doi = 10.1016/0196-6774(83)90017-2
| author = Jean Pierre Duval }}
</ref>
Line 103 ⟶ 104:
| pages = 236–238
| year = 1979
| doi = 10.1016/0020-0190(79)90114-5
| issn = 0020-0190 }}
</ref>
proposed an algorithm to efficiently compare two circular strings for equality without a normalization requirement. An additional application which arises from the algorithm is the fast generation of certain chemical structures without repetitions.
A variant for [[quantum computing]] was proposed by Wang & Ying (2024).<ref name="Wang">{{cite journal
| author = Wang, Q., Ying, M.
| title = Quantum Algorithm for Lexicographically Minimal String Rotation
| journal = Theory Comput Syst
| volume = 68
| pages = 29–74
| year = 2024
| doi = 10.1007/s00224-023-10146-8
| arxiv = 2012.09376
}}
</ref> They show that the quantum algorithm outperforms any (classical) randomized algorithms in both worst and average cases.
==See also==
* [[Lyndon word]]
* [[
==References==
Line 118 ⟶ 130:
[[Category:Problems on strings]]
[[Category:Lexicography]]
[[Category:Articles with example code]]
|