Lexicographically minimal string rotation: Difference between revisions

Content deleted Content added
OAbot (talk | contribs)
m Open access bot: url-access updated in citation with #oabot.
OAbot (talk | contribs)
m Open access bot: arxiv updated in citation with #oabot.
 
(4 intermediate revisions by 3 users not shown)
Line 1:
In [[computer science]], the '''lexicographically minimal string rotation''' (LMSR) or '''lexicographically least circular substring''' is the problem of finding the [[String_(computer_science)#Rotations|rotation of a string]] possessing the lowest [[lexicographical order]] of all such rotations. For example, the lexicographically minimal rotation of "bbaaccaadd" would be "aaccaaddbb". LMSR is widely used in [[Graph isomorphism|equality checking of graphs]], polygons, automata and chemical structures.<ref name="Wang"/>

It is possible for a string to have multiple lexicographically minimal rotationsLMSRs, but for most applications this does not matter as the rotations must be equivalent. Finding the lexicographically minimal rotation is useful as a way of [[Text normalization|normalizing]] strings. If the strings represent potentially [[isomorphism|isomorphic]] structures such as [[Graph (discrete mathematics)|graphs]], normalizing in this way allows for simple equality checking.<ref>{{cite journal
| author = Kellogg S. Booth
| last2 = Colbourn | first2 = Charles J.
Line 19 ⟶ 21:
==Algorithms==
 
===The Naivenaive Algorithmalgorithm===
The naive algorithm for finding the lexicographically minimal rotation of a string is to iterate through successive rotations while keeping track of the most lexicographically minimal rotation encountered. If the string is of length {{mvar|n}}, this algorithm runs in {{math|''O''(''n''<sup>2</sup>)}} time in the worst case.
 
===Booth's Algorithmalgorithm===
An efficient algorithm was proposed by Booth (1980).<ref>{{cite journal
| author = Kellogg S. Booth
Line 60 ⟶ 62:
Of interest is that removing all lines of code which modify the value of {{mvar|k}} results in the original Knuth-Morris-Pratt preprocessing function, as {{mvar|k}} (representing the rotation) will remain zero. Booth's algorithm runs in {{tmath|O(n)}} time, where {{mvar|n}} is the length of the string. The algorithm performs at most {{tmath|3n}} comparisons in the worst case, and requires auxiliary memory of length {{mvar|n}} to hold the failure function table.
 
===Shiloach's Fastfast Canonizationcanonization Algorithmalgorithm===
Shiloach (1981)<ref>{{cite journal
| title = Fast canonization of circular strings
Line 77 ⟶ 79:
The algorithm is divided into two phases. The first phase is a quick sieve which rules out indices that are obviously not starting locations for the lexicographically minimal rotation. The second phase then finds the lexicographically minimal rotation start index from the indices which remain.
 
===Duval's Lyndon Factorizationfactorization Algorithmalgorithm===
Duval (1983)<ref>{{cite journal
| title = Factorizing words over an ordered alphabet
Line 106 ⟶ 108:
</ref>
proposed an algorithm to efficiently compare two circular strings for equality without a normalization requirement. An additional application which arises from the algorithm is the fast generation of certain chemical structures without repetitions.
 
A variant for [[quantum computing]] was proposed by Wang & Ying (2024).<ref name="Wang">{{cite journal
| author = Wang, Q., Ying, M.
| title = Quantum Algorithm for Lexicographically Minimal String Rotation
| journal = Theory Comput Syst
| volume = 68
| pages = 29–74
| year = 2024
| doi = 10.1007/s00224-023-10146-8
| arxiv = 2012.09376
}}
</ref> They show that the quantum algorithm outperforms any (classical) randomized algorithms in both worst and average cases.
 
==See also==