Content deleted Content added
Adamant.pwn (talk | contribs) →History: missing "that" |
m Moving Category:Finite automata to Category:Finite-state machines per Wikipedia:Categories for discussion/Speedy |
||
(52 intermediate revisions by 23 users not shown) | |||
Line 1:
{{short description|Deterministic finite automaton accepting set of all suffixes of particular string}}
{{Use shortened footnotes|date=December 2020}}
{{Infobox data structure
|
|
|
|
|
|
|
}}
In [[computer science]], a '''suffix automaton''' is
In terms of [[automata theory]], a suffix automaton is the [[Minimal automaton|minimal]] partial [[deterministic finite automaton]] that recognizes the set of [[Substring|suffixes]] of a given [[String (computer science)|string]] <math>S=s_1 s_2 \dots s_n</math>. The [[state graph]] of a suffix automaton is called a directed acyclic word graph (DAWG), a term that is also sometimes used for any [[deterministic acyclic finite state automaton]].
Suffix automata were introduced in 1983 by a group of scientists from the [[University of Denver]] and the [[University of Colorado Boulder]]. They suggested a [[linear time]] [[online algorithm]] for its construction and showed that the suffix automaton of a string <math>S</math> having length at least two characters has at most <math display="inline">2|S| - 1</math> states and at most <math display="inline">3|S| - 4 </math> transitions. Further works have shown a close connection between suffix automata and [[suffix tree]]s, and have outlined several generalizations of suffix automata, such as compacted suffix automaton obtained by compression of nodes with a single outgoing arc.
Line 23 ⟶ 22:
[[File:Anselm Blumer with DAWG.jpg|thumb|288x288px|Anselm Blumer with a drawing of generalized CDAWG for strings ''ababc'' and ''abcab'']]
The concept of suffix automaton was introduced in 1983<ref name=":16" /> by a group of scientists from [[University of Denver]] and [[University of Colorado Boulder]] consisting of Anselm Blumer, Janet Blumer, [[Andrzej Ehrenfeucht]], [[David Haussler]] and Ross McConnell, although similar concepts had earlier been studied alongside suffix trees in the works of Peter Weiner,<ref name=":5" /> Vaughan Pratt<ref>{{harvp|Pratt|1973}}</ref> and [[Anatol Slissenko]].<ref>{{harvp|Slisenko|1983}}</ref> In their initial work, Blumer ''et al''. showed a suffix automaton built for the string <math>S</math> of length greater than <math>1</math> has at most <math>2|S| -1</math> states and at most <math>3|S| -4</math> transitions, and suggested a linear [[algorithm]] for automaton construction.<ref name=":2">{{harvp|Blumer
In 1983, Mu-Tian Chen and Joel Seiferas independently showed that Weiner's 1973 suffix-tree construction algorithm<ref name=":5">{{harvp|Weiner|1973}}</ref> while building a suffix tree of the string <math>S</math> constructs a suffix automaton of the reversed string <math display="inline">S^R</math> as an auxiliary structure.<ref name=":6">{{harvp|Chen
== Definitions ==
Line 37 ⟶ 36:
* "Concatenation of languages" <math>A</math> and <math>B</math> is denoted as <math>A \cdot B</math> or <math>AB</math> and corresponds to the set of pairwise concatenations <math>AB = \{\alpha \beta : \alpha \in A, \beta \in B\}</math>;
* If the word <math>\omega\in\Sigma^*</math> may be represented as <math>\omega = \alpha\gamma\beta</math>, where <math>\alpha,\beta,\gamma \in \Sigma^*</math>, then words <math>\alpha</math>, <math>\beta</math> and <math>\gamma</math> are called "prefix", "suffix" and "[[subword]]" (substring) of the word <math>\omega</math> correspondingly;
* If <math>T = T_1 \dots T_n</math> and <math>T_l T_{l+1} \dots T_r = S</math> (with <math>1 \leq l \leq r \leq n</math>) then <math>S</math> is said to "occur" in <math>T</math> as a subword. Here <math>l</math> and <math>r</math> are called left and right positions of occurrence of <math>S</math> in <math>T</math> correspondingly.
== Automaton structure ==
Formally, [[deterministic finite automaton]] is determined by [[5-tuple]] <math>\mathcal A = (\Sigma, Q, q_0, F, \delta)</math>, where:
* <math>\Sigma</math> is an "alphabet" that is used to construct words,
Line 48 ⟶ 47:
* <math>\delta : Q \times \Sigma \mapsto Q</math> is a [[Partial function|partial]] "transition" function of automaton, such that <math>\delta(q, \sigma)</math> for <math>q \in Q</math> and <math>\sigma \in \Sigma</math> is either undefined or defines a transition from <math>q</math> over character <math>\sigma</math>.
Most commonly, deterministic finite automaton is represented as a [[directed graph]] ("diagram") such that:<ref name=":14">{{harvp|
* Set of graph [[Vertex (graph theory)|vertices]] corresponds to the state of states <math>Q</math>,
Line 56 ⟶ 55:
* Specifically, every transition <math display="inline">\delta(q_1, \sigma) = q_2</math> is represented by an arc from <math>q_1</math> to <math>q_2</math> marked with the character <math>\sigma</math>. This transition also may be denoted as <math display="inline">q_1 \begin{smallmatrix}{\sigma}\\[-5pt]{\longrightarrow}\end{smallmatrix} q_2</math>.
In terms of its diagram, the automaton recognizes the word <math>\omega=\omega_1 \omega_2 \dots \omega_m</math> only if there is a path from the initial vertex <math>q_0</math> to some final vertex <math>q \in F</math> such that concatenation of characters on this path forms <math>\omega</math>. The set of words recognized by an automaton forms a language that is set to be recognized by the automaton. In these terms, the language recognized by a suffix automaton of <math>S</math> is the language of its (possibly empty) suffixes.<ref name=":1">{{harvp|Crochemore
=== Automaton states ===
{{Main|DFA minimization}}
"Right context" of the word <math>\omega</math> with respect to language <math>L</math> is a set <math>[\omega]_R=\{\alpha : \omega\alpha \in L\}</math> that is a set of words <math>\alpha</math> such that their concatenation with <math>\omega</math> forms a word from <math>L</math>. Right contexts induce a natural [[equivalence relation]] <math>[\alpha]_R = [\beta]_R</math> on the set of all words. If language <math>L</math> is recognized by some deterministic finite automaton, there exists unique up to [[isomorphism]] automaton that recognizes the same language and has the minimum possible number of states. Such an automaton is called a [[minimal automaton]] for the given language <math>L</math>. [[Myhill–Nerode theorem]] allows it to define it explicitly in terms of right contexts:<ref>{{harvp|Рубцов|2019|loc=|pp=89—94}}</ref><ref>{{harvp|Hopcroft
{{Math theorem
|
* Alphabet <math>\Sigma</math> stays the same,
Line 72 ⟶ 71:
}}
In these terms, a "suffix automaton" is the minimal deterministic finite automaton recognizing the language of suffixes of the word <math>S=s_1s_2\dots s_n</math>.
{{Math theorem
|
There is a following bijection between <math>endpos(\omega)</math> and <math>[\omega]_R</math>:
Line 84 ⟶ 83:
For example, for the word <math>S=abacaba</math> and its subword <math>\omega = ab</math>, it holds <math>endpos(ab)=\{2,6\}</math> and <math>[ab]_R = \{a,acaba\}</math>. Informally, <math>[ab]_R</math> is formed by words that follow occurrences of <math>ab</math> to the end of <math>S</math> and <math>endpos(ab)</math> is formed by right positions of those occurrences. In this example, the element <math>x=2 \in endpos(ab)</math> corresponds with the word <math>s_3 s_4 s_5 s_6 s_7 = acaba \in [ab]_R</math> while the word <math>a \in [ab]_R</math> corresponds with the element <math>7-|a|=6 \in endpos(ab)</math>.
It implies several structure properties of suffix automaton states. Let <math>|\alpha|
* If <math>[\alpha]_R</math> and <math>[\beta]_R</math> have at least one common element <math>x</math>, then <math>endpos(\alpha)</math> and <math>endpos(\beta)</math> have a common element as well. It implies <math>\alpha</math> is a suffix of <math>\beta</math> and therefore <math>endpos(\beta) \subset endpos(\alpha)</math> and <math>[\beta]_R \subset [\alpha]_R</math>. In aforementioned example, <math>a \in [ab]_R \cap [cab]_R</math>, so <math>ab</math> is a suffix of <math>cab</math> and thus <math>[cab]_R=\{a\} \subset \{a,acaba\} = [ab]_R</math> and <math>endpos(cab) = \{6\} \subset \{2,6\} = endpos(ab)</math>;
* If <math>[\alpha]_R=[\beta]_R</math>, then <math>endpos(\alpha)=endpos(\beta)</math>, thus <math>\alpha</math> occurs in <math>S</math> only as a suffix of <math>\beta</math>. For example, for <math>\alpha=b</math> and <math>\beta = ab</math> it holds that <math>[b]_R = [ab]_R = \{a,acaba\}</math> and <math>endpos(b)=endpos(ab)=\{2,6\}</math>;
* If <math>[\alpha]_R=[\beta]_R</math> and <math>\gamma</math> is a suffix of <math>\beta</math> such that <math>|\alpha|
Any state <math>q = [\alpha]_R</math> of the suffix automaton recognizes some continuous [[Total order|chain]] of nested suffixes of the longest word recognized by this state.<ref name=":10" />
"Left extension" <math>\overset{\scriptstyle{\leftarrow}}{\gamma}</math> of the string <math>\gamma</math> is the longest string <math>\omega</math> that has the same right context as <math>\gamma</math>. Length <math>|\overset{\scriptstyle{\leftarrow}}{\gamma}|</math> of the longest string recognized by <math>q=[\gamma]_R</math> is denoted by <math>len(q)</math>. It holds:<ref name=":3">{{harvp|Inenaga
{{Math theorem
|
}}
Line 103 ⟶ 102:
{{Math theorem
|
# [[Vertex (graph theory)|Vertices]] <math>V</math> of the tree correspond to left extensions <math>\overleftarrow{\omega}</math> of all <math>S</math> substrings,
# [[Edge (graph theory)|Edges]] <math>E</math> of the tree connect pairs of vertices <math>(\overleftarrow{\omega},\overleftarrow{\alpha\omega})</math>, such that <math>\alpha \in \Sigma</math> and <math>\overleftarrow{\omega} \neq \overleftarrow{\alpha\omega}</math>.
Line 111 ⟶ 110:
[[File:Suffix structures diamond.svg|thumb|Relationship of the suffix trie, suffix tree, DAWG and CDAWG]]
{{Main|Suffix tree}}
A "[[prefix tree]]" (or "trie") is a rooted directed tree in which arcs are marked by characters in such a way no vertex <math>v</math> of such tree has two out-going arcs marked with the same character. Some vertices in trie are marked as final. Trie is said to recognize a set of words defined by paths from its root to final vertices. In this way prefix trees are a special kind of deterministic finite automata if you perceive its root as an initial vertex.<ref>{{harvp|Rubinchik
By its definition, a suffix automaton can be obtained via [[DFA minimization|minimization]] of the suffix trie. It may be shown that a compacted suffix automaton is obtained by both minimization of the suffix tree (if one assumes each string on the edge of the suffix tree is a solid character from the alphabet) and compaction of the suffix automaton.<ref name=":0">{{harvp|Inenaga
Similarly to right contexts one may introduce "left contexts" <math>[\omega]_L = \{\beta \in \Sigma^* : \beta \omega \in L\}</math>, "right extensions" <math>\overset{\scriptstyle{\rightarrow}}{\omega~}</math> corresponding to the longest string having same left context as <math>\omega</math> and the equivalence relation <math>[\alpha]_L = [\beta]_L</math>. If one considers right extensions with respect to the language <math>L</math> of "prefixes" of the string <math>S</math> it may be obtained:<ref name=":3" />
{{Math theorem
|
* Vertices <math>V</math> of the tree correspond to right extensions <math>\overrightarrow{\omega}</math> of all <math>S</math> substrings,
* Edges <math>E</math> correspond to triplets <math>(\overrightarrow{\omega},x\alpha, \overrightarrow{\omega x})</math> such that <math>x \in \Sigma</math> and <math>\overrightarrow{\omega x}=\overrightarrow{\omega} x \alpha</math>.
Line 125 ⟶ 124:
}}, which implies the suffix link tree of the string <math>S</math> and the suffix tree of the string <math>S^R</math> are isomorphic:<ref name=":11" />
▲{| class="wikitable mw-collapsible mw-collapsed"
|-
! Suffix structures of words "abbcbc" and "cbcbba"
Line 135 ⟶ 133:
File:Suffix tree for cbcbba.svg|Suffix tree of the word "cbcbba"<br>(Suffix link tree of the word "abbcbc")
</gallery>
|}
Similarly to the case of left extensions, the following lemma holds for right extensions:<ref name=":3" />
{{Math theorem
|
}}
=== Size ===
A suffix automaton of the string <math>S</math> of length <math>n>1</math> has at most <math>2n-1</math> states and at most <math>3n-4</math> transitions. These bounds are reached on strings <math>abb\dots bb=ab^{n-1}</math> and <math>abb\dots bc=ab^{n-2}c</math> correspondingly.<ref name=":9" /> This may be formulated in a stricter way as <math>|\delta|
{|
|-
! Maximal suffix automata
Line 157 ⟶ 155:
== Construction ==
Initially the automaton only consists of a single state corresponding to the empty word, then characters of the string are added one by one and the automaton is rebuilt on each step incrementally.<ref name=":12">{{harvp|Crochemore
=== State updates ===
Line 163 ⟶ 161:
{{Math theorem
|
* <math>[\alpha]_{R_{\omega x} } = [\alpha]_{R_\omega}x \cup \{\varepsilon\}</math> if <math>\alpha</math> is a suffix of <math>\omega x</math>;
* <math>[\alpha]_{R_{\omega x} } = [\alpha]_{R_\omega}x</math> otherwise.
Line 173 ⟶ 171:
{{Math theorem
|
* If <math>[u]_{R_\omega} = [v]_{R_\omega}</math> and <math>[u]_{R_\omega} \neq [\alpha]_{R_\omega}</math>, then <math>[u]_{R_{\omega x} } = [v]_{R_{\omega x} }</math>;
* If <math>[u]_{R_\omega} = [\alpha]_{R_\omega}</math> and <math>\vert u\vert \leq \vert \alpha\vert </math>, then <math>[u]_{R_{\omega x} } = [\alpha]_{R_{\omega x} }</math>;
Line 184 ⟶ 182:
=== Transitions and suffix links updates ===
After the character <math>x</math> is appended to <math>\omega</math> possible new states of suffix automaton are <math>[\omega x]_{R_{\omega x} }</math> and <math>[\alpha]_{R_{\omega x} }</math>. Suffix link from <math>[\omega x]_{R_{\omega x} }</math> goes to <math>[\alpha]_{R_{\omega x} }</math> and from <math>[\alpha]_{R_{\omega x} }</math> it goes to <math>link([\alpha]_{R_{\omega} })</math>. Words from <math>[\omega x]_{R_{\omega x} }</math> occur in <math>\omega x</math> only as its suffixes therefore there should be no transitions at all from <math>[\omega x]_{R_{\omega x}}</math> while transitions to it should go from suffixes of <math>\omega</math> having length at least <math>\alpha</math> and be marked with the character <math>x</math>. State <math>[\alpha]_{R_{\omega x}}</math> is formed by subset of <math>[\alpha]_{R_\omega}</math>, thus transitions from <math>[\alpha]_{R_{\omega x}}</math> should be same as from <math>[\alpha]_{R_\omega}</math>. Meanwhile, transitions leading to <math>[\alpha]_{R_{\omega x}}</math> should go from suffixes of <math>\omega</math> having length less than <math>|\alpha|</math> and at least <math>len(link([\alpha]_{R_{\omega} }))</math>, as such transitions have
▲{| class="wikitable mw-collapsible mw-collapsed"
|-
! colspan="2" |
|-
|
{|style="text-align:center;width:600px"
|+'''''∅ → a'''''
Line 200 ⟶ 197:
|style="vertical-align:top;"|<small>Similarly, only one leaf is added to the suffix tree.</small>
|}
|
{|style="text-align:center;width:600px"
|+'''''a → ab'''''
Line 210 ⟶ 207:
|}
|-
|
{|style="text-align:center;width:600px"
|+'''''ab → abb'''''
Line 219 ⟶ 216:
|style="vertical-align:top;"|<small>In the suffix tree it corresponds to the split of the edge leading to the vertex 2.</small>
|}
|
{|style="text-align:center;width:600px"
|+'''''abb → abbc'''''
Line 229 ⟶ 226:
|}
|-
|
{|style="text-align:center;width:600px"
|+'''''abbc → abbcb'''''
Line 238 ⟶ 235:
|style="vertical-align:top;"|<small>Correspondingly, new leaf is hanged on the vertex 4 in the suffix tree.</small>
|}
|
{|style="text-align:center;width:600px"
|+'''''abbcb → abbcbc'''''
Line 248 ⟶ 245:
|}
|}
=== Construction algorithm ===
Theoretical results above lead to the following algorithm that takes character
# The state corresponding to the word
# After
# States corresponding to suffixes of
# Once the aforementioned loop is over, there are 3 cases:
## If none of states on the suffix path had a transition by
## If the transition by
## If the transition is found but <math>len(q) > len(p)+1</math>, then words from
# If the previous step was concluded with the creation of
# Transitions that have
The whole procedure is described by the following pseudo-code:
'''function''' {{
'''define''' {{
'''assign''' {{
'''assign''' {{
'''while''' {{
'''assign''' {{
'''define''' {{
'''if''' {{
'''assign''' {{
'''else if''' {{
'''assign''' {{
'''else''':
'''define''' {{
'''assign''' {{
'''assign''' {{
'''assign''' {{
'''while''' {{
'''assign''' {{
Here <math>q_0</math> is the initial state of the automaton and <
=== Complexity ===
Line 296 ⟶ 292:
{{Math theorem
|
* <math>V = \{\overleftrightarrow \omega : \omega \in \Sigma^*\}</math> is a set of automaton states;
Line 302 ⟶ 298:
}}
Two-way extensions induce an equivalence relation <math display="inline">\overset{\scriptstyle\longleftrightarrow}{\alpha} = \overset{\scriptstyle\longleftrightarrow}{\beta}</math> which defines the set of words recognized by the same state of compacted automaton. This equivalence relation is a [[transitive closure]] of the relation defined by <math display="inline">(\overset{\scriptstyle{\rightarrow}}{\alpha\,}=\overset{\scriptstyle{\rightarrow}}{\beta\,}) \vee (\overset{\scriptstyle{\leftarrow}}{\alpha} = \overset{\scriptstyle{\leftarrow}}{\beta})</math>, which highlights the fact that a compacted automaton may be obtained by both gluing suffix tree vertices equivalent via <math>\overset{\scriptstyle{\leftarrow}}{\alpha} = \overset{\scriptstyle{\leftarrow}}{\beta}</math> relation (minimization of the suffix tree) and gluing suffix automaton states equivalent via <math>\overset{\scriptstyle{\rightarrow}}{\alpha\,}=\overset{\scriptstyle{\rightarrow}}{\beta\,}</math> relation (compaction of suffix automaton).<ref name=":13">{{harvp|Blumer
=== Suffix automaton of several strings ===
Consider a set of words <math>T=\{S_1, S_2, \dots, S_k\}</math>. It is possible to construct a generalization of suffix automaton that would recognize the language formed up by suffixes of all words from the set. Constraints for the number of states and transitions in such automaton would stay the same as for a single-word automaton if you put <math>n = |S_1|
This idea is further generalized to the case when <math>T</math> is not given explicitly but instead is given by a [[prefix tree]] with <math>Q</math> vertices. Mohri ''et al''. showed such an automaton would have at most <math>2Q-2</math> and may be constructed in linear time from its size. At the same time, the number of transitions in such automaton may reach <math>O(Q|\Sigma|)</math>, for example for the set of words <math>T=\{\sigma_1, a\sigma_1, a^2\sigma_1, \dots, a^n \sigma_1, a^n \sigma_2, \dots, a^n \sigma_k \}</math> over the alphabet <math>\Sigma=\{a,\sigma_1,\dots,\sigma_k\}</math> the total length of
=== Sliding window ===
Some [[compression algorithms]], such as [[LZ77]] and [[Run-length encoding|RLE]] may benefit from storing suffix automaton or similar structure not for the whole string but for only last <math>k</math> its characters while the string is updated. This is because compressing data is usually expressively large and using <math>O(n)</math> memory is undesirable. In 1985, Janet Blumer developed an algorithm to maintain a suffix automaton on a sliding window of size <math>k</math> in <math>O(nk)</math> worst-case and <math>O(n \log k)</math> on average, assuming characters are distributed independently and [[Uniform distribution (discrete)|uniformly]]. She also showed <math>O(nk)</math> complexity cannot be improved: if one considers words construed as a concatenation of several <math>(ab)^m c (ab)^m d</math> words, where <math>k=6m+2</math>, then the number of states for the window of size <math>k</math> would frequently change with jumps of order <math>m</math>, which renders even theoretical improvement of <math>O(nk)</math> for regular suffix automata impossible.<ref>{{harvp|Blumer|1987|loc=|p=|pp=461—465}}</ref>
The same should be true for the suffix tree because its vertices correspond to states of the suffix automaton of the reversed string but this problem may be resolved by not explicitly storing every vertex corresponding to the suffix of the whole string, thus only storing vertices with at least two out-going edges. A variation of McCreight's suffix tree construction algorithm for this task was suggested in 1989 by Edward Fiala and Daniel Greene;<ref>{{harvp|Fiala
One way to overcome this obstacle is to allow window width to vary a bit while staying <math>O(k)</math>. It may be achieved by an approximate algorithm suggested by Inenaga et al. in 2004. The window for which suffix automaton is built in this algorithm is not guaranteed to be of length <math>k</math> but it is guaranteed to be at least <math>k</math> and at most <math>2k+1</math> while providing linear overall complexity of the algorithm.<ref>{{harvp|Inenaga
== Applications ==
Suffix automaton of the string <math>S</math> may be used to solve such problems as:<ref name=":
* Counting the number of distinct substrings of <math>S</math> in <math>O(|S|)</math> on-line,
* Finding the longest substring of <math>S</math> occurring at least twice in <math>O(|S|)</math>,
* Finding the [[longest common substring]] of <math>S</math> and <math>T</math> in <math>O(|T|)</math>,
* Counting the number of occurrences of <math>T</math> in <math>S</math> in <math>O(|T|)</math>,
* Finding all occurrences of <math>T</math> in <math>S</math> in <math>O(|T|+k)</math>, where <math>k</math> is the number of occurrences.
It is assumed here that <math>T</math> is given on the input after suffix automaton of <math>S</math> is constructed.
Suffix automata are also used in data compression,<ref>{{harvp|Yamamoto
==References==
Line 334 ⟶ 330:
===Bibliography===
{{refbegin|2}}
*{{cite book|doi=10.1007/3-540-13345-3_9 |chapter=Building the minimal DFA for the set of all subwords of a word on-line in linear time |title=Automata, Languages and Programming |series=Lecture Notes in Computer Science |date=1984 |last1=Blumer |first1=A. |last2=Blumer |first2=J. |last3=Ehrenfeucht |first3=A. |last4=Haussler |first4=D. |last5=McConnell |first5=R. |volume=172 |pages=109–118 |isbn=978-3-540-13345-2 }}
*{{cite journal|doi=10.1145/28869.28873 |zbl=1433.68118 |title=Complete inverted files for efficient text retrieval and analysis |date=1987 |last1=Blumer |first1=A. |last2=Blumer |first2=J. |last3=Haussler |first3=D. |last4=McConnell |first4=R. |last5=Ehrenfeucht |first5=A. |journal=Journal of the ACM |volume=34 |issue=3 |pages=578–595 }}
*{{cite journal|doi=10.1016/0196-6774(87)90045-9 |zbl=0636.68109 |title=How much is that DAWG in the window? A moving window algorithm for the directed acyclic word graph |date=1987 |last1=Blumer |first1=Janet A. |journal=Journal of Algorithms |volume=8 |issue=4 |pages=451–469 }}
*{{cite journal|doi=10.3390/A11080118 |doi-access=free |zbl=1458.68043 |title=Sliding Suffix Tree |date=2018 |last1=Brodnik |first1=Andrej |last2=Jekovec |first2=Matevž |journal=Algorithms |volume=11 |issue=8 |page=118 }}
*{{cite book|doi=10.1007/978-3-642-82456-2_7 |chapter=Efficient and Elegant Subword-Tree Construction |title=Combinatorial Algorithms on Words |date=1985 |last1=Chen |first1=M. T. |last2=Seiferas |first2=Joel |pages=97–107 |isbn=978-3-642-82458-6 }}
*{{cite book|doi=10.1007/978-3-662-07675-0_9 |chapter=Automata for Matching Patterns |title=Handbook of Formal Languages |date=1997 |last1=Crochemore |first1=Maxime |last2=Hancart |first2=Christophe |pages=399–462 |isbn=978-3-642-08230-6 }}
*{{cite book|doi=10.1007/3-540-63246-8_12 |chapter=On compact directed acyclic word graphs |title=Structures in Logic and Computer Science |series=Lecture Notes in Computer Science |date=1997 |last1=Crochemore |first1=Maxime |last2=Vérin |first2=Renaud |volume=1261 |pages=192–211 |isbn=978-3-540-63246-7 }}
*{{cite book|doi=10.1007/978-3-540-39984-1_16 |chapter=A Bit-Parallel Suffix Automaton Approach for (δ,γ)-Matching in Music Retrieval |title=String Processing and Information Retrieval |series=Lecture Notes in Computer Science |date=2003 |last1=Crochemore |first1=Maxime |last2=Iliopoulos |first2=Costas S. |last3=Navarro |first3=Gonzalo |last4=Pinzon |first4=Yoan J. |volume=2857 |pages=211–223 |isbn=978-3-540-20177-9 }}
*{{cite book |last1=Serebryakov |first1=Vladimir |last2=Galochkin |first2=Maksim Pavlovich |last3=Furugian |first3=Meran Gabibullaevich |last4=Gonchar |first4=Dmitriy Ruslanovich |year=2006 |title=Теория и реализация языков программирования: Учебное пособие |url=http://trpl7.ru/t-books/_TRYAPBOOK_pdf.pdf |language=Russian |___location=Moscow |publisher=MZ Press |isbn=5-94073-094-9}}
*{{cite book|doi=10.1007/978-3-319-38827-4_12 |chapter=Evaluation and Improvement of Fast Algorithms for Exact Matching on Genome Sequences |title=Algorithms for Computational Biology |series=Lecture Notes in Computer Science |date=2016 |last1=Faro |first1=Simone |volume=9702 |pages=145–157 |isbn=978-3-319-38826-7 }}
*{{cite journal|doi=10.1145/63334.63341 |title=Data compression with finite windows |date=1989 |last1=Fiala |first1=E. R. |last2=Greene |first2=D. H. |journal=Communications of the ACM |volume=32 |issue=4 |pages=490–505 }}
*{{cite book|doi=10.4230/LIPICS.MFCS.2016.38 |doi-access=free |zbl=1398.68703 |date=2016 |last1=Fujishige |first1=Yuta |last2=Tsujimaru |first2=Yuki |last3=Inenaga |first3=Shunsuke |last4=Bannai |first4=Hideo |last5=Takeda |first5=Masayuki |chapter=Computing DAWGs and Minimal Absent Words in Linear Time for Integer Alphabets |title=41st International Symposium on Mathematical Foundations of Computer Science (MFCS 2016) |series=Leibniz International Proceedings in Informatics |pages=38:1–38:14 |publisher=Schloss Dagstuhl – Leibniz-Zentrum für Informatik }}
*{{cite book |last1=Hopcroft |first1=John Edward |last2=Ullman |first2=Jeffrey David |year=1979 |title=Introduction to Automata Theory, Languages, and Computation |edition=1st |___location=Massachusetts |publisher=Addison-Wesley |isbn=978-81-7808-347-6 |ol=9082218M}}
*{{cite journal |last=Inenaga |first=Shunsuke |date=2003 |title=Bidirectional Construction of Suffix Trees |url=https://str.i.kyushu-u.ac.jp/~inenaga/papers/NJC_bidirectional.pdf |journal=Nordic Journal of Computing |volume=10 |issue=1 |pages=52–67 |citeseerx=10.1.1.100.8726}}
*{{cite journal|doi=10.1016/J.DAM.2004.04.012 |zbl=1084.68137 |title=On-line construction of compact directed acyclic word graphs |date=2005 |last1=Inenaga |first1=Shunsuke |last2=Hoshino |first2=Hiromasa |last3=Shinohara |first3=Ayumi |last4=Takeda |first4=Masayuki |last5=Arikawa |first5=Setsuo |last6=Mauri |first6=Giancarlo |last7=Pavesi |first7=Giulio |journal=Discrete Applied Mathematics |volume=146 |issue=2 |pages=156–179 }}
*{{cite book |last1=Inenaga |first1=Shunsuke |last2=Hoshino |first2=Hiromasa |last3=Shinohara |first3=Ayumi |last4=Takeda |first4=Masayuki |last5=Arikawa |first5=Setsuo |year=2001 |chapter=Construction of the CDAWG for a trie |chapter-url=http://www.shino.ecei.tohoku.ac.jp/~ayumi/papers/PSC2001.pdf |title=Prague Stringology Conference. Proceedings |pages=37–48 |citeseerx=10.1.1.24.2637}}
*{{cite journal|doi=10.1016/S1570-8667(03)00064-9 |zbl=1118.68755 |title=Compact directed acyclic word graphs for a sliding window |date=2004 |last1=Inenaga |first1=Shunsuke |last2=Shinohara |first2=Ayumi |last3=Takeda |first3=Masayuki |last4=Arikawa |first4=Setsuo |journal=Journal of Discrete Algorithms |volume=2 |pages=33–51 }}
*{{cite book|doi=10.1109/DCC.1996.488324 |chapter=Extended application of suffix trees to data compression |title=Proceedings of Data Compression Conference - DCC '96 |date=1996 |last1=Larsson |first1=N.J. |pages=190–199 |isbn=0-8186-7358-3 }}
*{{cite journal|doi=10.1016/J.TCS.2009.03.034 |zbl=1194.68143 |title=General suffix automaton construction algorithm and space bounds |date=2009 |last1=Mohri |first1=Mehryar |last2=Moreno |first2=Pedro |last3=Weinstein |first3=Eugene |journal=Theoretical Computer Science |volume=410 |issue=37 |pages=3553–3562 }}
*{{Cite book |last=Паращенко |first=Дмитрий А. |year=2007 |title=Обработка строк на основе суффиксных автоматов |url=http://is.ifmo.ru/diploma-theses/paraschenko/doc.pdf |language=Russian |___location=Saint Petersburg |publisher=ITMO University}}
*{{Cite book |last=Pratt |first=Vaughan Ronald |year=1973 |title=Improvements and applications for the Weiner repetition finder |oclc=726598262}}
*{{Cite book |last=Рубцов |first=Александр Александрович |url=http://rubtsov.su/public/books/zz-a5-online.pdf |year=2019 |title=Заметки и задачи о регулярных языках и конечных автоматах |language=Russian |___location=Moscow |publisher=Moscow Institute of Physics and Technology |isbn=978-5-7417-0702-9}}
*{{cite journal|doi=10.1016/J.EJC.2017.07.021 |zbl=1374.68131 |title=EERTREE: An efficient data structure for processing palindromes in strings |date=2018 |last1=Rubinchik |first1=Mikhail |last2=Shur |first2=Arseny M. |journal=European Journal of Combinatorics |volume=68 |pages=249–265 |arxiv=1506.04862 }}
*{{cite book|doi=10.1007/978-3-540-89097-3_12 |chapter=Sliding CDAWG Perfection |title=String Processing and Information Retrieval |series=Lecture Notes in Computer Science |date=2008 |last1=Senft |first1=Martin |last2=Dvořák |first2=Tomáš |volume=5280 |pages=109–120 |isbn=978-3-540-89096-6 }}
*{{cite journal|doi=10.1007/BF01084395 |zbl=0509.68043 |title=Detection of periodicities and string-matching in real time |date=1983 |last1=Slisenko |first1=A. O. |journal=Journal of Soviet Mathematics |volume=22 |issue=3 |pages=1316–1387 }}
*{{cite book|doi=10.1109/SWAT.1973.13 |chapter=Linear pattern matching algorithms |title=14th Annual Symposium on Switching and Automata Theory (Swat 1973) |date=1973 |last1=Weiner |first1=Peter |pages=1–11 }}
*{{cite book |doi=10.4230/LIPICS.STACS.2014.675 |doi-access=free |zbl=1359.68341 |date=2014 |last1=Yamamoto |first1=Jun'ichi |last2=I |first2=Tomohiro |last3=Bannai |first3=Hideo |last4=Inenaga |first4=Shunsuke |last5=Takeda |first5=Masayuki |chapter=Faster Compact On-Line Lempel-Ziv Factorization |title=31st International Symposium on Theoretical Aspects of Computer Science (STACS 2014) |series=Leibniz International Proceedings in Informatics |pages=675–686 |publisher=Schloss Dagstuhl – Leibniz-Zentrum für Informatik }}
{{refend}}
Line 369 ⟶ 365:
{{Strings |state=collapsed}}
[[Category:
[[Category:
|