Kleene's algorithm: Difference between revisions

Content deleted Content added
m Reverted edits by 38.18.114.116 (talk) to last version by Citation bot
 
(5 intermediate revisions by 4 users not shown)
Line 1:
In [[theoretical computer science]], in particular in [[formal language theory]], '''Kleene's algorithm''' transforms a given [[nondeterministic finite automaton]] (NFA) into a [[regular expression]].
Together with other conversion algorithms, it establishes the equivalence of several description formats for [[regular language]]s. Alternative presentations of the same method include the "elimination method" attributed to [[Janusz Brzozowski (computer scientist)|Brzozowski]] and [[Edward J. McCluskey|McCluskey]], the algorithm of [[Robert McNaughton|McNaughton]] and [[Hisao Yamada|Yamada]],<ref>{{Cite journal|lastlast1=McNaughton|firstfirst1=R.|last2=Yamada|first2=H.|date=March 1960|title=Regular Expressions and State Graphs for Automata|journal=IRE Transactions on Electronic Computers|volume=EC-9|issue=1|pages=39–47|doi=10.1109/TEC.1960.5221603|issn=0367-9950}}</ref> and the use of [[Arden's lemma]].
 
==Algorithm description==
According to Gross and Yellen (2004),<ref name="gross2004handbook">{{cite book| title=Handbook of Graph Theory| year=2004| volume=| publisher=CRC Press| editor=Jonathan L. Gross and Jay Yellen| series=Discrete Mathematics and it Applications| isbn=1-58488-090-2}} Here: sect.2.1, remark R13 on p.65</ref> the algorithm can be traced back to [[Kleene]] (1956).<ref>{{cite journal| author=Kleene, Stephen C.| title=Representation of Events in Nerve Nets and Finite AutomateAutomata| journal=Automata Studies, Annals of Math. Studies| year=1956| volume=34| publisher=Princeton Univ. Press| url=http://www.dlsi.ua.es/~mlf/nnafmc/papers/kleene56representation.pdf}} Here: sect.9, p.37-40</ref> A presentation of the algorithm in the case of [[deterministic finite automata]] (DFAs) is given in Hopcroft and Ullman (1979).<ref>{{cite book| author=John E. Hopcroft, Jeffrey D. Ullman| title=Introduction to Automata Theory, Languages, and Computation| year=1979| publisher=Addison-Wesley| isbn=0-201-02988-X| url-access=registration| url=https://archive.org/details/introductiontoau00hopc}} Here: Section 3.2.1 pages 91-96</ref> The presentation of the algorithm for NFAs below follows Gross and Yellen (2004).<ref name="gross2004handbook" />.
 
Given a [[Nondeterministic finite automaton#Formal definition|nondeterministic finite automaton]] ''M'' = (''Q'', Σ, δ, ''q''<sub>0</sub>, ''F''), with ''Q'' = { ''q''<sub>0</sub>,...,''q''<sub>''n''</sub> } its set of [[nondeterministic finite automaton#Formal definition|states]], the algorithm computes
Line 24:
By induction on ''k'', it can be shown that the length<ref>More precisely, the number of regular-expression symbols, "''a''<sub>''i''</sub>", "ε", "|", "<sup>*</sup>", "·"; not counting parentheses.</ref> of each expression ''R''{{su|p=''k''|b=''ij''}} is at most {{sfrac|1|3}}(4<sup>''k''+1</sup>(6''s''+7) - 4) symbols, where ''s'' denotes the number of characters in Σ.
Therefore, the length of the regular expression representing the language accepted by ''M'' is at most {{sfrac|1|3}}(4<sup>''n''+1</sup>(6''s''+7)''f'' - ''f'' - 3) symbols, where ''f'' denotes the number of final states.
This exponential blowup is inevitable, because there exist families of DFAs for which any equivalent regular expression must be of exponential size.<ref>{{Cite journalbook|lastlast1=Gruber|firstfirst1=Hermann|last2=Holzer|first2=Markus|title=Automata, Languages and Programming |chapter=Finite Automata, Digraph Connectivity, and Regular Expression Size |date=2008|editor-last=Aceto|editor-first=Luca|editor2-last=Damgård|editor2-first=Ivan|editor3-last=Goldberg|editor3-first=Leslie Ann|editor4-last=Halldórsson|editor4-first=Magnús M.|editor5-last=Ingólfsdóttir|editor5-first=Anna|editor6-last=Walukiewicz|editor6-first=Igor|title=Finite Automata, Digraph Connectivity, and Regular Expression Size|journal=Automata, Languages and Programming|volume=5126|series=Lecture Notes in Computer Science|publisher=Springer Berlin Heidelberg|pages=39–50|doi=10.1007/978-3-540-70583-3_4|isbn=9783540705833|urls2cid=https://semanticscholar.org/paper/ae605b2875c9c329fc29c177fb003fb6cd0a35b810975422}}. Theorem 16.</ref>
 
In practice, the size of the regular expression obtained by running the algorithm can be very different depending on the order in which the states are considered by the procedure, i.e., the order in which they are numbered from 0 to ''n''.
Line 313:
 
[[Category:Algorithms]]
[[Category:Finite-state automatamachines]]
[[Category:Regular expressions]]