In theoretical computer science, in particular in formal language theory, Kleene's algorithm transforms a given deterministic finite automaton into a regular expression. Together with other conversion algorithms, it establishes the equivalence of several description formats for regular languages.
Algorithm description
According to Gross and Yellen (2004),[1] the algorithm can be traced back to Kleene (1956).[2]
This description follows Hopcroft and Ullman (1979).[3] Given a deterministic finite automaton M = (Q, Σ, δ, q0, F), with Q = { q0,...,qn } its set of states, the algorithm computes
- the sets Rk
ij of all strings that take M from state qi to qj without going though any state numbered higher than k.
Here, "going through a state" means entering and leaving it, so both i and j may be higher than k, but no intermediate state may.
Each set Rk
ij is represented by a regular expression; the algorithm computes them step by step for k = -1, 0, ..., n. Since there is no state numbered higher than n, the regular expression Rn
0j represents the set of all strings that take M from its start state q0 to qj. If F = { q1,...,qf } is the set of accept states, the regular expression Rn
01 | ... | Rn
0f represents the language accepted by M.
The initial regular expressions, for k = -1, are computed as
- R-1
ij = a1 | ... | am if i≠j, where δ(qi,a1) = ... = δ(qi,am) = qj - R-1
ij = a1 | ... | am | ε, if i=j, where δ(qi,a1) = ... = δ(qi,am) = qj
After that, in each step the expressions Rk
ij are computed from the previous ones by
- Rk
ij = Rk-1
ik (Rk-1
kk)* Rk-1
kj | Rk-1
ij
Example
The automaton shown in the picture can be described as M = (Q, Σ, δ, q0, F) with
- the set of states Q = { q0, q1, q2 },
- the input alphabet Σ = { a, b },
- the transition function δ with δ(q0,a)=q0, δ(q0,b)=q1, δ(q1,a)=q2, δ(q1,b)=q1, δ(q2,a)=q1, and δ(q2,a)=q1,
- the start state q0, and
- set of accept states F = { q1 }.
Kleene's algorithm computes the initial regular expressions as
R-1 00 = a | ε |
R-1 01 = b |
R-1 02 = ∅ |
R-1 10 = ∅ |
R-1 11 = b | ε |
R-1 12 = a |
R-1 20 = ∅ |
R-1 21 = a | b |
R-1 22 = ε |
Step 0:
R0 00 = |
R0 01 = |
R0 02 = |
R0 10 = |
R0 11 = |
R0 12 = |
R0 20 = |
R0 21 = |
R0 22 = |
Step 1:
R1 00 = |
R1 01 = |
R1 02 = |
R1 10 = |
R1 11 = |
R1 12 = |
R1 20 = |
R1 21 = |
R1 22 = |
Step 2:
R2 00 = |
R2 01 = |
R2 02 = |
R2 10 = |
R2 11 = |
R2 12 = |
R2 20 = |
R2 21 = |
R2 22 = |
References
- ^ Jonathan L. Gross and Jay Yellen, ed. (2004). Handbook of Graph Theory. Discrete Mathematics and it Applications. CRC Press. ISBN 1-58488-090-2. Here: sect.2.1, remark R13 on p.65
- ^ Kleene, Stephen C. (1956). "Representation of Events in Nerve Nets and Finite Automate" (PDF). Automata Studies, Annals of Math. Studies. 34. Princeton Univ. Press.
- ^ John E. Hopcroft, Jeffrey D. Ullman (1979). Introduction to Automata Theory, Languages, and Computation. Addison-Wesley. ISBN 0-201-02988-X. Here: Theorem 2.4, p.33-34