Nondeterministic finite automaton: Difference between revisions

Content deleted Content added
No edit summary
Tags: Reverted Mobile edit Mobile web edit
OAbot (talk | contribs)
m Open access bot: url-access=subscription updated in citation with #oabot.
 
(41 intermediate revisions by 21 users not shown)
Line 4:
* each of its transitions is ''uniquely'' determined by its source state and input symbol, and
* reading an input symbol is required for each state transition.
A '''nondeterministic finite automaton''' ('''NFA'''), or '''nondeterministic finite-state machine''', does not need to obey these restrictions. In particular, every DFA is also an NFA. Sometimes the term '''NFA''' is used in a narrower sense, referring to an NFA that is ''not'' a DFA, but the former definition is usednot in this article.
 
Using the [[subset construction algorithm]], each NFA can be translated to an equivalent DFA; i.e., a DFA recognizing the same [[formal language]].<ref>{{cite book |title=Introduction to Languages and the Theory of Computation |last1=Martin |first1=John |year=2010 |page=108 |publisher=McGraw Hill |isbn= 978-0071289429}}</ref>
Like DFAs, NFAs only recognize [[regular language]]s.
NFAs were introduced in 1959 by [[Michael O. Rabin]] and [[Dana Scott]],<ref>{{cite journal sfn|doi=10.1147/rd.32.0114 |last=Rabin |first=M. O. |last2=Scott |first2=D. |title=Finite Automata and Their Decision Problems |journal=IBM Journal of Research and Development |volume=3 |issue=2 |pages=114–125 |date=April 1959 }}</ref> who also showed their equivalence to DFAs. NFAs are used in the implementation of [[regular expression]]s: [[Thompson's construction]] is an algorithm for compiling a regular expression to an NFA that can efficiently perform pattern matching on strings. Conversely, [[Kleene's algorithm]] can be used to convert an NFA into a regular expression (whose size is generally exponential in the input automaton).
 
NFAs have been generalized in multiple ways, e.g., [[#NFA with ε-moves|nondeterministic finite automata with ε-moves]], [[finite-state transducer]]s, [[pushdown automaton|pushdown automata]], [[Alternating finite automaton|alternating automata]], [[ω-automaton|ω-automata]], and [[probabilistic automaton|probabilistic automata]].
Line 18:
==Informal introduction==
 
There are at least two equivalent ways to describe the behavior of an NFA, and both of them are equivalent. The first way makes use of the [[Nondeterministic algorithm|nondeterminism]] in the name of an NFA. For each input symbol, the NFA transitions to a new state until all input symbols have been consumed. In each step, the automaton nondeterministically "chooses" one of the applicable transitions. If there exists at least one "lucky run", i.e. some sequence of choices leading to an accepting state after completely consuming the input, it is accepted. Otherwise, i.e. if no choice sequence at all can consume all the input<ref>A choice sequence may lead into a "dead end" where no transition is applicable for the current input symbol; in this case it is considered unsuccessful.</ref> and lead to an accepting state, the input is rejected.<ref name="Hopcroft.Ullman.1979">{{cite book sfn| isbn=0-201-02988-X Hopcroft| author=John E. Hopcroft and Jeffrey D. Ullman | title=Introduction to Automata Theory, Languages, and Computation | ___location=Reading/MA | publisher=Addison-Wesley | year=1979 | urlpp=https://archive.org/details/introductiontoau00hopc }}</ref>{{rp|19-20}}<ref name="Aho.Hopcroft.Ullman.1974">{{cite book | isbn=0-201-00029-6 | author=Alfred V. Aho and John E. Hopcroft and Jeffrey D. Ullman | title=The Design and Analysis of Computer Algorithms | url=https://archive.org/details/designanalysisof00ahoarich | url-access=registration | ___location=Reading/MA | publisher=Addison-Wesley | year=1974 }}</ref>{{rp|319}}{{sfn|Hopcroft|Motwani|Ullman|2006|pp=55-6}}
 
In the second way, the NFA consumes a string of input symbols, one by one. In each step, whenever two or more transitions are applicable, it "clones" itself into appropriately many copies, each one following a different transition. If no transition is applicable, the current copy is in a dead end, and it "dies". If, after consuming the complete input, any of the copies is in an accept state, the input is accepted, else, it is rejected.<ref name="{{sfn|Hopcroft.|Ullman.|1979"/>{{rp|pp=19–20}}<!---not an error: Hopcroft and Ullman use both informal explanations---><ref name="Sipser.1997">{{cite book sfn| author=Michael Sipser | title=Introduction to the Theory of Computation | ___location=Boston/MA | publisher=PWS Publishing Co. | year=1997 | isbnp=0-534-94728-X | url=https://archive.org/details/introductiontoth00sips 48}}</ref>{{rpsfn|48}}<ref name="Hopcroft.|Motwani.|Ullman.2003">{{cite book | url=http://148.216.38.247/~rrusiles/Fie/Horizontal/Hopcroft_Introduction_to_Automata_Theory_Languages_and_Computation.pdf 2006| isbnpp=055-201-44124-1 | author=John E. Hopcroft and Rajeev Motwani and Jeffrey D. Ullman | title=Introduction to Automata Theory, Languages, and Computation | ___location=Upper Saddle River/NJ | publisher=Addison Wesley | year=2003 }}</ref>{{rp|566}}
 
==Formal definition==
Line 28:
An ''NFA'' is represented formally by a 5-[[tuple]],
<math>(Q, \Sigma, \delta, q_0, F)</math>, consisting of
* a finite [[Set (mathematics)|set]] of [[State (computer science)|states]] <math>Q</math>.,
* a finite set of input symbols called the [[inputAlphabet (computer symbolscience)|alphabet]]s <math>\Sigma</math>.,
* a transition [[function (mathematics)|function]] <math>\delta</math> : <math>Q\times \Sigma \rightarrow \mathcal{P}(Q)</math>.,
* an ''initial'' (or ''start'') state <math>q_0 \in Q</math>., and
* a set of states <math>F</math> distinguished as ''accepting'' (or ''final'') ''states'' <math>F \subseteq Q</math>.
 
Here, <math>\mathcal{P}(Q)</math> denotes the [[power set]] of <math>Q</math>.
Line 44:
*# <math>r_{i+1} \in \delta (r_i, a_{i+1})</math>, for <math>i = 0, \ldots, n-1</math>
*# <math>r_n \in F</math>.
:In words, the first condition says that the machine starts in the start state <math>q_0</math>. The second condition says that given each character of string <math>w</math>, the machine will transition from state to state according to the transition function <math>\delta</math>. The last condition says that the machine accepts <math>w</math> if the last input of <math>w</math> causes the machine to halt in one of the accepting states. In order for <math>w</math> to be accepted by <math>M</math>, it is not required that every state sequence ends in an accepting state, it is sufficient if one does. Otherwise, ''i.e.'' if it is impossible at all to get from <math>q_0</math> to a state from <math>F</math> by following <math>w</math>, it is said that the automaton ''rejects'' the string. The set of strings <math>M</math> accepts is the [[Formal language|language]] ''recognized'' by <math>M</math> and this language is denoted by <math>L(M)</math>.<ref name="Aho.Hopcroft.Ullman.1974"/>{{rp|320}}<!---using "deduction" relation "\vdash"---><ref name="Sipser.1997"/>{{rpsfn|Sipser|1997|p=54}}
 
*Alternatively, <math>w</math> is accepted if <math>\delta^*(q_0, w) \cap F \not = \emptyset</math>, where <math>\delta^*: Q \times \Sigma^* \rightarrow \mathcal{P}(Q)</math> is defined [[recursion (computer science)|recursively]] by:
*# <math>\delta^*(r, \epsilon) = \{r\}</math> where <math>\epsilon</math> is the empty string, and
*# <math>\delta^*(r, xa)= \bigcup_{r' \in \delta^*(r, x)} \delta(r', a)</math> for all <math>x \in \Sigma^*, a \in \Sigma</math>.
:In words, <math>\delta^*(r, x)</math> is the set of all states reachable from state <math>r</math> by consuming the string <math>x</math>. The string <math>w</math> is accepted if some accepting state in <math>F</math> can be reached from the start state <math>q_0</math> by consuming <math>w</math>.<ref name="{{sfn|Hopcroft.|Ullman.|1979"/>{{rp|p=21}}<ref name="{{sfn|Hopcroft.|Motwani.|Ullman.2003"/>{{rp|2006|p=59}}
 
===Initial state===
Line 58:
|-
| VALIGN=TOP | [[File:NFASimpleExample.svg|thumb|250px|The [[state diagram]] for ''M''. It is not deterministic since in state ''p'' reading a 1 can lead to ''p'' or to ''q''.]]
| VALIGN=TOP | [[File:NFASimpleExample Runs10.gif|thumb|250px|All possible runs of ''M'' on input string "10".]]
|-
| COLSPAN=2 | [[File:NFASimpleExample Runs1011.gif|thumb|530px|All possible runs of ''M'' on input string "1011".<br />''Arc label:'' input symbol, ''node label'': state, ''{{color|#00c000|green}}:'' start state, ''{{color|#c00000|red}}:'' accepting state(s).]]
<!---replaced by image:---{| class="wikitable"
|-
! {{diagonal split header|State sequence|Input}}
!Input: || || 1 || || 0 || || 1 || || 1 ||
|-
|State sequence 1: || ''p'' || || ''q'' || ||'''?'''|| || || ||
|-
|State sequence 2: || ''p'' || || ''p'' || || ''p'' || || ''q'' || || '''?'''
|-
|State sequence 3: || ''p'' || || ''p'' || || ''p'' || || ''p'' || || ''q''
|}
]]
The following automaton <math>M</math>, with a binary alphabet, determines if the input ends with a 1.
|}
The following automaton <math>{{mvar|M</math>}}, with a binary alphabet, determines if the input ends with a 1.
Let <math>M = (\{p, q\}, \{0, 1\}, \delta, p, \{q\})</math> where
the transition function <math>\delta</math> can be defined by this [[state transition table]] (cf. upper left picture):
:<math>\begin{array}{|c|cc|}
:{| class="wikitable" style="text-align:center;"
! \bcancel{{diagonal split header|}_\text{State|}\quad {}^\text{Input}} &
0 &
! 0
! 1
\\
|-
\hline
! <math>p</math>
p &
| <math>\{p\}</math>
| <math>\{p,q\}</math> &
\{p,q\}
|-
\\
! <math>q</math>
q &
| <math>\emptyset</math>
| <math>\emptyset</math> &
\emptyset
|}
\end{array}</math>
Since the set <math>\delta(p,1)</math> contains more than one state, <math>{{mvar|M</math>}} is nondeterministic.
The language of <math>{{mvar|M</math>}} can be described by the [[regular language]] given by the [[regular expression]] <code>(0|1)*1</code>.
 
All possible state sequences for the input string "1011" are shown in the lower picture.
 
<!---replaced by image:---{| class="wikitable"
The string is accepted by <math>{{mvar|M</math>}} since one state sequence satisfies the above definition; it doesn'tdoes not matter that other sequences fail to do so.
|-
!Input: || || 1 || || 0 || || 1 || || 1 ||
|-
|State sequence 1: || ''p'' || || ''q'' || ||'''?'''|| || || ||
|-
|State sequence 2: || ''p'' || || ''p'' || || ''p'' || || ''q'' || || '''?'''
|-
|State sequence 3: || ''p'' || || ''p'' || || ''p'' || || ''p'' || || ''q''
|}--->
The string is accepted by <math>M</math> since one state sequence satisfies the above definition; it doesn't matter that other sequences fail to do so.
The picture can be interpreted in a couple of ways:
* In terms of the [[#Informal introduction|above]] "lucky-run" explanation, each path in the picture denotes a sequence of choices of <math>{{mvar|M</math>}}.
* In terms of the "cloning" explanation, each vertical column shows all clones of <math>{{mvar|M</math>}} at a given point in time, multiple arrows emanating from a node indicate cloning, a node without emanating arrows indicating the "death" of a clone.
The feasibility to read the same picture in two ways also indicates the equivalence of both above explanations.
* Considering the first of the [[#Recognized language|above]] formal definitions, "1011" is accepted since when reading it <math>{{mvar|M</math>}} may traverse the state sequence <math>\langle r_0,r_1,r_2,r_3,r_4 \rangle = \langle p, p, p, p, q \rangle</math>, which satisfies conditions 1 to 3.
* Concerning the second formal definition, bottom-up computation shows that <math>\delta^*(p,\epsilon) = \{ p \}</math>, hence <math>\delta^*(p,1) = \delta(p,1) = \{ p,q \}</math>, hence <math>\delta^*(p,10) = \delta(p,0) \cup \delta(q,0) = \{ p \} \cup \{\}</math>, hence <math>\delta^*(p,101) = \delta(p,1) = \{ p,q \}</math>, and hence <math>\delta^*(p,1011) = \delta(p,1) \cup \delta(q,1) = \{ p,q \} \cup \{\}</math>; since that set is not disjoint from <math>\{ q \}</math>, the string "1011" is accepted.
 
In contrast, the string "10" is rejected by <math>{{mvar|M</math>}} (all possible state sequences for that input are shown in the upper right picture), since there is no way to reach the only accepting state, <math>{{mvar|q</math>}}, by reading the final 0 symbol. While <math>{{mvar|q</math>}} can be reached after consuming the initial "1", this does not mean that the input "10" is accepted; rather, it means that an input string "1" would be accepted.
 
==Equivalence to DFA==
Line 108 ⟶ 112:
Conversely, for each NFA, there is a DFA such that it recognizes the same formal language. The DFA can be constructed using the [[powerset construction]].
 
This result shows that NFAs, despite their additional flexibility, are unable to recognize languages that cannot be recognized by some DFA. It is also important in practice for converting easier-to-construct NFAs into more efficiently executable DFAs. However, if the NFA has ''n'' states, the resulting DFA may have up to 2<sup>''n''</sup> states, which sometimes makes the construction impractical for large NFAs.
 
==NFA with ε-moves==
Line 147 ⟶ 151:
The automaton is said to accept a string <math>w</math> if
:<math>\delta^*(q_0,w) \cap F \neq \emptyset ,</math>
that is, if reading <math>w</math> may drive the automaton from its start state <math>q_0</math> to some accepting state in <math>F .</math><ref name="{{sfn|Hopcroft.|Ullman.|1979"/>{{rp|p=25}}
 
===Example===
Line 187 ⟶ 191:
|}
<math>M</math> can be viewed as the union of two [[deterministic finite automaton|DFA]]s: one with states <math>\{S_1, S_2\}</math> and the other with states <math>\{S_3, S_4\}</math>.
The language of <math>M</math> can be described by the [[regular language]] given by this [[regular expression]] <math>(1^{*}01^{*}001^{*})^{*} \cup (0^{*}10^{*}110^{*})^{*}</math>.
We define <math>M</math> using ε-moves but <math>M</math> can be defined without using ε-moves.
 
Line 211 ⟶ 215:
:*If <math>\delta'^*(q_0,w)</math> contains a state in <math>F' \setminus \{ q_0 \} ,</math> then <math>\delta^*(q_0,w)</math> contains the same state, which lies in <math>F</math>.
:*If <math>\delta'^*(q_0,w)</math> contains <math>q_0 ,</math> and <math>q_0 \in F ,</math> then <math>\delta^*(q_0,w)</math> also contains a state in <math>F ,</math> viz. <math>q_0 .</math>
:*If <math>\delta'^*(q_0,w)</math> contains <math>q_0 ,</math> and <math>q_0 \not\in F ,</math> but <math>q_0\in F',</math> then {{clarifythere span|theexists a state in <math>E(q_0) \cap F</math>|reason=Which, one?and Whythe should such asame state exist?|date=April 2022}} must be in <math display=inline>\delta^*(q_0,w) = \bigcup_{r \in \delta^*(q,v)} E(\delta(r,a)) .</math><ref name="{{sfn|Hopcroft.|Ullman.|1979"/>{{rp|pp=26-27}}
 
Since NFA is equivalent to DFA, NFA-ε is also equivalent to DFA.
Line 238 ⟶ 242:
There are many ways to implement a NFA:
* Convert to the equivalent DFA. In some cases this may cause exponential blowup in the number of states.<ref>{{cite web|url=https://cseweb.ucsd.edu//~ccalabro/essays/fsa.pdf|title=NFA to DFA blowup|website=cseweb.ucsd.edu|date=February 27, 2005|access-date=6 March 2023|author=Chris Calabro}}</ref>
* Keep a [[set data structure]] of all states which the NFA might currently be in. On the consumption of an input symbol, [[set union|unite]] the results of the transition function applied to all current states to get the set of next states; if ε-moves are allowed, include all states reachable by such a move (ε-closure). Each step requires at most ''s''<sup>2</sup> computations, where ''s'' is the number of states of the NFA. On the consumption of the last input symbol, if one of the current states is a final state, the machine accepts the string. A string of length ''n'' can be processed in time [[Big O notation#Formal definition|''O'']](''ns''<sup>2</sup>),<ref name="{{sfn|Hopcroft.|Motwani.|Ullman.2003"/>{{rp|1532006|pp=154-5}} and space ''O''(''s'').
* Create multiple copies. For each ''n'' way decision, the NFA creates up to ''n''−1 copies of the machine. Each will enter a separate state. If, upon consuming the last input symbol, at least one copy of the NFA is in the accepting state, the NFA will accept. (This, too, requires linear storage with respect to the number of NFA states, as there can be one machine for every NFA state.)
* Explicitly propagate tokens through the transition structure of the NFA and match whenever a token reaches the final state. This is sometimes useful when the NFA should encode additional context about the events that triggered the transition. (For an implementation that uses this technique to keep track of object references have a look at Tracematches.)<ref>Allan, C., Avgustinov, P., Christensen, A. S., Hendren, L., Kuzins, S., Lhoták, O., de Moor, O., Sereni, D., Sittampalam, G., and Tibble, J. 2005. [http://abc.comlab.ox.ac.uk/papers#oopsla2005 Adding trace matching with free variables to AspectJ] {{Webarchive|url=https://web.archive.org/web/20090918053718/http://abc.comlab.ox.ac.uk/papers#oopsla2005 |date=2009-09-18 }}. In Proceedings of the 20th Annual ACM SIGPLAN Conference on Object Oriented Programming, Systems, Languages, and Applications (San Diego, CA, USA, October 16–20, 2005). OOPSLA '05. ACM, New York, NY, 345-364.</ref>
 
* It is [[PSPACE]]-complete to test, given an NFA, whether it is ''universal'', i.e., if there is a string that it does not accept.<ref>Historically shown in: {{Cite journal|last=Meyer|first=A. R.|last2=Stockmeyer|first2=L. J.|date=1972-10-25|title=The equivalence problem for regular expressions with squaring requires exponential space|journal=Proceedings of the 13th Annual Symposium on Switching and Automata Theory (SWAT)|___location=USA|publisher=IEEE Computer Society|pages=125–129|doi=10.1109/SWAT.1972.29}} For a modern presentation, see [http://www.lsv.fr/~jacomme/1718/ca/td4_sol.pdf]</ref> The same is true of the ''inclusion problem'', i.e., given two NFAs, is the language of one a subset of the language of the other.
== Complexity ==
 
* One can solve in linear time the [[emptiness problem]] for NFA, i.e., check whether the language of a given NFA is empty. To do this, we can simply perform a [[depth-first search]] from the initial state and check if some final state can be reached.
* It is [[PSPACE]]-complete to test, given an NFA, whether it is ''universal'', i.e., if there is a string that it does not accept.<ref>Historically shown in: {{Cite journalbook |lastlast1=Meyer |firstfirst1=A. R. |last2=Stockmeyer |first2=L. J. |datetitle=13th Annual Symposium on Switching and Automata Theory (Swat 1972-10-25) |titlechapter=The equivalence problem for regular expressions with squaring requires exponential space |journaldate=Proceedings1972-10-25 of the 13th Annual Symposium on Switching and Automata Theory (SWAT)|___location=USA |publisher=IEEE Computer Society |pages=125–129 |doi=10.1109/SWAT.1972.29}} For a modern presentation, see [http://www.lsv.fr/~jacomme/1718/ca/td4_sol.pdf]</ref> TheAs a consequence, the same is true of the ''inclusion problem'', i.e., given two NFAs, is the language of one a subset of the language of the other.
* Given as input an NFA ''A'' and an integer n, the [[counting problem (complexity)|counting problem]] of determining how many words of length ''n'' are accepted by ''A'' is intractable; it is [[♯P|'''#P'''-hard]]. In fact, this problem is complete (under [[parsimonious reduction]]s) for the complexity class [[SpanL]].<ref>{{Cite journal |last1=Álvarez |first1=Carme |last2=Jenner |first2=Birgit |date=1993-01-04 |title=A very hard log-space counting class |url=https://dx.doi.org/10.1016/0304-3975%2893%2990252-O |journal=Theoretical Computer Science |volume=107 |issue=1 |pages=3–30 |doi=10.1016/0304-3975(93)90252-O |issn=0304-3975|url-access=subscription }}</ref>
 
==Application of NFA==
Line 256 ⟶ 265:
 
== References ==
*{{cite Mjournal |doi=10. O1147/rd.32.0114 |last1=Rabin and|first1=M. DO. |last2=Scott, "|first2=D. |title=Finite Automata and theirTheir Decision Problems", ''|journal=IBM Journal of Research and Development'', '''|volume=3''': |issue=2 (|pages=114–125 |date=April 1959) pp.&nbsp;115–125.}}
* Michael {{Sipser, ''Introduction to the Theory of Computation''. PWS, Boston. 1997. {{isbn|0-534-94728-X}}. ''(see sectionSee §1.2: Nondeterminism, pp. 47–63.)''{{sfn whitelist|CITEREFSipser1997}}
* {{Hopcroft and Ullman 1979|author-link=no|title-link=no}}{{sfn whitelist|CITEREFHopcroftUllman1979}}
* John E. Hopcroft and Jeffrey D. Ullman, ''[[Introduction to Automata Theory, Languages, and Computation]]'', Addison-Wesley Publishing, Reading Massachusetts, 1979. {{isbn|0-201-02988-X}}. ''(See chapter 2.)''
* {{Hopcroft, Motwani, and Ullman 2006}} ''(See chapter 2, "Finite Automata".)''{{sfn whitelist|CITEREFHopcroftMotwaniUllman2006}}
 
{{Formal languages and grammars}}
Line 264 ⟶ 274:
 
{{DEFAULTSORT:Nondeterministic Finite-State Machine}}
[[Category:Finite-state automatamachines]]