Deterministic finite automaton: Difference between revisions

Content deleted Content added
Top: test {sfn}
 
(6 intermediate revisions by 4 users not shown)
Line 25:
| publisher = IEEE / ACM
| title = Proceedings of the 27th International Conference on Program Comprehension, ICPC 2019, Montreal, QC, Canada, May 25-31, 2019
| year = 2019}}</ref>| isbn = 978-1-7281-1519-1
}}</ref>
 
DFAs have been generalized to ''[[nondeterministic finite automata]] (NFA)'' which may have several arrows of the same label starting from a state. Using the [[powerset construction]] method, every NFA can be translated to a DFA that recognizes the same language. DFAs, and NFAs as well, recognize exactly the set of [[regular language]]s.{{sfn|Hopcroft|Motwani|Ullman|2006}}
Line 31 ⟶ 32:
==Formal definition==
A deterministic finite automaton {{mvar|M}} is a 5-[[n-tuple|tuple]], {{math|(''Q'', Σ, ''δ'', ''q''<sub>0</sub>, ''F'')}}, consisting of
* a finite [[Set (mathematics)|set]] of [[State (computer science)|states]] {{mvar|Q}}
* a finite set of input symbols called the [[Alphabet (computer science)|alphabet]] {{math|Σ}}
* a transition [[function (mathematics)|function]] {{math|''δ'' : ''Q'' × Σ → ''Q''}}
* an initial (or [[Finite-state machine#Start state|start) state]] <math>q_0 \in Q</math>
* a set of [[Finite-stateaccepting machine#Accept .28or(or final.29 states|accept) states]] <math>F \subseteq Q</math>
 
Let {{math|1=''w'' = ''a''<sub>1</sub>''a''<sub>2</sub>...''a<sub>n</sub>''}} be a string over the alphabet {{math|Σ}}. The automaton {{mvar|M}} accepts the string {{mvar|w}} if a sequence of states, {{math|''r''<sub>0</sub>, ''r''<sub>1</sub>, ..., ''r<sub>n</sub>''}}, exists in {{mvar|Q}} with the following conditions:
Line 78 ⟶ 79:
===Local automata===
 
A '''local automaton''' is a DFA, not necessarily complete, for which all edges with the same label lead to a single vertex. Local automata accept the class of [[Local language (formal language)|local languages]], those for which membership of a word in the language is determined by a "sliding window" of length two on the word.<ref name=Law129>{{sfn|Lawson (|2004) |p.129</ref><ref name=Sak228>129}}{{sfn|Sakarovitch (|2009) |p.=228</ref>}}
 
A '''Myhill graph''' over an alphabet ''A'' is a [[directed graph]] with [[Vertex (graph theory)|vertex set]] ''A'' and subsets of vertices labelled "start" and "finish". The language accepted by a Myhill graph is the set of directed paths from a start vertex to a finish vertex: the graph thus acts as an automaton.<ref name{{sfn|Lawson|2004|p=Law129/>129}} The class of languages accepted by Myhill graphs is the class of local languages.<ref name=Law128>{{sfn|Lawson (|2004)| p.=128</ref>}}
 
===Randomness===
Line 93 ⟶ 94:
{{columns-list|colwidth=30em|
*Union
*Intersection{{sfn|Hopcroft|Ullman|1979|pp=59–60}} (see picture)
*Intersection<ref>{{cite book | isbn=0-201-02988-X | author=John E. Hopcroft and Jeffrey D. Ullman | title=Introduction to Automata Theory, Languages, and Computation | url=https://archive.org/details/introductiontoau00hopc | url-access=registration | ___location=Reading/MA | publisher=Addison-Wesley | year=1979 }}</ref>{{rp|59–60}} (see picture)
*Concatenation
*[[Complementation of automata#With deterministic finite automata|Complement]]
Line 116 ⟶ 117:
| title = Grammars and languages
| volume = 76
| year = 1969| issue = 4
}}</ref>
*Homomorphism<ref name=rose/><ref name=spanier/>
}}
Line 147 ⟶ 149:
* the DFA with a minimum number of states for a particular regular language (Minimization Problem)
 
DFAs are equivalent in computing power to [[nondeterministic finite automata]] (NFAs). This is because, firstly any DFA is also an NFA, so an NFA can do what a DFA can do. Also, given an NFA, using the [[powerset construction]] one can build a DFA that recognizes the same language as the NFA, although the DFA could have exponentially larger number of states than the NFA.<ref name=Sak105>{{sfn|Sakarovitch (|2009) |p.105</ref><ref name=Law63>105}}{{sfn|Lawson (|2004) |p.=63</ref>}} However, even though NFAs are computationally equivalent to DFAs, the above-mentioned problems are not necessarily solved efficiently also for NFAs. The non-universality problem for NFAs is [[PSPACE complete]] since there are small NFAs with shortest rejecting word in exponential size. A DFA is universal if and only if all states are final states, but this does not hold for NFAs. The Equality, Inclusion and Minimization Problems are also PSPACE complete since they require forming the complement of an NFA which results in an exponential blow up of size.<ref>{{Cite web |last1=Esparza Estaun |first1=Francisco Javier |last2=Sickert |first2=Salomon |last3=Blondin |first3=Michael |date=16 November 2016 |orig-date= |title=Operations and tests on sets: Implementation on DFAs |url=https://www7.in.tum.de/um/courses/auto/ws1718/slides1718/04-Implementations_sets.pdf |url-status=dead |archive-url=https://web.archive.org/web/20180808171506/https://www7.in.tum.de/um/courses/auto/ws1718/slides1718/04-Implementations_sets.pdf |archive-date=8 August 2018 |website=Automata and Formal Languages 2017/18 }}</ref>
 
On the other hand, finite-state automata are of strictly limited power in the languages they can recognize; many simple languages, including any problem that requires more than constant space to solve, cannot be recognized by a DFA. The classic example of a simply described language that no DFA can recognize is bracket or [[Dyck language]], i.e., the language that consists of properly paired brackets such as word "(()())". Intuitively, no DFA can recognize the Dyck language because DFAs are not capable of counting: a DFA-like automaton needs to have a state to represent any possible number of "currently open" parentheses, meaning it would need an unbounded number of states. Another simpler example is the language consisting of strings of the form ''a<sup>n</sup>b<sup>n</sup>'' for some finite but arbitrary number of ''a''{{'}}s, followed by an equal number of ''b''{{'}}s.<ref name=Law46>{{sfn|Lawson (|2004) |p.=46</ref>}}
 
==DFA identification from labeled words==
Line 163 ⟶ 165:
Gold's algorithm assumes that <math>S^+</math> and <math>S^-</math> contain a ''[[Characteristic Samples|characteristic set]]'' of the regular language; otherwise, the constructed DFA will be inconsistent either with <math>S^+</math> or <math>S^-</math>.
Other notable DFA identification algorithms include the RPNI algorithm,<ref>{{Cite book | doi=10.1142/9789812797902_0004| chapter=Inferring Regular Languages in Polynomial Updated Time| title=Pattern Recognition and Image Analysis| volume=1| pages=49–61| series=Series in Machine Perception and Artificial Intelligence| year=1992| last1=Oncina| first1=J.| last2=García| first2=P.| isbn=978-981-02-0881-3}}</ref> the Blue-Fringe evidence-driven state-merging algorithm,<ref>{{Cite book |doi = 10.1007/BFb0054059|chapter = Results of the Abbadingo one DFA learning competition and a new evidence-driven state merging algorithm|title = Grammatical Inference|volume = 1433|pages = 1–12|series = Lecture Notes in Computer Science|year = 1998|last1 = Lang|first1 = Kevin J.|last2 = Pearlmutter|first2 = Barak A.|last3 = Price|first3 = Rodney A.|isbn = 978-3-540-64776-8|url = http://eprints.maynoothuniversity.ie/10250/1/BP-Results-1998.pdf}}</ref>
and Windowed-EDSM.<ref>{{Cite book | url=https://dl.acm.org/doi/abs/10.5555/645519.655966 | title=Beyond EDSM {{pipe}} Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications| date=23 September 2002| pages=37–48| isbn=9783540442394| last1=Adriaans| first1=Pieter| last2=Fernau| first2=Henning| last3=Zaanen| first3=Menno van| publisher=Springer}}</ref>
Another research direction is the application of [[evolutionary algorithm]]s: the smart state labeling evolutionary algorithm<ref>{{Cite journal |doi = 10.1109/TPAMI.2005.143|pmid = 16013754|title = Learning deterministic finite automata with a smart state labeling evolutionary algorithm|journal = IEEE Transactions on Pattern Analysis and Machine Intelligence|volume = 27|issue = 7|pages = 1063–1074|year = 2005|last1 = Lucas|first1 = S.M.|last2 = Reynolds|first2 = T.J.|s2cid = 14062047}}</ref> allowed to solve a modified DFA identification problem in which the training data (sets <math>S^+</math> and <math>S^-</math>) is ''noisy'' in the sense that some words are attributed to wrong classes.
 
Line 261 ⟶ 263:
 
==References==
* {{Hopcroft, Motwani, and Ullman 20061979|author-link=no|title-link=no}}{{sfn whitelist|CITEREFHopcroftUllman1979}}
* {{Hopcroft, Motwani, and Ullman 2006}}{{sfn whitelist|CITEREFHopcroftMotwaniUllman2006}}
* {{cite book | last=Lawson | first=Mark V. | title=Finite automata | publisher=Chapman and Hall/CRC | year=2004 | isbn=1-58488-255-7 | zbl=1086.68074 }}
* {{cite journal
Line 274 ⟶ 277:
|issue= 4
|pages= 115–133
|ref=MP43
|doi= 10.1007/BF02478259
|pmid= 2185863
Line 289 ⟶ 291:
|year= 1959
|pages= 114–125
|ref=RS59
|url=https://www.researchgate.net/publication/230876408|doi= 10.1147/rd.32.0114
}}
* {{cite book | last=Sakarovitch | first=Jacques | title=Elements of automata theory | others=Translated from the French by Reuben Thomas | ___location=Cambridge | publisher=[[Cambridge University Press]] | year=2009 | isbn=978-0-521-84425-3 | zbl=1188.68177 }}
===Further reading===
* {{Sipser 1997}} — '''1.1''': "Finite Automata" pp.&nbsp;31&ndash;47. '''4.1''': "Decidable Languages - Decidable Problems Concerning Regular Languages" pp.&nbsp;152&ndash;155. '''4.4''': DFA can accept only regular language
 
Line 298 ⟶ 300:
 
{{DEFAULTSORT:Deterministic Finite-State Machine}}
[[Category:Finite-state automatamachines]]