Content deleted Content added
Isaidnoway (talk | contribs) →Advantages and disadvantages: ext link not allowed in parameter, only one archive link is needed |
m Moving Category:Finite automata to Category:Finite-state machines per Wikipedia:Categories for discussion/Speedy |
||
(21 intermediate revisions by 12 users not shown) | |||
Line 2:
{{Use dmy dates|date=April 2020}}
{{Redirect-synonym|DFSA|[[drug-facilitated sexual assault]]}}
{{CS1 config|mode=cs1}}
[[File:DFA example multiplies of 3.svg|thumb|upright=1.1|An example of a deterministic finite automaton that accepts only binary numbers that are multiples of 3. The state S<sub>0</sub> is both the start state and an accept state. For example, the string "1001" leads to the state sequence S<sub>0</sub>, S<sub>1</sub>, S<sub>2</sub>, S<sub>1</sub>, S<sub>0</sub>, and is hence accepted.]]
In the [[theory of computation]], a branch of [[theoretical computer science]], a '''deterministic finite automaton''' ('''DFA''')—also known as '''[[Finite-state machine#Acceptors (recognizers)|deterministic finite acceptor]]''' ('''DFA'''), '''deterministic finite-state machine''' ('''DFSM'''), or '''deterministic finite-state automaton''' ('''DFSA''')—is a [[finite-state machine]] that accepts or rejects a given [[String (computer science)|string]] of symbols, by running through a state sequence uniquely determined by the string.
The figure illustrates a deterministic finite automaton using a [[state diagram]]. In this example automaton, there are three states: S<sub>0</sub>, S<sub>1</sub>, and S<sub>2</sub> (denoted graphically by circles). The automaton takes a finite [[sequence]] of 0s and 1s as input. For each state, there is a transition arrow leading out to a next state for both 0 and 1. Upon reading a symbol, a DFA jumps ''deterministically'' from one state to another by following the transition arrow. For example, if the automaton is currently in state S<sub>0</sub> and the current input symbol is 1, then it deterministically jumps to state S<sub>1</sub>. A DFA has a ''start state'' (denoted graphically by an arrow coming in from nowhere) where computations begin, and a [[set (mathematics)|set]] of ''accept states'' (denoted graphically by a double circle) which help define when a computation is successful.
A DFA is defined as an abstract mathematical concept, but is often implemented in hardware and software for solving various specific problems such as [[lexical analysis]] and [[pattern matching]]. For example, a DFA can model software that decides whether or not online user input such as email addresses are syntactically valid.<ref>{{
| last1 = Bai |
| last2 = Clee | first2 = Brian
| last3 = Shrestha | first3 = Nischal
| last4 = Chapman | first4 = Carl
}}</ref>▼
| last5 = Wright | first5 = Cimone
| last6 = Stolee | first6 = Kathryn T.
| editor1-last = Guéhéneuc | editor1-first = Yann-Gaël
| editor2-last = Khomh | editor2-first = Foutse
| editor3-last = Sarro | editor3-first = Federica
| contribution = Exploring tools and strategies used during regular expression composition tasks
| contribution-url = https://par.nsf.gov/servlets/purl/10100320
| doi = 10.1109/ICPC.2019.00039
| pages = 197–208
| publisher = IEEE / ACM
| title = Proceedings of the 27th International Conference on Program Comprehension, ICPC 2019, Montreal, QC, Canada, May 25-31, 2019
| year = 2019| isbn = 978-1-7281-1519-1
DFAs have been generalized to ''[[nondeterministic finite automata]] (NFA)'' which may have several arrows of the same label starting from a state. Using the [[powerset construction]] method, every NFA can be translated to a DFA that recognizes the same language. DFAs, and NFAs as well, recognize exactly the set of [[regular language]]s.
==Formal definition==
A deterministic finite automaton {{mvar|M}} is a 5-[[n-tuple|tuple]], {{math|(''Q'', Σ, ''δ'', ''q''<sub>0</sub>, ''F'')}}, consisting of
* a finite [[Set (mathematics)|set]] of [[State (computer science)|states]] {{mvar|Q}}
* a finite set of input symbols called the [[Alphabet (computer science)|alphabet]] {{math|Σ}}
* a transition [[function (mathematics)|function]] {{math|''δ'' : ''Q'' × Σ → ''Q''}}
* an initial (or
* a set of
Let {{math|1=''w'' = ''a''<sub>1</sub>''a''<sub>2</sub>
# {{math|1=''r''<sub>0</sub> = ''q''<sub>0</sub>}}
# {{math|1=''r''<sub>''i''+1</sub> = ''δ''(''r<sub>i</sub>'', ''a''<sub>''i''+1</sub>)}}, for {{math|1=''i'' = 0,
# <math>r_n \in F</math>.
Line 65 ⟶ 79:
===Local automata===
A '''local automaton''' is a DFA, not necessarily complete, for which all edges with the same label lead to a single vertex. Local automata accept the class of [[Local language (formal language)|local languages]], those for which membership of a word in the language is determined by a "sliding window" of length two on the word.
A '''Myhill graph''' over an alphabet ''A'' is a [[directed graph]] with [[Vertex (graph theory)|vertex set]] ''A'' and subsets of vertices labelled "start" and "finish". The language accepted by a Myhill graph is the set of directed paths from a start vertex to a finish vertex: the graph thus acts as an automaton.
===Randomness===
When the start state and accept states are ignored, a DFA of {{mvar|n}} states and an alphabet of size {{mvar|k}} can be seen as a [[Directed graph|digraph]] of {{mvar|n}} vertices in which all vertices have {{mvar|k}} out-arcs labeled {{math|1,
In a random DFA, the maximum number of vertices reachable from one vertex is very close to the number of vertices in the largest [[strongly connected component|SCC]] with high probability.<ref name=Grusho /><ref>{{cite conference |last1=Carayol |first1=Arnaud |last2=Nicaud |first2=Cyril |date=February 2012 |title=Distribution of the number of accessible states in a random deterministic automaton |volume=14 |pages=194–205 |conference=STACS'12 (29th Symposium on Theoretical Aspects of Computer Science) |___location=Paris, France |url=https://hal.archives-ouvertes.fr/hal-00678213}}</ref> This is also true for the largest [[Glossary of graph theory#Subgraphs|induced sub-digraph]] of minimum in-degree one, which can be seen as a directed version of [[Degeneracy (graph theory)#k-Cores|{{math|1}}-core]].<ref name=Cai />
Line 80 ⟶ 94:
{{columns-list|colwidth=30em|
*Union
*Intersection{{sfn|Hopcroft|Ullman|1979|pp=59–60}} (see picture)
*Concatenation
*[[Complementation of automata#With deterministic finite automata|Complement]]
*[[Kleene closure]]
*Reversal<ref name=rose/>
*Quotient
| last = Rose | first = Gene F.
| doi = 10.1016/S0022-0000(68)80029-7
| issue = 2
| journal = [[Journal of Computer and System Sciences]]
| pages = 148–168
| title = Closures which Preserve Finiteness in Families of Languages
| volume = 2
| year = 1968}}</ref>
*Substitution<ref name=spanier>{{cite journal
| last = Spanier | first = E.
| doi = 10.1080/00029890.1969.12000214
| journal = American Mathematical Monthly
| jstor = 2316423
| mr = 241205
| pages = 335–342
| title = Grammars and languages
| volume = 76
| year = 1969| issue = 4
}}</ref>
*Homomorphism<ref name=rose/><ref name=spanier/>
}}
Line 117 ⟶ 149:
* the DFA with a minimum number of states for a particular regular language (Minimization Problem)
DFAs are equivalent in computing power to [[nondeterministic finite automata]] (NFAs). This is because, firstly any DFA is also an NFA, so an NFA can do what a DFA can do. Also, given an NFA, using the [[powerset construction]] one can build a DFA that recognizes the same language as the NFA, although the DFA could have exponentially larger number of states than the NFA.
On the other hand, finite-state automata are of strictly limited power in the languages they can recognize; many simple languages, including any problem that requires more than constant space to solve, cannot be recognized by a DFA. The classic example of a simply described language that no DFA can recognize is bracket or [[Dyck language]], i.e., the language that consists of properly paired brackets such as word "(()())". Intuitively, no DFA can recognize the Dyck language because DFAs are not capable of counting: a DFA-like automaton needs to have a state to represent any possible number of "currently open" parentheses, meaning it would need an unbounded number of states. Another simpler example is the language consisting of strings of the form ''a<sup>n</sup>b<sup>n</sup>'' for some finite but arbitrary number of ''a''
==DFA identification from labeled words==
Line 131 ⟶ 163:
However, Traxbar does not guarantee the minimality of the constructed DFA.
In his work<ref name="Complexity of Automaton Identificat"/> E.M. Gold also proposed a heuristic algorithm for minimal DFA identification.
Gold's algorithm assumes that <math>S^+</math> and <math>S^-</math> contain a ''[[Characteristic Samples|characteristic set]]'' of the regular language; otherwise, the constructed DFA will be inconsistent either with <math>S^+</math> or <math>S^-</math>.
Other notable DFA identification algorithms include the RPNI algorithm,<ref>{{Cite book | doi=10.1142/9789812797902_0004| chapter=Inferring Regular Languages in Polynomial Updated Time| title=Pattern Recognition and Image Analysis| volume=1| pages=49–61| series=Series in Machine Perception and Artificial Intelligence| year=1992| last1=Oncina| first1=J.| last2=García| first2=P.| isbn=978-981-02-0881-3}}</ref> the Blue-Fringe evidence-driven state-merging algorithm,<ref>{{Cite book |doi = 10.1007/BFb0054059|chapter = Results of the Abbadingo one DFA learning competition and a new evidence-driven state merging algorithm|title = Grammatical Inference|volume = 1433|pages = 1–12|series = Lecture Notes in Computer Science|year = 1998|last1 = Lang|first1 = Kevin J.|last2 = Pearlmutter|first2 = Barak A.|last3 = Price|first3 = Rodney A.|isbn = 978-3-540-64776-8|url = http://eprints.maynoothuniversity.ie/10250/1/BP-Results-1998.pdf}}</ref>
and Windowed-EDSM.<ref>{{Cite book | url=https://dl.acm.org/doi/abs/10.5555/645519.655966 | title=Beyond EDSM
Another research direction is the application of [[evolutionary algorithm]]s: the smart state labeling evolutionary algorithm<ref>{{Cite journal |doi = 10.1109/TPAMI.2005.143|pmid = 16013754|title = Learning deterministic finite automata with a smart state labeling evolutionary algorithm|journal = IEEE Transactions on Pattern Analysis and Machine Intelligence|volume = 27|issue = 7|pages = 1063–1074|year = 2005|last1 = Lucas|first1 = S.M.|last2 = Reynolds|first2 = T.J.|s2cid = 14062047}}</ref> allowed to solve a modified DFA identification problem in which the training data (sets <math>S^+</math> and <math>S^-</math>) is ''noisy'' in the sense that some words are attributed to wrong classes.
Line 231 ⟶ 263:
==References==
* {{Hopcroft and Ullman 1979|author-link=no|title-link=no}}{{sfn whitelist|CITEREFHopcroftUllman1979}}
* {{Hopcroft, Motwani, and Ullman 2006}}{{sfn whitelist|CITEREFHopcroftMotwaniUllman2006}}
* {{cite book | last=Lawson | first=Mark V. | title=Finite automata | publisher=Chapman and Hall/CRC | year=2004 | isbn=1-58488-255-7 | zbl=1086.68074 }}
* {{cite journal
Line 244 ⟶ 277:
|issue= 4
|pages= 115–133
|doi= 10.1007/BF02478259
|pmid= 2185863
Line 259 ⟶ 291:
|year= 1959
|pages= 114–125
|url=https://www.researchgate.net/publication/230876408|doi= 10.1147/rd.32.0114
}}
* {{cite book | last=Sakarovitch | first=Jacques | title=Elements of automata theory | others=Translated from the French by Reuben Thomas | ___location=Cambridge | publisher=[[Cambridge University Press]] | year=2009 | isbn=978-0-521-84425-3 | zbl=1188.68177 }}
===Further reading===
* {{Sipser 1997}} — '''1.1''': "Finite Automata" pp. 31–47. '''4.1''': "Decidable Languages - Decidable Problems Concerning Regular Languages" pp. 152–155. '''4.4''': DFA can accept only regular language
{{Formal languages and grammars|state=collapsed}}
{{DEFAULTSORT:Deterministic Finite-State Machine}}
[[Category:Finite-state
|