Comparison of parser generators: Difference between revisions

Content deleted Content added
Bender the Bot (talk | contribs)
m HTTP to HTTPS for SourceForge
 
(29 intermediate revisions by 19 users not shown)
Line 1:
{{More citations needed|date=July 2023}}
{{short description|None}}
{{More citations needed|date=July 2023}}
 
This is a list of notable [[lexer generator]]s and [[parser generator]]s for various language classes.
 
== Regular languages ==
[[Regular language]]s are a category of languages (sometimes termed [[Chomsky hierarchy|Chomsky Type 3]]) which can be matched by a state machine (more specifically, by a [[deterministic finite automaton]] or a [[nondeterministic finite automaton]]) constructed from a [[regular expression]]. In particular, a regular language can match constructs like "A follows B", "Either A or B", "A, followed by zero or more instances of B", but cannot match constructs which require consistency between non-adjacent elements, such as "some instances of A followed by the same number of instances of B", and also cannot express the concept of recursive "nesting" ("every A is eventually followed by a matching B"). A classic example of a problem which a regular grammar cannot handle is the question of whether a given string contains correctly- nested parentheses. (This is typically handled by a Chomsky Type 2 grammar, also termed a [[context-free grammar]].)
 
{| class="wikitable sortable" style="text-align: center; font-size: 85%; width: auto;"
Line 47:
| Quex || [[Deterministic finite automaton|DFA]] direct code || [[C (programming language)|C]], [[C++]] || {{D-A|Mixed}} || {{Yes|All}} || {{Free}}, [[GNU Lesser General Public License|GNU LGPL]]
|-
| [[Ragel]] || [[Deterministic finite automaton|DFA]] || [[Go (programming language)|Go]], [[C (programming language)|C]], [[C++]], [[Java (programming language)|Java]], [[Assembly language|assembly]] || {{D-A|Mixed}} || {{Yes|All}} || {{Free}}, [[GNU General Public License|GNU GPL]], [[MIT License|MIT]]<ref>{{Cite web|url=http://www.colm.net/open-source/ragel/|title = Ragel State Machine Compiler}}</ref><ref>http://www.colm.net/open-source/ragel/ {{verify source |date=November 2020 |reason=This ref was deleted Special:Diff/987467678 by a bug in VisualEditor and later restored by a bot from the original cite located at Special:Permalink/987467450 cite #1 - verify the cite is accurate and delete this template. [[User:GreenC bot/Job 18]]}}</ref>
|-
| [[RE/flex]] || [[Deterministic finite automaton|DFA]] direct code, DFA table driven, and [[Nondeterministic finite automaton|NFA]] regex libraries || [[C++]] || {{D-A|Mixed}} || {{Yes|All}} || {{Free}}, [[BSD licenses|BSD]]
|-
| [[re2c]] || [[Deterministic finite automaton|DFA]] direct code || [[C (programming language)|C]], [[C++]], [[Go (programming language)|Go]], [[Rust (programming language)|Rust]] || {{D-A|Mixed}} || {{Yes|All}} || {{Free}}, [[public ___domain]]
Line 55:
 
== Deterministic context-free languages ==
[[Context-free language]]s are a category of languages (sometimes termed [[Chomsky hierarchy|Chomsky Type 2]]) which can be matched by a sequence of replacement rules, each of which essentially maps each non-terminal element to a sequence of terminal elements and/or other nonterminal elements. Grammars of this type can match anything that can be matched by a [[regular grammar]], and furthermore, can handle the concept of recursive "nesting" ("every A is eventually followed by a matching B"), such as the question of whether a given string contains correctly- nested parentheses. The rules of Context-free grammars are purely local, however, and therefore cannot handle questions that require non-local analysis such as "Does a declaration exist for every variable that is used in a function?". To do so technically would require a more sophisticated grammar, like a Chomsky Type 1 grammar, also termed a [[context-sensitive grammar]]. However, parser generators for context-free grammars often support the ability for user-written code to introduce limited amounts of context-sensitivity. (For example, upon encountering a variable declaration, user-written code could save the name and type of the variable into an external data structure, so that these could be checked against later variable references detected by the parser.)
 
The [[deterministic context-free language]]s are a proper subset of the context-free languages which can be efficiently parsed by [[deterministic pushdown automata]].<!-- More parsing algorithms and output formats supported. See httphttps://gdk.sourceforge.net/gdkref.pdf -->
{| class="wikitable sortable" style="text-align: center; font-size: 85%; width: auto;"
|-
Line 66:
| [[ANTLR]]3 || [[LL parser|LL]](*) || [[Extended Backus–Naur form|EBNF]] || [[ActionScript]], [[Ada (programming language)|Ada95]], [[C (programming language)|C]], [[C++]], [[C Sharp (programming language)|C#]], [[Java (programming language)|Java]], [[JavaScript]], [[Objective-C]], [[Perl]], [[Python (programming language)|Python]], [[Ruby (programming language)|Ruby]] || {{D-A|Mixed}} || generated || {{Some|[[Java virtual machine]]}} || {{Yes}} || {{Free}}, [[BSD licenses|BSD]]
|-
| APG<ref>{{Cite web cn|title=Survey on Various Syntax Analyzer Tools |url=https://www.ijraset.com/research-paper/various-syntax-analyzer-tools |access-date=2023-09-16January |website=www.ijraset.com |language=en2025}}</ref>|| [[Recursive descent]], [[backtracking]] || [[ABNF]] || [[Python (programming language)|Python]], [[JavaScript]], [[C (programming language)|C]], [[Java (programming language)|Java]] || {{D-P|Separate}} || none || {{Yes|All}} || {{No}} || {{Free}}, [[BSD licenses|BSD]]
|-
| Beaver<ref>{{Cite journal |last1=Boyland |first1=John |last2=Spiewak |first2=Daniel |date=2010-09-17 |title=Tool Paper: ScalaBison Recursive Ascent-Descent Parser Generator |journal=Electronic Notes in Theoretical Computer Science |series=Proceedings of the Ninth Workshop on Language Descriptions Tools and Applications (LDTA 2009) |volume=253 |issue=7 |pages=65–74 |doi=10.1016/j.entcs.2010.08.032 |issn=1571-0661|doi-access=free }}</ref><ref>{{Cite web |title=Beaver - a LALR Parser Generator |url=https://beaver.sourceforge.net/ |access-date=2023-09-16 |website=beaver.sourceforge.net}}</ref>|| [[LALR parser|LALR]](1) || [[Extended Backus–Naur form|EBNF]]|| [[Java (programming language)|Java]]|| {{D-A|Mixed}} || external || {{Some|[[Java virtual machine]]}} || {{No}} || {{Free}}, [[BSD licenses|BSD]]
|-
| [[GNU Bison|Bison]] || [[LALR parser|LALR]](1), [[Canonical LR parser|LR]](1), [[IELR parser|IELR]](1), [[GLR parser|GLR]] || [[Yacc]] || [[C (programming language)|C]], [[C++]], [[D (programming language)|D]], [[Java (programming language)|Java]] || {{D-A|Mixed}} || external || {{Yes|All}} || {{No}} || {{Free}}, [[GNU General Public License|GNU GPL]] with exception
|-
| [[BtYacc]] || [[Backtracking]] [[Bottom-up parsing|Bottom-up]] || ? || [[C++]] || {{D-A|Mixed}} || external || {{Yes|All}} || {{No}} || {{Free}}, [[public ___domain]]
Line 78:
| CL-Yacc<ref>{{Cite journal |last1=Newton |first1=Jim E. |last2=Demaille |first2=Akim |last3=Verna |first3=Didier |date=2016-05-09 |title=Type-Checking of Heterogeneous Sequences in Common Lisp |url=https://www.lrde.epita.fr/dload/papers/newton.16.rte.report.pdf |journal=Proceedings of the 9th European Lisp Symposium on European Lisp Symposium |series=ELS2016 |___location=Kraków, Poland |publisher=European Lisp Scientific Activities Association |pages=13–20 |isbn=978-2-9557474-0-7}}</ref><ref>{{Cite web |title=CL-Yacc — a LALR(1) parser generator for Common Lisp |url=https://www.irif.fr/~jch/software/cl-yacc/ |access-date=2023-09-16 |website=www.irif.fr}}</ref>|| [[LALR parser|LALR]](1) || [[Lisp (programming language)|Lisp]]|| [[Common Lisp]]|| {{D-A|Mixed}} || external || {{Yes|All}} || {{No}} || {{Free}}, [[MIT License|MIT]]
|-
| [[Coco/R]] || [[LL parser|LL]](1) + semantic predicates || [[Extended Backus–Naur form|EBNF]] || [[C (programming language)|C]], [[C++]], [[C Sharp (programming language)|C#]], [[F Sharp (programming language)|F#]], [[Java (programming language)|Java]], [[Ada (programming language)|Ada]], [[Object Pascal]], [[Delphi (software)|Delphi]], [[Modula-2]], [[Oberon (programming language)|Oberon]], [[Ruby (programming language)|Ruby]], [[Swift (programming language)|Swift]], [[Unicon (programming language)|Unicon]], [[Visual Basic .NET]] || {{D-A|Mixed}} || generated || {{Some|[[Java virtual machine]], [[.NET]] framework, [[Microsoft Windows|Windows]], [[POSIX]] (depends on output language)}} || {{No}} || {{Free}}, [[GNU General Public License|GNU GPL]]
|-
| CppCC<ref>{{Cite journal |last1=Hosseinpour |first1=Sahereh |last2=Alavi Milani |first2=Mir Mohammad Reza |last3=Pehlivan |first3=Hüseyin |date=July 2018 |title=A Step-by-Step Solution Methodology for Mathematical Expressions |journal=Symmetry |language=en |volume=10 |issue=7 |pages=285 |doi=10.3390/sym10070285 |bibcode=2018Symm...10..285H |issn=2073-8994 |doi-access=free }}</ref><ref>{{Cite web |title=CppCC's Home Page |url=https://cppcc.sourceforge.net/ |access-date=2023-09-16 |website=cppcc.sourceforge.net}}</ref>|| [[LL parser|LL]](k) || ? || [[C++]]|| {{D-A|Mixed}} || generated || {{Some|[[POSIX]]}} || {{No}} || {{Free}}, [[GNU General Public License|GNU GPL]]
Line 84:
| CUP<ref>{{Cite web |title=Java Cup |url=https://pages.cs.wisc.edu/~fischer/cs536.s06/course.hold/html/NOTES/4a.JAVA-CUP.html |access-date=2023-09-16 |website=pages.cs.wisc.edu}}</ref><ref>{{Cite web |title=CUP |url=http://www2.cs.tum.edu/projects/cup/docs.php |access-date=2023-09-16 |website=www2.cs.tum.edu}}</ref>|| [[LALR parser|LALR]](1) || ? || [[Java (programming language)|Java]]|| {{D-A|Mixed}} || external || {{Some|[[Java virtual machine]]}} || {{No}} || {{Free}}, [[BSD licenses|BSD]]-like
|-
<!-- Other programming languages may be supported. See httphttps://eli-project.sourceforge.net/ -->
| Eli<ref>{{Cite journal |last1=Thiemann |first1=Peter |last2=Neubauer |first2=Matthias |date=2004-12-31 |title=Parameterized LR Parsing |journal=Electronic Notes in Theoretical Computer Science |series=Proceedings of the Fourth Workshop on Language Descriptions, Tools, and Applications (LDTA 2004) |volume=110 |pages=115–132 |doi=10.1016/j.entcs.2004.06.007 |issn=1571-0661|doi-access=free }}</ref><ref>{{Cite journal |last1=Gray |first1=Robert W. |last2=Levi |first2=Steven P. |last3=Heuring |first3=Vincent P. |last4=Sloane |first4=Anthony M. |last5=Waite |first5=William M. |title=Eli: a complete, flexible compiler construction system |journal=Communications of the ACM |date=1992 |language=en |volume=35 |issue=2 |pages=121–130 |doi=10.1145/129630.129637 |s2cid=5121773 |issn=0001-0782|doi-access=free }}</ref> || [[LALR parser|LALR]](1) || ? || [[C (programming language)|C]] || {{D-A|Mixed}} || generated || {{Some|[[POSIX]]}} || {{No}} || {{Free}}, [[GNU General Public License|GNU GPL]], [[GNU Lesser General Public License|GNU LGPL]]
|-
Line 113:
| Lapg || [[LALR parser|LALR]](1) || ? || [[C (programming language)|C]], [[C++]], [[C Sharp (programming language)|C#]], [[Java (programming language)|Java]], [[JavaScript]] || {{D-A|Mixed}} || generated || {{Some|[[Java virtual machine]]}} || {{No}} || {{Free}}, [[GNU General Public License|GNU GPL]]
|-
| Lark || [[LALR parser|LALR]](1), [[Earley_parserEarley parser|Earley (SPPF)]] || [[Extended Backus–Naur form|EBNF]] || [[Python (programming language)|Python]], [[JavaScript]] || {{D-A|Mixed}} || generated || {{Yes|All}} || {{Yes}} || {{Free}}, [[MIT license|MIT]]
|-
| [[Lemon (parser generator)|Lemon]] || [[LALR parser|LALR]](1) ||BNF dialect<ref>{{Cite web |title=The Lemon Parser Generator |url=https://sqlite.org/src/doc/trunk/doc/lemon.html#syntax |access-date=2023-11-30 |website=sqlite.org}}</ref>|| [[C (programming language)|C]] || {{D-A|Mixed}} || external || {{Yes|All}} || {{No}} || {{Free}}, [[public ___domain]]
Line 164:
|-
| PRECC || [[LL parser|LL]](k) || ? || [[C (programming language)|C]] || {{D-P|Separate}} || generated || {{Some|[[DOS]], [[POSIX]]}} || {{No}} || {{Free}}, [[GNU General Public License|GNU GPL]]
|-
| QLALR || [[LALR parser|LALR]](1) || ? || [[C++]] || {{D-A|Mixed}} || external || {{Yes|All}} || {{No}} || {{Free}}, [[GNU General Public License|GNU GPL]]
|-
|racc<ref>{{Cite web|title=Racc|url=https://i.loveruby.net/en/projects/racc/|access-date=2021-11-26|website=i.loveruby.net}}</ref>
Line 175 ⟶ 177:
|{{Free|[[LGPL]]}}
|-
| REX<ref>{{Cite web |url=https://www.bottlecaps.de/rex/ |title=The REX Parser Generator supports C, C++, Java, JavaScript, C#, Go, Haxe, Python, Scala, Typescript, XQuery, and XSLT}}</ref> || [[LL parser|LL]](1) [[sLL parser|sLL]](k) [[LR parser|LR]](k) [[LALR]](k) [[GLR parser|GLR]] [[Parsing expression grammar|PEG]] [[Deterministic finite automaton|DFA]] [[Context-dependent lexing]] || [[Extended Backus–Naur form|EBNF]] || [[C++]], [[C Sharp (programming language)|C#]], [[Java (programming language)|Java]], [[JavaScript]], [[Go (programming language)|Go]], [[Haxe]], [[Python (programming language)|Python]], [[Scala (programming language)|Scala]], [[TypeScript|Typescript]], [[XQuery]], [[XSLT]] || {{D-P|Separate}} || generated || {{Yes|All}} || {{No}} || {{Free}}, [[Apache License 2.0]]
| QLALR || [[LALR parser|LALR]](1) || ? || [[C++]] || {{D-A|Mixed}} || external || {{Yes|All}} || {{No}} || {{Free}}, [[GNU General Public License|GNU GPL]]
|-
| [[SableCC]] || [[LALR parser|LALR]](1) || ? || [[C (programming language)|C]], [[C++]], [[C Sharp (programming language)|C#]], [[Java (programming language)|Java]], [[OCaml]], [[Python (programming language)|Python]] || {{D-P|Separate}} || generated || {{Some|[[Java virtual machine]]}} || {{No}} || {{Free}}, [[GNU Lesser General Public License|GNU LGPL]]
|-
| SLK<ref>{{Cite web |url=http://www.slkpg.sitecom/ |title=The SLK Parser Generator supports C, C++, Java, JavaScript, and C#, optional backtracking, free}}</ref> || [[LL parser|LL]](k) [[LR parser|LR]](k) [[LALR]](k) || [[Extended Backus–Naur form|EBNF]] || [[C (programming language)|C]], [[C++]], [[C Sharp (programming language)|C#]], [[Java (programming language)|Java]], [[JavaScript]] || {{D-P|Separate}} || external || {{Yes|All}} || {{No}} || SLK<ref>http://www.slkpg.sitecom/license.txt {{Bare URL plain text|date=March 2022}}</ref>
|-
| SLY<ref>{{Cite web |url=https://sly.readthedocs.io/en/latest/sly.html |title=SLY (Sly Lex Yacc)}}</ref> || [[LALR parser|LALR]](1) || BNF || [[Python (programming language)|Python]] || {{D-A|Mixed}} || generated || {{Yes|All}} || {{No}} || {{Free}}, [[MITBSD License|MITBSD]]
|-
| SP (Simple Parser) || [[Recursive descent]] || [[Python (programming language)|Python]] || [[Python (programming language)|Python]] || {{D-P|Separate}} || generated || {{Yes|All}} || {{No}} || {{Free}}, [[GNU Lesser General Public License|GNU LGPL]]
Line 201 ⟶ 203:
| TP Yacc || [[LALR parser|LALR]](1) || ? || [[Turbo Pascal]] || {{D-A|Mixed}} || external || {{Yes|All}} || {{Yes}} || {{Free}}, [[GNU General Public License|GNU GPL]]
|-
| [[Tree-sitter (parser generator)|Tree-Sitter]]<ref>{{Cite web |url=https://tree-sitter.github.io/ |title=Tree-Sitter - An incremental parsing system for programming tools}}</ref> || [[LR parser|LR]](1), [[GLR parser|GLR]] || [[JavaScript]] [[Domain-specific language|DSL]], [[JSON]] || [[C (programming language)|C]], bindings ([[Rust (programming language)|Rust]], [[WebAssembly]], [[JavaScript]], [[Python (programming language)|Python]], many other) || {{D-P|Separate}} || generated + external || {{Yes|All}} || {{D-A|[[Vim (text editor)#Neovim|Neovim]], [[VisualHelix]], Studio[[GNU CodeEmacs]], [[Lapce]], [[Zed (text editor)|Zed]]}} || {{Free}}, [[MIT License|MIT]]
|-
| Tunnel Grammar Studio || [[Tunnel Parsing]] || [[Augmented Backus–Naur form|ABNF]] || [[C++]] || {{D-P|Separate}} || generated || {{Some|[[Microsoft Windows|Windows]]}} || {{Yes}} || {{Proprietary}}
Line 231 ⟶ 233:
|}
 
==Parsing expression grammars, deterministic booleanBoolean grammars==
 
This table compares parser generators with [[parsing expression grammar]]s, deterministic [[booleanBoolean grammar]]s.
 
{| class="wikitable sortable" style="text-align: center; font-size: 85%; width: auto;"
Line 276 ⟶ 278:
|-
| neotoma || [[Packrat parser|Packrat]] || [[Erlang (programming language)|Erlang]] || {{D-P|Separate}} || {{Yes|All}} || {{Free}}, [[MIT License|MIT]]
|-
| nez<ref>{{Citation |title=Nez: practical open grammar language |date=2015-11-26 |arxiv=1511.08307 |last1=Kuramitsu |first1=Kimio }}</ref> || Parsing machine || [[Java (programming language)|Java]], [[C (programming language)|C]] || {{D-P|Separate}} || {{Some|[[Java virtual machine]]}} || {{Free}}, [[BSD License|BSD]]
|-
| NPEG || Recursive descent || [[C Sharp (programming language)|C#]] || {{D-A|Mixed}} || {{Yes|All}} || {{Free}}, [[MIT License|MIT]]
Line 281 ⟶ 285:
| [[OMeta]] || [[Packrat parser|Packrat]] (modified, partial memoization) || [[JavaScript]], [[Squeak]], [[Python (programming language)|Python]] || {{D-A|Mixed}} || {{Yes|All}} || {{Free}}, [[MIT License|MIT]]
|-
| [[PackCC]] || [[Packrat parser|Packrat]] (modified, left-recursion support) || [[C (programming language)|C]] || {{D-A|Mixed}} || {{Yes|All}} || {{Free}}, [[MIT License|MIT]]
|-
| [[Packrat parser|Packrat]] || [[Packrat parser|Packrat]] || [[Scheme (programming language)|Scheme]] || {{D-A|Mixed}} || {{Yes|All}} || {{Free}}, [[MIT License|MIT]]
Line 330 ⟶ 334:
|}
 
== General context-free, conjunctive, or booleanBoolean languages ==
 
This table compares parser generator languages with a general [[context-free grammar]], a [[conjunctive grammar]], or a [[booleanBoolean grammar]].
 
{| class="wikitable sortable" style="text-align: center; font-size: 85%; width: auto;"
Line 342 ⟶ 346:
| APaGeD || [[GLR parser|GLR]], [[LALR parser|LALR]](1), [[LL parser|LL]](k) || ? || [[D (programming language)|D]] || {{D-A|Mixed}} || generated || {{Yes|All}} || {{No}} || {{Free}}, [[Artistic License|Artistic]]
|-
| [[GNU Bison|Bison]] || [[LALR parser|LALR]](1), [[Canonical LR parser|LR]](1), [[IELR parser|IELR]](1), [[GLR parser|GLR]] || [[Yacc]] || [[C (programming language)|C]], [[C++]], [[D (programming language)|D]],<ref>{{cite web |title=Decl Summary (Bison 3.8.1) |url=https://www.gnu.org/software/bison/manual/html_node/Decl-Summary.html |website=www.gnu.org}}</ref> [[Java (programming language)|Java]], [[XML]] || {{D-A|Mixed}}, except XML || external || {{Yes|All}} || {{No}} || {{Free}}, [[GNU General Public License|GNU GPL]]
|-
| [[DMS Software Reengineering Toolkit]] || [[GLR parser|GLR]] || ? || [[Parlanse]] || {{D-A|Mixed}} || generated || {{Some|[[Microsoft Windows|Windows]]}} || {{No}} || {{Proprietary}}
Line 400 ⟶ 404:
! Name !! Parsing algorithm !! Input grammar notation !! Boolean grammar abilities !! Development platform !! [[Software license|License]]
|-
| [httphttps://sourceforge.net/p/bnf2xml/ bnf2xml] || [[Recursive descent]] (is a text filter output is xml) || simple [[Backus–Naur form|BNF]]{{clarify|Only context-free grammars can be denoted in BNF.|date=January 2018}} grammar (input matching), output is xml || ? || Beta, and not a full EBNF parser || {{Free}}, [[GNU General Public License|GNU GPL]]
|}