Content deleted Content added
(267 intermediate revisions by more than 100 users not shown) | |||
Line 1:
{{short description|Algorithm}}
A '''recursive descent parser''' is a top-down [[parsing|parser]] built from a set of [[Mutual recursion|mutually-recursive]] procedures (or a non-recursive equivalent) where each such [[procedure]] usually implements one of the production rules of the [[formal grammar|grammar]]. Thus the structure of the resulting program closely mirrors that of the grammar it recognises.▼
{{More footnotes|date=February 2009}}
▲
A '''predictive parser''' is a recursive descent parser with no backup. Predictive parsing is possible only for the class of [[LL parser|LL(k)]] grammars, which are the context-free grammars for which there exists some positive integer k that allows a recursive descent parser to decide which production to use by examining only the next k tokens of input. (The [[LL parser|LL(k)]] grammars therefore exclude all ambiguous grammars, as well as all grammars that contain left recursion. Any context-free grammar can be transformed into an equivalent grammar that has no left recursion, but removal of left recursion does not always yield an [[LL parser|LL(k)]] grammar.) A predictive parser runs in [[linear time]].▼
▲A
Recursive descent with backup is a technique that determines which production to use by trying each production in turn. Recursive descent with backup is not limited to [[LL parser|LL(k)]] grammars, but is not guaranteed to terminate unless the grammar is [[LL parser|LL(k)]]. Even when they terminate, parsers that use recursive descent with backup may require [[exponential time]].▼
▲Recursive descent with
Although predictive parsers are widely used, and are frequently chosen if writing a parser by hand, programmers often prefer to use a table-based parser produced by a [[parser generator]],{{Citation needed|date=February 2018}} either for an LL(''k'') language or using an alternative parser, such as [[LALR parser|LALR]] or [[LR parser|LR]]. This is particularly the case if a grammar is not in [[LL parser|LL(''k'')]] form, as transforming the grammar to LL to make it suitable for predictive parsing is involved. Predictive parsers can also be automatically generated, using tools like [[ANTLR]].
Predictive parsers can be depicted using transition diagrams for each non-terminal symbol where the edges between the initial and the final states are labelled by the symbols (terminals and non-terminals) of the right side of the production rule.<ref>{{cite book|last1=Aho|first1=Alfred V.|last2=Sethi|first2=Ravi|last3=Ullman|first3=Jeffrey|authorlink1=Alfred V. Aho|authorlink3=Jeffrey Ullman|title=Compilers: Principles, Techniques and Tools|url=https://archive.org/details/compilers00ahoa|url-access=limited|date=1986|publisher=Addison Wesley|page=[https://archive.org/details/compilers00ahoa/page/n193 183]|edition=first}}</ref>
== Example parser ==▼
The following [[EBNF]] [[grammar]] (for [[Niklaus Wirth]]'s [[PL/0]] programming language, from▼
▲The following [[Extended Backus–Naur Form|EBNF]]-like [[formal grammar|grammar]] (for [[Niklaus Wirth]]'s [[PL/0]] programming language, from ''[[Algorithms + Data Structures = Programs]]'') is in [[LL parser|LL(1)]] form:
<syntaxhighlight lang="ebnf">
program = block "." .
Line 23 ⟶ 24:
statement =
| "call" ident
| "begin" statement {";" statement } "end"
| "if" condition "then" statement
| "while" condition "do" statement .
] .▼
condition =
"odd" expression
| expression ("="|"#"|"<"|"<="|">"|">=") expression .
expression = ["+"|"-"] term {("+"|"-") term} .
Line 39 ⟶ 38:
term = factor {("*"|"/") factor} .
factor =
ident | number | "(" expression ")" . </syntaxhighlight>
[[
===C implementation===
Notice how closely the predictive parser below mirrors the grammar above. There is a procedure for each nonterminal in the grammar. Parsing descends in a top-down manner, until the final nonterminal has been processed. The program fragment depends on a global variable, ''sym'', which contains the next symbol from the input, and the global function ''getsym'', which updates ''sym'' when called.▼
What follows is an implementation of a recursive descent parser for the above language in [[C (programming language)|C]]. The parser reads in source code, and exits with an error message if the code fails to parse, exiting silently if the code parses correctly.
▲Notice how closely the predictive parser below mirrors the grammar above. There is a procedure for each nonterminal in the grammar. Parsing descends in a top-down manner
The implementations of the functions ''nextsym'' and ''error'' are omitted for simplicity.
<syntaxhighlight lang="c">
typedef enum {ident, number, lparen, rparen, times, slash, plus,
minus, eql, neq, lss, leq, gtr, geq, callsym, beginsym, semicolon,
Line 54 ⟶ 61:
Symbol sym;
void
void error(const char msg[]);
int accept(Symbol s) {
if (sym == s) {
return 1;
}
Line 83 ⟶ 89:
} else {
error("factor: syntax error");
}
}
Line 90 ⟶ 96:
factor();
while (sym == times || sym == slash) {
factor();
}
Line 97 ⟶ 103:
void expression(void) {
if (sym == plus || sym == minus)
term();
while (sym == plus || sym == minus) {
term();
}
Line 110 ⟶ 116:
} else {
expression();
if (sym == eql || sym == neq || sym == lss || sym == leq || sym == gtr || sym == geq) {
getsym();▼
expression();
} else {
error("condition: invalid operator");
}
}
Line 140 ⟶ 145:
expect(dosym);
statement();
error("statement: syntax error");
}
}
Line 168 ⟶ 176:
void program(void) {
block();
expect(period);
}
</syntaxhighlight>
*[[TMG (language)|TMG]] – an early compiler-compiler used in the 1960s and early 1970s
*[[JavaCC]]
*[[Coco/R]]
*[[ANTLR]]
*[[Spirit Parser Framework]] – a C++ recursive descent parser generator framework requiring no pre-compile step
*[[parboiled (Java)]] – a recursive descent [[Parsing expression grammar|PEG]] parsing library for [[Java (programming language)|Java]]
The C++ front-end of the [[Clang]] compiler contains a hand-written parser based on the recursive-descent parsing algorithm. <ref>How Clang handles the type / variable name ambiguity of C/C++
▲== See also ==
https://eli.thegreenplace.net/2012/07/05/how-clang-handles-the-type-variable-name-ambiguity-of-cc/</ref>
▲*[[ANTLR]] - a recursive descent parser generator
==See
*[[Parser combinator]] – a higher-order function used in combinatory parsing, a method of factoring recursive descent parser designs
* The [[Dragon book]], Alfred V Aho, Ravi Sethi, and Jeffrey D Ullman, in particular Section 4.4.▼
*[[Parsing expression grammar]] – another form representing recursive descent grammar
* ''Modern Compiler Implementation in Java, Second Edition'', Andrew Appel, 2002, ISBN 0-521-82060-X.▼
*[[Recursive ascent parser]]
* ''Recursive Programming Techniques'', W.H. Burge, 1975, ISBN 0-201-14450-6▼
* ''Crafting a Compiler with C'', Charles N Fischer and Richard J LeBlanc, Jr, 1991, ISBN 0-8053-2166-7.▼
* ''Compiling with C# and Java'', Pat Terry, 2005, ISBN/ISSN: 0-321-26360-X 624▼
* ''Algorithms + Data Structures = Programs'', Niklaus Wirth, 1975, ISBN 0-13-022418-9▼
* ''Compiler Construction'', Niklaus Wirth, 1996, ISBN 0-201-40353-6▼
==References==
[[Category:Parsing algorithms]]▼
{{Reflist}}
===General references===
▲*
▲* ''Modern Compiler Implementation in Java, Second Edition'', Andrew Appel, 2002, {{ISBN
▲* ''Crafting a Compiler with C'', Charles N Fischer and Richard J LeBlanc, Jr, 1991, {{ISBN
==External links==
*[https://compilers.iecc.com/crenshaw/ Jack W. Crenshaw: ''Let's Build A Compiler'' (1988-1995)], in [[Pascal (programming language)|Pascal]], with [[assembly language]] output, using a "keep it simple" approach
{{Parsers}}
▲[[Category:Parsing algorithms]]
[[Category:Articles with example C code]]
|