Content deleted Content added
m robot Adding: ko |
More extensive example, additional references |
||
Line 6:
== Example parser ==
[[Algorithms + Data Structures = Programs]]):
<pre>
program = block "." .
block =
[ "const" ident "=" number {"," ident "=" number} ";"]
[ "var" ident {"," ident} ";"]
{ "procedure" ident ";" block ";" } statement .
statement =
[ ident ":=" expression
| "call" ident
| "begin" statement {";" statement } "end"
| "if" condition "then" statement
| "while" condition "do" statement
].
condition =
"odd" expression
| expression ("="|"#"|"<"|"<="|">"|">=") expression
.
expression = [ "+"|"-"] term { ("+"|"-") term} .
term = factor {("*"|"/") factor} .
factor = ident | number | "(" expression ")" .
</pre>
[[Terminals]] are expressed in quotes (except for the well defined 'ident'
and 'number'). Each [[nonterminal]] is defined by a rule in the grammar.
Notice how closely the code below mirrors the grammar above. There is a procedure for each nonterminal in the grammar. Parsing descends in a top-down manner, until the final nonterminal has been processed. The program fragment depends on a global variable, ''sym'', which contains the next symbol from the input, and the global function ''getsym'', which updates ''sym'' when called.
<pre>
int accept(Symbol s) {
if (sym == s) {
getsym();
return 1;
}
return 0;
}
int expect(Symbol s) {
if (accept(s))
return 1;
error("expect: unexpected symbol");
return 0;
}
void factor(void) {
if (accept(ident)) {
;
} else if (accept(number)) {
;
} else if (accept(lparen)) {
expression();
expect(rparen);
} else {
error("factor: syntax error");
getsym();
}
}
void term(void) {
factor();
while (sym == times || sym == slash) {
getsym();
factor();
}
}
void expression(void) {
if (sym == plus || sym == minus)
getsym();
term();
while (sym == plus || sym == minus) {
getsym();
term();
}
}
void condition(void) {
if (accept(oddsym)) {
expression();
} else {
expression();
if (sym == eql || sym == neq || sym == lss ||
sym == leq || sym == gtr || sym == geq) {
getsym();
expression();
} else {
error("condition: invalid operator");
getsym();
}
}
}
void statement(void) {
if (accept(ident)) {
expect(becomes);
expression();
} else if (accept(callsym)) {
expect(ident);
} else if (accept(beginsym)) {
do {
statement();
} while (accept(semicolon));
expect(endsym);
} else if (accept(ifsym)) {
condition();
expect(thensym);
statement();
} else if (accept(whilesym)) {
condition();
expect(dosym);
statement();
}
}
void block(void) {
if (accept(constsym)) {
do {
expect(ident);
expect(eql);
expect(number);
} while (accept(comma));
expect(semicolon);
}
if (accept(varsym)) {
do {
expect(ident);
} while (accept(comma));
expect(semicolon);
}
while (accept(procsym)) {
expect(ident);
expect(semicolon);
block();
expect(semicolon);
}
statement();
}
void program(void) {
getsym();
block();
expect(period);
}
</pre>
== Formalizing recursive descent parsers ==
Line 34 ⟶ 167:
== Implementation in functional languages ==
Recursive descent parsers are particularly easy to implement in functional languages such as [[Haskell programming language|Haskell]]. See [http://www.cs.nott.ac.uk/~gmh/pearl.pdf Functional Pearls: Monadic Parsing in Haskell] (pdf format).
== See Also ==
*[[PL/0]] - for more about the programming language PL/0
*[[COCO/R]] - a recursive descent parser generator
*[[ANTLR]] - a recursive descent parser generator
== References ==
Line 41 ⟶ 179:
* [http://pyparsing.sourceforge.net/ pyparsing]: A versatile recursive descent [[Python (programming language)|Python]] module.
{{FOLDOC}}
* ''Algorithms + Data Structures = Programs'', Niklaus Wirth, 1975, ISBN 0-13-022418-9
* ''Compiler Construction'', Niklaus Wirth, 1996, ISBN 0-201-40353-6
* ''Compiling with C# and Java'', Pat Terry, 2005, ISBN/ISSN: 0-321-26360-X 624
[[Category:Parsing algorithms]]
|