Operator-precedence parser: Difference between revisions

Content deleted Content added
Add example of putting an operator-precedence parser in the middle of a recursive-descent parser
fix formatting
Line 13:
== Example algorithm to parse infix notation ==
 
An [[EBNF]] grammar that parses infix notation will usually look like this:
 
<pre>
Line 29:
The algorithm is not a pure operator-precedence parser like the Dijkstra [[Reverse Polish Notation#Converting from infix notation|shunting yard]] algorithm. It assumes that the <em>primary</em> nonterminal is parsed in a separate subroutine, like in a [[recursive descent parser]].
 
=== Pseudo-code ===
Precedence levels are greater or equal to 0. The pseudo-code is as follows:
 
The pseudo-code for the algorithm is as follows. The parser starts at function <em>parse_expression</em>. Precedence levels are greater or equal to 0.
parse_expression ()
 
:* return parse_expression_1 (parse_primary (), 0)
<em>parse_expression</em> ()
:* return <em>parse_expression_1</em> (<em>parse_primary</em> (), 0)
<em>parse_expression_1 (lhs, min_precedence)
:* while the next token is a binary operator whose precedence is >= <em>min_precedence</em>
::* <em>op</em> := next token
::* <em>rhs</em> := <em>parse_primary</em> ()
::* while the next token is a binary operator whose precedence is greater than <em>op</em>'s, or a right-associative operator whose precedence is equal to <em>op</em>'s
:::* lookahead := next token
:::* <em>rhs</em> := <em>parse_expression_1</em> (<em>rhs</em>, lookahead's precedence)
::* <em>lhs</em> := the result of applying <em>op</em> with operands <em>lhs</em> and <em>rhs</em>
:* return <em>lhs</em>
 
=== Example execution of the algorithm ===
 
An example execution on the expression 2 + 3 * 4 + 5 == 19 is as follows. We give precedence 0 to equality expressions, 1 to additive expressions, 2 to multiplicative expressions.
 
<em>parse_expression_1</em> (<em>lhs</em> = 2, <em>min_precedence</em> = 0)
* the next token is +, with precedence 1. the while loop is entered.
* <em>op</em> is + (precedence 1)
* <em>rhs</em> is 3
* the next token is *, with precedence 2. recursive invocation.<br><em>parse_expression_1</em> (<em>lhs</em = 3, <em>min_precedence</em> = 2)
:* the next token is *, with precedence 2. the while loop is entered.
::* <em>op</em> is * (precedence 2)
::* <em>rhs</em> is 4
::* the next token is +, with precedence 1. no recursive invocation.
::* <em>lhs</em> is assigned 3*4 = 12
::* the next token is +, with precedence 1. the while loop is left.
:* 12 is returned.
* the next token is +, with precedence 1. no recursive invocation.
* <em>lhs</em> is assigned 2+12 = 14
* the next token is +, with precedence 1. the while loop is not left.
* <em>op</em> is + (precedence 1)
* <em>rhs</em> is 5
* the next token is ==, with precedence 0. no recursive invocation.
* <em>lhs</em> is assigned 14+5 = 19
* the next token is ==, with precedence 0. the while loop is not left.
* <em>op</em> is == (precedence 0)
* <em>rhs</em> is 19
* the next token is <em>end-of-line</em>, which is not an operator. no recursive invocation.
* <em>lhs</em> is assigned the result of evaluating 19 == 19, for example 1 (as in the C standard).
* the next token is <em>end-of-line</em>, which is not an operator. the while loop is left.
1 is returned.
 
</pre>
 
[[Category:Parsing algorithms]]