Content deleted Content added
Corrected typo |
Rescuing 1 sources and tagging 0 as dead.) #IABot (v2.0.9.5 |
||
(48 intermediate revisions by 38 users not shown) | |||
Line 1:
{{Short description|Bottom-up parser that interprets an operator-precedence grammar}}
In [[computer science]], an '''operator [[Edsger Dijkstra]]'s [[shunting yard algorithm]] is commonly used to implement operator
== Relationship to other parsers ==
An operator-precedence parser is a simple [[shift-reduce parser]] that is capable of parsing a subset of [[LR parser|LR(1)]] grammars. More precisely, the operator-precedence parser can parse all LR(1) grammars where two consecutive [[nonterminal]]s and [[empty string|epsilon]] never appear in the right-hand side of any rule.
Operator-precedence parsers are not used often in practice; however they do have some properties that make them useful within a larger design. First, they are simple enough to write by hand, which is not generally the case with more sophisticated right shift-reduce parsers. Second, they can be written to consult an operator table at [[Run time (program lifecycle phase)|run time]], which makes them suitable for languages that can add to or change their operators while parsing. (An example is [[Haskell (programming language)|Haskell]], which allows user-defined infix operators with custom associativity and precedence;
[[
== Precedence climbing method ==
The precedence climbing method is a compact, efficient, and flexible algorithm for parsing expressions that was first described by Martin Richards and Colin Whitby-
An infix-notation expression grammar in [[EBNF]] format will usually look like this:
<syntaxhighlight lang="
</syntaxhighlight>
Line 30 ⟶ 31:
The algorithm is not a pure operator-precedence parser like the Dijkstra shunting yard algorithm. It assumes that the ''primary'' nonterminal is parsed in a separate subroutine, like in a recursive descent parser.
===
The
'''return
''lookahead'' := peek next token
'''while''' ''lookahead'' is a binary operator whose precedence is >= ''min_precedence''
''op'' := ''lookahead''
advance to next token
''rhs'' := ''parse_primary'' ()
''lookahead'' := peek next token
'''while''' ''lookahead'' is a binary operator whose precedence is greater
than ''op''<nowiki>'</nowiki>s, or a right-associative operator
whose precedence is equal to ''op'''s
''rhs'' := ''parse_expression_1'' (''rhs'', precedence of ''op'' + (1 if ''lookahead'
''lookahead'' := peek next token
''lhs'' := the result of applying ''op'' with operands ''lhs'' and ''rhs''
'''return''' ''lhs''
Note that in the case of a production rule like this (where the operator can only appear once):
<syntaxhighlight lang="
</syntaxhighlight>
Line 90 ⟶ 91:
* the next token is ''end-of-line'', which is not an operator. the outer while loop is left.
1 is returned.
== Pratt parsing ==
{{external links|section|date=August 2023}}
Another precedence parser known as Pratt parsing was first described by [[Vaughan Pratt]] in the 1973 paper "Top Down Operator Precedence",<ref>Pratt, Vaughan. "[https://web.archive.org/web/20151223215421/http://hall.org.ua/halls/wizzard/pdf/Vaughan.Pratt.TDOP.pdf Top Down Operator Precedence]." ''Proceedings of the 1st Annual ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages'' (1973).</ref> based on [[Recursive descent parser|recursive descent]]. Though it predates precedence climbing, it can be viewed as a generalization of precedence climbing.<ref>{{cite web |last1=Norvell |first1=Theodore |title=Parsing Expressions by Recursive Descent |url=http://www.engr.mun.ca/~theo/Misc/pratt_parsing.htm |website=www.engr.mun.ca |quote=The purpose of this post is to [... start] with precedence climbing and refactoring it to use the command pattern until we arrive at a Pratt parser. [This is the author who coined the term "precedence climbing".]}}</ref>
Pratt designed the parser originally to implement the [[CGOL]] programming language, and it was treated in much more depth in a Masters Thesis under his supervision.<ref>Van De Vanter, Michael L. "[http://publications.csail.mit.edu/lcs/specpub.php?id=715 A Formalization and Correctness Proof of the CGOL Language System]." (Master's Thesis). MIT Laboratory for Computer Science Technical Report MIT-LCS-TR-147 (Cambridge, Massachusetts). 1975.</ref>
Tutorials and implementations:
* [[Douglas Crockford]] based the JavaScript parser in [[JSLint]] on Pratt parsing.<ref>{{cite web|author=Crockford, D|title=Top Down Operator Precedence|date=2007-02-21|url=http://crockford.com/javascript/tdop/tdop.html}}</ref>
* Comparison between Python implementations of precedence climbing and Pratt parsing: [http://www.oilshell.org/blog/2016/11/01.html "Pratt Parsing and Precedence Climbing Are the Same Algorithm" (2016) by Andy Chu]
* Tutorial using [[Rust (programming language)|Rust]]: [https://matklad.github.io/2020/04/13/simple-but-powerful-pratt-parsing.html "Simple but Powerful Pratt Parsing" (2020) by Aleksey Kladov]
* Tutorial using [[Rust (programming language)|Rust]]: [https://williamr.dev/posts/pratt-parsing/ "The Pratt Parsing Technique" (2024) by William Rågstad]
* Tutorial using [[Python (programming language)|Python]]: [http://effbot.org/zone/simple-top-down-parsing.htm "Simple Top-Down Parsing in Python" (2008) by Fredrik Lundh] {{Webarchive|url=https://web.archive.org/web/20150228044653/http://effbot.org/zone/simple-top-down-parsing.htm |date=2015-02-28 }}
* Tutorial using [[Java (programming language)|Java]]: [http://journal.stuffwithstuff.com/2011/03/19/pratt-parsers-expression-parsing-made-easy/ "Pratt Parsers: Expression Parsing Made Easy" (2011) by Bob Nystrom, author of Crafting Interpreters]
* Implementation in [[C Sharp (programming language)|C#]]: [https://github.com/atifaziz/Gratt "Gratt: A Generic Vaughn Pratt's top-down operator precedence parser for .NET Standard"] (a [[Generic programming|generic]] version inspired by the Java implementation presented by Bob Nystrom in [http://journal.stuffwithstuff.com/2011/03/19/pratt-parsers-expression-parsing-made-easy/ "Pratt Parsers: Expression Parsing Made Easy"])
== Alternative methods ==
Line 97 ⟶ 113:
Such trees do not necessarily need to be implemented using data structures conventionally used for trees. Instead, tokens can be stored in flat structures, such as tables, by simultaneously building a priority list which states what elements to process in which order.
=== Full parenthesization ===
Another approach is to first fully parenthesize the expression, inserting a number of parentheses around each operator, such that they lead to the correct precedence even when parsed with a linear, left-to-right parser. This algorithm was used in the early [[Fortran#FORTRAN|FORTRAN I]] compiler:<ref name="Padua2000">{{cite journal |last1=Padua |first1=David |title=The Fortran I Compiler |year=2000
<blockquote>
Line 105 ⟶ 122:
* add <code>((</code> at the beginning of each expression and after each left parenthesis in the original expression; and
* add <code>))</code> at the end of the expression and before each right parenthesis in the original expression.
Although not obvious, the algorithm was correct, and, in the words of [[Donald Knuth|Knuth]], “The resulting formula is properly parenthesized, believe it or not.”<ref name="Knuth1962">{{cite journal |last1=Knuth |first1=Donald E. |title=A HISTORY OF WRITING COMPILERS |year=1962 |publisher=Edmund C. Berkeley |journal=
</blockquote>
{{missing information|section|why parenthesization works|date=May 2023}}
Example code of a simple C application that handles parenthesisation of basic math operators (<code>+</code>, <code>-</code>, <code>*</code>, <code>/</code>, <code>^</code>, <code>(</code> and <code>)</code><!-- parentheses -->):
<
#include <stdio.h>
#include <string.h>
// The command-line argument boundary is our lexer.
int main(int argc, char *argv[]) {
int i;
printf("((((");
for (i=1; i!=argc; i++) {
if
switch (*argv[i]) {
case '(': printf("(((("); continue; case ')': printf("))))"); continue;
case '^': printf(")^("); continue;
case '*': printf("))*(("); continue;
case '/': printf("))/(("); continue;
case '+':
// unary check: either first or had an operator expecting secondary argument
if (i == 1 || strchr("(^*/+-", *argv[i-1]))
printf("+");
Line 143 ⟶ 164:
return 0;
}
</syntaxhighlight>
First, you need to compile your program. Assuming your program is written in C and the source code is in a file named program.c, you would use the following command:
<pre>gcc program.c -o program</pre>
The above command tells gcc to compile program.c and create an executable named program.
Command to run the program with parameters, For example; a * b + c ^ d / e
<pre>./program a '*' b + c '^' d / e</pre>▼
▲<pre>a * b + c ^ d / e</pre>
it produces
<pre>((((a))*((b)))+(((c)^(d))/((e))))</pre>
Line 158 ⟶ 184:
==References==
{{Reflist}}
==External links==
* {{cite web| last=Clarke | first=Keith | date=1992-05-26 | title=Re: compact recursive-descent parsing of expressions | url=http://groups.google.com/group/comp.compilers/browse_thread/thread/aee24b9504d527ca/1210ef8ae529accd?#1210ef8ae529accd |
* [http://groups.google.com/group/comp.compilers/browse_thread/thread/442ca111b2dfb38d/5390237263b3625b?#5390237263b3625b Example C++ code by Keith Clarke for parsing infix expressions using the precedence climbing method]
* {{cite journal| last=Samelson | first=Klaus |
* [https://web.archive.org/web/20160817143616/http://radovan-augustin.info/techpapers/parser/Parser.pdf Parser for expression with infix notation using precedence lists]
{{Parsers}}
{{DEFAULTSORT:Operator-Precedence Parser}}
|