Shunting yard algorithm: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 01:28, 29 May 2020 edit Math Machine 4 (talk \| contribs) 90 edits There was an incorrect step taken in the examples section, specifically in the second example Tag: Visual edit ← Previous edit		Latest revision as of 15:22, 23 June 2025 edit undo Dgpop (talk \| contribs) Extended confirmed users 57,751 edits →top: phrasing
(84 intermediate revisions by 57 users not shown)
Line 1: {{Short description\|Algorithm to parse a syntax with infix notation to postfix notation}} {{noMore footnotes\|date=August 2013}} In [[computer science]], the '''shunting-yard algorithm''' is a method for parsing mathematical expressions specified in [[infix notation]]. It can produce either a postfix notation string, also known as [[Reverse Polish notation]] (RPN), or an [[abstract syntax tree]] (AST). The [[algorithm]] was invented by [[Edsger Dijkstra]] and named the "shunting yard" algorithm because its operation resembles that of a [[classification yard\|railroad shunting yard]]. Dijkstra first described the Shunting Yard Algorithm in the [[Mathematisch Centrum]] report [https://repository.cwi.nl/noauth/search/fullrecord.php?publnr=9251 MR 34/61]. {{Infobox algorithm Like the evaluation of RPN, the shunting yard algorithm is [[stack (data structure)\|stack]]-based. Infix expressions are the form of mathematical notation most people are used to, for instance {{nowrap\|"3 + 4"}} or {{nowrap\|"3 + 4 × (2 − 1)"}}. For the conversion there are two text [[Variable (programming)\|variables]] ([[string (computer science)\|strings]]), the input and the output. There is also a [[stack (data structure)\|stack]] that holds operators not yet added to the output queue. To convert, the program reads each symbol in order and does something based on that symbol. The result for the above examples would be (in [[Reverse Polish notation]]) {{nowrap\|"3 4 +"}} and {{nowrap\|"3 4 2 1 − × +"}}, respectively.▼ \|name={{PAGENAMEBASE}} \|class=[[Parsing]] \|data=[[Stack (abstract data type)\|Stack]] \|time=<math>O(n)</math> \|space=<math>O(n)</math> }} In [[computer science]], the '''shunting yard algorithm''' is a method for parsing arithmetical or logical expressions, or a combination of both, specified in [[infix notation]]. It can produce either a postfix notation string, also known as [[reverse Polish notation]] (RPN), or an [[abstract syntax tree]] (AST).<ref>{{cite web\|access-date=2020-12-28\|title=Parsing Expressions by Recursive Descent\|url=http://www.engr.mun.ca/~theo/Misc/exp_parsing.htm\|website=www.engr.mun.ca\|author=Theodore Norvell\|date=1999}}</ref> The [[algorithm]] was invented by [[Edsger Dijkstra]], first published in November 1961,<ref>{{Cite journal \|last=Dijkstra \|first=Edsger \|date=1961-11-01 \|title=Algol 60 translation : An Algol 60 translator for the X1 and making a translator for Algol 60 \|url=https://ir.cwi.nl/pub/9251 \|language=en \|journal=Stichting Mathematisch Centrum}}</ref> and named because its operation resembles that of a [[classification yard\|railroad shunting yard]]. The shunting-yard algorithm was later generalized into [[Operator-precedence parser\|operator-precedence parsing]].▼ ▲Like the evaluation of RPN, the shunting yard algorithm is [[stack (data structure)\|stack]]-based. Infix expressions are the form of mathematical notation most people are used to, for instance {{nowrap\|"3 + 4"}} or {{nowrap\|"3 + 4 × (2 − 1)"}}. For the conversion there are two text [[Variable (programming)\|variables]] ([[string (computer science)\|strings]]), the input and the output. There is also a [[stack (data structure)\|stack]] that holds operators not yet added to the output queue. To convert, the program reads each symbol in order and does something based on that symbol. The result for the above examples would be (in [[~~Reverse~~reverse Polish notation]]) {{nowrap\|"3 4 +"}} and {{nowrap\|"3 4 2 1 − × +"}}, respectively. The shunting yard algorithm will correctly parse all valid infix expressions, but does not reject all invalid expressions. For example, {{nowrap\|"1 2 +"}} is not a valid infix expression, but would be parsed as {{nowrap\|"1 + 2"}}. The algorithm can however reject expressions with mismatched parentheses. ▲The shunting- yard algorithm was later generalized into [[Operator-precedence parser\|operator-precedence parsing]]. ==A simple conversion== Line 22 ⟶ 34: ==Graphical illustration== [[File:Shunting yard.svg\|frameless\|border\|center\|~~400px~~500px\|]] Graphical illustration of algorithm, using a [[wye junction\|three-way railroad junction]]. The input is processed one symbol at a time: if a variable or number is found, it is copied directly to the output a), c), e), h). If the symbol is an operator, it is pushed onto the operator stack b), d), f). If the operator's precedence is ~~less~~lower than that of the operators at the top of the stack or the ~~precedents~~precedences are equal and the operator is left associative, then that operator is popped off the stack and added to the output g). Finally, any remaining operators are popped off the stack and added to the output i). ==The algorithm in detail== ~~Important~~{{for\|important terms~~: [[~~\|token (parser)\|~~Token]], [[~~function (mathematics)\|~~Function]], [[~~Operator associativity~~]], [[~~\|Order of operations~~\|Precedence]]~~}} {{font color\|blue\|/* ~~This~~The ~~implementation~~functions ~~does~~referred ~~not~~to ~~implement~~in ~~composite~~this ~~functions,functions~~algorithm ~~with~~are ~~variable~~simple single argument functions ~~number~~such ofas ~~arguments~~sine, ~~and~~inverse ~~unary~~or ~~operators~~factorial. /}} {{font color\|blue\|/ This implementation does not implement composite functions, functions with a variable number of arguments, or unary operators. /}} '''while''' there are ~~tokens~~[[token (parser)\|token]]s to be read ~~'''do'''~~: read a token. '''if''' the token is: a ~~number,~~ - a ''~~'then'~~number'': ~~push~~put it tointo the output queue. ~~'''else~~- a ~~if'~~''[[function ~~the token is a~~ (mathematics)\|function ~~'''then'~~]]'': push it onto the operator stack ~~'''else~~- an if''~~' the token is an~~ operator '' '~~then~~'o''<sub>1</sub>: '''while''' (~~(there is a operator at the top of the operator stack)~~ there is ~~and ((the~~an operator ''o''<sub>2</sub> at the top of the operator stack ~~has~~which is not a left ~~greater~~parenthesis, ~~precedence)~~ '''and''' (''o''<sub>2</sub> has orgreater ~~(the~~[[Order ~~operator~~of atoperations\|precedence]] ~~the~~than ~~top~~''o''<sub>1</sub> of'''or''' ~~the~~(''o''<sub>1</sub> ~~operator~~and ~~stack~~''o''<sub>2</sub> ~~has~~have ~~equal~~the same precedence '''and''' ~~the token~~''o''<sub>1</sub> is left -associative)) ): and (the operator at the top of the operator stack is not a left parenthesis)):▼ pop ~~operators~~''o''<sub>2</sub> from the operator stack ~~onto~~into the output queue. push it''o''<sub>1</sub> onto the operator stack. - a ''","'': '''else if''' the token is a left parenthesis (i.e. "("), '''then''':▼ push it onto the operator stack.▼ '''else if''' the token is a right parenthesis (i.e. ")"), '''then''':▼ '''while''' the operator at the top of the operator stack is not a left parenthesis: pop the operator from the operator stack ~~onto~~into the output queue. ▲ ~~'''else~~- a if''~~' the token is a~~ left parenthesis'' (i.e. "(")~~, '''then'''~~: {{font color\|blue\|/ If the stack runs out without finding a left parenthesis, then there are mismatched parentheses. /}}▼ ~~'''if'''~~push ~~there~~it ~~is a left parenthesis at the top of~~onto the operator stack~~, '''then''':~~ ▲ ~~'''else~~- a if''~~' the token is a~~ right parenthesis'' (i.e. ")")~~, '''then'''~~: pop the operator from the operator stack and discard it▼ ▲ '''while''' ~~and (~~the operator at the top of the operator stack is not a left parenthesis)): {{font color\|blue\|/ After while loop, if operator stack not null, pop everything to output queue /}}▼ ▲ ~~push~~ it ~~onto~~ {'''assert''' the operator stack. is not empty} ~~'''if''' there are no more tokens to read '''then''':~~ ▲ {{font color\|blue\|/ If the stack runs out without finding a left parenthesis, then there are mismatched parentheses. /}} '''while''' there are still operator tokens on the stack:▼ ~~{{font~~ color\|blue\|/ If pop the operator ~~token on~~from the ~~top of the~~operator stack isinto ~~a parenthesis, then there are mismatched~~the ~~parentheses.~~output /}}queue ~~pop~~{'''assert''' ~~the~~there ~~operator~~is ~~from~~a ~~the~~left ~~operator~~parenthesis ~~stack~~at ~~onto~~the top of the ~~output~~operator ~~queue.~~stack} ▲ pop the ~~operator~~left parenthesis from the operator stack and discard it ~~exit.~~ '''if''' there is a function token at the top of the operator stack, '''then''': pop the function from the operator stack into the output queue ▲ {{font color\|blue\|/ After the while loop, ifpop ~~operator~~the ~~stack~~remaining ~~not~~items ~~null,~~from ~~pop~~the operator stack ~~everything~~into tothe output queue. /}} ▲ '''while''' there are ~~still operator~~ tokens on the operator stack: {{font color\|blue\|/ If the operator token on the top of the stack is a parenthesis, then there are mismatched parentheses. /}} {'''assert''' the operator on top of the stack is not a (left) parenthesis} pop the operator from the operator stack onto the output queue To analyze the running time complexity of this algorithm, one has only to note that each token will be read once, each number, function, or operator will be printed once, and each function, operator, or parenthesis will be pushed onto the stack and popped off the stack once—therefore, there are at most a constant number of operations executed per token, and the running time is thus O(''n'')~~—linear~~ — linear in the size of the input. The shunting yard algorithm can also be applied to produce prefix notation (also known as [[Polish notation]]). To do this one would simply start from the end of a string of tokens to be parsed and work backwards, reverse the output queue (therefore making the output queue an output stack), and flip the left and right parenthesis behavior (remembering that the now-left parenthesis behavior should pop until it finds a now-right parenthesis)., ~~And~~while ~~changing~~making sure to change the [[Operator associativity\|associativity]] condition to right. ==Detailed ~~example~~examples== Input: {{nowrap\|3 + 4 × 2 ÷ ( 1 − 5 ) ^ 2 ^ 3}} Line 123 ⟶ 142: :{\| class="wikitable" ! Token !! Action !! Output =<br>(in [[Reverse Polish Notation\|RPN]]) !! Operator<br>stack !! Notes \|- \| align="center" \| sin \|\| Push token to stack \|\| \|\| align="right" \| sin \|\| Line 135 ⟶ 154: \| align="center" \| 2 \|\| Add token to output \|\| 2 \|\| align="right" \| ( max ( sin \|\| \|- \| align="center" \| , \|\| ~~ignore~~Ignore \|\| 2 \|\| align="right" \| ( max ( sin \|\| The operator at the top of the stack is a left parenthesis \|- \| align="center" \| 3 \|\| Add token to output \|\| 2 3 \|\| align="right" \| ( max ( sin \|\| \|- \| align="center" rowspan="23' \| ) \|\| ~~pop~~Pop stack to output \|\| 2 3 \|\| align="right" \| ( max ( sin \|\| Repeated until "(" is at the top of the stack \|- \| Pop stack \|\| 2 3 \|\| align="right" \| max ( sin \|\|Discarding matching parentheses \|- ~~\| align="center" rowspan="2" \| ÷ \|~~\| Pop stack to output \|\| 2 3 max \|\| align="right" \| ( sin \|\| Function at top of the stack \|- \| align="center" \| ÷ \|\| Push token to stack \|\| 2 3 max \|\| align="right" \| ÷ ( sin \|\| \|- \| align="center" \| 3 \|\| Add token to output \|\| 2 3 max 3 \|\| align="right" \| ÷ ( sin \|\| Line 155 ⟶ 174: \| align="center" \| {{pi}} \|\| Add token to output \|\| 2 3 max 3 ÷ {{pi}} \|\| align="right" \| × ( sin \|\| \|- \| align="center" rowspan="23" \| ) \|\| Pop stack to output \|\| 2 3 max 3 ÷ {{pi}} × \|\| align="right" \| ( sin \|\|Repeated until "(" is at the top of the stack \|- \| Pop stack \|\| 2 3 max 3 ÷ {{pi}} × \|\| align="right" \| sin \|\|Discarding matching parentheses \|- \| Pop stack to output\|\| 2 3 max 3 ÷ {{pi}} × sin\|\| \|\|Function at top of the stack \|- \| align="center" \| ''end'' \|\| Pop entire stack to output \|\| 2 3 max 3 ÷ {{pi}} × sin \|\| \|\| Line 165 ⟶ 186: [[Operator-precedence parser]] [[Stack-sortable permutation]] ==References== {{Reflist}} ==External links== [http://www.cs.utexas.edu/~EWD/MCReps/MR35.PDF Dijkstra's original description of the Shunting yard algorithm] [https://~~web.archive~~literateprograms.org/~~web/20110718214204/http://en~~shunting_yard_algorithm__c_.~~literateprograms.org/Shunting_yard_algorithm_(C)~~html Literate Programs implementation in C] [https://github.com/Skarlett/shunting-yard-rs/blob/93bf03b37da611c1d642b6e221597ae095189901/src/main.rs#L220-L300 Demonstration of Shunting yard algorithm in Rust] [http://www.chris-j.co.uk/parsing.php Java Applet demonstrating the Shunting yard algorithm] [http://www.codeding.com/?article=11 Silverlight widget demonstrating the Shunting yard algorithm and evaluation of arithmetic expressions] Line 174 ⟶ 199: *[https://nl.mathworks.com/matlabcentral/fileexchange/68458-evaluation Matlab code, evaluation of arithmetic expressions using the shunting yard algorithm] ~~{{Edsger Dijkstra}}~~ {{Parsers}} [[Category:Parsing algorithms]] [[Category:Dutch inventions]] ~~[[Category:Edsger W. Dijkstra]]~~