Recursive ascent parser: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 20:09, 5 December 2012 edit Yobot (talk \| contribs) Bots 4,733,870 edits m clean up, References after punctuation per WP:REFPUNC and WP:PAIC using AWB (8748) ← Previous edit		Latest revision as of 23:04, 22 December 2024 edit undo Kvng (talk \| contribs) Extended confirmed users, New page reviewers 116,036 edits avoid unnec redirect, link scope
(19 intermediate revisions by 18 users not shown)
Line 1: In [[computer science]], '''recursive ascent parsing''' is a technique for implementing an [[~~LALR]]~~LR parser]] which uses mutually-recursive functions rather than tables. Thus, the parser is ''directly encoded'' in the host language similar to [[recursive descent]]. Direct encoding usually yields a parser which is faster than its table-driven equivalent<ref>{{cite ~~web~~journal\|title=Very fast LR parsing\|author=Thomas J Penello\|journal=ACM SIGPLAN Notices \|year=1986\|~~url~~volume=~~http://portal~~21 \|issue=7 \|pages=145–151 \|doi=10.~~acm.org~~1145/~~citation.cfm?id=~~13310.13326 \|doi-access=free }}</ref> for the same reason that compilation is faster than interpretation. It is also (nominally) possible to hand edit a recursive ascent parser, whereas a tabular implementation is nigh unreadable to the average human. Recursive ascent was first described by Thomas ~~Penello~~Pennello in his article {{cite ~~web~~book\|~~title~~chapter=Very fast LR parsing\| doi=10.1145/12276.13326 \|chapter-url=http://portal.acm.org/citation.cfm?id=13310.13326\| title=Proceedings of the 1986 SIGPLAN symposium on Compiler construction - SIGPLAN '86 \| year=1986 \| last1=Pennello \| first1=Thomas J. \| pages=145–151 \| isbn=0897911970 \| s2cid=17214407 }} in 1986. He was not intending to create a hand-editable implementation of an LR parser, but rather a maintainable and efficient parser implemented in [[assembly language]]. The technique was later expounded upon by G.H. Roberts<ref>{{cite ~~web~~journal\|title=Recursive ascent: an LR analog to recursive descent\|year=1988\|author=G.H. Roberts\|~~url~~journal=~~http://portal~~ACM SIGPLAN Notices \|volume=23 \|issue=8 \|pages=23–29 \|doi=10.~~acm.org~~1145/~~citation.cfm?id=~~47907.47909 \|s2cid=12740771 \|doi-access=free }}</ref> in 1988 as well as in an article by Leermakers, Augusteijn, Kruseman Aretz<ref>{{cite ~~web~~journal\|title=A functional LR parser\|author=Leermakers, Augusteijn, Kruseman Aretz\|journal=Theoretical Computer Science \|year=1992\|~~url~~volume=~~http://portal~~104 \|issue=2 \|pages=313–323 \|doi=10.~~acm.org~~1016/~~citation.cfm?id~~0304-3975(92)90128-3 \|doi-access=~~146986.146994~~free }}</ref> in 1992 in the journal ''Theoretical Computer Science''. An extremely readable description of the technique was written by Morell and Middleton<ref>{{cite ~~article~~news\|title=Recursive-ascent parsing\|~~author~~author1=Larry Morell ~~and~~ \|author2=David Middleton \|name-list-style=amp \|year=2003\|journal=Journal of Computing Sciences in Colleges\|volume=18\|issue=6\|pages=186–201\|url=http://portal.acm.org/citation.cfm?id=770849}}</ref> in 2003. A good exposition can also be found in a TOPLAS article by Sperber and Thiemann.<ref>{{cite ~~web~~journal\|title=Generation of LR parsers by partial evaluation\|author=Sperber and Thiemann\|journal=ACM Transactions on Programming Languages and Systems \|year=2000\|~~url~~volume=~~http://portal.acm~~22 \|issue=2 \|pages=224–264 \|doi=10.~~org~~1145/~~citation~~349214.~~cfm?id=~~349219~~&dl~~ \|s2cid=~~GUIDE&coll~~14955687 \|doi-access=~~GUIDE&CFID=56087236&CFTOKEN=74111863~~free }}</ref> Recursive ascent has also been merged with recursive descent, yielding a technique known as [[recursive ascent/descent]]. This implementation technique is arguably easier to hand-edit due to the reduction in states and fact that some of these states are more intuitively top-down rather than bottom up. It can also yield some minimal performance improvements over conventional recursive ascent.<ref>{{cite web\|title=ScalaBison Recursive Ascent-Descent Parser Generator\|~~author~~author1=John Boyland ~~and~~ \|author2=Daniel Spiewak \|name-list-style=amp \|year=2009\|url=http://www.cs.uwm.edu/~boyland/papers/scala-bison.pdf\|archive-url=https://web.archive.org/web/20090718161230/http://www.cs.uwm.edu/~boyland/papers/scala-bison.pdf\|archive-date=2009-07-18\|url-status=dead}}</ref> == Summary == Intuitively, recursive ascent is a literal implementation of the [[LR parser\|LR parsing]] concept. Each function in the parser represents a single LR [[Finite -state machine\|automaton]] state]]. Within each function, a multi-branch statement is used to select the appropriate action based on the current token ~~popped~~read ~~off~~from the input ~~stack~~stream. Once the token has been identified, action is taken based on the state being encoded. There are two different fundamental actions which may be taken based on the token in question: * '''Shift''' - Encoded as a function call, effectively jumping to a new automaton state. Line 35: The following is a [[Scala (programming language)\|Scala]] implementation of a recursive ascent parser based on the above grammar: <~~source~~syntaxhighlight lang="scala"> object ExprParser { private type Result = (NonTerminal, Int) Line 302: } } </syntaxhighlight> ~~</source>~~ The following is a [[Prolog]] implementation of a recursive ascent parser based on the above grammar: <syntaxhighlight lang="prolog"> state(S), [S] --> [S]. state(S0, S), [S] --> [S0]. /* 0. S --> E$ 1. E --> E + T 2. E --> E - T 3. E --> T 4. T --> (E) 5. T --> N 6. N --> 0 7. N --> 1 / accept --> state(s([], [e(_)])). r(1) --> state(s(Ts, [t(A1), '+', e(A0)\|Ss]), s(Ts, [e(A0+A1)\|Ss])). r(2) --> state(s(Ts, [t(A1), '-', e(A0)\|Ss]), s(Ts, [e(A0-A1)\|Ss])). r(3) --> state(s(Ts, [t(A)\|Ss]), s(Ts, [e(A)\|Ss])). r(4) --> state(s(Ts, [')', e(A), '('\|Ss]), s(Ts, [t(A)\|Ss])). r(5) --> state(s(Ts, [n(A)\|Ss]), s(Ts, [t(A)\|Ss])). r(6) --> state(s(Ts, ['0'\|Ss]), s(Ts, [n(0)\|Ss])). r(7) --> state(s(Ts, ['1'\|Ss]), s(Ts, [n(1)\|Ss])). t(T) --> state(s([T\|Ts], Ss), s(Ts, [T\|Ss])). / S --> .E$ E --> .E + T E --> .E - T E --> .T T --> .(E) T --> .N N --> .0 N --> .1 / s0 --> t('('), s3, s2, s1. s0 --> t('0'), s11, s10, s2, s1. s0 --> t('1'), s12, s10, s2, s1. / S --> E.$ E --> E. + T E --> E. - T / s1 --> accept. s1 --> t('+'), s7, s1. s1 --> t('-'), s8, s1. / E --> T. / s2 --> r(3). / T --> (.E) E --> .E + T E --> .E - T E --> .T T --> .(E) T --> .N N --> .0 N --> .1 / s3 --> t('('), s3, s2, s4. s3 --> t('0'), s11, s10, s2, s4. s3 --> t('1'), s12, s10, s2, s4. / T --> (E.) E --> E .+ T E --> E .- T / s4 --> t(')'), s9. s4 --> t('+'), s7, s4. s4 --> t('-'), s8, s4. / E --> E + T. / s5 --> r(1). / E --> E - T. / s6 --> r(2). / E --> E + .T T --> .(E) T --> .N N --> .0 N --> .1 / s7 --> t('('), s3, s5. s7 --> t('0'), s11, s10, s5. s7 --> t('1'), s12, s10, s5. / E --> E - .T T --> .(E) T --> .N N --> .0 N --> .1 / s8 --> t('('), s3, s6. s8 --> t('0'), s11, s10, s6. s8 --> t('1'), s12, s10, s6. / T --> (E). / s9 --> r(4). / T --> N. / s10 --> r(5). / N --> '0'. / s11 --> r(6). / N --> '1'. / s12 --> r(7). parser(Cs, T) :- length(Cs, _), phrase(s0, [s(Cs, [])], [s([], [e(T)])]). % state(S0, S), [S] --> [S0, S]. %?- S0 = [s("(1+1)", [])\|_], phrase(s0, S0, [S]), maplist(portray_clause, S0). </syntaxhighlight> ==See also== [[Recursive descent parser]] * [[Recursive ascent/descent parser]] == References == {{reflist}} {{Parsers}} [[Category:Parsing algorithms]]