Simple LR parser: Difference between revisions

Content deleted Content added
AnomieBOT (talk | contribs)
m Dating maintenance tags: {{Citation needed}}
m Replaced 1 bare URLs by {{Cite web}}; Replaced "Archived copy" by actual titles
 
(22 intermediate revisions by 15 users not shown)
Line 1:
{{Short description|Computer mechanic}}
{{unreferenced|date=December 2012}}
{{more citations needed|date=November 2024}}
In [[computer science]], a '''Simple LR''' or '''SLR parser''' is a type of [[LR parser]] with small [[LR parser#Constructing LR(0) parsing tables|parse table]]s and a relatively simple parser generator algorithm. As with other types of LR(1) parser, an SLR parser is quite efficient at finding the single correct [[bottom-up parsing|bottom-up parse]] in a single left-to-right scan over the input stream, without guesswork or backtracking. The parser is mechanically generated from a formal grammar for the language.
 
SLR and the more- general methods [[LALR parser]] and [[Canonical LR parser]] have identical methods and similar tables at parse time; they differ only in the mathematical grammar analysis algorithms used by the parser generator tool. SLR and LALR generators create tables of identical size and identical parser states. SLR generators accept fewer grammars than do LALR generators like [[yacc]] and [[GNU bison|Bison]].{{citation needed|date=November 2024}} Many computer languages don't readily fit the restrictions of SLR, as is. Bending the language's natural grammar into [[SLR grammar]] form requires more compromises and grammar hackery. So LALR generators have become much more widely used than SLR generators, despite being somewhat more complicated tools. SLR methods remain a useful learning step in college classes on compiler theory.{{citation needed|date=November 2024}}
 
SLR and LALR were both developed by [[Frank DeRemer]] as the first practical uses of [[Donald Knuth]]'s LR parser theory.<ref>{{citationCite neededweb| title=Introduction to Computational Linguistics - LR Parsers | url=https://wiki.eecs.yorku.ca/course_archive/2013-14/W/6339/_media/lrk.pdf | archive-url=https://web.archive.org/web/20210415164259/https://wiki.eecs.yorku.ca/course_archive/2013-14/W/6339/_media/lrk.pdf | archive-date=January2021-04-15}}</ref><ref>{{Cite 2015web| title=Introduction to LR-Parsing | url=https://www.seas.upenn.edu/~cis5110/notes/cis511-sl9.pdf | archive-url=https://web.archive.org/web/20240629070724/https://www.seas.upenn.edu/~cis5110/notes/cis511-sl9.pdf | archive-date=2024-06-29}}</ref> The tables created for real grammars by full LR methods were impractically large, larger than most computer memories of that decade, with 100 times or more parser states than the SLR and LALR methods.<ref>{{citationcite book needed| url=https://books.google.com/books?id=nEA9AAAAIAAJ&pg=PA87 | title=LR Parsing: Theory and Practice | isbn=978-0-521-30413-9 | last1=Chapman | first1=Nigel P. | date=June17 December 1987 | publisher=CUP Archive 2012}}</ref>
 
== Lookahead sets ==
To understand the differences between SLR and LALR, it is important to understand their many similarities and how they both make shift-reduce decisions. (See Please readthe article [[LR parser]] now for that background, up through the section on reductions' Lookahead'''lookahead Setssets'''.)
 
The one difference between SLR and LALR is how their generators calculate the '''lookahead sets''' of input symbols that should appear next, whenever some completed [[Formal grammar#The syntax of grammars|production rule]] is found and reduced.
 
SLR generators calculate that lookahead by an easy approximation method based directly on the grammar, ignoring the details of individual parser states and transitions. This ignores the particular context of the current parser state. If some nonterminal symbol ''S'' is used in several places in the grammar, SLR treats those places in the same single way rather than handling them individually. The SLR generator works out <code>Follow(S)</code>, the set of all terminal symbols which can immediately follow some occurrence of ''S''. In the parse table, each reduction to ''S'' uses Follow(S) as its LR(1) lookahead set. Such follow sets are also used by generators for LL top-down parsers. A grammar that has no shift/reduce or reduce/reduce conflicts when using Followfollow sets is called an [[SLR grammar]]. {{citation needed|date=November 2024}}
 
LALR generators calculate lookahead sets by a more precise method based on exploring the graph of parser states and their transitions. This method considers the particular context of the current parser state. It customizes the handling of each grammar occurrence of some nonterminal S. See article [[LALR parser]] for further details of this calculation. The lookahead sets calculated by LALR generators are a subset of (and hence better than) the approximate sets calculated by SLR generators. If a grammar has table conflicts when using SLR follow sets, but is conflict-free when using LALR follow sets, it is called a LALR grammar. {{citation needed|date=November 2024}}
 
== Example ==
Line 24 ⟶ 25:
Constructing the action and goto table as is done for LR(0) parsers would give the following item sets and tables:
 
: ''';Item set 0'''
: S → • E
: + E → • 1 E
: + E → • 1
 
: ''';Item set 1'''
: E → 1 • E
: E → 1 •
Line 35 ⟶ 36:
: + E → • 1
 
: ''';Item set 2'''
: S → E •
 
: ''';Item set 3'''
: E → 1 E •
 
Line 45 ⟶ 46:
{| class="wikitable"
|- align="center"
|! || colspan="2" |''action''||''goto''
|- align="center"
|! ''state''||'''1'''||'''$'''||'''E'''
!1||$||E
|- align="center"
!0
| '''0'''||s1||||2
|- align="center"
!1
| '''1'''||s1/r2||r2||3
|- align="center"
!2
| '''2'''||||acc||
|- align="center"
!3
| '''3'''||r1||r1||
|}
 
Line 61 ⟶ 67:
{| class="wikitable"
|- align="center"
|! symbol|
|S||E||1
|- align="center"
|! following|
|$||$||1,$
|}
 
A reduce only needs to be added to a particular action column if that action is in the follow set associated with that reduce. This algorithm describes whether a reduce action must be added to an action column:
 
function mustBeAdded(reduceAction, action) {
ruleNumber = reduceAction.value;
ruleSymbol = rules[ruleNumber].leftHandSide;
return (action in followSet(ruleSymbol))
}
 
for example, {{code|mustBeAdded(r2, "1")}} is false, because the left hand side of rule 2 is "E", and 1 is not in E's follow set.
Contrariwise, {{code|mustBeAdded(r2, "$")}} is true, because "$" is in E's follow set.
 
By using mustBeAdded on each reduce action in the action table, the result is a conflict-free action table:
Line 81 ⟶ 89:
{| class="wikitable"
|- align="center"
|! || colspan="2" |''action''||''goto''
|- align="center"
|! ''state''||'''1'''||'''$'''||'''E'''
!1||$||E
|- align="center"
!0
| '''0'''||s1||||2
|- align="center"
!1
| '''1'''||s1||r2||3
|- align="center"
!2
| '''2'''||||acc||
|- align="center"
!3
| '''3'''||||r1||
|}
 
Line 99 ⟶ 112:
* [[LALR parser]]
* [[SLR grammar]]
 
==References==
{{reflist}}
 
{{Parsers}}
 
[[Category:Parsing algorithms]]