Content deleted Content added
→Compared to regular expressions: Nondeterministic example |
→Indirect left recursion: Practical significance |
||
Line 220:
=== Indirect left recursion ===
A PEG is called ''well-formed''<ref name="For04"/> if it contains no ''[[left recursion|left-recursive]]'' rules, i.e., rules that allow a nonterminal to expand to an expression in which the same nonterminal occurs as the leftmost symbol. For a left-to-right top-down parser, such rules cause infinite regress: parsing will continually expand the same nonterminal without moving forward in the string. Therefore, to allow packrat parsing, left recursion must be eliminated.
==== Practical significance ====
Therefore, to allow packrat parsing, left recursion must be eliminated. For example, in the arithmetic grammar above, it could seem tempting to express operator precedence as a matter of ordered choice — <code>Sum / Product / Value</code> would mean first try viewing as <code>Sum</code> (since we parse top–down), second try viewing as <code>Product</code>, and only third try viewing as <code>Value</code> — rather than via nesting of definitions. This (non-well-formed) grammar seeks to keep precedence order only in one line:▼
Direct recursion, be that left or right, is important in context-free grammars, because there recursion is the only way to describe repetition:
<syntaxhighlight lang="peg">
Sum → Term | Sum '+' Term | Sum '-' Term
Args → Arg | Arg ',' Args
</syntaxhighlight>
People trained in using context-free grammars often come to PEGs expecting to use the same idioms, but parsing expressions can do repetition without recursion:
<syntaxhighlight lang="peg">
Sum ← Term ( '+' Term / '-' Term )*
Args → Arg ( ',' Arg )*
</syntaxhighlight>
A difference lies in the [[abstract syntax tree]]s generated: with recursion each <code>Sum</code> or <code>Args</code> can have at most two children, but with repetition there can be arbitrarily many. If later stages of processing require that such lists of children are recast as trees with bounded [[degree (graph theory)|degree]], for example microprocessor instructions for addition typically only allow two operands, then properties such as [[left-associative|left-associativity]] would be imposed after the PEG-directed parsing stage.
Therefore left-recursion is practically less likely to trouble a PEG packrat parser than, say, an LL(k) context-free parser, unless one insists on using context-free idioms. However, not all cases of recursion are about repetition.
==== Non-repetition left-recursion ====
▲
<syntaxhighlight lang="peg">
|