Revision as of 12:12, 5 January 2024 edit 88.129.117.158 (talk) →Indirect left recursion: Practical significance ← Previous edit		Revision as of 13:50, 13 January 2024 edit undo 37.197.59.228 (talk) →Compared to regular expressions: Simplified example. Elaborated point about nondeterminism. Next edit →
Line 177: The <code>*</code>, <code>+</code>, and <code>?</code> of parsing expressions are similar to those in regular expressions, but a difference is that these operate strictly in a greedy mode. This is ultimately due to <code>/</code> being an ordered choice. A consequence is that something can match as a regular expression which does not match as parsing expression: : <code>~~[oa]+~~[ab]?[~~oa]+[b][ob~~bc]+[ccd]</code> is both a valid regular expression and a valid parsing expression. As regular expression, it matches <code>bc</code>, but as parsing expression it does not match, because the <code>[ab]?</code> will match the <code>b</code>, then <code>[bc]</code> will match the <code>c</code>, leaving nothing for the <code>[cd]</code>, so at that point matching the sequence fails. "Trying again" with having <code>[ab]?</code> match the empty string is explicitly against the semantics of parsing expressions; this is not an edge case of a particular matching algorithm, instead it is the sought behaviour. ~~is both a valid regular expression and a valid parsing expression. As regular expression, it matches~~ ~~: <code>ooaooboooc</code>~~ Even regular expressions that depend on nondeterminism ''can'' be compiled into a parsing expression grammar, by having a separate nonterminal for every state of the corresponding [[deterministic finite automaton\|DFA]] and encoding its transition function into the definitions of these nonterminals — with <code>[ab]</code> against the <code>a</code> and <code>[b]</code> against the <code>b</code>, but as parsing expression it does not match, because <code>[oa]+</code> matches <code>ooaoo</code>, <code>[ab]</code> matches the <code>b</code>, second <code>[oa]+</code> matches <code>ooo</code>, causing <code>[b]</code> to fail against the <code>c</code>. <syntaxhighlight lang="peg"> A ← 'x' B / 'y' C </syntaxhighlight> is effectively saying "from state A transition to state B if the next character is x, but to state C if the next character is y" — but this works because nondeterminism can be eliminated within the realm of regular languages. It would not make use of the parsing expression variants of the repetition operations. === Compared to context-free grammars ===

Parsing expression grammar: Difference between revisions