Content deleted Content added
→Indirect left recursion: Practical significance |
→Compared to regular expressions: Simplified example. Elaborated point about nondeterminism. |
||
Line 177:
The <code>*</code>, <code>+</code>, and <code>?</code> of parsing expressions are similar to those in regular expressions, but a difference is that these operate strictly in a greedy mode. This is ultimately due to <code>/</code> being an ordered choice. A consequence is that something can match as a regular expression which does not match as parsing expression:
: <code>
is both a valid regular expression and a valid parsing expression. As regular expression, it matches <code>bc</code>, but as parsing expression it does not match, because the <code>[ab]?</code> will match the <code>b</code>, then <code>[bc]</code> will match the <code>c</code>, leaving nothing for the <code>[cd]</code>, so at that point matching the sequence fails. "Trying again" with having <code>[ab]?</code> match the empty string is explicitly against the semantics of parsing expressions; this is not an edge case of a particular matching algorithm, instead it is the sought behaviour.
Even regular expressions that depend on nondeterminism ''can'' be compiled into a parsing expression grammar, by having a separate nonterminal for every state of the corresponding [[deterministic finite automaton|DFA]] and encoding its transition function into the definitions of these nonterminals —
<syntaxhighlight lang="peg">
A ← 'x' B / 'y' C
</syntaxhighlight>
is effectively saying "from state A transition to state B if the next character is x, but to state C if the next character is y" — but this works because nondeterminism can be eliminated within the realm of regular languages. It would not make use of the parsing expression variants of the repetition operations.
=== Compared to context-free grammars ===
|