Parsing expression grammar: Difference between revisions

Content deleted Content added
OAbot (talk | contribs)
m Open access bot: arxiv updated in citation with #oabot.
Compared to regular expressions: Nondeterministic example
Line 170:
 
<syntaxhighlight lang="peg">
start ← MatchedAB !.
MatchedAB ← ('a' MatchedAB 'b')?
</syntaxhighlight>
 
Here <code>MatchedAB !.</code> is the starting expression. The <code>!.</code> part enforces that the input ends after the <code>MatchedAB</code>, by saying “there is no next character”; unlike regular expressions, which have magic constraints <code>$</code> or <code>\Z</code> for this, parsing expressions can express the end of input using only the basic primitives.
 
The <code>*</code>, <code>+</code>, and <code>?</code> of parsing expressions are similar to those in regular expressions, but a difference is that these operate strictly in a greedy mode. This is ultimately due to <code>/</code> being an ordered choice. A consequence is that something can match as a regular expression which does not match as parsing expression:
: <code>[oa]+[ab][oa]+[b][ob]+[c]</code>
is both a valid regular expression and a valid parsing expression. As regular expression, it matches
: <code>ooaooboooc</code>
with <code>[ab]</code> against the <code>a</code> and <code>[b]</code> against the <code>b</code>, but as parsing expression it does not match, because <code>[oa]+</code> matches <code>ooaoo</code>, <code>[ab]</code> matches the <code>b</code>, second <code>[oa]+</code> matches <code>ooo</code>, causing <code>[b]</code> to fail against the <code>c</code>.
 
=== Compared to context-free grammars ===