Revision as of 18:20, 25 December 2023 edit 88.129.117.158 (talk) →Indirect left recursion: Clarify intent of example ← Previous edit		Revision as of 19:37, 25 December 2023 edit undo 88.129.117.158 (talk) →Advantages: Actually write about advantages (as opposed to underhandedly warn about memory requirements, which is covered in another section) Next edit →
Line 153: == Advantages == Compared to pure [[regular expressions]] (i.e., ~~without~~describing ~~back-references~~a language recognisable using a [[finite automaton]]), PEGs are ~~strictly~~vastly more powerful,. ~~but~~In ~~require~~particular ~~significantly~~they ~~more~~can ~~memory.~~handle ~~For~~unbounded ~~example~~recursion, aand ~~regular~~so ~~expression~~match ~~inherently~~parentheses ~~cannot~~down ~~find~~to an arbitrary ~~number~~nesting ofdepth; ~~matched~~regular ~~pairs~~expressions can at best keep track of ~~parentheses,~~nesting ~~because~~down itto issome ~~not~~fixed ~~recursive~~depth, ~~but~~because a ~~PEG~~finite automaton (having a finite set of internal states) can only distinguish finitely many different nesting depths. In ~~However~~more theoretical terms, <math> \{a^n ~~PEG~~b^n\}_{n ~~will~~\geqslant ~~require~~0} an</math> ~~amount~~(the language of ~~memory~~all ~~proportional~~strings toof ~~the~~zero ~~length~~or ofmore ~~the input~~<math>a</math>'s, ~~while~~followed aby ~~regular~~an ~~expression~~''equal ~~matcher~~number'' ~~will~~of ~~require~~<math>b</math>s) ~~only~~is a ~~constant~~parsing ~~amount~~expression oflanguage, ~~memory.~~matched by the grammar <syntaxhighlight lang="peg"> ~~Any PEG can be parsed in linear time by using a packrat parser, as described above.~~ start ← Matched !. Matched ← ('a' Matched 'b')? </syntaxhighlight> Here <code>Matched !.</code> is the starting expression. The <code>!.</code> part enforces that the input ends after the <code>Matched</code>, by saying “there is no next character”; unlike regular expressions, which have magic constraints <code>$</code> or <code>\Z</code> for this, parsing expressions can express the end of input using only its basic primitives. Many CFGs contain ambiguities, even when they're intended to describe unambiguous languages. The "[[dangling else]]" problem in C, C++, and Java is one example. These problems are often resolved by applying a rule outside of the grammar. In a PEG, these ambiguities never arise, because of prioritization. The <code>*</code>, <code>+</code>, and <code>?</code> of parsing expressions are similar to those in regular expressions, but a difference is that these operate strictly in a greedy mode. This is ultimately due to <code>/</code> being an ordered choice. In the strict formal sense, PEGs are likely incomparable to [[context-free grammar]]s (CFGs), but practically there are many things that PEGs can do which pure CFGs cannot, whereas it is difficult to come up with examples of the contrary. In particular PEGs can be crafted to natively resolve ambiguities, such as the "[[dangling else]]" problem in C, C++, and Java, whereas CFG-based parsing often needs a rule outside of the grammar to resolve them. Moreover any PEG can be parsed in linear time by using a packrat parser, as described above, whereas parsing according to a general CFG is asymptotically equivalent<ref>{{cite journal \|last1=Lee \|first1=Lillian \|title=Fast Context-free Grammar Parsing Requires Fast Boolean Matrix Multiplication \|journal=J. ACM \|date=January 2002 \|volume=49 \|issue=1 \|page=1–15 \|doi=10.1145/505241.505242}}</ref> to [[logical matrix\|boolean matrix]] multiplication (thus likely between quadratic and cubic). == Disadvantages ==

Parsing expression grammar: Difference between revisions