Parsing expression grammar: Difference between revisions

Content deleted Content added
Advantages: Added example of non-context-free parsing expression language.
Advantages: Made subsections
Line 153:
 
== Advantages ==
=== Compared to regular expressions ===
Compared to pure [[regular expressions]] (i.e., describing a language recognisable using a [[finite automaton]]), PEGs are vastly more powerful. In particular they can handle unbounded recursion, and so match parentheses down to an arbitrary nesting depth; regular expressions can at best keep track of nesting down to some fixed depth, because a finite automaton (having a finite set of internal states) can only distinguish finitely many different nesting depths. In more theoretical terms, <math> \{a^n b^n\}_{n \geqslant 0} </math> (the language of all strings of zero or more <math>a</math>'s, followed by an ''equal number'' of <math>b</math>s) is not a regular language, but it is easily seen to be a parsing expression language, matched by the grammar
 
Line 163 ⟶ 164:
 
The <code>*</code>, <code>+</code>, and <code>?</code> of parsing expressions are similar to those in regular expressions, but a difference is that these operate strictly in a greedy mode. This is ultimately due to <code>/</code> being an ordered choice.
 
=== Compared to context-free grammars ===
 
In the strict formal sense, PEGs are likely incomparable to [[context-free grammar]]s (CFGs), but practically there are many things that PEGs can do which pure CFGs cannot, whereas it is difficult to come up with examples of the contrary. In particular PEGs can be crafted to natively resolve ambiguities, such as the "[[dangling else]]" problem in C, C++, and Java, whereas CFG-based parsing often needs a rule outside of the grammar to resolve them. Moreover any PEG can be parsed in linear time by using a packrat parser, as described above, whereas parsing according to a general CFG is asymptotically equivalent<ref>{{cite journal |last1=Lee |first1=Lillian |title=Fast Context-free Grammar Parsing Requires Fast Boolean Matrix Multiplication |journal=J. ACM |date=January 2002 |volume=49 |issue=1 |page=1–15 |doi=10.1145/505241.505242}}</ref> to [[logical matrix|boolean matrix]] multiplication (thus likely between quadratic and cubic).