Parsing expression grammar: Difference between revisions

Content deleted Content added
m Reverted edit by 2600:1006:B1A4:8F61:6E8A:2DB7:CAEC:C00E (talk) to last version by 209.94.142.169
OAbot (talk | contribs)
m Open access bot: url-access updated in citation with #oabot.
 
(7 intermediate revisions by 6 users not shown)
Line 33:
A '''parsing expression grammar''' is a collection of named parsing expressions, which may reference each other. The effect of one such reference in a parsing expression is as if the whole referenced parsing expression was given in place of the reference. A parsing expression grammar also has a designated '''starting expression'''; a string matches the grammar if it matches its starting expression.
 
An element of a string matched is called a ''[[terminal symbol]]'', or '''terminal''' for short. Likewise the names assigned to parsing expressions are called ''[[nonterminal symbol]]s'', or '''nonterminals''' for short. These terms would be descriptive for [[Chomsky hierarchy|generative grammars]], but in the case of parsing expression grammars they are merely terminology, kept mostly because of being near ubiquousubiquitous in discussions of [[parsing]] algorithms.
 
=== Syntax ===
Both ''abstract'' and ''concrete'' syntaxes of parsing expressions are seen in the literature, and in this article. The abstract syntax is essentially a [[expression (mathematics)|mathematical formula]] and primarily used in theoretical contexts, whereas concrete syntax parsing expressions could be used directly to control a [[parser]]. The primary concrete syntax is that defined by Ford,<ref name="For04"/>{{rp|Fig.1}} although many tools have their own dialect of this. Other tools<ref>{{cite web |last1=Sirthias |first1=Mathias |title=Parboiled: Rule Construction in Java |website=[[GitHub]] |url=https://github.com/sirthias/parboiled/wiki/Rule-Construction-in-Java |access-date=13 January 2024}}</ref> can be closer to using a programming-language native encoding of abstract syntax parsing expressions as their concrete syntax.
 
==== Atomic parsing expressions ====
Line 187:
|class=cs.PL
| eprint = 2005.06444
}}</ref> uses dynamic programming to apply PEG rules bottom-up and right to left, which is the inverse of the normal recursive descent order of top-down, left to right. Parsing in reverse order solves the left recursion problem, allowing left-recursive rules to be used directly in the grammar without being rewritten into non-left-recursive form, and also conveysconfers optimal error recovery capabilities upon the parser, which historically proved difficult to achieve for recursive descent parsers.
 
== Advantages ==
Line 386:
 
== Practical use ==
* [[Python (programming language)|Python]] reference implementation [[CPython]] introduced a PEG parser in version 3.9 as an alternative to the [[LL parser|LL(1) parser]] and uses just PEG from version 3.10.<ref>{{Cite web |title=PEP 617 – New PEG parser for CPython {{!}} peps.python.org |url=https://peps.python.org/pep-0617/ |access-date=2023-01-16 |website=peps.python.org}}</ref>
 
* The [[Jq_(programming_language)#Parsing_Expression_Grammars|jq programming language]] uses a formalism closely related to PEG.
 
* The [[Lua (programming language)|Lua]] authors created [https://www.inf.puc-rio.br/~roberto/lpeg/ LPeg], a [[pattern-matching]] library that uses PEG instead of [[regular expressions]],<ref>{{cite journal |last1=Ierusalimschy |first1=Roberto |title=A text pattern-matching tool based on Parsing Expression Grammars |journal=Software: Practice and Experience |date=10 March 2009 |volume=39 |issue=3 |pages=221–258 |doi=10.1002/spe.892 |url=https://onlinelibrary.wiley.com/doi/10.1002/spe.892 |language=en |issn=0038-0644|url-access=subscription }}</ref> as well as the [https://www.inf.puc-rio.br/~roberto/lpeg/re.html re] module which implements a regular-expression-like syntax utilizing the LPeg library.<ref>{{cite web |last1=Ierusalimschy |first1=Roberto |title=LPeg.re - Regex syntax for LPEG |url=https://www.inf.puc-rio.br/~roberto/lpeg/re.html |website=www.inf.puc-rio.br}}</ref>
 
== See also ==