Revision as of 16:21, 15 January 2024 edit 130.243.94.123 (talk) →Theory of parsing expression grammars: New section ← Previous edit		Revision as of 17:15, 15 January 2024 edit undo 130.243.94.123 (talk) →Theory of parsing expression grammars: Undecidability of language emptiness. Next edit →
Line 371: The class of parsing expression languages is closed under set intersection and complement, thus also under set union.<ref name="For04" />{{rp\|Sec.3.4}} === Undecidability of emptiness === In stark contrast to the case for context-free grammars, it is not possible to generate elements of a parsing expression language from its grammar. Indeed, it is algorithmically [[undecidable problem\|undecidable]] whether the language recognised by a parsing expression grammar is empty! One reason for this is that any instance of the [[Post correspondence problem]] reduces to an instance of the problem of deciding whether a parsing expression language is empty. Recall that an instance of the Post correspondence problem consists of a list <math> (\alpha_1,\beta_1), (\alpha_2,\beta_2), \dotsc, (\alpha_n,\beta_n) </math> of pairs of strings (of terminal symbols). The problem is to determine whether there exists a sequence <math>\{k_i\}_{i=1}^m</math> of indices in the range <math>\{1,\dotsc,n\}</math> such that <math> \alpha_{k_1} \alpha_{k_2} \dotsb \alpha_{k_m} = \beta_{k_1} \beta_{k_2} \dotsb \beta_{k_m} </math>. To [[reduction (complexity)\|reduce]] this to a parsing expression grammar, let <math> \gamma_0, \gamma_1, \dotsc, \gamma_n </math> be arbitrary pairwise distinct equally long strings of terminal symbols (already with <math>2</math> distinct symbols in the terminal symbol alphabet, length <math>\lceil \log_2(n+1) \rceil</math> suffices) and consider the parsing expression grammar <math display="block"> \begin{aligned} S &\leftarrow \&(A \, !.) \&(B \, !.) (\gamma_1/\dotsb/\gamma_n)^+ \gamma_0 \\ A &\leftarrow \gamma_0 / \gamma_1 A \alpha_1 / \dotsb / \gamma_n A \alpha_n \\ B &\leftarrow \gamma_0 / \gamma_1 B \beta_1 / \dotsb / \gamma_n B \beta_n \end{aligned} </math> Any string matched by the nonterminal <math>A</math> has the form <math>\gamma_{k_m} \dotsb \gamma_{k_2} \gamma_{k_1} \gamma_0 \alpha_{k_1} \alpha_{k_2} \dotsb \alpha_{k_m}</math> for some indices <math>k_1,k_2,\dotsc,k_m</math>. Likewise any string matched by the nonterminal <math>B</math> has the form <math>\gamma_{k_m} \dotsb \gamma_{k_2} \gamma_{k_1} \gamma_0 \beta_{k_1} \beta_{k_2} \dotsb \beta_{k_m}</math>. Thus any string matched by <math>S</math> will have the form <math>\gamma_{k_m} \dotsb \gamma_{k_2} \gamma_{k_1} \gamma_0 \rho</math> where <math> \rho = \alpha_{k_1} \alpha_{k_2} \dotsb \alpha_{k_m} = \beta_{k_1} \beta_{k_2} \dotsb \beta_{k_m}</math>. == Practical use ==

Parsing expression grammar: Difference between revisions