Roman R. Redziejowski

REDZIEJOWSKI, Roman R., 2007. Parsing Expression Grammar as a Primitive Recursive-Descent Parser with Backtracking. Fundamenta Informaticae. 1 January 2007. Vol. 79, no. 3–4, p. 513–524. Two recent developments in the field of formal languages are Parsing Expression Grammar (PEG) and packrat parsing. The PEG formalism is similar to BNF, but defines syntax in terms of recognizing strings, rather than constructing them. It is, in fact, precise specification of a backtracking recursive-descent parser. Packrat parsing is a general method to handle backtracking in recursive-descent parsers. It ensures linear working time, at a huge memory cost. This paper reports an experiment that consisted of defining the syntax of Java 1.5 in PEG formalism, and literally transcribing the PEG definitions into parsing procedures (accidentally, also in Java). The resulting primitive parser shows an acceptable behavior, indicating that packrat parsing might be an overkill for practical languages. The exercise with defining the Java syntax suggests that more work is needed on PEG as a language specification tool.

The structure of a recursive-descent parser follows closely a grammar defined in Backus-Naur Form (BNF) or Extended BNF (EBNF). Each procedure is associated with one symbol of the grammar and attempts to recognize an input string corresponding to that symbol. It either reports ”success” and consumes the string, or reports ”failure”.

[7] Ford, B. Parsing expression grammars: A recognition-based syntactic foundation. In Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2004 (Venice, Italy, 14–16 January 2004), N. D. Jones and X. Leroy, Eds., ACM, pp. 111–122.

Parsing Expression Grammar (PEG) encodes a recursive-descent parser with limited backtracking. It has been recently noticed that in the situation when the parser is to explore several alternatives one after another, no further alternatives need to be explored after the parser reached certain ”cut point”.