CS 3721 Programming Languages
|
This parser uses a recursive function corresponding to each grammar rule (that is, corresponding to each non-terminal symbol in the language). For simplicity one can just use the non-terminal as the name of the function. The body of each recursive function mirrors the right side of the corresponding rule. In order for this method to work, one must be able to decide which function to call based on the next input symbol.
Perhaps the hardest part of a recursive descent parser is the scanning: repeatedly fetching the next token from the scanner. It is tricky to decide when to scan, and the parser doesn't work at all if there is an extra scan or a missing scan.
P ---> E E ---> E + T | E - T | T T ---> T * S | T / S | S S ---> F ^ S | F F ---> ( E ) | char |
In order to eliminate left recursion, one simple method is to introduce new notation: curley brackets, where {xx} means "zero or more repetitions of xx", and parentheses () used for grouping, along with the or-symbol: |. Because of the many metasymbols, it is a good idea to enclose all terminals in single quotes. Also put a '$' at the end. The resulting grammar looks as follows:
P ---> E '$' E ---> T {('+'|'-') T} T ---> S {('*'|'/') S} S ---> F '^' S | F F ---> '(' E ')' | char |
Now the grammar is suitable for creation of a recursive descent parser. Notice that this is a different grammar that describes the same language, that is the same sentences or strings of terminal symbols. A given sentence will have a similar parse tree to one given by the previous grammar, but not necessarily the same parse tree.
One could alter the first grammar in other ways to make it work for recursive descent. For example, one could write the rule for E as:
E ---> T '+' E | T |
This eliminates the left recursion, and leaves the language the same, but it changes the semantics of the language. With this change, the operator '+' would associate from right to left, instead of from left to right, so this method is not acceptable.
The sequence of function calls that occurs during the parse essentially does a type of traversal of the parse tree, that is, a process of visiting each node of the tree. In this case many nodes are visited more than once. In your data structures course you studied traversals of three kinds: preorder, inorder, and postorder. Which kind of traversal do you think the parse gives for the parse tree?