A. Tiny® Parser: Study the pages:
Tokens (the lexical level): In order to keep things simple, in this and in the next two recitations, all tokens are just single non-whitespace characters, so the scanner is particularly simple: just fetch the next non-whitespace character. This means that identifiers will be a single letter, and the only constants will be a single digit. These restrictions are not as great as you might think, and you can still write complex programs in this language One hard part in writing and debugging a RD parser is in deciding when to fetch the next token (scan). It is tricky to decide when to scan, and the parser doesn't work at all if there is an extra scan or a missing scan. In general, we do an initial scan, and then scan immediately after we have made a decision based on the token scanned and are done with that token. Hints for writing the parser: Notice that the required parser includes much of the example parser for arithmetic expressions as a special case, so that you can just use this earlier parser for these parts. Perhaps the hardest part for students is to decide how to handle alternatives on the right side of a rule. Each such choice must be based on the next input token. For example, the grammar rule for F has three alternatives, but these are easily distinguished because the first starts with a '(', the second with a plus or a minus, the third is a lower-case letter, and the fourth is a digit. At this point, any other token on input is an error. How about the alternatives in the rule for S ? Here the single rule itself does not show what next token to look for. However, following rules show this, so that the alternative A is chosen in case the next token is a lower-case letter, the alternative W is chosen in case the next token is '{', and so forth. The alternatives on S of P and C pose an additional problem, since in both cases the next token is a <. Here you could in effect introduce an extra non-terminal and an extra grammar rule: PorC −−−> P | C. The choice between these two is then based on the next token after a <, that is, an upper-case letter in case of C, and either a lower-cases letter, a digit, or a left paren in case of P. The part of the grammar involving S { S } means "one or statements". You could easily alter the syntax of your version of the language to match { S } instead, that is, "zero or more statements". In handling such a construct, you need to know how to stop calling the function S(), the one that handles a statement. There are two ways: you can keep calling S() as long as the next terminal is one of the five kinds of tokens that can start a statement. The other (equivalent) method is to keep calling S() until the next token is the proper one to end the sequence of statements: '$' for the whole program, ':' or ']' for an if-then or an if-then-else, and '}' for a while. A pure parser will input a sentence and just say whether the sentence was legal or not, without any other output. However, this parser is actually carrying out an implicit complete traversal of the parse tree by the function calls and returns (as will be illustrated in class). As with my examples you should use extra temporary output illustrating the calls and returns, so you can have confidence that your parser is working correctly. Your own temporary output can be simpler than what I used.
Sample inputs to your parser: You should start with very simple input as you are developing your parser -- perhaps just a single assignment statement at first. Then you might try completing the parser for assignment statements (non-terminal A) and for the two kinds of output statements (non-terminals P and C). With these implemented you can use the simple program in part B below as test input. Call this Sample Input 0. (I checked all the sample inputs below with my program, and they seem to have correct syntax.) Here is an artificial "program" just to test features of your parser (it's not supposed to do anything sensible):
Here is a more complicated sample source to use as the input to your parser:
Finally here is a much more complicated sample source to use as the input to your parser. It includes while loops nested three deep.
This program produces the first m Fibonacci numbers along with their prime factorizations. Here is roughly what your parser output might look like with the above input (using a slightly different parser): Parser output. What to Hand In: You should submit whatever sample inputs you managed to parse. If you have troubles with some of the samples presented here, you may submit some of your own, perhaps inputs that exercise the features you managed to implement. B. MIPS Example: Study the pages:
You should partly follow patterns of the example programs and code in the three links above. It is also permissible to use the MIPS registers more heavily. What to Hand In: You should submit for part B of Recitation 6: Revision date: 2013-09-27. (Use ISO 8601, an International Standard.) |