CS 3721 Recitation 5: R-D Parsers

CS 3721
Programming Languages
Fall 2013

Recitation 6. Tiny® Parser

Week 6: Sep 30 - Oct 4

Submit following directions at: submissions and rules at: rules. Deadlines are:

2013-10-11 23:59:59 (that's Fri, 11 Oct 2013, 11:59:59 pm) for full credit.
2013-10-14 23:59:59 (that's Mon, 14 Oct 2013, 11:59:59 pm) for 75% credit.

A. Tiny® Parser: Study the pages:

R-D Parsers (Parsers: "Bare", Debug, Evaluate)
Tiny® (double version)

Tiny®

Tokens (the lexical level):

all tokens are just single non-whitespace characters

single letter

single digit

One hard part in writing and debugging a RD parser is in deciding when to fetch the next token (scan). It is tricky to decide when to scan, and the parser doesn't work at all if there is an extra scan or a missing scan. In general, we do an initial scan, and then scan immediately after we have made a decision based on the token scanned and are done with that token.

Hints for writing the parser:

Perhaps the hardest part for students is to decide how to handle alternatives on the right side of a rule. Each such choice must be based on the next input token. For example, the grammar rule for F has three alternatives, but these are easily distinguished because the first starts with a '(', the second with a plus or a minus, the third is a lower-case letter, and the fourth is a digit. At this point, any other token on input is an error.

How about the alternatives in the rule for S ? Here the single rule itself does not show what next token to look for. However, following rules show this, so that the alternative A is chosen in case the next token is a lower-case letter, the alternative W is chosen in case the next token is '{', and so forth.

The alternatives on S of P and C pose an additional problem, since in both cases the next token is a <. Here you could in effect introduce an extra non-terminal and an extra grammar rule: PorC −−−> P | C. The choice between these two is then based on the next token after a <, that is, an upper-case letter in case of C, and either a lower-cases letter, a digit, or a left paren in case of P.

The part of the grammar involving S { S } means "one or statements". You could easily alter the syntax of your version of the language to match { S } instead, that is, "zero or more statements". In handling such a construct, you need to know how to stop calling the function S(), the one that handles a statement. There are two ways: you can keep calling S() as long as the next terminal is one of the five kinds of tokens that can start a statement. The other (equivalent) method is to keep calling S() until the next token is the proper one to end the sequence of statements: '$' for the whole program, ':' or ']' for an if-then or an if-then-else, and '}' for a while.

A pure parser will input a sentence and just say whether the sentence was legal or not, without any other output. However, this parser is actually carrying out an implicit complete traversal of the parse tree by the function calls and returns (as will be illustrated in class). As with my examples you should use extra temporary output illustrating the calls and returns, so you can have confidence that your parser is working correctly. Your own temporary output can be simpler than what I used.

Sample inputs to your parser:

Sample Input 0

Here is an artificial "program" just to test features of your parser (it's not supposed to do anything sensible):

Sample Input 1
*b = 2; c = -3; d = 4; m = 0; a = b^d + db^c; [ a-n ? n = n(a+2); : n = n+(-5); ] { d ? b = (ba)^2; d = d-1; } [ d-b ? n = n+2; ] $**

Here is a more complicated sample source to use as the input to your parser:

Sample Input 2
*n = 1; s = 0; > m; { n - m ? s = s + 1/(nn); < n; < T; < s; < N; n = n + 1; } $**

Finally here is a much more complicated sample source to use as the input to your parser. It includes while loops nested three deep.

Sample Input 3

f = 1; g = 2; n = 3; > m;
{ m - n ?
   < n; < T; < g; < T;
   j = g; d = 2; t = 1;
   { t ?
      [ j%d ? e = 0; : e = 1; ]
      { e ?
         j = j/d; < d; < B;
         [ j%d ? e = 0; : e = 1; ]
      }
      [ j - 1 ? t = 1; : t = 0; ]
      [ d - 2 ? d = d + 2; : d = 3; ]
   }
   < N;
   n = n + 1;
   h = f + g;
   f = g; g = h;
}  $

This program produces the first m Fibonacci numbers along with their prime factorizations. Here is roughly what your parser output might look like with the above input (using a slightly different parser): Parser output.

What to Hand In:

C or Java source for the parser.
debug run for Sample Input 0.
debug run for Sample Input 1.
debug run for Sample Input 2.
shortened debug run for Sample Input 3. (Not the whole run -- my output given above is over 439 lines long.)

B. MIPS Example: Study the pages:

MIPS
MIPS Examples
MIPS: Euler Series (MIPS program in Recitation 5)

Sample Input 0

Tiny® code: hand translate to MIPS
Sample Input 0 (for part A above)
*a = 8 2; b = 7 * a; c = b + 1; d = a / c; e = 3 + d; < e; < N; $**

You should partly follow patterns of the example programs and code in the three links above. It is also permissible to use the MIPS registers more heavily.

What to Hand In:

the Tiny source code,
the MIPS code that does what the Tiny program does, and
the results of running the MIPS code using the SPIM simulator. (The output might be a surprise.)

Revision date: 2013-09-27. (Use ISO 8601, an International Standard.)