CS3723, Exam, 6 March 2014 Partial ANSWERS Your Name:     Dedalus              Stephen     
Last First

Directions: You may write on the exam or use extra paper or both. You are not allowed to use anything except the exam sheets. No crib sheets, no notes, no calculator, no cell phone. [In some parts below you will asked for Java, C, or MIPS code; if you don't remember the syntax or form of what you want to write, for part credit, fake it as well as you can by describing what you want to do.]


  1. (20) Grammars: Consider the following grammar and the sentence to the right of it.

    Grammar
     E  −−>  E + T  |  T
     T  −−>  T * F  |  F
     F  −−>  ( E )  |  id
      
    Sentence
    ( id + id ) * id
    

    1. Is this grammar ambiguous? (Just Yes or No.)
    2. Write out a leftmost derivation for the sentence.
      E ==> T ==> T * F ==> F * F ==> ( E ) * F ==> ( E + T ) * F ==>
      ( T + T ) * F ==> ( F + T ) * F ==> ( id + T ) * F ==>
      ( id + F ) * F ==> ( id + id ) * F ==> ( id + id ) * id
      [I took off 2 points for those who produced the rightmost derivation backwards.
      Also you should use ==> for derivations and −−> for grammar rules.
      The most common mistake was to start off with: E ==> T * F.]
    3. Draw a parse tree for the sentence.
    4. Is there more than one parse tree for the sentence? (Just Yes or No.)


  2. (10) RPN (Reverse Polish Notation):
    1. By any method figure out what the value of the following RPN string is:

        2 3 + 4 * 5 6 * +

      [The blanks are not significant. Each digit is a separate operand, so there is no integer "56". If you show your work and get the wrong answer, you might get part credit. Full credit for just the right answer.] [The calculations below are not required as part of the answer.]

      Stack   Action
      -------------------------------------
      2 3     push 2; push 3;
      5       apply + to top two and push;
      5 4     push 4;
      20      apply * to top two and push;
      20 5 6  push 5; push 6;    
      20 30   apply * to top two and push;
      50      apply + to top two and push;
      
    2. Give a normal arithmetic expression that represents this RPN string and has all the same operands in the same order. (You may have to add parentheses.) This is: ((2+3)*4) + (5*6) = 50, or
      (2+3)*4 + 5*6, with fewest parens.


  3. (10) Regular Expressions: Consider the regular expression:

      (a|bc*)dd*
      = adn, n >= 1
      or
      =bcmdn, m >= 0, n >= 1

    Which of the following strings are described by this RE?

    (i) bcd, (ii) abc, (iii) abd, (iv) ad, (v) b, (vi) bd
    Note: Can't have both an a and a b. Must have at least one d.
    You were only required to identify the correct strins.]


  4. (20) S-R Parsers: Consider the grammar on the left and the corresponding parse table for shift-reduce parsing below it. Below to the right is the start of the shift-reduce parse of the sentence: $ id + id ^ id * id $ . You are to complete this parse, as we did in the course, step-by-step. In case of an r, give the grammar rule used. The rule is shown for the first r below. [Be careful: the standard mistake is to not reduce the longest matching right-hand-side.]

Grammar:
Arithmetic Expressions
P  −−>  E
E  −−>  E + T  |   E - T  |  T
T  −−>  T * S  |   T / S  |  S
S  −−>  F ^ S  |   F
F  −−>  ( E )  |  id


Parser: Shift-Reduce Table
   | id| ^ | */| +-| ( | ) | $ |
---+---+---+---+---+---+---+---+
P  |   |   |   |   |   |   |acc|
E  |   |   |   | s |   | s | r |
T  |   |   | s | r |   | r | r |
S  |   | r | r | r |   | r | r |
F  |   | s | r | r |   | r | r |
id |   | r | r | r |   | r | r |
^  | s |   |   |   | s |   |   |
*/ | s |   |   |   | s |   |   |
+- | s |   |   |   | s |   |   |
(  | s |   |   |   | s |   |   |
)  |   | r | r | r |   | r | r |
$  | s |   |   |   | s |   |   |
---+---+---+---+---+---+---+---+

Shift-Reduce Actions (Fill in side and bottom)
Stack               Curr    Rest of Input     Action
(top at right)      Sym
------------------------------------------------------------
$                   id      + id ^ id * id $    s
$ id                +       id ^ id * id $      r: F --> id
$ F                 +       id ^ id * id $      r: S --> F
$ S                 +       id ^ id * id $      r: T --> S
$ T                 +       id ^ id * id $      r: E --> T
$ E                 +       id ^ id * id $      s
$ E +               id      ^ id * id $         s
$ E + id            ^       id * id $           r:
$ E + F             ^       id * id $           s
$ E + F ^           id      * id $              s
$ E + F ^ id        *       id $                r: F --> id
$ E + F ^ F         *       id $                r: S --> F
$ E + F ^ S         *       id $                r: S --> F ^ S
$ E + S             *       id $                r: T --> S
$ E + T             *       id $                s
$ E + T *           id      $                   s
$ E + T * id        $       -                   r: F --> id
$ E + T * F         $       -                   r: S --> F
$ E + T * S         $       -                   r: T --> T * S
$ E + T             $       -                   r: E --> E + T
$ E                 $       -                   r: P --> E
$ P                 $       -                   acc


  1. (20) Recursive-Descent Parsing: Consider the grammar:

    Grammar: Weird Language
    S −−>  b M b      (S = start symbol)
    M −−>  ( L  |  a
    L −−>  M a )
    

    [Some students are treating the 'a' in the rule for M as an error. The rule for M says: expect either a '(' followed by something recognized by L() or an 'a', but of course not both.
    Some students were worried about the lack of quote marks around the terminals. They wondered if '(' and ')' were also terminals. But we saw this grammar in exactly this form in Recitations 2 and 4, so didn't expect you would have this trouble. All through the course I've sometimes left off quotes on terminals.]

    1. Which symbols are terminal symbols? b ( a )

    2. Which symbols are non-terminal symbols? S M L

    3. Write code for the function M that is part of a recursive-descent parser for this language. (Just the code for M, not the entire parser.) You may write in either Java or C (not much difference). You must include calls to the scanner and calls to report any error. (Assume that the next terminal is always in a variable next, that scan( ) replaces what is in next with the next terminal, and that the next terminal has already been scanned as you enter M. These are the assumptions we usually made in the course.) The parser is at the end of the page at: parser. Possible code for the function M:

    void M (void) {
       if (next == '(') {
          scan();
          L();
          return;
       }
       else if (next == 'a') {
          scan();
          return;
       }
       else error(3);
    }
    void M () { //no err check
       if (next == '(') {
          scan();
          L();
       }
       else if (next == 'a') {
          scan();
       }
    }
    void M (void) {
       if (next == '(') {
          scan();
          L();
          return;
       } // no else needed
       if (next == 'a') {
          scan();
          return;
       } // no else needed
       error(3);
    }

    [Several students started the code off with the incorrect code:
       if (next != '(' || next != 'a') error (1);
    But this should be:
       if (next != '(' && next != 'a') error (1);
    or else it could be:
       if (!(next == '(' || next == 'a')) error (1);
    Formulas:
          !(A || B) is equivalent to: (!A) && (!B),
          !(A && B) is equivalent to: (!A) || (!B)   ]


  2. (20) MIPS R-D Parser: The grammar rule for the Tiny while statement is:

      W  −−>  '{' E '?' S { S } '}'

    This is handled by the parser with a function W. As well as you can, give Java or C code for this function. Make the same assumptions about the next variable and the scan() function as in the previous problem. This is just the code for the bare parser, and should not show any MIPS code produced. Do not check for errors. [As you call W, you can assume that you've already scanned and that next contains the character {.] The code below is from my actual compiler.
    The red part, without the error handling, is one possible answer.
    Easier is not to have my function L(), but instead substitute as indicated below.
    Other variations are discussed below.

       private void L() {
          S(); // each S must start with one of the following
          while (next == '[' || next == '{' || next == '<' ||
                 next == '>' || Character.isLowerCase(next)) {
             S();
          }
       }
    
       private void W() { // while
          int w = nextTempWh();
          emit("WhileStart" + w + ":"); 
          if (next == '{') { // fetch the conditional
             scan();
             int res = E();
             emit("        l.d     $f2, " + (res*8) + "($s1)");
             emit("        l.d     $f4, 0($s1)");
             emit("        c.eq.d  $f2, $f4");
             emit("        bc1t    WhileEnd" + w);
          }
          if (next == '?') {
             scan();
             L(); // or insert 4 lines below in place of L()
             emit("        j       WhileStart" + w);
             emit("WhileEnd" + w + ":");
          }
          else error(7);
          if (next == '}') scan();
          else error(8);
       }
    
    In place of my call to the function L(), you may have used
    the simpler code:
       S();
       while (next != '}') { // inside while, seq of stmts must end with '}'
          S();
       }
    
    Further variations:
    1. The initial check for '{' is not needed in my parser,
    nor is it needed in yours for this exam. Following my directions in the exam,
    you need an initial scan.
    2. Each Tiny statement ends in a specific character, ']', '}', ';', and
    the whole program ends with '$'.  In my parser, I always scanned past this
    final character in the function that handled the statement, but you wouldn't
    have to do this.  Instead S() could have an initial scan.
    
    
    

  3. (20) MIPS Storage for Our Compiler Assignment: Here is how the storage for our assignment is declared in MIPS, where the address of M is stored in $s1:

      M:      .double 0.,1.,2.,3.,4.,5.,6.
              .double 7.,8.,9. # constants
              .space  208  # variables a to z
              .space  1000

    Using our conventions for representing constant digits and single-letter variables in memory, how do you reference the number 5.0 in memory? The number 0. is stored at an offset 0 from M, so 5. will be at an offset 40 from M. (Each double takes 8 bytes.) Thus 5. will be referenced by 40($s1).
    Where would the variable b be stored in memory? The 10 digits as doubles take up 80 bytes, so the letter a will be at offset 80 from M, that is, it will be referenced by 80($s1). b will be referenced by 88($s1).
    Give a MIPS instruction to load 5.0 into register $f4. Give a MIPS instruction to store the value of register $f4 into the variable b.

            l.d  $f4,  40($s1)  # load 5.0 into $f4
            s.d  $f4,  88($s1)  # store $f4 into variable b
       # together these do the assignment: b = 5;
    [I used my own notation of M[5] and M[11] for 5.0 and b, as if M were an array of doubles, which it is not. It's just a starting address of some storage that we treat like an array of doubles. Some students wrote M[88] in place of M[11], but that is not correct.]


  4. (20) MIPS Compiler: Consider the function E(), and suppose it is handling a single addition. What information from outside itself does this function use to output its MIPS statements. Where does this information come from?
    [The answer below is more elaborate and detailed than what you needed.]
    Since this call to E is supposed to handle a single addition, there will be two calls to the function T(). The values returned from these calls give memory addresses of the two operands to the addition. These are the addresses that will be in effect later at run time. Each integer value is an offset from the value in $s1, which is the address of the location M.
    The third thing needed from outside the function, is a new counter value to allow a new temporary memory location to be determined. This the memory location where the sum of the two operands will be stored at run time.

    As specific as you can, say what statements are written out.
    First will come two MIPS load instructions, loading the operand values into two registers, say $f2 and $f4. Then comes a MIPS add instruction to add the contents of the two registers, leaving the result in another register, say $f6. Finally is a store instructions to store the value in $f6 into the new memory location mentioned above. Here are the actual instructions written out by my own compiler, slightly altered and simplified. (Yours will likely differ from this.)
    
             int arg1 = T();
             int arg2 = T();
             temp = nextTemp(); // started at 0, incr. each time called
             res = temp + 36; // get past the digits and identifiers
     // below I multiply everything by 8
             emit("        l.d     $f2, " + (arg1*8) + "($s1)"); 
             emit("        l.d     $f4, " + (arg2*8) + "($s1)");
             emit("        add.d   $f6, $f2, $f4");
             emit("        s.d     $f6, " + (res*8) + "($s1)");
    
    What does E return?
    My compiler returns the value res above, where res*8 is the address at run time of the sum of the two operands above. (My code deals with the number of double words, not the number of bytes, but you may well have used just bytes.).


  5. (20) MIPS Compiler: In the grammar for the Tiny language used in class, the rule for an assignment statement is:

      A  −−>  lower-case  '='  E  ';'

    In as much detail as you can give, say what MIPS instructions should be output by the function A that handles this statement, and explain how they would be output (what information is needed to output them).
    [Again the answer below is more elaborate and detailed than what you needed.]
    The function A uses two pieces of information to output its two MIPS instructions:

    • The particular lower-case letter on the left hand side, used for the second instruction, a store. The value at run time of the expression on the right hand side will be stored in the memory address corresponding to the lc letter.
    • The integer returned by a call to E(), which an offset in memory from the value in the register $s1, referring to a memory location where the value of the expression will be at run time. The first instruction will load (always at run time) that value into a register, so that next the second instruction can store the value from that register into the variable determined by the lc letter.

    Specifically, my own code for A() is:

       private void A() {
          int res = 0, res1 = 0;
          if (Character.isLowerCase(next)) {
             res1 = next - 'a' + 10;
             scan();
          }
          else error(9);
          if (next == '=') scan();
          else error(10);
          res = E();
          if (next == ';') scan();
          else error(11);
          comment("# M[" + res1 + "] = M[" + res + "]");
          emit("        l.d     $f2, " + (res*8) + "($s1)");
          emit("        s.d     $f2, " + (res1*8) + "($s1)");
       }
    
    [It's important to realize that the value returned by the call to E() might be the location of a digit contant, or it might be the location of a single variable, or finally it could be the location of a temporary memory variable.]