CS 3721
Programming Languages  
Fall 2013
Recitation 3.   Grammars
Week 3: Sep 9-13

Note: Dates below changed on 10 Sep 2013 (2 and 3 days later).
Submit following directions at: submissions and rules at: rules. Deadlines are:
  • 2013-09-20  23:59:59 (that's Fri, 20 Sep 2013, 11:59:59 pm) for full credit.
  • 2013-09-23  23:59:59 (that's Mon, 23 Sep 2013, 11:59:59 pm) for 75% credit.


A. Implementing Regular Expressions: Study the pages:

  • Course Overview (first item on Calendar Page, Week 0). This gives an overview of the process of handling a regular expression using NFAs.
  • RE −−> NFA, which illustrates the process using the regular expression:   /.*(ab|aac)/

Start with the regular expression: /a(ab)*|b(ba)*/
This problem asks you to trace through a process of handling this regular expression, following the pattern show in the two pages above, especially Example 1 in the second page.

  1. Break the expression up into a sequence of applications of regular expression operators, one at a time. Anything in parentheses has to be dealt with first, and after that comes, star, concatenation, and or in that order. So above, the first thing to look at would be the (ab). [You should do this problem by hand, and not in some fancy way using RPN or a parser.]

  2. Just as with Example 1, for each operator patch in the ε-moves that are show just above Example 1. The result of this should be an NFA with ε-moves that recognizes the same language that is described by the regular expression we started with. (It would be possible to "save" some ε-moves, that is, leave them off, but instead you should use all the ones recommended. The states of the resulting NFA can be numbered sequentually as you insert them (as shown in Example 1), or you can number them in some arbitrary way (but use numbers in sequence starting at 0).

  3. If we were writing a program to handle the NFA above, we would simulate its actions given an input. Instead I want you to convert your NFA above to a DFA by hand using the generalized subset algorithm. [This isn't as bad as it might seem. The resulting DFA has 7 states plus an error state for the empty set.]

B. Problems on Formal Grammars: Study the following pages, with emphasis on the first two:

  1. Intro. to Grammars,
  2. Derivations, Parse Trees, and Ambiguity, and
  3. Types of Grammars (time permitting)


  1. In each case below give an informal description of the language defined by the given grammar. "Informal" means short and simple, perhaps using an English language description. One way to get started is to try various derivation sequences and get intuition about what sentences can be generated. For example, in 1.a. below, one can write A ==> a A c ==> a b c, or A ==> a A c ==> a a A c c ==> a a b c c, and so forth.

    1.      A ---> a A c                 (A is start symbol)
           A ---> b
         
    2.      O ---> a  |  a E             (O is start symbol)
           E ---> a O                     (Hint: Why use symbols E and O?)
         
    3.      L ---> L A | A               (L is start symbol)
           A ---> a
         
    4.      N ---> ( C )                 (N is start symbol)
           C ---> C , A | A
           A ---> a | b | c | d 
         
    5.      S ---> A B                   (S is start symbol)
           A ---> a A  |  a
           B ---> b B c  |  b c
         

  2. Here is a very simple example grammar:
      
      E  ---> E + E
      E  ---> E * E 
      E  ---> ( E ) | a | b | c | d 
      
    1. Give the leftmost derivation and the parse tree for:

        a * ( b + c )
    2. Give the two distinct leftmost derivations and the two distinct parse trees for:

        a * b + c 

  3. Here is a "standard" simple example grammar that I will often use:
      
      E  ---> E + T |  T
      T  ---> T * F |  F
      F  ---> ( E ) | a | b | c | d 
      
    In each case below, given the sentence, construct the leftmost derivation sequence ( = replacement sequence = derivation) and the parse tree. (These are both unique.)

    1.   a + b + c

    2.   a * b + c * d


  4. Consider the following grammar for an assignment statement (A), with an operator ^ representing exponentiation. (This is a "beefed-up" version of the standard grammar.)
      
      A  ---> L = E
      E  ---> E + T | E - T | T
      T  ---> T * S | T / S | S
      S  ---> F ^ S | F
      F  ---> ( E ) | L
      L  ---> a | b | c | d 
      
    1. What are the terminal symbols? The non-terminal symbols? The metasymbols?

    2. Construct leftmost derivations and parse trees for each of the following sentences: (But only submit the derivations and not the parse trees, since these are pretty complicated.)

      1.    d = a + b + c

      2.    d = a ^ b ^ c

      3.    d = a + b * c ^ d

    3. What do the derivations and trees in ii. and iii. above say about the precedence and associativity of the new operator ^?


  5. Here are three harder problems, not for credit, just "for fun". In each case below give an informal description of the language defined by the given grammar.

    1.   S ---> ( S ) S  |  ε            (S is start symbol.)
                                          (ε is the empty symbol)
      
    2.   S ---> S S  |  ( S )  | ( )     (S is start symbol.)
                                          (Same language as a.)
      
    3.   S ---> a B  |  b A              (S is start symbol.)
        A ---> a S  |  b A A  |  a     
        B ---> b S  |  a B B  |  b     
      

Revision date: 2013-09-13. (Please use ISO 8601, the International Standard.)