Computer Languages History
(Click or use local copy.)
 CS 3723/3721
 Programming Languages
 Spring 2005

 Recitation 4
 Formal Grammars
    Week 4: Feb 7-11
 Due (on time): 2005-02-14  23:59:59
 Due (late):        2005-02-18  23:59:59

Recitation 4 must be submitted following directions at: submissions with deadlines
  • 2005-02-14  23:59:59 (that's Monday, 14 February 2005, 11:59:59 pm) for full credit.
  • 2005-02-18  23:59:59 (that's Friday, 18 February 2005, 11:59:59 pm) for 75% credit.


Overview: This recitation consists of a sequence of problems related to formal grammars. Please refer to formal grammers for reference material.


Recitation Problems:

  1. In each case below give an informal description of the language defined by the given grammar. "Informal" means short and simple. One way to get started is to try various derivation sequences and get intuition about what sentences can be generated. For example, in 1.a. below, one can write A ==> a A c ==> a b c, or A ==> a A c ==> a a A c c ==> a a b c c, and so forth.

    1.            
           A ---> a A c                 (A is start symbol)
           A ---> b
         
    2. 
           O ---> a  |  a E             (O is start symbol)
           E ---> a O
         
    3. 
           S ---> A B                   (S is start symbol)
           A ---> a A  |  a
           B ---> b B c  |  b c
         
    4. (This one is harder to see.)
      
           S ---> ( S ) S  |  e                   (S is start symbol)
                                                  (e is the empty symbol, nothing --
                                                    it just disappears)
         
    5. (This one is really hard and is just to show how complicated grammars can get. Don't spend too much time on it.)
      
           S ---> a B  |  b A           (S is start symbol)
           A ---> a S  |  b A A  |  a
           B ---> b S  |  a B B  |  b   
         
  2. Consider the following grammar for an assignment statement (A), with an operator ^ representing exponentiation:
    1. What are the terminal symbols? The non-terminal symbols? The metasymbols?

    2. Construct parse trees for each of the following sentences: (But don't write them down, since this is too irritating to do in a text file. Instead, just use them to answer question c below.)

      1. d = a + b + c
      2. d = a ^ b ^ c
      3. d = a + b * c ^ d
      4. d = a * (b + c)

    3. What do the trees in ii. and iii. above say about the precedence and associativity of the new operator ^?

  3. This question users the examples in the separate web page Examples.

    1. Show that the grammar in item 2 is ambiguous by giving a sentence (sequence of terminal symbols) with more than one parse tree.

    2. Show that the grammar in item 13 is ambiguous.

    3. Show that the grammar in item 14 is ambiguous. What strange and undesirable property does this grammar have?

    4. What is the problem with the grammar in item 15?

  4. A regular grammar has only rules of the form:

    1. Write a regular grammar for identifiers, using terminals of the form digit for a single digit, and letter for a single letter.
      [Then it is easy to write (where "..." is not part of the notation)
        
         digit  ----> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
         letter ----> a | b | ... | z | A | B | ... | Z
         

      and these rules satisfy the condition for being a regular grammar.]

      As in question 3, you will need to use recursion (or the notation from question 3).

    2. Write a regular grammar for integer constants, using terminals of the form digit for a single digit.

    3. Here is an algorithm to convert any regular grammar to a finite state machine:

      1. For each non-terminal A, construct a circled state labeled with A.

      2. Construct and extra double circled terminal state T.

      3. For each rule A ---> a, draw an arrow labeled with a from the state labeled with A to the state labeled with T.

      4. For each rule A ---> b B, draw an arrow labeled with b from the state labeled with A to the state labeled with B.

      Convert the following regular grammar to a finite state machine:

        
        A ---> a A  |  a B  |  b B  |  b
        
        B ---> b B  |  c
        

    4. There is a very similar algorithm to convert a finite state machine to a regular grammar. Figure out the algorithm (you don't need to state it formally) and use it to convert the finite state machine for a C-style comment to a regular grammar. (See Comment FSM here.)

    5. (This question is also hard!) Consider the following grammar:

        
        A ---> a A b  |  a b
        

      First give the language described by this grammar.

      Then argue that no finite state machine can describe this language. (If there are 10 states in the machine, consider the sentence with 11 a's followed by 11 b's. As the first 11 a's are processed, you must be in the same state twice. From this state you must be able to get to the final state as you use up the remaining a's and all the b's.) Thus this is not a regular grammar.


What you should submit: Refer to the submissions directions and to deadlines at the top of this page. The text file that you submit should first have Your Name, the Course Number, and the Recitation Number. The rest of the file should have answers to the questions above in it.

 Contents of email submission for Recitation 4:

Last Name, First Name; Course Number; Recitation Number (4).

Answers to problems 1, 2, 3, 4, and 5 above (where each has several subparts). Problems should be in the order given and should be clearly labeled, with 1, 2, etc., and a., b., etc.


Key ideas: A formal grammar allows one to describe the syntax of a programming language in a formal (mathematical) way that is not subject to misinterpretations like an English description would be. They allow an infinite collection of objects (a language) to be described by a relatively simple short grammar. These grammars are now essential descriptive tools and are also directly used to construct compilers, as we will see in the next few recitations.


Revision date: 2005-02-03. (Use ISO 8601, an International Standard.)