CS 3723: Semantic Actions

	CS 3723 Programming Languages
Semantic Actions

0. Translation Using Semantic Actions: The idea is to attach semantic code (often called semantic actions) to each grammar rule. During the course of a shift-reduce parse, each time a reduction is carried out, the corresponding semantic action code is executed. One wants to arrange things so that this carries out a translation.

Start with the following simple grammar:

Grammar: Arithmetic Expressions
P −−> E E −−> E + T \| T T −−> T * F \| F F −−> ( E ) \| id

Use the following shift-reduce table for this grammar:

Parser: Shift-Reduce Table

     |  id |  *  |  +  |  (  |  )  |  $  |
−−−−−+−−−−−+−−−−−+−−−−−+−−−−−+−−−−−+−−−−−+
  P  |     |     |     |     |     | acc |  (s = "shift")
  E  |     |     |  s  |     |  s  |  r  |
  T  |     |  s  |  r  |     |  r  |  r  |  (r = "reduce")
  F  |     |  r  |  r  |     |  r  |  r  |
  id |     |  r  |  r  |     |  r  |  r  |  (acc = "accept")
  *  |  s  |     |     |  s  |     |     |
  +  |  s  |     |     |  s  |     |     |    
  (  |  s  |     |     |  s  |     |     |
  )  |     |  r  |  r  |     |  r  |  r  |
  $  |  s  |     |     |  s  |     |     |
−−−−−+−−−−−+−−−−−+−−−−−+−−−−−+−−−−−+−−−−−+

The actual input (with identifiers) is: $ ( a + b ) * c $. To the parser, the input sentence will have pointers to the actual identifiers. We show these as id tags in red:

Input Sentence
$ ( id + id ) * id $ \| \| \| a b c

1. Translate to RPN: See Reverse Polish Notation to understand the target language. Here is the grammar rewritten, with semantic actions in red:

Grammar: Arithmetic Expressions Semantic Actions in red
P −−> E E −−> E + T output("+"); E −−> T T −−> T * F output("*"); T −−> F F −−> ( E ) F −−> id output(id.tag);

Shift-Reduce Actions Semantic Actions in red

Stack               Curr    Rest of Input       Action
(top at right)      Sym
---------------------------------------------------------------------------------
$                   (       id + id ) * id $    shift
$  (                id      + id ) * id $       shift
$  ( id             +       id ) * id $         reduce: F −−> id     output("a");
$  ( F              +       id ) * id $         reduce: T −−> F
$  ( T              +       id ) * id $         reduce: E −−> T
$  ( E              +       id ) * id $         shift
$  ( E  +           id      ) * id $            shift
$  ( E  +  id       )       * id $              reduce: F −−> id     output("b");
$  ( E  +  F        )       * id $              reduce: T −−> F
$  ( E  +  T        )       * id $              reduce: E −−> E + T  output("+");
$  ( E              )       * id $              shift
$  ( E )            *       id $                reduce: F −−> ( E )
$  F                *       id $                reduce: T −−> F
$  T                *       id $                shift
$  T  *             id      $                   shift
$  T  *  id         $                           reduce: F −−> id     output("c");
$  T  *  F          $                           reduce: T −−> T * F  output("*");
$  T                $                           reduce: E −−> T
$  E                $                           reduce: P −−> E
$  P                $                           accept

Arithmetic expression: $ ( a + b ) * c $
Translation to RPN:     a b + c *

Notice that the result of parsing, and carrying out the associated semantic actions, is to accept the input and to translate it to RPN, which is shown afterward in red above. This method will translate an arbitrarily complicated arithmetic expression (one recognized by this grammar) into RPN.

2. Translate to Intermediate Code: Be careful below to note that the lowercase t has nothing to do with the uppercase T. Also, when there are two Es or two Ts in the same grammar rule, the tags on them on the stack will be different, so we need subscripts to keep track of which tag is being created or used.

Grammar: Arithmetic Expressions Semantic Actions in red

P  −−>  E        P.tag = E.tag; output("print(" + P.tag + ")");

E₁ −−>  E₂ + T   E₁.tag = "t" + nextint();
                  output(E₁.tag + "=" E₂.tag + "+" T.tag);

E  −−>  T        E.tag = T.tag;

T₁ −−>  T₂ * F   T₁.tag = "t" + nextint();
                  output(T₁.tag + "=" T₂.tag + "*" F.tag);

T  −−>  F        T.tag = F.tag;

F  −−>  ( E )    F.tag = E.tag;

F  −−>  id       F.tag = id.tag;

The table below shows the shift-reduce parse of the same sentence, showing the stack, current symbol, remaining symbols, and next action to take at each stage. The semantic tags are shown in red below the id items and the stack items.

Shift-Reduce Actions Tag field below stack in red

Stack               Curr    Rest of Input       Action
(top at right)      Sym
-------------------------------------------------------------------
$                   (       id + id ) * id $    shift
$  (                id      + id ) * id $       shift
$  ( id             +       id ) * id $         reduce: F −−> id
     a
$  ( F              +       id ) * id $         reduce: T −−> F
     a
$  ( T              +       id ) * id $         reduce: E −−> T
     a
$  ( E              +       id ) * id $         shift
     a
$  ( E  +           id      ) * id $            shift
     a
$  ( E  +  id       )       * id $              reduce: F −−> id
     a     b
$  ( E  +  F        )       * id $              reduce: T −−> F
     a     b
$  ( E  +  T        )       * id $              reduce: E −−> E + T
     a     b
[output("t1 = a + b;");]
$  ( E              )       * id $              shift
     t1
$  ( E )            *       id $                reduce: F −−> ( E )
     t1
$  F                *       id $                reduce: T −−> F
   t1
$  T                *       id $                shift
   t1
$  T  *             id      $                   shift
   t1
$  T  *  id         $                           reduce: F −−> id
   t1    c
$  T  *  F          $                           reduce: T −−> T * F
   t1    c
[output("t2 = t1 * c;");]
$  T                $                           reduce: E −−> T
   t2
$  E                $                           reduce: P −−> E
   t2
$  P                $                           accept
   t2
[output("print(t2);");]

Arithmetic expression: $ ( a + b ) * c $
Translation to Intermediate code:
t1 = a + b;
t2 = t1 * c;
print(t2);

Notice again that the result of parsing, and carrying out the associated semantic actions, is to accept the input and to translate it to a sequence of intermediate code statements, which would be much closer to a simple machine language. As before, the input arithmetic expression can be arbitrarily complicated. This intermediate code is shown afterward in red above.

Parse Tree with Tags (in red)

                          E-\
                          |  t2
                          T-\
                          |  t2=t1*c
           +--------------+----+
           |              |    |
           T-\            |    |
           |  t1          |    |
           F-\            |    |
           |  t1          |    |
 +---------+----------+   |    |
 |         |          |   |    |
 |         E-\        |   |    |
 |         |  t1=a*b  |   |    |
 |   +-----+----+     |   |    |
 |   |     |    |     |   |    |
 |   E-\   |    |     |   |    |
 |   |  a  |    |     |   |    |
 |   T-\   |    T-\   |   |    |
 |   |  a  |    |  b  |   |    |
 |   F-\   |    F-\   |   |    F-\
 |   |  a  |    |  b  |   |    |  c
 (  id-\   +   id-\   )   *   id-\
        a          b              c

(Revision date: 2014-09-30. Please use ISO 8601, the International Standard.)