CS 3723
 Programming Languages 
   Tiny® Language  


Description: Much of the design is based on a requirement that each token be a single non-whitespace character. The idea is to make various elementary parts of the compiler unnecessary, such as the scanner and the symbol table. While relatively simple to implement, these can be distracting and confusing even in a modest compiler project.

Previous versions of Tiny® were oriented toward 32-bit integer data, while this version uses 64-bit floating point (doubles). These give exact 51-bit integer data, except when conversion to integer is required (truncation, etc.), where the MIPS hardware only allows 32 bits.

Grammar for Tiny® language
M  −−−>  S { S } '$'
S  −−−>  I  |  W  |  A  |  P  |  G
I  −−−>  '[' E '?' S { S } ':' S { S } ']'  |  '[' E '?' S { S } ']'
W  −−−>  '{' E '?' S { S } '}'
A  −−−>  lower-case '=' E ';'
P  −−−>  '<' ( E ';'  |  upper-case ';' )
G  −−−>  '>'  lower-case ';'
E  −−−>  T {('+' | '-') T}
T  −−−>  U {('*' | '/' | '%' | '@') U}
U  −−−>  F '^' U | F
F  −−−>  '(' E ')'  |  lower-case  |  digit
[Here is a Colorful Grammar.]

Here "lower-case" stands for a single lower-case letter, "upper-case" for a single upper-case letter, and "digit" for a single decimal digit (0 to 9).

This grammar (and the language it defines) doesn't have any regular reserved words (keywords), but a few upper-case letters are reserved ('B', 'T', and 'N').

Just to help with understanding, below on the left is the intuitive meaning of each of the above non-terminals. The table on the right outlines the input/output statements.

Non-terminal Description
  MMain Program
  LList of Statements (1 or more)
  SStatement
  IIf-Then-[Else] Statement
  WWhile Statement
  AAssignment Statement
  PPut or Print (double, char)
  GGet (double)
  EExpression (logical or arith)
  TTerm
  U(Hmmm. Ugly Term?)
  FFactor
  
Statement Meaning
  > lower-case ; read into variable
  < expression ; print as a double
  < B ; print a Blank
  < N ; print a Newl:
  < T ; print a Tab:


Semantics: The basic assignment is to be implemented using MIPS assembly code and 64-bit doubles for all numerical data values. (It would be a little easier to implement using 32-bit integers, but using doubles allows strictly more possibilities.)

  • Logical Expressions: These are used at the beginning of an if-then-else or a while. They are just arithmetic expressions, with the convention that a non-zero value is true and a zero value is false. This is exactly the situation with the C language.

  • if-then-else: This takes the form:
      [ expression ? statements : statements ] or [ expression ? statements ]
    This is intended to resemble the "statement" version of an if-then-else in C or Java. In case the expression is non-zero (true) the first statements are executed and otherwise the second ones are. The language uses [ ] for delimiters because this notation is often used in BNF for an optional item. Also the else part is optional

  • while: This takes the form:
      { expression ? statements }
    The language uses { } for delimiters because this notation is often used in BNF for "zero or more repetitions" of an item. The statements are repeatedly executed as long as the expression is non-zero (true). If the expression is zero initially, the statements are not executed.

  • digits 0-9: These represent the double values 0 through 9. They are the only numerical constants. (Well, 0 is numerical false and 1 to 9 are numerical true.) You cannot use a decimal point in the language.
    It might look like -1, -2, ..., -9 are also numerical constants, but the minus sign is a prefix operator, as it is in most languages.

  • lower-case-letters a-z: These are the only variables available for use in the language.

  • upper-case-letters: As shown above these are used to output certain specific characters.

  • Comments: As with MIPS, Python, Ruby, and others, comments go from a # to the next newline.

  • Operators: All operators act on doubles; all results are a double. Precedence is as expected, and is defined by the grammar.

      >  <  (input and output, see table above).
             Chosen because C++ uses cin >> x to input x, and cout << x to output x.

      +  −  *  (binary). Results as expected.

      +  −  (unary). Plus has no effect, while minus reverses the sign.

      (division). Result is the correct double, so that 5 / 3 is 1.666666666666666667.

      (new operator, truncated division). Like / except that the double result is truncated, so that 5 @ 3 is 1 as a double. To truncate a single double x, write x@1.

      (mod operator). If operands p and q are exact integers, the result is the integer p % q.
            Otherwise p % q = p - trunc(p/q)*q.

      (raise-to-power operator). p ^ q = pow(p, trunc(q)), so it cannot be used to compute non-integer powers. Works for negative p and/or negative q.


Examples of Tiny® Programs: You'll be getting lots of other examples. See two complex examples.


Representation of Tiny® Constructs and Data with MIPS Code: This will be presented at length on other pages.


Extensions: There is an extension of Tiny® to the Small® language, which includes relational operators, boolean values and operators, functions and parameters, and arrays. These extensions will be discussed in class.

(Revision date: 2014-10-20. Please use ISO 8601, the International Standard.)