 |
CS 3723
Programming Languages |
"Bare" R-D Parser for
Arithmetic Expressions |
"Bare" Parser for
Arithmetic Expressions (C and Java):
Below is C on the left and Java on the right. The Java code also needs
a GetChar class printed at the end. This parser, like all of them,
just marches through the parse tree, visiting nodes. At the end
it either accepts or rejects the input. It doesn't do anything
else unless code (semantic actions) is added to do something.
Notes on scanning:
- In this example, and in all our subsequent examples, a
token will be the same as a non-whitespace character.
This make the scanner (= lexical analyser) particularly simple:
just fetch the next non-whitespace character.
- This example mostly uses what is called
an early scan: As the program proceeds, tokens
(the same as non-whitespace characters here) are "used up",
that is, decisions are made based on the token,
after which the program is done with that token.
Then the program below usually does an immediate scan, to "scan past"
the token that no longer plays any role in the parse.
This is the simplest and least error-prone strategy, but you
should know that another scanning strategy exists:
the late scan. Here the program scans at the last possible
point before the token is required for a decision. This strategy
is preferred and can be essential for interactive use. Since
we will always be reading in entire program files, which strategy
is used makes no difference. I'm recommending the simplest one,
the one easiest to implement.
Parser in C |
Parser in Java |
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
char next;
void E(void);
void T(void);
void S(void);
void F(void);
void error(int);
void scan(void);
int main() {
scan();
E();
if (next != '$') error(1);
else printf("Success\n");
}
void E(void) {
T();
while (next == '+' || next == '-') {
scan();
T();
}
}
void T(void) {
S();
while (next == '*' || next == '/') {
scan();
S();
}
}
void S(void) {
F();
if (next == '^') {
scan();
S();
}
}
void F(void) {
if (isalpha(next)) {
scan();
}
else if (next == '(') {
scan();
E();
if (next == ')') scan();
else error(2);
}
else {
error(3);
}
}
void scan(void) {
while (isspace(next = getchar()))
;
}
void error(int n) {
printf("\n*** ERROR: %i\n", n);
exit(1);
}
| /* Arith.java: simple parser -- no output
* grammar:
* P ---> E '$'
* E ---> T {('+'|'-') T}
* T ---> S {('*'|'/') S}
* S ---> F '^' S | F
* F ---> char | '(' E ')'
*/
class Arith {
private GetChar getChar = new GetChar();
private char next;
private void P() {
scan();
E();
if (next != '$') error(1);
else System.out.println("Success");
}
private void E() {
T();
while (next == '+' || next == '-') {
scan();
T();
}
}
private void T() {
S();
while (next == '*' || next == '/') {
scan();
S();
}
}
private void S() {
F();
if (next == '^') {
scan();
S();
}
}
private void F() {
if (Character.isLetter(next)) {
scan();
}
else if (next == '(') {
scan();
E();
if (next == ')') scan();
else error(2);
}
else {
error(3);
}
}
private void scan() {
while (Character.isWhitespace(next =
getChar.getNextChar()))
;
}
private void error(int n) {
System.out.println("*** ERROR: " + n);
System.exit(1);
}
public static void main(String[] args) {
Arith arith = new Arith();
arith.P();
}
}
|
The C program above is complete as it is, but the Java
program uses a separate GetChar class printed below.
// GetChar: fetch next char
import java.io.*;
public class GetChar {
private Reader in; // internal file name for input stream
// GetChar: constructor
public GetChar () {
in = new InputStreamReader(System.in);
}
// getNextChar: fetches next char
public char getNextChar() {
char ch = ' '; // = ' ' to keep compiler happy
try {
ch = (char)in.read();
} catch (IOException e) {
System.out.println("Exception reading character");
}
return ch;
}
} |
Revision date: 2014-01-22.
(Please use ISO
8601, the International Standard.)
|