|
CS 3723
Programming Languages |
Compiler Overview
|
Outline of
the Actions of a Compiler
Classic Form -- More Often Implemented
as a Hybrid
[See Programming Language Translation
for a restatement of the concepts below.]
In the first statement below, the two identifiers initial
and rate are assumed to have floating point values from
an earlier part of the program. In an actual compiler, the actions
shown below are not separate passes, but each takes output from
the previous step as it is produced and feeds input to the next
step as it is needed. (We will see how this works.) |
|
Lexical Analyzer:
A relatively simple part of the
compiler, this breaks the input sequence of
characters into a sequence of tokens, which are
units of input that will be fed to the next phase of
the compiler. Any identifiers will be looked up in the growing
SYMBOL TABLE
(ST) at the left. The identifier tokens
(denoted id1, id2, id2)
also have a pointer to the ST that says which identifier
it is. The actual output of the lexical analyzer is in
internal symbolic form, so it does not output characters
"id" or any such thing.
See Lexical Analysis
for more detail.
Syntax Analyzer:
This builds a syntax tree that
represents how the statement is put together. In this case,
it means (among other things) that the "*" operator
has the highest precedence, "+" the next highest, the the
":=" assignment has the lowest (dealt with last).
With an actual compiler the syntax tree is not explicitly
constructed, but only exists implicitly. (We'll see how this
works.)
Semantic Analyzer:
This takes the meaning of constructs
into account. In this case one can't directly multiply a
float by an int,
so 60 must be converted to float.
Intermediate Code Generator:
Converts the tree into
a sequence of statements in a simple machine-oriented language.
This step might be skipped.
Code Optimizer:
Tries to get the computation done
with as few statements as possible, eliminating two temporary
constants in the process. Here it also converts
the "inttoreal(60)" into the float constant "60.0" at
compile time, so that no conversion is needed at run time.
This stage might be carried out in several places.
Code Generator: This generates machine code.
In practice it is never assembler code as shown here, but
always true machine code. Otherwise the compiler would need
to feed this output into an assembler, greatly increasing
compile-time. (In the "old" (Paleolithic) days,
this was sometimes done.)
|
( Revision date: 2014-05-21.
Please use ISO
8601, the International Standard.)
|