|
 |
CS 3723
Programming Languages
|
The MIPS Computer and
SPIM Assembly Language |
History:
The
MIPS
hardware development started in 1981 at Stanford University
by John Hennessy and his slaves (= graduate students). It was
one of two revolutionary RISC computers (the other being SPARC,
developed at Berkeley by Patterson and another group of
graduate student slaves). Both of these architectures
had a tremendous influence on the computer
industry.
Hennessey used MIPS in his various computer organization and
architecture textbooks. To quote from a manual with a link below:
"The architecture of the MIPS computers is simple and regular,
which makes it easy to learn and understand. The processor contains
32 general-purpose 32-bit registers and a well-designed instruction
set that makes it a propitious target for generating code in a compiler."
In contrast, he refers to the Intel 80x86 as "an architecture
that is difficult to explain and impossible to love."
Resources:
The simulator for MIPS, called SPIM, is available on our
Linux machines (using the command spim as shown below).
It accepts the MIPS assembly language, and
this will be the target of out (small) compiler.
Resources with way too much information are:
-
The
SPIM Manual.
- Another resource is from "Computer Organization",
by Patterson and Hennessy:
Appendix A.
What We Will Use:
For our purposes, MIPS used 32-bits for integers and instructions.
It has add, sub, mul and div instructions for these integers.
As a first example, suppose you want to find the value of
f = g + 5, supposing g already had a value. You might like
an instruction such as add f, g, 5, but all instructions in
MIPS go through special 32-bit locations called registers
(there are 32 of them). Some registers are $ra, $s1, $t0, and
so forth (they all start with '$'). In order to add two numbers,
first you have to load the two numbers into registers.
Then the add places the sum into a register. Finally you have
to store the result into a location for the sum.
In this course, all our numbers and variables are going to be offsets from
an address stored in $s1. Our compiler
will keep all its numbers, variables and temporary locations
in a single "array" M of 32-bit integers.
All values and offsets are in bytes, and a 32-bit integer holds 4 bytes.
So an reference to M[i] is an offset of i*4 from
the start of M.
Here's a table describing where these
are located inside M:
Type of value |
Location in M |
# |
Formula |
Offset |
Constants 0,1,...,9 |
M[0],M[1],...,M[9] | 10 |
M[ch-'0']=M[3] for ch='3' |
((ch-'0')*4)($s1)=12($s1) for ch='3' |
Variables a,b,...,z |
M[10],M[11],...M[35] | 26 |
M[ch-'a'+10]=M[15] for ch='f' |
((ch-'a'+10)*4)($s1)=20($s1) for ch='f' |
Temporary variables |
M[36],M[37],...,M[235] | 200 |
M[n+36] for temp n>=0 |
(n*4+144)($s1) for temp n>=0 |
In the code below, we will load the address of M
into the register $s1. Then the location for a variable
c, which is in M[2+10] = M[12], will be denoted by
(12*4)($s1)=48($s1)
which means "48 bytes past the start of M, with the
starting address stored in register $s1".
We still want to do the simple example f = g + 5. Here
that is in 5 steps, ending with actual MIPS assembler.
f = g + 5
add f, g, 5
load value of location g into register t1
load value of 5 into register t2
add values in t1 and t2 , leaving result in register t3
store value in t3 into location f
load M[16] into $t1
load M[5] into $t2
add $t1 and $t2 , result in $t3
store the value of $t3 into M[15]
lw $t1, 64($s1)
lw $t1, 20($s1)
add $t3, $t1, $t2
sw $t3, 60($s1)
|
You should realize that the use of an array M in this way for
all the storage that our compiler will need was my own arbitrary choice,
made to keep the compiler simpler. Essentially, I mapped
all the storage needed at run-time into a single array.
This is called static storage allocation (in a particularly
simple form).
There are other, better ways to handle variables: in general
one should use a symbol table for the variables that actually
occur in the program, and map each variable at
compile time to a memory location at run time. This would
allow arbitrarily many variables.
Simlarly one could use the same table or a different table
for constants as they occur.
Alternatively you could handle the constants one-by-one as they arise,
just hard-wiring the constant into the assembly code.
Example Program:
In the interest of "diving in", here is an initial sample
SPIM program (Because of the green line numbers,
here is the source for the program below.)
sample.t: f = 5;
g = 8;
r = f*f + g*g;
s = g*g + 2*f*g;
< s; < Blank;
< r; < NewL
1 # sample.s: initial sample spim program
2 # everything on a line after a "#" is a comment
3 # mips expect to call a function "main"
4 .globl main
5 main: # main: global
6 addu $s7, $ra, $zero # save $ra
7
8 # next get the address of an "array" M of ints
9 # M will hold all the storage for our program
10 la $s1, M
11 ### Start of compiled code
12 # f = 5;
13 lw $t0, 20($s1) # $t0 = 5
14 sw $t0, 60($s1) # f = $t0
15 # g = 8;
16 lw $t0, 32($s1) # $t0 = 8
17 sw $t0, 64($s1) # g = $t0
18 # r = f*f + g*g
19 lw $t0, 60($s1) # $t0 = f
20 mul $t1, $t0, $t0 # $t1 = $t0 * $t0
21 lw $t2, 64($s1) # $t2 = g
22 mul $t3, $t2, $t2 # $t3 = $t1 * $t2
23 add $t4, $t1, $t3 # $t4 = $t1 + $t3
24 sw $t4, 108($s1) # r = $t4
25 # s = g*g + 2*f*g
26 # note: g still in $t2, g*g in $t3, f in $t0
27 mul $t5, $t0, $t2 # $t5 = f * g
28 add $t5, $t5, $t5 # $t5*2, (=2*f*g)
29 add $t6, $t3, $t5 # $t6 = $t3 + $t5
30 # =g*g + 2*f*g
31 sw $t6, 112($s1) # s = $t6
| 32 # print r
33 li $v0, 1 # magic code: int
34 lw $a0, 108($s1)
35 syscall
36 # print Blank
37 li $v0, 4 # magic code: str
38 la $a0, Blank
39 syscall
40 # print s
41 li $v0, 1 # magic code: int
42 lw $a0, 112($s1)
43 syscall
44 # print NewL
45 li $v0, 4 # magic code: str
46 la $a0, NewL
47 syscall
48
49 ### End of complied code
50 addu $ra, $s7, $zero
51 jr $ra # return
52 .data # storage for variables
53 M: .word 0,1,2,3,4,5,6,7,8,9 # const
54 .space 104 # 26 variables a to z
55 .space 800 # temps, M[36]-M[235]
56 Blank: .asciiz " "
57 NewL: .asciiz "\n"
58 Tab: .asciiz "\t"
% which spim # % = unix prompt
/usr/bin/spim
% spim -file sample.s # this executes it
SPIM Version 7.4 of January 1, 2009
Copyright 1990-2004 by James R. Larus
All Rights Reserved.
Loaded: /usr/lib/spim/exceptions.s
89 144 # expected output
|
Comments About the Example:
These refer to the green line numbers.
- Outer Frame: Lines 4-6 and 50-51 enclose the
program in a standard framework. Line 5 is the label
main: which is what the system looks for to put into execution.
As the same time the system places the address from which the program was
called into the special register $ra ("return address").
This register is then copied into another register $s7.
If the program implements any function calls (which ours will
not), then $ra would be needed inside them, so it would be
necessary to save its value. At the other end, line 50
restores $ra from $s7, and line 51 is a "jump register"
instruction that jumps back to where main was called from.
There is no requirement that the label main be at the beginning
or that the return be at the end.
- syscall: The program contains 4 syscalls, for printing
strings and ints (lines 32-47). This is a program-initiated interrupt
(also called an exception). It is often necessary for
the Operating System to interrupt the execution of a program
in order to attend to other matters (such as input/output).
In this case the syscall asks the OS to perform a service.
The OS kernel takes control. It examines the special
register $v0 to determine what should be done.
Here a code 1 says to print an int, and a code 4 says to
print a string. There are a number of other coded services.
In the first case here, the OS looks at register $a0 to get
the integer value to print, and in the second case it
expects $a0 to hold the address of the string to print.
In the second case, the .data declaration at the end of the
program put the address of a string into a label such
as Blank: or NewL:.
Revision date: 2013-02-16.
(Please use ISO
8601, the International Standard.)
|