CS 3723
 Programming Languages 
   The MIPS Computer and  
   SPIM Assembly Language  


History: The MIPS hardware development started in 1981 at Stanford University by John Hennessy and his slaves (= graduate students). It was one of two revolutionary RISC computers (the other being SPARC, developed at Berkeley by Patterson and another group of graduate student slaves). Both of these architectures had a tremendous influence on the computer industry.

Hennessey used MIPS in his various computer organization and architecture textbooks. To quote from a manual with a link below: "The architecture of the MIPS computers is simple and regular, which makes it easy to learn and understand. The processor contains 32 general-purpose 32-bit registers and a well-designed instruction set that makes it a propitious target for generating code in a compiler." In contrast, he refers to the Intel 80x86 as "an architecture that is difficult to explain and impossible to love."


Resources: The simulator for MIPS, called SPIM, is available on our Linux machines (using the command spim as shown below). It accepts the MIPS assembly language, and this will be the target of out (small) compiler. Resources with way too much information are:


What We Will Use: For our purposes, MIPS used 32-bits for integers and instructions. It has add, sub, mul and div instructions for these integers. As a first example, suppose you want to find the value of f = g + 5, supposing g already had a value. You might like an instruction such as add f, g, 5, but all instructions in MIPS go through special 32-bit locations called registers (there are 32 of them). Some registers are $ra, $s1, $t0, and so forth (they all start with '$'). In order to add two numbers, first you have to load the two numbers into registers. Then the add places the sum into a register. Finally you have to store the result into a location for the sum.

In this course, all our numbers and variables are going to be offsets from an address stored in $s1. Our compiler will keep all its numbers, variables and temporary locations in a single "array" M of 32-bit integers. All values and offsets are in bytes, and a 32-bit integer holds 4 bytes. So an reference to M[i] is an offset of i*4 from the start of M. Here's a table describing where these are located inside M:

Type of value Location in M # Formula Offset
Constants 0,1,...,9 M[0],M[1],...,M[9]10 M[ch-'0']=M[3] for ch='3' ((ch-'0')*4)($s1)=12($s1) for ch='3'
Variables a,b,...,z M[10],M[11],...M[35]26 M[ch-'a'+10]=M[15] for ch='f' ((ch-'a'+10)*4)($s1)=20($s1) for ch='f'
Temporary variables M[36],M[37],...,M[235]200 M[n+36] for temp n>=0 (n*4+144)($s1) for temp n>=0

In the code below, we will load the address of M into the register $s1. Then the location for a variable c, which is in M[2+10] = M[12], will be denoted by (12*4)($s1)=48($s1) which means "48 bytes past the start of M, with the starting address stored in register $s1". We still want to do the simple example f = g + 5. Here that is in 5 steps, ending with actual MIPS assembler.

f  =  g + 5
add  f,  g,  5
load value of location  g  into register  t1 
load value of  5  into register  t2 
add values in  t1  and  t2 , leaving result in register  t3 
store value in  t3  into location  f 
load  M[16]  into  $t1 
load  M[5]  into  $t2 
add  $t1  and  $t2 , result in  $t3 
store the value of  $t3  into  M[15] 
lw    $t1, 64($s1)
lw    $t1, 20($s1)
add   $t3, $t1, $t2
sw    $t3, 60($s1)

You should realize that the use of an array M in this way for all the storage that our compiler will need was my own arbitrary choice, made to keep the compiler simpler. Essentially, I mapped all the storage needed at run-time into a single array. This is called static storage allocation (in a particularly simple form).

There are other, better ways to handle variables: in general one should use a symbol table for the variables that actually occur in the program, and map each variable at compile time to a memory location at run time. This would allow arbitrarily many variables.

Simlarly one could use the same table or a different table for constants as they occur. Alternatively you could handle the constants one-by-one as they arise, just hard-wiring the constant into the assembly code.


Example Program: In the interest of "diving in", here is an initial sample SPIM program (Because of the green line numbers, here is the source for the program below.)

sample.t:  f = 5;
           g = 8;
           r = f*f + g*g;
           s = g*g + 2*f*g;
           < s; < Blank;
           < r; < NewL

1 # sample.s: initial sample spim program 2 # everything on a line after a "#" is a comment 3 # mips expect to call a function "main" 4 .globl main 5 main: # main: global 6 addu $s7, $ra, $zero # save $ra 7 8 # next get the address of an "array" M of ints 9 # M will hold all the storage for our program 10 la $s1, M 11 ### Start of compiled code 12 # f = 5; 13 lw $t0, 20($s1) # $t0 = 5 14 sw $t0, 60($s1) # f = $t0 15 # g = 8; 16 lw $t0, 32($s1) # $t0 = 8 17 sw $t0, 64($s1) # g = $t0 18 # r = f*f + g*g 19 lw $t0, 60($s1) # $t0 = f 20 mul $t1, $t0, $t0 # $t1 = $t0 * $t0 21 lw $t2, 64($s1) # $t2 = g 22 mul $t3, $t2, $t2 # $t3 = $t1 * $t2 23 add $t4, $t1, $t3 # $t4 = $t1 + $t3 24 sw $t4, 108($s1) # r = $t4 25 # s = g*g + 2*f*g 26 # note: g still in $t2, g*g in $t3, f in $t0 27 mul $t5, $t0, $t2 # $t5 = f * g 28 add $t5, $t5, $t5 # $t5*2, (=2*f*g) 29 add $t6, $t3, $t5 # $t6 = $t3 + $t5 30 # =g*g + 2*f*g 31 sw $t6, 112($s1) # s = $t6
32  # print r
33          li      $v0, 1   # magic code: int
34          lw      $a0, 108($s1)
35          syscall
36  # print Blank
37          li      $v0, 4   # magic code: str
38          la      $a0, Blank
39          syscall
40  # print s
41          li      $v0, 1   # magic code: int
42          lw      $a0, 112($s1)
43          syscall
44  # print NewL
45          li      $v0, 4   # magic code: str
46          la      $a0, NewL
47          syscall
48
49  ### End of complied code
50          addu    $ra, $s7, $zero
51          jr      $ra          # return
52          .data   # storage for variables
53  M:      .word   0,1,2,3,4,5,6,7,8,9 # const
54          .space  104  # 26 variables a to z
55          .space  800  # temps, M[36]-M[235]
56  Blank:  .asciiz " "
57  NewL:   .asciiz "\n"
58  Tab:    .asciiz "\t"

% which spim # % = unix prompt /usr/bin/spim % spim -file sample.s # this executes it SPIM Version 7.4 of January 1, 2009 Copyright 1990-2004 by James R. Larus All Rights Reserved. Loaded: /usr/lib/spim/exceptions.s 89 144 # expected output


Comments About the Example: These refer to the green line numbers.

  • Outer Frame: Lines 4-6 and 50-51 enclose the program in a standard framework. Line 5 is the label main: which is what the system looks for to put into execution. As the same time the system places the address from which the program was called into the special register $ra ("return address"). This register is then copied into another register $s7. If the program implements any function calls (which ours will not), then $ra would be needed inside them, so it would be necessary to save its value. At the other end, line 50 restores $ra from $s7, and line 51 is a "jump register" instruction that jumps back to where main was called from. There is no requirement that the label main be at the beginning or that the return be at the end.

  • syscall: The program contains 4 syscalls, for printing strings and ints (lines 32-47). This is a program-initiated interrupt (also called an exception). It is often necessary for the Operating System to interrupt the execution of a program in order to attend to other matters (such as input/output). In this case the syscall asks the OS to perform a service. The OS kernel takes control. It examines the special register $v0 to determine what should be done. Here a code 1 says to print an int, and a code 4 says to print a string. There are a number of other coded services. In the first case here, the OS looks at register $a0 to get the integer value to print, and in the second case it expects $a0 to hold the address of the string to print. In the second case, the .data declaration at the end of the program put the address of a string into a label such as Blank: or NewL:.


Revision date: 2013-02-16. (Please use ISO 8601, the International Standard.)