CS 2733/2731, Org II, Fall 2003 Final Exam, Selected and Partial Answers 1. # Answer to Problem 1 .globl main main: add $s7, $0, $ra # save return address .data A: .word 4, 9, 25, 49, 121, 169, 289 # squares of first 7 primes .text ################# Answer to Problem 2 ########################## la $s1, A # start address of A addi $s2, $0, 0 # running sum addi $s3, $0, 0 # array index of A addi $s4, $0, 7 # constant 7 Loop: lw $t1, 0($s1) # $t1 = A[$s3] add $s2, $s2, $t1 # $s2 = sum of A[] so far addi $s1, $s1, 4 # $s1 += 4 addi $s3, $s3, 1 # $s3 += 1 bne $s3, $s4, Loop # branch back to Loop until $s4 == 7 addi $v0, $0, 1 # print the sum add $a0, $0, $s2 syscall ################# End of Answer to Problem 2 ################### addi $v0, $0, 4 # print a newline la $a0, Newln syscall add $ra, $0, $s7 # restore return address jr $ra .data Newln: .asciiz "\n" ################# Output ####################################### # ten42% spim -file exam1_2.s # 666 ################# End of output ################################ Another anwser: ################# Second Answer to Problem 2 ################### la $s1, A # start address of A addi $s2, $0, 0 # running sum addi $s3, $0, 0 # array index of A addi $s4, $0, 7 # constant 7 Loop: mul $t0, $s3, 4 # $t0 = array index * 4 add $t2, $t0, $s1 # $t2 = start of A + offset lw $t1, 0($t2) # $t1 = contents at start of A + offset add $s2, $s2, $t1 # $s2 = sum of A[] so far addi $s3, $s3, 1 # $s3 += 1 bne $s3, $s4, Loop # branch back to Loop until $s4 == 7 addi $v0, $0, 1 # print the sum add $a0, $0, $s2 syscall ################# End of Second Answer to Problem 2 ############ ------------------------------------------------------------------------- 2. ##### CS 2734, Final Exam, Problem 2 ####### .globl main main: addu $s7, $zero, $ra ########## MAIN FOR PROB 2 ################# addi $a0, $0, 25 # Param1 = 12 addi $a1, $0, 144 # Param2 = 144 jal F # call F add $a0, $0, $v0 # $v0 = ret val li $v0, 1 # print it syscall jal Newl # print newline # Finish main addu $ra, $zero, $s7 # normal end of main jr $ra # return to system ########## END OF MAIN ##################### ########## PROB 2, function F ############ F: add $v0, $a0, $a1 jr $ra ########## END OF FUNCTION F ########## ########## write newline ################ Newl: li $v0, 4 la $a0, Newline syscall jr $ra ########## DATA ######################### .data Newline: .asciiz "\n" ######################################### # Output: # 169 ######################################### ------------------------------------------------------------------------- 3. This instruction is like the first part of lw or sw and the last part of add. No additional data lines or control lines are needed. Fetching the instruction, updating the PC, and fetching registers are the same as for all instructions. The ALU does the same as for lw or sw: ALUresult = (Output of Read data 1) + sign-extend(IR[15-0]); The control settings are the same as for those instructions: ALUSrc = 1, ALUOp = 00. (So that input to ALU becomes 010 (add).) For the rest, we have to route the ALUresult around back into the register file, using IR[20-16] for the register. (Note that add has Reg[IR[15-11]] = ALUresult) Thus we need RegWrite = 1, MemtoReg = 0, and RegDest = 0. (Note that add needs RegDest = 1, because the destination register is in bits 15-11 for add, while it is bits 20-16 for addi.) ------------------------------------------------------------------------- 4. Just the standard lw diagram for the singlecycle implementation, with FIVE cycles. ------------------------------------------------------------------------- 5. Register 2 is forwarded from the sub instruction (WB stage) and register 4 from the and instructions (MEM stage). Both are forwarded to the or instruction (EX stage). Three lines going into the forwarding unit should be marked A and three lines marked B. One line going out to a mulitplexor is marked A and one marked B. In addition, one 32-bit data forwarding line in marked A and one is marked B. (10 lines should be marked altogether!) ------------------------------------------------------------------------- 6. A stall is needed because of the lw instruction. See Section 6.5. The Hazard Detection Unit notices that a lw instruction is in the EX stage (by checking that the MemRead flag is 1). It also checks that the result of the lw is in a register needed by the next instruction (in the ID stage). In this case it inserts a 1-cycle stall, by inserting zeros in for the control signals in the ID stage, and by deasserting the IF/IDWrite control line, so that nothing is written into the IF/ID on that cycle. It also deasserts the PCWrite signal, so that nothing is written into the PC. Thus the next instruction is _not_ written into IF/ID until IF/IDWrite is asserted again, when the flow of instructions can start up. Also the PC value is not updated until the next cycle. Thus a "bubble" is created in the sequence of instructions going through. Two cycles later (see Fig. 6.48) when the lw instruction is in its final stage and when the following and instruction is in its execute stage, the value of $2 must be forwarded into the ALU for use by the and instruction, using the Forwarding unit as for other data hazards. At the same time, lw finishes loading the new value into $2. ------------------------------------------------------------------------- 7. (a) When the system starts your program, it starts at line 29. Then the "jal main" class the main function and starts the earlier code labeled with "main:". Finally, after your program finishes, control returns to line 30 which does a MIPS exit. (b) Start in on line 20 when there is any exception. (c) Lines 20-22 will print the message "Duhh-hhhhh!" (d) Line 25 adds 4 to the EPC value, the address of the offending instruction. (e) Line 26 returns to the next instruction past the instruction that caused the interrupt. ------------------------------------------------------------------------- 8. (a) Uses 10 bits for index (or cache address), so that the cache holds 1024 words of data. (b) Since these are words the low-order 2 bits of any address of a word are always zeros. (c) The tag field holds the remaining bits, after leaving off the low order 2 bits, and the next 10 bits. (10 + 2 + 20 = 32) (d) The hardware extracts the low-order 10 bits (except for the lowest 2 bits), and uses these 10 bits as an address in the cache table. At this address, the hardware checks if the valid bit is set (so it is a valid entry), and checks if the tag field matches the high-order 20 bits in the address being looked up. A match means a cache hit. (e) In case of a cache miss, see page 551 at the bottom. ------------------------------------------------------------------------- 9. See text, chapter 8.