CS 2733/2731
Computer Organization II -- Fall 2003
Review for Final Exam
(Wednesday, 10 December 2003,
1:30-4:15 pm)
Previous Reviews:
Refer to the reviews for Exam 1 and Exam 2 for material covered
up to the second exam:
See
Review for Exam 1 and
Review for Exam 2
Previous Exams: Previous exams are available through the individual
web pages. This final may be significantly different from previous ones.
Review topics: Here are the new topics since the second exam,
mainly just Chapter 6 on the pipelined implementation.
Note that the final will emphasize Chapter 6, since that is
the most important topic since the second exam.
The final also covers the new topics of
caches, buses, and the recitation on exceptions.
- Exceptions and the trap handler in MIPS (Lab 10).
- The specifics of how the trap handler works.
- Take-home quiz on exceptions:
quiz
- Pipelined Implementation (Chapter 6)
- Overview (Section 6.1, includes especially discussion of
hazards).
- Pipelined datapath (Section 6.2, ignores hazards).
Note the several ways of graphically representing pipelines
on pages 461-465, especially the series of diagrams
in Figures 6.22-6.24.
- Pipelined control (Section 6.3, ignores hazards).
The control signals from the Control unit are the same as
for the single-cycle implementation, but the pipelined implementation
makes use of extra latched
register storage between pipeline stages to pass along control
information. (See Figures 6.29 and 6.30. See also the
good series of diagrams in Figures 6.31-6.35.)
- Data hazards and forwarding (Section 6.4).
You should understand how forwarding works, with the Forwarding Unit
and extra control and data lines.
This applies to dependencies between the register result of one
instruction and the use of that new register value in subsequent
instructions. (See especially Figure 6.38 and the
series of diagrams in Figures 6.41-6.42. We did not cover
Figure 6.43 or page 488.)
- Data hazards and stalls (Section 6.5).
In case of a dependency involving a lw instruction,
the machine must stall for one instruction. This uses the
Hazard Detection Unit.
Note how the stall is inserted by
de-asserting write lines to the PC and the IF/ID registers,
and by inserting all zeros on control lines. (These zeros
propagate along from cycle to cycle, as the "bubble" of
a stalled instruction moves along the pipeline.)
See diagrams in Figures 6.47-6.49. Note that the
final forwarding is done by the Forwarding Unit as before.
- Branch hazard (Section 6.6). In case of a branch,
if the branch is taken (in one simple implementation),
must stall and wipe out the start of the next instruction.
(Also need to move branch handling into step 2 of pipeline.)
See Figures 6.51 and 6.52. Skip Section 6.6 from pape 501 on.
- Exceptions (Section 6.7). Example of arithmetic overflow.
IF.Flush, ID.Flush, EX.Flush lines to put nop
in first stage and to set control lines to zeros in second
and third stages.
- Caches (Chapter 7)
- The general idea of caching, used not just for memory,
but with disk storage and elsewhere.
- SRAM versus DRAM (B.5, pages B-26 to B-33).
- The idea of hashing, with a hash function and
with some method for resolving collisions:
- Open addressing, that is, using the next available
sequential location after the hash address.
- Bucketting, that is, using a linked list attached to
to each hash address.
- Overflow area. Using an additional area for data that
collided.
- A very simple example, with 3-bit cache addresses, and 5-bit
memory addresses. Use the low-order 3 bits of the memory address
for the cache address. See Figures 7.5 and 7.6.
- A simple approach to a cache, involving a cache table such
as the one shown in Figure 7.7, with 1024 cache entries
(using a 10-bit cache address), indexes
in the range from 0 to 1023, a valid bit, a 20-bit tag field,
and a 32-bit data field.
For lookup:
- The CPU generates an address.
- Extract a 10-bit index from bits 11-2 of the address.
- For the address of a word, bits 1-0 will be 0.
- Compare the 20-bit quantity from bits 32-12 of the
address with the tag field. If equal, you have a hit,
and return the data field. If not equal, you have
a miss; go out to main memory.
- In case of a miss, stall the CPU, fetch the word from
memory, load it into the cache, and restart the instruction
so that this time there will be a cache hit.
- A very similar but slightly more complicated example as
shown in Figure 7.8, with 14-bit cache addresses (16K entries), and
a 16-bit tag field. Once again, bits 0 and 1 are not used,
since the cache is fetching words.
- Using spatial locality
(which means that if an item is referenced, items whose addresses
are close by will tend to be referenced soon):
As illustrated in Figure 7.10,
each cache entry could be a block of 4 words, so that a cache
miss will fetch 4 adjacent words, leading to likely hits afterwards.
- Associative caches, as illustrated in Figure 7.19,
where 4 completely distinct words are held for each cache index
(4-way associative).
- Buses (Section 8.4)
- Structure: data lines, address lines, control lines.
- Types: processor-memory, I/O buses, backplane.
- Protocols: Synchonous (using a clock), asynchronous (uses handshaking).
- Example protocol: Figures 8.10 and 8.11. (See also the
handout illustrating these figures.)
- Performance: Synchonous is faster, and cannot be as long
or as versatile as asynchronous.
- Bus arbitration: bus master and slave. With several masters,
need arbitration: Four types:
- Daisy-chain: "simple and cheap, but not fair or fast".
Chains through devices from highest to lowest priority.
- Centralized, parallel arbitration: Uses a request line
from each device to the arbiter. (PCI bus)
- Distributed arbitration by self-selection: the devices
requesting bus access determine who gets access.
(Each device wanting bus access places a code indicating
its identity on the bus.) (NuBus)
- Distributed arbitration by collision detection:
Each device wanting to use the bus checks if it is in use,
and if not, starts using it. If two or more devices start
at time close enough, there will be a collision.
In case of a collision, the devices stop using the bus,
wait a random time interval, and try again. (Ethernet)
Likely final exam questions:
- No questions about the use of CMOS transistors to create
gates.
- Probably several questions about MIPS assembly and machine
language. I might ask for the code for a simple loop
(use of beq or bne), or
for a simple call to a function (jal)
and the code for the function (jr $ra to return), or
access to memory (use of lw or sw),
or saving and restoring a register on the stack.
- Possibly one question about the correspondence between
machine code and assembler code (as in Lab 8:
Hand Assembly of MIPS Code.
- At least one question on either the single-cycle or
multi-cycle implementation. This might even ask you to do
something creative, such a implementing a new instruction.
The text has exercises asking how to implement the
addi instruction in both the single-cycle
and the multi-cycle models, without additional control lines.
(Instruction decode does have to recognize the new instruction.)
- Extra emphasis on the pipelining chapter, the first seven
sections.
- I might ask about the pipelined datapath or pipelined
control (lots of figures in sections 6.2, 6.3).
- I plan to give you a diagram with a forwarding unit,
for you to explain how the forwarding unit works in simple
cases not involving a stall (section 6.4, figures 6.41, 6.42).
- I might ask about the hazard detection unit and stall used
to handle the lw instruction (section 6.5,
figures 6.47, 6.48, 6.49).
- I might ask about handling the beq instruction
by moving everything into step 2, and by putting in a 1-cycle
stall bubble (for successful branch) (section 6.6,
figures 6.51, 6.52).
- I might ask about handling arithmetic overflow exception
by flushing the pipeline and restarting the instruction at address
40000040 (section 6.7, figures 6.55, 6.56).
- Probably one question about exception handlers.
- Probably one question about caches.
- Probably one question about buses.
Good Luck!!!
Revision date: 2003-12-02.
(Please use ISO
8601, the International Standard.)