worksheet 3

Computer Design Supervision 3/5
Robert Mullins
(a) 2011 Paper 5 Question 3
https://www.cl.cam.ac.uk/teaching/exams/pastpapers/y2011p5q3.pdf
(b) What is a branch delay slot and what can it help to avoid? If we can't find a suitable
instruction to move into the branch delay slot what are we forced to fill it with?
(c) Imagine we attempt to boost the performance of a simple 5-stage pipeline with the following
extensions:
(i) The ability to fetch two instructions per cycle
(ii) The execute stage now consists of 3 independent functional units. On each cycle we are
able to issue two instructions from the decode stage to two of these units.
Draw the data forwarding paths required to ensure dependent instructions can execute on
consecutive clock cycles.
What checks would need to be made in the decode stage before the instructions were issued in
parallel?
(d) Every ARM instruction is conditionally executed, how might this feature be used to improve
the code below?
CMP r0, #5
BEQ BYPASS ; if r0!=5 {
ADD r1, r1, r0 ; r1:=r1+r0­r2
SUB r1, r1, r2 ; }
BYPASS: ....
(e) Provide an example of a subroutine call in MIPS and x86.
(f) Write an iterative version of Fibonacci in Java and figure out what the disassembled code
means. Run through the code for a fib(4). [lecture 9, slide 27]
(g) What characteristics of typical memory access patterns enable a hierarchy of memories to be
exploited?
(h) In Figure 1., what type of cache is X, Y and Z? The Figure highlights a block of main memory
and shows in each case where it could be potentially stored in each cache.
(i) What type of cache is shown in Figure 2?
Figure 1.
Figure 2.
(j) Increasing cache block size to take advantage of spatial locality may initially reduce cache
miss rates, but why might further increases actually begin to increase the cache miss rate?
(k)
Average memory access time =
Hit Time + Miss rate x Miss penalty
(a) How might the miss rate of a cache be reduced?
(b) How might the miss penalty of a cache be reduced?
(c) Which type of cache will have the smallest access time: a direct-mapped cache or a fullyassociative cache? Assume the caches are equal in size (or capacity).
(l) Describe one advantage of employing a “write back” policy when handling a write to a cache?
(m) Draw a block diagram of a hypothetical processor with a 5-stage pipeline. Clearly label the
stages of the pipeline, major components and data forwarding paths, the L1 instruction cache, L1
data cache and the TLBs.
(i) Why don't we want to have to access our L1 cache with physical addresses?
(ii) What potential issues are there with virtual-addressed physically-tagged caches?
Further work
Please feel free to complete any past papers in addition to the work above or the lecturer's
exercises:
http://www.cl.cam.ac.uk/teaching/1415/CompDesign/cam-only/exercises.pdf
I suggest you read Chapter 7 (Microarchitecture) and Chapter 8 (Memory Hierarchy) of the
Harris & Harris book.