Project 2 - ACE home page

Department of Electrical Engineering and Computer Science
Ohio University
Spring 2014
EE 4683/5683: Computer Architecture
Project # 2
Due Date: Wednesday April 16, 2014 by 11:59 EST
Instructions:
1) Use Dinero simulator to solve problems 1-4. 5-7 are paper pencil problems.
2) Submit a PDF of your analysis of the data and your results. One sample output per question
can be attached at the end of the solution as appendix. Do not put all the runs of all the
simulation outputs.
Question 1: Calculate the Local and Global Miss Rates (Use macr.din only)
Calculate the global and local miss rates for a 2-level cache. Consider level-1 cache with 16KB
and vary the level-2 cache size form 2KB, 8KB, 32KB and 128KB for a block size of 16bytes,
LRU replacement policy and 2-way associativity. Compare this value with the miss rates of
single-level cache.
a) Consider only data caches for simulation and plot the results (Logarithmic scale makes the
plot clearer).
b) What conclusion can you draw from this?
c) Is the local miss rate a good measure of secondary caches?
d) What is the optimum value of the secondary cache size?
Question 2: Unified and split caches
(use cc1.din only)
Compare the cache miss ratios of the following two systems:
 a system with a 32K-byte unified cache
 a system with a 16K-byte instruction-only cache and a 16K-byte data-only cache.
Assume the caches are 4-way set associative, LRU replacement policy and the block size is 32
bytes. What conclusions can you draw from this experiment?
Question 3: Multi-level L2 Cache
(use spice.din only)
Calculate the AMAT for 2-level cache with:
(a) level-1 cache size of 16KB and level-2 cache size of 64KB
(b) level-1 cache size of 32KB and level-2 cache size of 128KB
1
Department of Electrical Engineering and Computer Science
Ohio University
Spring 2014
Assume direct-mapped for level-1 and 4-way set-associative for level-2 with block size of 16
bytes. The hit time for level-1 is 1 clock cycle and for level-2 is 8 clock cycles. Let the miss
penalty for level-2 be 50 clock cycles and memory access latency be 100 clock cycles. What can
you say about the performance of the two organizations?
Question 4: Calculation of CPI (use all 3 traces)
Calculate the Effective CPI of a 2-level cache hierarchy with level-1 direct-mapped cache of 4KB
and level-2, 2-way set associative cache of 64KB with the block size of 32 bytes for both
caches. The base CPI is assumed to be 1.5 and there are 1.45 memory references per
instruction. The hit time to level-1 is 1 clock cycle and hit-time to level-2 is 15 clock cycles. The
miss penalty for level-2 is assumed to be 60 clock cycles.
[Hint:
Average Memory stalls/instruction =
Misses/instruction (L1)  Hit Time (L2) + Misses/Instruction (L2)  Miss Penalty (L2)
Misses/Instruction (L1) = Miss Rate (L1) x Memory Access/Instruction
Misses/Instruction (L2) = Miss Rate (L2) x Memory Access/Instruction]
Question 5: Replacement Algorithms for Cache (Do not use Dinero)
Consider a memory system with a two-level hierarchy with a cache M1 and main memory M2.
The size of the main memory, M2 is 256 bytes with block size of 16 bytes. The size of the
cache, M1 is 64 bytes with the same block size of 16 bytes. The word length is 4 bytes,
implying 4 words per block. A certain trace program generates the following sequence of word
addresses,
0,8,16,1,24,21,20,3,32,61,31,19,16,60,28,21,8,11,19,22,28,42,55,58,59
Note, every time a new block is accessed by the cache, up to 4 words are received implying
that if block 0 is accessed from the main memory then 0,1,2,3 words are obtained, if block 1 is
accessed from the main memory, words 4,5,6,7 are obtained, if block 2 is accessed from the
main memory words 8,9,10,11 are obtained and so on. The addresses given above are word
addresses, not memory block addresses. Assume that the access time is 2 clocks from the
cache (M1) and 50 clocks from main memory (M2), the transfer rate is 4 bytes per clock and
that 25% of the transfers are dirty. The base CPI of a perfect memory system is 1.75.
(a) Consider a fully associative cache with LRU replacement policy. Determine the hit ratio.
What is the average memory access time?
(b) Consider a direct mapped cache. Determine the hit ratio. What is the average memory
access time?
2
Department of Electrical Engineering and Computer Science
Ohio University
Spring 2014
Problem 6 (Do not use Dinero)
You purchased an Acme computer with the following features






95% of all memory accesses are found in the cache
each cache block is two words, and the whole block is read on any miss
the processor sends references to its cache at the rate of 109 words per second
25% of those references are writes
assume that the memory system can support 109 words per second, reads or writes
the bus reads or writes a single word at a time (the memory system cannot read or wrote
two words at once)
 assume at any one time, 30% of the blocks in the cache have been modified
 the cache uses write allocate on a write miss
You are considering adding a peripheral to the system, and you want to know how much of the
memory system bandwidth is already used. Calculate the percentage of memory system
bandwidth used on the average in the two cases below. Be sure to state your assumptions
a. The cache is write through
b. The cache is write back
Problem 7 (Do not use Dinero)
The main memory of a computer is organized as 256 blocks, with a block size of 8 words. The
cache has 16 block frames. For the questions below, show the mappings from the numbered
blocks of main memory to block frames of the cache. Draw all lines showing the mappings as
clearly as possible.
a) Show the direct mapping and the address bits that identify the tag field, the index field, and
the word offset field (the bits identifying a word within a block).
b) Show the fully associative mapping and the address bits that identify the tag field and the
word offset field.
c) Show the mapping for the 2-way set associative mapping and the address bits that identify
the tag field, the set number, and the word number.
3