Department of Electrical Engineering and Computer Science Ohio University Spring 2014 EE 4683/5683: Computer Architecture Project # 2 Due Date: Wednesday April 16, 2014 by 11:59 EST Instructions: 1) Use Dinero simulator to solve problems 1-4. 5-7 are paper pencil problems. 2) Submit a PDF of your analysis of the data and your results. One sample output per question can be attached at the end of the solution as appendix. Do not put all the runs of all the simulation outputs. Question 1: Calculate the Local and Global Miss Rates (Use macr.din only) Calculate the global and local miss rates for a 2-level cache. Consider level-1 cache with 16KB and vary the level-2 cache size form 2KB, 8KB, 32KB and 128KB for a block size of 16bytes, LRU replacement policy and 2-way associativity. Compare this value with the miss rates of single-level cache. a) Consider only data caches for simulation and plot the results (Logarithmic scale makes the plot clearer). b) What conclusion can you draw from this? c) Is the local miss rate a good measure of secondary caches? d) What is the optimum value of the secondary cache size? Question 2: Unified and split caches (use cc1.din only) Compare the cache miss ratios of the following two systems: a system with a 32K-byte unified cache a system with a 16K-byte instruction-only cache and a 16K-byte data-only cache. Assume the caches are 4-way set associative, LRU replacement policy and the block size is 32 bytes. What conclusions can you draw from this experiment? Question 3: Multi-level L2 Cache (use spice.din only) Calculate the AMAT for 2-level cache with: (a) level-1 cache size of 16KB and level-2 cache size of 64KB (b) level-1 cache size of 32KB and level-2 cache size of 128KB 1 Department of Electrical Engineering and Computer Science Ohio University Spring 2014 Assume direct-mapped for level-1 and 4-way set-associative for level-2 with block size of 16 bytes. The hit time for level-1 is 1 clock cycle and for level-2 is 8 clock cycles. Let the miss penalty for level-2 be 50 clock cycles and memory access latency be 100 clock cycles. What can you say about the performance of the two organizations? Question 4: Calculation of CPI (use all 3 traces) Calculate the Effective CPI of a 2-level cache hierarchy with level-1 direct-mapped cache of 4KB and level-2, 2-way set associative cache of 64KB with the block size of 32 bytes for both caches. The base CPI is assumed to be 1.5 and there are 1.45 memory references per instruction. The hit time to level-1 is 1 clock cycle and hit-time to level-2 is 15 clock cycles. The miss penalty for level-2 is assumed to be 60 clock cycles. [Hint: Average Memory stalls/instruction = Misses/instruction (L1) Hit Time (L2) + Misses/Instruction (L2) Miss Penalty (L2) Misses/Instruction (L1) = Miss Rate (L1) x Memory Access/Instruction Misses/Instruction (L2) = Miss Rate (L2) x Memory Access/Instruction] Question 5: Replacement Algorithms for Cache (Do not use Dinero) Consider a memory system with a two-level hierarchy with a cache M1 and main memory M2. The size of the main memory, M2 is 256 bytes with block size of 16 bytes. The size of the cache, M1 is 64 bytes with the same block size of 16 bytes. The word length is 4 bytes, implying 4 words per block. A certain trace program generates the following sequence of word addresses, 0,8,16,1,24,21,20,3,32,61,31,19,16,60,28,21,8,11,19,22,28,42,55,58,59 Note, every time a new block is accessed by the cache, up to 4 words are received implying that if block 0 is accessed from the main memory then 0,1,2,3 words are obtained, if block 1 is accessed from the main memory, words 4,5,6,7 are obtained, if block 2 is accessed from the main memory words 8,9,10,11 are obtained and so on. The addresses given above are word addresses, not memory block addresses. Assume that the access time is 2 clocks from the cache (M1) and 50 clocks from main memory (M2), the transfer rate is 4 bytes per clock and that 25% of the transfers are dirty. The base CPI of a perfect memory system is 1.75. (a) Consider a fully associative cache with LRU replacement policy. Determine the hit ratio. What is the average memory access time? (b) Consider a direct mapped cache. Determine the hit ratio. What is the average memory access time? 2 Department of Electrical Engineering and Computer Science Ohio University Spring 2014 Problem 6 (Do not use Dinero) You purchased an Acme computer with the following features 95% of all memory accesses are found in the cache each cache block is two words, and the whole block is read on any miss the processor sends references to its cache at the rate of 109 words per second 25% of those references are writes assume that the memory system can support 109 words per second, reads or writes the bus reads or writes a single word at a time (the memory system cannot read or wrote two words at once) assume at any one time, 30% of the blocks in the cache have been modified the cache uses write allocate on a write miss You are considering adding a peripheral to the system, and you want to know how much of the memory system bandwidth is already used. Calculate the percentage of memory system bandwidth used on the average in the two cases below. Be sure to state your assumptions a. The cache is write through b. The cache is write back Problem 7 (Do not use Dinero) The main memory of a computer is organized as 256 blocks, with a block size of 8 words. The cache has 16 block frames. For the questions below, show the mappings from the numbered blocks of main memory to block frames of the cache. Draw all lines showing the mappings as clearly as possible. a) Show the direct mapping and the address bits that identify the tag field, the index field, and the word offset field (the bits identifying a word within a block). b) Show the fully associative mapping and the address bits that identify the tag field and the word offset field. c) Show the mapping for the 2-way set associative mapping and the address bits that identify the tag field, the set number, and the word number. 3
© Copyright 2024 ExpyDoc