Intel Nehalem is the architecture of the Intel Core i7, which was first released in November 2008. It consists of 2, 4, or 8 cores, each with a 64 KB of L1 cache per core (32 KB L1 instruction cache and 32 KB L1 data cache, both with 256-bit cache blocks), a unified 256 KB L2 cache per core, and a shared 8MB 16-way associative L3 cache with 64-byte blocks. The 4-way associative L1 instruction cache latency is 3 cycles, while the 8-way associative L1 data cache latency is 4 cycles. The L2 cache access time is 10 cycles, and L3 is 40 cycles. The specs for my Core i7 laptop are 1600 MHZ DDR3 memory (with a 13.75 ns access time), and a processor clock rate of 2.6 GHz.

Here is a diagram of one core and the shared "uncore" resources (from Wikipedia)

Intel Nehalem arch.svg
By Appaloosa - Own work, CC BY-SA 3.0,
https://commons.wikimedia.org/w/index.php?curid=3887926

Question 1

Given a 32-bit address, give the breakdown into tag, index, and offset at each level of cache:

  1. L1 instruction cache
  2. L1 data cache
  3. L2 cache
  4. L3 cache

Question 2

Consider a memory read for 32-bit address 0x5F852B24 that is in memory, but has not been previously visited by this core and is not in any cache. Give the address and size of every read issued to any cache or memory, including multiple reads if necessary to fill a block; and the tag, index, and offset checked at each level of cache.

Question 3

For a particular program, the L1 data cache miss rate is 0.7%, the L2 miss rate is 4%, and the L3 miss rate is 12%.

  1. What is the average data memory access time in cycles?
  2. What is the actual access time for the access in Question 2?