Computer Architecture/C.A (ETH Zürich, Spring 2020)

Lecture 21b: Memory Hierarchy and Caches

Tony Lim 2021. 7. 27. 22:28

Sram is used for L1,L2 caches which is more expensive. having higher bandwidth.

 

 

 

Cache access

  1. index into the tag and data stores with index bits in address 
  2. check valid bit in tag store
  3. compare tag bits in address with the stored tag in tag store

 

 

every block ( for example 00010 , 01010 , 10010, 11010 4 blocks) can be mapped to exact same place. and with tag we can find which cluster of block is the right one. it is also easy to design.

but since we are using strict same location for every block we are not fully using left free cache.

 

 

now we have 2 entry for 1 index.  but now we have larger tag bits in address.

 

 

we no longer have index bits we only have tag bits. but it will cause longer latency. capacity and complexness is issue.

 

 

Handling Writes

Write back = when the block is evicted (when cache get kicked out) most of system is this.

  • can combine mutiple writes to the same block before eviction ,potentially saves bandwidth between cache levels + save energy
  • need a bit in th etag store indicating the block is dirty or modified

 

Write through = at the time the write happens

  • simpler , no dirty bit
  • All levels are up to date. Consistency: Simpler cache coherence because no need to check close-to-processor cache's tag stores for presence
  • more bandwidth intensive

 

Allocate on write miss = bring entire block to cache

can combine writes instead of writing each of them individually to next level

simpler because write misses can be treated the same way as read misses

requries transfer of the whole cache block 

 

No allocate on write miss = don't bring data to cache just memory

conserves cache space if locality of writes is low (ptentially better cache hit rate)