Sram is used for L1,L2 caches which is more expensive. having higher bandwidth.
Cache access
- index into the tag and data stores with index bits in address
- check valid bit in tag store
- compare tag bits in address with the stored tag in tag store
every block ( for example 00010 , 01010 , 10010, 11010 4 blocks) can be mapped to exact same place. and with tag we can find which cluster of block is the right one. it is also easy to design.
but since we are using strict same location for every block we are not fully using left free cache.
now we have 2 entry for 1 index. but now we have larger tag bits in address.
we no longer have index bits we only have tag bits. but it will cause longer latency. capacity and complexness is issue.
Handling Writes
Write back = when the block is evicted (when cache get kicked out) most of system is this.
- can combine mutiple writes to the same block before eviction ,potentially saves bandwidth between cache levels + save energy
- need a bit in th etag store indicating the block is dirty or modified
Write through = at the time the write happens
- simpler , no dirty bit
- All levels are up to date. Consistency: Simpler cache coherence because no need to check close-to-processor cache's tag stores for presence
- more bandwidth intensive
Allocate on write miss = bring entire block to cache
can combine writes instead of writing each of them individually to next level
simpler because write misses can be treated the same way as read misses
requries transfer of the whole cache block
No allocate on write miss = don't bring data to cache just memory
conserves cache space if locality of writes is low (ptentially better cache hit rate)
'Computer Architecture > C.A (ETH Zürich, Spring 2020)' 카테고리의 다른 글
Lecture 20: Graphics Processing Units (0) | 2021.07.23 |
---|---|
Lecture 19: SIMD Processors (0) | 2021.07.19 |
Lecture 18a: VLIW (0) | 2021.07.11 |
Lecture 16: Branch Prediction (0) | 2021.07.10 |
Lecture 15b: Out of Order , DataFlow & LD/ST Handling (0) | 2021.07.10 |