Quick Recap on Cache

# Quick Recap on Cache * cache design goal: __copy__ a piece of memory that you use often into faster hardware(cache) for future and faster access :DD . disk > memory > cache > register * when you access memory: from cpu you obtain byte address. Now if you directly go to memory, and memory transfers only that byte you request, you will get annoyed(way toooo slow). So why not when you access a byte with byte address, aside from this byte, you got several other bytes which you may use in future ? (special locality) > * these additional bytes are stored in `cache` !! > * "these several bytes" as one transfer unit called `block` With cache and block design, memory transfers data in chunk, and cache access will be more efficient. * cache size = number of blocks * block size block size = (# of words or # of bytes) assume address is 32bit, and your cache size is only 256 KB, block size 16 bytes (ie. 4 words) _number of blocks_ = 256 * 2^10 bytes / 16 bytes = 2^14 blocks Then, to access certain block with a 32-bit address: | bit number | 31 to 18| 17 to 4 | 3 to 0 | | -------- | -------- | -------- | -------- | | | tag | index(total 14 bits) as only 2^14 blocks | offset | Above is direct mapped. This implies for any two addresses with same 4-17 bits, when I acccess 0...1|__0...0__|00, and 0...0|__0...0__|00, I will access the same block in the cache, this is conflict miss TT So if you don't want such things happen(ie. several blocks fight for same block), why not let a block have several location to reside? this is exactly the concept of set ! * n-way set associative n-way set associative divides all blocks into n way m sets (n x m = size of cache, each set has n cache lines). This means data with byte address with block number being _k_ will resides in `k % n` th way. ie. the structure of blocks in cache will be |set 0| set 1 | |..|set n-1| | -------- | -------- | -------- | -------- |-------- | | 0 | 1 | 2 |..|n-1| |n |n+1 | ... |.. | 2 * n - 1| |...| |..| Then how many rows ? (ie since set is colun wise above, what is size of a set?) Previously, we got total number of blocks in the cache, then it is straighforward to get size of a set as _total number of blocks_ / n * About dirty ?? update ?? * say you want modify these data you bring into cache. * things to note: cache is only a temporary __copy__. The real and more permenant data is in the memory. * So how to modify: * you sacrifice efficiency, when you want to modify the data, you modify both cache and memory. way too slow, but you ensure memory get your modification ! * you modify that data in cache for now, and continue to work on other tasks. (faster, no need to write to memory now!) * problem: you memory has different data now !! eg. you update A from 3 to 5. well..in this case you only modify that A in your cache. The A in your memory is not yet update! > to keep your modification, you need to update A sometime in the future. > ### when ? > depends on you. But a good choice is __when you replace that block of data in cache with other data block.__ > #### Another trouble: you need to do > (1) bring in that new data block > (2) update the block you modified previously (called: dirty block). > Too many things to do. > _How about bring in that new data block first ? And delay (2) later you have time to write_ * rewrite-buffer: to delay (2)! --- * Above is only a 20-min quick draft on my understanding of cache. Thanks for reading, and welcome any discussion or correction on me ! A more complete version will be soon 😅?! * __if theres any topic on computer architecture you want to discuss, please kindly comment and I can write some note on it ~__