Memory Hierarchy

# Memory Hierarchy ## Introduction Memory 想要同時兼顧速度快且空間大花費又小幾乎是不可能的，必須在`Speed`與`Space`間`trade off`，因此本篇主要講述如何利用有限的Memory資源，透過Memory Hierarchy 達到Memory Access efficiency ### Principle of Locality * **Temporal locality** items accessed recently are likely to be accessed again soon * **Spatial locality** items near those accessed recently are likely to be accessed soon 利用`Principle of Locality`優勢，Memory可以採用Hierarchy方式存取 * Store everything on disk * Copy recently accessed (and nearby) items from disk to smaller DRAM memory (Main Memory) * Copy more recently accessed (and nearby) items from DRAM to smaller SRAM memory (Cache memory attached to CPU) ![Memory Hierarchy](https://hackmd.io/_uploads/ryT7PbKUke.png) :::warning 越靠近Processor的Memory存取速度越快，但Size也越小Cost越高 ::: ### Memory Hierarchy Levels ![image](https://hackmd.io/_uploads/ryICnZt8yg.png) * **Block (line**) minimum unit of information that can be either present or not present in a two-level hierarchy > ex: two-level hierarchy (cache and main memory) * **Hit** if accessed data is present in an upper level (ex: cache) * Hit rate $=\frac{\# hits}{\# accesses}$ * Hit time the time to access the upper level including the time needed to determine whether the access is a hit or miss * **Miss** if accessed data is absent in an upper level :::warning Miss handle 此時對應的block會從lower level coped到upper level，讓upper level存取 ::: * Miss rate $= \frac{\#\ misses}{\# accesses}= 1 - hit\ rate$ * Miss penalty the time replace a upper level block + the time to deliver block to the processor ## Memory Technologies ### SRAM Technology SRAMS are integrated circuits that are memory arrays * Usually with a **single access port** for either a read or a write * With a **fixed access time** to any datum (though the read and write access time may differ) * Using six (or less) transistors per bit ::: warning 因SRAM需要更多個transistors儲存bit使得單位面積下能夠儲存的data較DRAM少 ::: ### DRAM Technology Data stored as a charge in a capacitor * **Single transistor** used to access the charge * **periodically** be refreshed * Read contents and write back * Performed on a DRAM "row" ![image](https://hackmd.io/_uploads/ryPKqBCU1x.png) ::: warning DRAM在讀取時會先讀取Row到Row buffer，再透過Col addr取得我們真正要的data ::: ::: warning DRAM速度慢(相較SRAM)的原因是因為需要periodically refresh ::: #### Advanced DRAM Organization * Bits in a DRAM are organized as a rectangular array * DRAM accesses an entire row * Burst mode Supply successive words from a row with reduced latency * Double data rate (DDR) SDRAM * With added clocks: Synchronous DRAM * Transfer on rising and falling clock edges ::: warning 假如原本只能在rising edge讀取data，現在多了falling edge，所以每個cycle可以做2個讀取動作使得data rate double ::: #### DRAM Performance Factors * Row buffer Allow serveral words to be read and refreshed in parallel * Synchronous DRAM Allows for consecutive accesses in bursts without needing to send each address * DRAM banking Allows simultaneous access to multiple DRAMs (accessed rotated between banks: address interleveing) ::: warning 假如需要4個banks裡的data，只需要1個access就可以同時讀取4 banks data ::: ::: warning 這些factor可以改善DRAM bandwidth ::: ### Flash **Nonvolatile** semiconductor storage * 100x ~ 1000x faster that disk * smaller, lower power, more robust (compare to disk) #### Flash type * **NOR flash**: bit cell like a NOR gate * Random read/write access * Used for instruction memory in embedded systems * **NAND flash**: bit cell like a NAND gate * Denser (bits/area), but block-at-a-a-time access * Cheaper per GiB * Used for USB keys, media storage, ... :::danger **Problem** Flash bits wear out after 1000's of accesses 如果存取同一個位置的data超過1000次以上，那個位置的data會損壞 **Solution** **Wear leveling**: remap data to less used blocks ::: ### Disk Storage **Nonvolatile**, rotating magnetic storage * Serveral **platters**, with information recorded magnetically on both surfaces (usually) * Bits recorded in **tracks**, which in turn divided into **sectors** (e.g., 512 Bytes). error correction code per sector to find and correct errors ![image](https://hackmd.io/_uploads/Hk86XICUkg.png) ![image](https://hackmd.io/_uploads/BkCd4UAIJl.png) #### Disk Sectors * each sector records * Sector ID * Data (512 to 4096 bytes) * Error correcting code (ECC) * ... * **access to a sector (important)** 1. **Queuing delay** (other accesses are pending) 2. **Seek** move the head to the desired track 3. **Rotational latency** wait for the desired sector to rotate under the head 4. **Data transfer** 5. **Controller overhead** **Example** ::: success Given 512B sector, 15,0000rpm (revolutions per minute), 4ms average seek time, 100MiB/s transfer rate, 0.2ms controller overhead, idle disk ::: * Average read time = 4 ms (seek time) + $\frac{15000}{2\ \times 60}$ (rotation latency) + $\frac{512\ B}{100\ MiB/s}$ (Data transfer) + 0.2 ms (controller delay) = 4 ms + 2 ms + 0.005 ms + 0.2 ms = 6.2 ms * if actual average seek time is 1 ms * Average read time = 3.2 ms #### Disk Performance Issues * Manufacturers quote average seek time * based on all possible seeks :::warning 事實上，因為data locality 和 OS scheduling，<span class="red">actual</span> average seek times會比average seek times還要來的小 ::: * Smart disk controller allocates physical sectors on disk * Presents **logical** sector interface to host * Disk drives include caches * **Prefetch** sectors in anticipation of access * Avoid seek and rotational delay :::warning 這裡的caches是Disk裡面的caches不是CPU的 ::: --- ### Cache Memory * How to we know if the data is present ? * Where do we look ? #### Direct-Mapped Cache * Location determined by address * Direct mapped: only one choice * (Block address) $mod$ (# Blocks in cache) * \# block is a power of 2 => (use low-order address bits) ### Reference [1] P.A.Patterson and J.L. Hennessy, Computer Organization and Design RISC-V ###### tags: `Computer Architecture` <style> @import url('https://fonts.googleapis.com/css2?family=PT+Serif&display=swap'); @import url('https://fonts.googleapis.com/css?family=Noto+Serif+TC&display=swap'); @import url('https://fonts.googleapis.com/css2?family=Fira+Code&family=Noto+Serif+TC&display=swap'); * { font-family: 'PT Serif', 'Noto Serif TC', sans-serif; letter-spacing: 0.02em; /* font-size: 16px; letter-spacing: 0.02em; */ } .easy{ color:green; } .medium{ color:#eda35e; } .red{ color:red; } .proof { padding:10px; background-color: #f7f7f7; } </style>