Phinal Table Entry (PTE / ΦTE)

# Phinal Table Entry (PTE / ΦTE) ###### tags: `計算機組織`, `109-2` :::success ![](https://i.imgur.com/Ji8qoQa.gif =64x64) 頁面施工中 ![](https://i.imgur.com/Ji8qoQa.gif =64x64) ::: :::warning ![](https://i.imgur.com/Ji8qoQa.gif =64x64) **此條目需要補充更多來源。** (2021年6月21日) 請協助補充多方面可靠來源以改善這篇條目，無法查證的內容可能會因為異議提出而移除。 ::: :::info  ![](https://i.imgur.com/Fh1RCp5.png =64x64) J.C.出品，必屬ㄎㄧㄤ品 ((p)章節) ::: :::danger ![](https://i.imgur.com/MOlkBGS.png) ::: [TOC] ### Locality ((p)5-1) - Temporal Locality - Spatial Locality ### Memory Technology ((p)5-2) - 區分 SRAM, DRAM, 磁碟的大小/速度 - DRAM 訊號是方波，有 SDR DDR QDR - 硬碟有 Controller 對 sector 做分配 SCSI ATA SATA > 還有一堆神奇的東西，但我覺得不會考，Omit! > [name=JCxYIS] ### Dick Access Problem ((p)5-2) - ==給一堆東西問 Average Read Time== :::info :::spoiler **Example: Disk Access** Given - `512`B sector - `15000`rpm - `4` ms average seek time (i.e. to position the head over the proper track) - `100`MB/s transfer rate - `0.2`ms controller overhead, idle disk Average Read Time will be: - Seek time = `4` ms - Rotational Latency = `0.5/(15000/60/1000 (ms))` = `2` ms - Transfer Time = `512 (B) / 100 (MB/s)` = `0.005` ms - Controller Overhead = `0.2 ms` - TOTAL = $6.2005 ms$ ::: > OS 教 Dick Scheduling，CO 教 Dick AcceSS > [name=] ### Direct Mapped Cache ((p)5-3) - ==給存取陣列問 hit/ miss== - 增加 block 大小會怎樣 (miss rate--, miss penalty++) :::info :::spoiler **Example: Direct-Mapped Cache** 假設我們有 8-block 的 cache，每個 block 能存 1 word，使用 Direct-Mapped 表格是 (data 已省略) | (Index) | V | Tag | |:-------:|:-------:|:---:| | **000** | `false` | | | **001** | `false` | | | **010** | `false` | | | **011** | `false` | | | **100** | `false` | | | **101** | `false` | | | **110** | `false` | | | **111** | `false` | | *存取 Word Addrress `22`* 1. 轉成 Binary ：`10 110` 2. 前往 index 格：`010` 3. 判斷 tag 是否正確： `10 != -` 不正確 4. 拿資料，寫入 cache，並作 Miss 判定 | 110 | `true` | 10 | | --- | ------ | --- | *存取 Word Addrress `26`* 1. 轉成 Binary ：`11 010` 2. 前往 index 格：`010` 3. 判斷 tag 是否正確： `11 != -` 不正確 4. 拿資料，寫入 cache，並作 Miss 判定 | 010 | `true` | 11 | | --- | ------ | --- | *存取 Word Addrress `22`* 1. 轉成 Binary ：`10 110` 2. 前往 index 格：`110` 3. 判斷 tag 是否正確： `10 == 10` 正確 4. 直接回傳 data，作 Hit 判定 *存取 Word Addrress `18`* 1. 轉成 Binary ：`10 010` 2. 前往 index 格：`010` 3. 判斷 tag 是否正確： `10 != 11` 不正確 4. 拿資料，覆寫 cache，並作 Miss 判定 | 010 | `true` | 10 | | --- | ------ | --- | ::: ### Write Policy ((p)5-3) - Timing/ Why need write policy - Write-Through vs. Write-back > write-around 肯定不會考...吧？ > 知道那是什麼就好了 > [name=JCxYIS] ### Cache Size ((p)5-3) - ==給 Cache 類型、Block Size、Data Size，問各種 bit、cache size== :::info :::spoiler **Example: Cache size** Given - Direct-mapped Cache - `16` KB data - Block size: `4` word - Address: `32-bit` 這個 cache 的大小是？ - Block Count: 16 KB = `16384`B = `4096` words = `1024` blocks - Index = `lg1024`=`10` bits - Tag = `32-10-2-2` = `18` bits - TOTAL = $Block\ Entries \times (tag+valid\ field+data) = 2^{10}\times(18+1+128) = 147$ Kbits ::: ### Cache Performance ((p)5-4) :::info :::spoiler **Example: Cache Performance** Given - I-cache Miss Rate = `2%` - D-cache Miss Rate = `4%` - Miss Penalty = `100` Cycles - Base CPI (ideal cache) = 2 - Load/ Store = `36%` of Instructions Than, - Miss Cycle per Instruction - I-cache: $1 \times 0.02 \times 100 = 2$ (Instruction Fetch) - D-cache: $0.36 \times 0.04 \times 100 = 1.44$ (L/S operation) - Actual CPI = $2+2+1.44 = 5.44$ - Ideal CPU (no penalty) is $5.44/2 = 2.72$ times faster ::: ### AMAT ((p)5-4) - OS 也有喔 :::info :::spoiler **Example: Average Memory Access Time** Given - Clock Time = `1` ns with cache - Hit Time = `1` cycle - Miss Penalty = `20` cycle - Miss Rate = `5%` Than, - AMAT = $1+5\%\times20 = 2$ ns ::: ### Associative Cache ((p)5-4) :::info :::spoiler **Example: Associative Cache** Given - 4-block cache Compare the following mapping methods: - Direct mapped - 2-wat set associative - Fully associative Block Address: `0, 8, 0, 6, 8` -- **Direct Mapped (Index=2 -> last 2 bit)** | Address | Index | Hit? | Data ([0, 1, 2, 3]) | |:-------------:|:-----:|:----:|:-------------------:| | `0` (`00000`) | `00` | miss | `[0,-,-,-]` | | `8` (`01000`) | `00` | miss | `[8,-,-,-]` | | `0` (`00000`) | `00` | miss | `[0,-,-,-]` | | `6` (`00110`) | `10` | miss | `[0,-,6,-]` | | `8` (`01000`) | `00` | miss | `[8,-,6,-]` | -- **2-way set associative (LRU replacement)** | Address | Index | Hit? | Data ([0, 1, 2, 3]) | |:-------------:|:-----:|:----:|:-------------------:| | `0` (`00000`) | `0` | miss | `[0,-,-,-]` | | `8` (`01000`) | `0` | miss | `[0,8,-,-]` | | `0` (`00000`) | `0` | hit | `[0,8,-,-]` | | `6` (`00110`) | `0` | miss | `[0,6,-,-]` | | `8` (`01000`) | `0` | miss | `[8,6,-,-]` | -- **Fully associative** | Address | Index | Hit? | Data ([0, 1, 2, 3]) | |:-------------:|:-----:|:----:|:-------------------:| | `0` (`00000`) | - | miss | `[0,-,-,-]` | | `8` (`01000`) | - | miss | `[0,8,-,-]` | | `0` (`00000`) | - | hit | `[0,8,-,-]` | | `6` (`00110`) | - | miss | `[0,8,6,-]` | | `8` (`01000`) | - | hit | `[0,8,6,-]` | ::: :::info :::spoiler **Example: Associative Cache Size** Given a cache of - 4096 blocks - 4-word block size - 32-bit address Find the total number of tag bits which is - Direct mapped - 2-way set-associative cache -- **2-wat set-associative** Entries: $\lg(4096/2) = 11$ Byte Offset: 4word = 16byte => $\lg16 = 4$ Tag Bits: $32-11-4 = 17$ Total Number of Tag Bits: $17\times2\times2048 = 69632$ bits (70Kb) **Directed mapped** $65536$ bits (66Kb) ::: - 越多的 Associative 能越降低 Miss Rate (但是下降率逐步減少) ### Replacement Policy ((p)5-4) - 看 OS - LRU ### Multilevel Cache ((p)5-4) :::info :::spoiler **Example: Multilevel Cache** Given - CPU base CPI = `1` - CPU clock rate = `4`Ghz => Each Clock = 0.25ns - Miss rate per instruction = `2%` - Main memory access time = `100`ns Whose L1 cache - Miss Penalty = $100/0.25$ = 400 Cycles - Effictive CPI = $1+2\%\times400$ = 9 Then we add L2 cache - Access time = `5`ns - Global Miss Rate (to Main memory) = `0.5%` Primary miss with L2 hit (L1 miss, L2 hit) - Penalty = $5/0.25%$ = 20 Cycles Primary miss with L2 miss (L1, L2 all miss) - Extra Penalty = 400 Cycles We get - CPI = $1+2\%\times 20 + 0.5\%\times400$ = 3.4 Thus, Performance Ratio = $9/3.4$ = 2.6 ::: ### Dependability ((p)5-5) :::info :::spoiler **Example: Dependability** Some disk today are quoted to have - `1,000,000` hour MTTF (about 114 years) Suppose we have a warehouse scale system which has - `50,000` servers - each sever has `2` disks Thus, a disk's Annual Failure Rate (AFR) is - $1/1000000\times24\times365 = 0.876\%$ So, with 100000 disks, we would expect - $876$ disks to fail per year - or $2$ disk failures per day ::: ### Hamming EC Code ((p)5-5) :::info :::spoiler **Example: ECC** Byte Data `1001 1010` in binary The Hamming ECC code: *方法一* - P1: `10X1 1X1X` 總和為 4，為偶數 = 0 - P2: `1X01 X01X` 總和為 3，為奇數 = 1 - P4: `X001 XXX0` 總和為 1，為奇數 = 1 - P8: `XXXX 1010` 總和為 2，為偶數 = 0 插入 Encoded String: `--1- 001- 1010`。獲得結果 `0111 0010 1010` *方法二* - 先把 Encoded String 填入 - `--1- 001- 1010` - 把上面資料裡含有 1 的位數，全部轉成 binary。注意是從左到右： - `3, 7, 9, 11` => `0011, 0111, 1001, 1011` - 把得到的結果放在一起做 XOR (判斷每位總和的基偶) - $0011 \oplus 0111 \oplus 1001 \oplus 1011 = 0110$ - 把上面的數 **從右到左** 依序填入 Encoded String 獲得結果 - `0111 0010 1010` \- 如果某個 bit 發生錯誤，例如資料變成 `0111 0010 1110` 我們檢查 ECC Code: 資料: `1001 1110` 應該要符合 P1/2/4/8: `0110` (方法一) - P1: `10X1 1X1X` 總和為 4，為偶數 = 0，正確 - P2: `1X01 X11X` 總和為 4，為偶數 = 0，**不正確！** - P4: `X001 XXX0` 總和為 1，為奇數 = 1，正確 - P8: `XXXX 1110` 總和為 3，為奇數 = 1，**不正確！** 錯誤的位元： $2+8 = 10$。第 10 位元必定有錯！ ::: ## ==TODO==