--- tags: 進階電腦系統理論與實作, NCKU Linux Kernel Internals, 作業系統 --- # 2020q3 Homework4 (quiz4) contributed by < `RusselCK` > ###### tags: `RusselCK` > [指標 & 數值系統 & Bitwise 練習題](https://hackmd.io/@sysprog/2020-quiz4) ## Q1. LeetCode [461. Hamming Distance](https://leetcode.com/problems/hamming-distance/) * The [Hamming distance](https://en.wikipedia.org/wiki/Hamming_distance) between two integers is the number of positions at which the corresponding bits are different. #### 使用 `__builtin_popcount` 實作 ```cpp= int hammingDistance(int x, int y) { return __builtin_popcount(x OP y); // OP = ^ } ``` * `x ^ y` 可以將 `x` `y`兩者對應位置不同的 bits 顯示出來,其 1 的個數即為 Hamming distance ## Q1. 進階題 #### 不使用 gcc extension 實作 ```cpp int hammingDistance(int x, int y) { x ^= y; int i = 0; while(x){ if (x & 1) i++; x >>= 1; } return i; } ``` #### LeetCode [477. Total Hamming Distance](https://leetcode.com/problems/total-hamming-distance/) * find the total Hamming distance between all pairs of the given numbers. :::spoiler 範例 >Input: 4, 14, 2 > >Output: 6 > >the 4 is 0100, 14 is 1110, and 2 is 0010 >So the answer will be: HammingDistance(4, 14) + HammingDistance(4, 2) + HammingDistance(14, 2) = 2 + 2 + 2 = 6. ::: ##### 超時版 ```cpp int totalHammingDistance(int* nums, int numsSize){ int total = 0; for(int i=0; i < numsSize ; i++){ for (int j= i+1 ; j < numsSize ; j++){ total += __builtin_popcount(nums[i] ^ nums[j]); } } return total; } ``` * 兩兩一組求 Hamming Distance ,最後再加總 * 共需要算 $\binom{n}{2}= \frac{n * (n-1)}{2}$ 次 ##### 不超時版 ```cpp int totalHammingDistance(int* nums, int numsSize){ int total = 0; for(int i=0; i < 31 ; i++){ int count = 0; for (int j= 0 ; j < numsSize ; j++) count += (nums[j] >> i) & (1); total += count * (numsSize - count); } return total; } ``` > 04 = ( 0100 ) > 14 = ( 1110 ) > 02 = ( 0010 ) * 每個 bit 的 #兩兩不同 = ( #1 ) * ( #0 ) = ( #1 ) * ( #nums - #1 ) * 最後加總 32 個 bit 的 #兩兩不同 * [效能評估](https://leetcode.com/submissions/detail/403907220/): ![](https://i.imgur.com/oEHclli.png) #### Hamming Distance 和 Hamming Weight * [錯誤更正碼簡介](https://w3.math.sinica.edu.tw/math_media/d184/18404.pdf) * [抽象代數的實務應用](http://bit.ly/abstract-algebra) * [Hamming codes, Part I](https://youtu.be/X8jsijhllIA) * [Hamming codes, Part II](https://youtu.be/b3NxrZOu_CE) ##### 錯誤更正碼( Error–Correcting Codes ) :::spoiler 由來 * 1948年以前, 大家在做通訊系統的時候, 都只想到要如何使訊號接收得比較好, 比如把天線加大, 或者把訊號傳遞的功率加大, 但這些都是在物理上做改進 * 1948年 Claude Shannon 寫了一篇論文叫做 A mathematical theory of communications, 他說事實上我們不必這樣做, 很多情形只要在資料上做手腳, 只要傳資料的速率小於每個通道特有的量( 通道容量 channel capacity ), 就一定有辦法讓資料錯誤的機率很小, 要多小都可以 * 因為他提出了這件事, 後來造成了整個領域的發展, 這個領域叫做訊息理論 (information theory) ```graphviz digraph{ rank = LR data[shape = box, color = white] Encoder[shape = box] Channel[shape = box] Decoder[shape = box] gooddata[label="good data" shape = box color = white] subgraph{ rank = same data->Encoder Encoder->Channel [label = "redundancy" ] Encoder->Channel [label = "data" color=white] Channel->Decoder [label = "noisy redundancy" ] Channel->Decoder [label = "noisy data" color=white] Decoder->gooddata } } ``` ::: :::spoiler Single–Error Correcting(SEC) Codes / (7, 4) Hamming code * ==只能改一個錯== * (7, 4) : 全部長度是7, 只有4個 bits是原來要傳的資料 * Hamming 在 1950 年所發明的 ![](https://i.imgur.com/AQuV68O.png) ``` Input = d1 d2 d3 d4 p1 p2 p3 A = { d1, d2, d4, p1 } B = { d1, d3, d4, p2 } C = { d2, d3, d4, p3 } ``` :::success * 正確的情形: 任一集合中,`#of 1` is even * d1 d2 d3 d4 是原來要傳的資料 * 若有 1 個 bit 傳錯,根據規則可以很快的找出來並修正 ::: :::spoiler Hamming codes 的數學結構 * 定義 modulo 2 運算 * 先 加起來 或 乘起來 後,再 除以 2 取 餘數 * 由 (7, 4)Hamming code 規則可以得到: * 為了與文章一致, ( d4 d3 d2 d1 p3 p2 p1 ) 對應到 ( x1 x2 x3 x4 x5 x6 x7) $$ \left\{ \begin{array}{c} x1 + x2 + x3 + x5 = 0 \\ x1 + x2 + x4 + x6 = 0 \\ x1 + x3 + x4 + x7 = 0 \\ \end{array} \right. ,\space di,\space pj \in \begin{Bmatrix} 0,1 \end{Bmatrix} $$ 線性方程式寫成矩陣 $$ H = \begin{bmatrix} 1 & 1 & 1 & 0 & 1 & 0 & 0 \\ 1 & 1 & 0 & 1 & 0 & 1 & 0 \\ 1 & 0 & 1 & 1 & 0 & 0 & 1\\ \end{bmatrix} $$ * $C=\begin{Bmatrix}所有Codewords\end{Bmatrix}=$ ==$H$ 的 $Nullspace$== * $dim(H)=7-rank(H)=7-3=4$ * 只要給定其中 4 個 bits ,另外 3個 bits 會對應而生 * 這 4 個 bits 共有 $2^4=16$ 種可能 $=\begin{vmatrix}C\end{vmatrix}$ ( codewords ) ::: :::spoiler Syndrome ( 徵狀 ) * $x$ : codeword $e$ : 錯誤向量 ```graphviz digraph{ rank = LR x[shape = box, color = white] //Encoder[shape = box] Channel[shape = box] //Decoder[shape = box] y[label="y = x + e" shape = box color = white] subgraph{ rank = same x->Channel //Encoder->Channel [label = "redundancy" ] //Encoder->Channel [label = "data" color=white] //Channel->Decoder [label = "noisy redundancy" ] //Channel->Decoder [label = "noisy data" color=white] Channel->y } } ``` * 計算 syndrome $$ Hy^T = Hx^T + He^T $$ $$ 因為\space x \space是\space codeword,所以\space Hx^T=0 $$ * 代表 syndrome 只跟錯誤向量有關, 跟傳哪個 codeword 無關 * 舉例: $y = x + e $ = ( 1101000 ) = ( 1101010 ) + ( 0000010 ) $$ s^T= Hy^T =\begin{bmatrix} 1 & 1 & 1 & 0 & 1 & 0 & 0 \\ 1 & 1 & 0 & 1 & 0 & 1 & 0 \\ 1 & 0 & 1 & 1 & 0 & 0 & 1\\ \end{bmatrix} \begin{bmatrix} 1 \\ 1 \\ 0 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} =\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} $$ * 根據錯誤的 bit 不同, $s^T$ 的結果也會隨之改變,整理如下: | 錯誤的 bit | x1 | x2 | x3 | x4 | x5 | x6 | x7 | |:----------:|:-------------------------------------:|:-------------------------------------:|:-------------------------------------:|:-------------------------------------:|:-------------------------------------:|:-------------------------------------:|:-------------------------------------:| | $e$ | (1000000) | (0100000) | (0010000) | (0001000) | (0000100) | (0000010) | (0000001) | | $s^T$ | $\begin{bmatrix}1\\1\\1\end{bmatrix}$ | $\begin{bmatrix}1\\1\\0\end{bmatrix}$ | $\begin{bmatrix}1\\0\\1\end{bmatrix}$ | $\begin{bmatrix}0\\1\\1\end{bmatrix}$ | $\begin{bmatrix}1\\0\\0\end{bmatrix}$ | $\begin{bmatrix}0\\1\\0\end{bmatrix}$ | $\begin{bmatrix}0\\0\\1\end{bmatrix}$ | * ==每一個 syndrome 就對應到一種錯的情況== * 我們只要知道它的徵狀是什麼, 就可以知道錯在那裡, 然後把它改正過來 ::: #### Hamming Distance & Hamming Weight * Hamming distance = 兩個向量不一樣的地方有幾個 * **Hamming weight** = 一個向量**不是 0 的的地方有幾個** * ==**x + y 的 Hamming weight = x 和 y 的 Hamming distance**== * modulo 2 加法 * 加起來之後不為0 的 bits 的個數 = 它們之間不同的 bits 的個數 * **minimum distance** * 任何在 $C$ 中的兩個不同的 codewords 的 Hamming distance, 它們之中最小的值 $$ d_{min}{C} = min \space d_{H}(x,y) \\ x,y \in C, x \neq y $$ * **minimum weight** * 所有不為0的 codewords, 它的 Hamming weight 的最小值 $$ w_{min}{C} = min \space w_{H}(x) \\ x \in C, x \neq 0 $$ * also, minimum distance = minimum weight, i.e. ==$d_{min}{C}=w_{min}{C}$== $$ H =\begin{bmatrix} 1 & 1 & 1 & 0 & 1 & 0 & 0 \\ 1 & 1 & 0 & 1 & 0 & 1 & 0 \\ 1 & 0 & 1 & 1 & 0 & 0 & 1\\ \end{bmatrix} $$ * minimum distance $d_{min}{C}=w_{min}{C}$ 是多少? * 1 ? * NO! 因為 $H$ 的任一個 column 中都不存在 $\overrightarrow{0}$ * 2 ? * NO! 因為 $H$ 的任兩個 column 相加不為 $\overrightarrow{0}$ * ==3== ? * Yes! 取第 1、2、7 個 column 相加,可以得到 $\overrightarrow{0}$ :::success * ==如果要改 $t$ 個錯, 它的 minimum distance 至少要 $2t + 1$== :::spoiler 證明 ![](https://i.imgur.com/OVDUk3m.png) * 兩球的中心分別為 codeword $x$ and $x'$ 球內包含 所有與中心的 Hamming distance 小於等於 $t$ 的 binary vector * 因為有 $t$ 個錯要改,所以兩個球不能碰在一起,因此,兩球距離最小值為 1 ( 交集為空集合 ) * 若兩球碰在一起,會有一點和 $x$ 及 $x'$ 的距離都 小於 $t$ , 此時便不知道要修正為 $x$ 還是 $x'$,導致不能改 $t$ 個錯 $\Rightarrow$ x 和 x' 的距離 至少要 $t+1+t=2t+1$ $\Rightarrow$ 任兩個 codewords 的距離 至少要 $2t+1$ ::: #### 研讀 [Reed–Solomon error correction](https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction) #### 閱讀 Linux 核心原始程式碼的 [lib/reed_solomon 目錄](https://github.com/torvalds/linux/tree/master/lib/reed_solomon),抽離程式碼為單獨可執行的程式,作為 ECC 的示範 ## Q2. LeetCode [1483. Kth Ancestor of a Tree Node](https://leetcode.com/problems/kth-ancestor-of-a-tree-node/) * You are given a tree with `n` nodes numbered from `0` to `n-1` in the form of a parent array where `parent[i]` is the parent of node `i`. The root of the tree is node `0`. * Implement the function `getKthAncestor(int node, int k)` to **return the `k`-th ancestor of the given node**. If there is no such ancestor, return `-1`. * The k-th ancestor of a tree node is the `k`-th node in the path from that node to the root. #### [Binary Lifting](https://cp-algorithms.com/graph/lca_binary_lifting.html#toc-tgt-1) 的實作: ```cpp= #include <stdlib.h> typedef struct { AAA; // AAA = int **parent int max_level; // int n; } TreeAncestor; TreeAncestor *treeAncestorCreate(int n, int *parent, int parentSize) { TreeAncestor *obj = malloc(sizeof(TreeAncestor)); obj->parent = malloc(n * sizeof(int *)); int max_level = 32 - __builtin_clz(n) + 1; for (int i = 0; i < n; i++) { obj->parent[i] = malloc(max_level * sizeof(int)); for (int j = 0; j < max_level; j++) obj->parent[i][j] = -1; } for (int i = 0; i < parentSize; i++) obj->parent[i][0] = parent[i]; for (int j = 1;; j++) { int quit = 1; for (int i = 0; i < parentSize; i++) { obj->parent[i][j] = obj->parent[i][j + BBB] == -1 // BBB = (-1) ? -1 : obj->parent[obj->parent[i][j - 1]][j - 1]; if (obj->parent[i][j] != -1) quit = 0; } if (quit) break; } obj->max_level = max_level - 1; // obj->n = n; return obj; } int treeAncestorGetKthAncestor(TreeAncestor *obj, int node, int k) { int max_level = obj->max_level; for (int i = 0; i < max_level && node != -1; ++i) if (k & (CCC)) // CCC = 1 << i node = obj->parent[node][i]; return node; } void treeAncestorFree(TreeAncestor *obj) { for (int i = 0; i < obj->n; i++) free(obj->parent[i]); free(obj->parent); free(obj); } ``` :::spoiler 範例 ![](https://i.imgur.com/SKDA2Jl.png) ```json Input: ["TreeAncestor","getKthAncestor","getKthAncestor","getKthAncestor"] [[7,[-1,0,0,1,1,2,2]],[3,1],[5,2],[6,3]] ``` ``` Output: [null,1,0,-1] ``` ``` Explanation: TreeAncestor treeAncestor = new TreeAncestor(7, [-1, 0, 0, 1, 1, 2, 2]); treeAncestor.getKthAncestor(3, 1); // returns 1 which is the parent of 3 treeAncestor.getKthAncestor(5, 2); // returns 0 which is the grandparent of 5 treeAncestor.getKthAncestor(6, 3); // returns -1 because there is no such ancestor ``` ::: * Binary lifting : * ==the 7th ancestor = the 4th ancestor of the 2nd ancestor of the 1st ancestor== * 與二進制相關 > 7 = 4 + 2 + 1 = $2^2+2^1+2^0$ * ==the $2^j$-th ancestor = the $2^{j-1}$-th ancestor of the $2^{j-1}$-th ancestor== > the 8th ancestor = the 4th ancestor of the 4th ancestor * 可改良`int treeAncestorGetKthAncestor()` * 程式碼解析: #### ` TreeAncestor *treeAncestorCreate(int n, int *parent, int parentSize)` ```cpp=3 typedef struct { AAA; int max_level; } TreeAncestor; TreeAncestor *treeAncestorCreate(int n, int *parent, int parentSize) { reeAncestor *obj = malloc(sizeof(TreeAncestor)); obj->parent = malloc(n * sizeof(int *)); ``` * 由 `#11` 可以知道,AAA 的名字為 `parent` ,而且是**型態為 `(int *)` 的指標** * 由 `#15`、`#17`可以知道,`obj->parent`是 **二維陣列** * 因此 **`AAA = int **parent`** ```cpp=13 int max_level = 32 - __builtin_clz(n) + 1; for (int i = 0; i < n; i++) { obj->parent[i] = malloc(max_level * sizeof(int)); for (int j = 0; j < max_level; j++) obj->parent[i][j] = -1; } ``` * By Binary Lifting, **for each node, 記錄 $2^k$-th ancestor, $k = 0,1,2,...,\left\lfloor log(n) \right\rfloor$** * 因此,我們共需要`n` * $( \left\lfloor log(n) \right\rfloor+1)$ 的二維陣列 > 以 8 bits 為例, `n` = 7 = ( 0000 0111 )~2~ > $\left\lfloor log(n) \right\rfloor+1=\left\lceil log(n) \right\rceil = \left\lceil log(7) \right\rceil$ = 3 = `8 - __builtin_clz(7)` ```graphviz digraph{ node [shape=record, fontcolor=black, fontsize=14, width=4.75, fixedsize=true] B[ label=" 0 | 1 | 2 | 3 |...|log(n)" color=white] A[ label="1st|2nd|4th|8th|...|nth"] B -> A[ style = invis ] } ``` * 將 `obj->parent` 做成 `n` * `max_level` 的 二維陣列:`obj->parent[n][max_level]` * 所有初始值 設為 `-1` ( i.e. no ancestor ) * :::spoiler 範例 ```cpp obj->parent[7][4] = { [-1,-1,-1,-1], [-1,-1,-1,-1], [-1,-1,-1,-1], [-1,-1,-1,-1], [-1,-1,-1,-1], [-1,-1,-1,-1], [-1,-1,-1,-1], } ``` ::: :::warning Why `max_level` 比預想的多 1 ? ::: ```cpp=19 for (int i = 0; i < parentSize; i++) obj->parent[i][0] = parent[i]; ``` > `parent[i]` 存放 node~i~ 的父親 ( i.e. The 1-th ancestor ) 的位置 * `obj->parent[i][0]` 存放 node~i~ 的 1-th ancestor * `for` 迴圈 的終止式 用 `parentSize` 來判斷而非 `n` * 因為 input 提供有關 ancestor 的資料只有 `parent[]` * 若 $n \ne parentSize$ 且以 `n` 來判斷 , 會出現 `parent[]`沒充分使用 或 資料不足 的情況 * :::spoiler 範例 ```c obj->parent[7][4] = { [-1,-1,-1,-1], [0,-1,-1,-1], [0,-1,-1,-1], [1,-1,-1,-1], [1,-1,-1,-1], [2,-1,-1,-1], [2,-1,-1,-1], } ``` ::: ```cpp=22 for (int j = 1;; j++) { int quit = 1; for (int i = 0; i < parentSize; i++) { obj->parent[i][j] = obj->parent[i][j + BBB] == -1 ? -1 : obj->parent[obj->parent[i][j - 1]][j - 1]; if (obj->parent[i][j] != -1) quit = 0; } if (quit) break; } ``` * 設置 `obj->parent[i][1:max_level]` 的資料 * For each node, ==the $2^j$-th ancestor = the $2^{j-1}$-th ancestor of the $2^{j-1}$-th ancestor== * **`obj->parent[i][j] = obj->parent[obj->parent[i][j - 1]][j - 1]`** * 前提: `obj->parent[i][j - 1]` 不為 `-1` * 因此, **BBB = `(-1)`** :::spoiler 範例 ```c obj->parent[7][4] = { [-1,-1,-1,-1], [0,-1,-1,-1], [0,-1,-1,-1], [1,0,-1,-1], [1,0,-1,-1], [2,0,-1,-1], [2,0,-1,-1], } ``` ::: * `#23`、`#30`:若 `quit == 1`,跳出 `for` 迴圈 * `#28`:只要當下檢查的欄位 不為 `-1` ,代表還有欄位可能需要修改,不能跳出 `for` 迴圈,設置 `quit = 0` ```cpp=32 obj->max_level = max_level - 1; ``` :::warning * Why `max_level` 又要 - 1 ? * 代表先前預想的 `max_level` 沒有錯 * 可能是 `#22` ~ `#31` 有改進空間 ::: ```cpp=33 return obj; ``` * 完成設置,回傳 `obj` :::danger 善用 `cpp=行號` 的標注方式,避免逐行輸入行號,後者對之後的編撰無益。 :notes: jserv ::: #### `int treeAncestorGetKthAncestor(TreeAncestor *obj, int node, int k)` ```cpp=36 int treeAncestorGetKthAncestor(TreeAncestor *obj, int node, int k) { int max_level = obj->max_level; for (int i = 0; i < max_level && node != -1; ++i) if (k & (CCC)) // CCC = 1 << i node = obj->parent[node][i]; return node; } ``` > k = 7 = ( 0000 0111 )~2~ > 7th = ( 1th 的 2th ) 的 4th * by Binary Lifting, **CCC = `1 << i`** :::info * [for循环中的i++和++i有什么区别](https://blog.csdn.net/u010188178/article/details/83996538?utm_medium=distribute.pc_relevant_t0.none-task-blog-BlogCommendFromMachineLearnPai2-1.channel_param&depth_1-utm_source=distribute.pc_relevant_t0.none-task-blog-BlogCommendFromMachineLearnPai2-1.channel_param) ::: ## Q2. 進階題 #### 指出可改進的部份 1. 在 `void treeAncestorFree(TreeAncestor *obj)` 中有出現 `obj->n` 代表 `TreeAncestor` struct 等地方需要修改: ```cpp typedef struct { ... int n; } TreeAncestor; ``` ```cpp TreeAncestor* treeAncestorCreate(int n, int* parent, int parentSize) { TreeAncestor *obj = malloc(sizeof(TreeAncestor)); ... obj->n = n; } ``` * [效能評估](https://leetcode.com/submissions/detail/405682490/): ![](https://i.imgur.com/Hn3EMOa.png) 2. 修改 `for` 迴圈中止條件 ```cpp=23 for (int j = 1;; j++) { ``` * 因為沒有中止條件,會導致程式多進一次 `for` 迴圈 * 改進方式: * `#14` 改為 `int max_level = 32 - __builtin_clz(n);` * 設中止條件為 `j < max_level` * `#33` 改為 `obj->max_level = max_level;` * [效能評估](https://leetcode.com/submissions/detail/405683068/): ![](https://i.imgur.com/wmT9w9Z.png) * 在 C 語言項目中,執行時間領先 78.29% 的提交者 :::warning 3. 改用 bit array 來記錄 ::: ## Q3. [FizzBuzz](https://en.wikipedia.org/wiki/Fizz_buzz) * 從 1 數到 n * 如果是 3的倍數,印出 “Fizz” * 如果是 5 的倍數,印出 “Buzz” * 如果是 15 的倍數,印出 “FizzBuzz” * 如果都不是,就印出數字本身 #### 直覺的實作: (`naive.c`) ```c #include <stdio.h> int main() { for (unsigned int i = 1; i < 100; i++) { if (i % 3 == 0) printf("Fizz"); if (i % 5 == 0) printf("Buzz"); if (i % 15 == 0) printf("FizzBuzz"); if ((i % 3) && (i % 5)) printf("%u", i); printf("\n"); } return 0; } ``` #### bitwise 觀察 ```cpp #define MSG_LEN 8 char fmt[MSG_LEN + 1]; strncpy(fmt, &"FizzBuzz%u"[start], length); fmt[length] = '\0'; printf(fmt, i); printf("\n"); ``` 我們若能精準從給定輸入 `i` 的規律去控制 `start` 及 `length`,即可符合 FizzBuzz 題意: ``` string literal: "fizzbuzz%u" offset: 0 4 8 ``` #### bitwise 的實作: (`bitwise.c`) ```c= #define MSG_LEN 8 static inline bool is_divisible(uint32_t n, uint64_t M) { return n * M <= M - 1; } static uint64_t M3 = UINT64_C(0xFFFFFFFFFFFFFFFF) / 3 + 1; static uint64_t M5 = UINT64_C(0xFFFFFFFFFFFFFFFF) / 5 + 1; int main(int argc, char *argv[]) { for (size_t i = 1; i <= 100; i++) { uint8_t div3 = is_divisible(i, M3); uint8_t div5 = is_divisible(i, M5); unsigned int length = (2 << div3) << div5; char fmt[MSG_LEN + 1]; strncpy(fmt, &"FizzBuzz%u"[(MSG_LEN >> KK1) >> (KK2 << KK3)], length); // KK1 = div5 ; KK2 = div3 ; KK3 = 2 ; fmt[length] = '\0'; printf(fmt, i); printf("\n"); } return 0; } ``` * 程式碼解析: ```cpp=3 static inline bool is_divisible(uint32_t n, uint64_t M) { return n * M <= M - 1; } static uint64_t M3 = UINT64_C(0xFFFFFFFFFFFFFFFF) / 3 + 1; static uint64_t M5 = UINT64_C(0xFFFFFFFFFFFFFFFF) / 5 + 1; ``` ```cpp=12 uint8_t div3 = is_divisible(i, M3); uint8_t div5 = is_divisible(i, M5); ``` * `div3`、`div5` 分別可知道 `i` 是否為 3 或 5 的倍數 * 若是,則為 1 * 若否,則為 0 * 除法判別可參考 [quiz2](https://hackmd.io/uoqS2M4ISw2cyN95njNEUQ#2020q3-Homework1-quiz2) [Q3](https://hackmd.io/uoqS2M4ISw2cyN95njNEUQ#Q3-%E9%99%A4%E6%B3%95%E9%81%8B%E7%AE%97%E7%9A%84%E6%94%B9%E9%80%B2%E7%9A%84%E7%AD%96%E7%95%A5) 及 [Q4](https://hackmd.io/uoqS2M4ISw2cyN95njNEUQ#Q4-%E5%91%88-Q3-%EF%BC%8C%E5%88%A4%E6%96%B7%E6%9F%90%E5%80%8B%E6%95%B8%E5%80%BC%E8%83%BD%E5%90%A6%E8%A2%AB%E6%8C%87%E5%AE%9A%E7%9A%84%E9%99%A4%E6%95%B8%E6%89%80%E6%95%B4%E9%99%A4) ```cpp=1 #define MSG_LEN 8 ``` ```cpp=14 unsigned int length = (2 << div3) << div5; char fmt[MSG_LEN + 1]; strncpy(fmt, &"FizzBuzz%u"[(MSG_LEN >> KK1) >> (KK2 << KK3)], length); ``` * 根據題目和程式碼,可得下表: | `div3` | `div5` | `length` | `(( 1000 ) >> KK1) >> (KK2 << KK3)` | |:------:|:------:|:--------:|:-----------------------------------:| | 1 | 0 | 4 | 0 = ( 0000 ) = 8 >> ( 3 or more ) | | 0 | 1 | 4 | 4 = ( 0100 ) = 8 >> 1 | | 1 | 1 | 8 | 0 = ( 0000 ) = 8 >> ( 3 or more ) | | 0 | 0 | 0 | 8 = ( 1000 ) | * 若 div5 = 1 ,則 MSG_LEM >> 1 * 若 div3 = 1 ,則 MSG_LEM >> ( 3 or more ) * 綜合上述,**KK1 = div5, KK2 = div3, KK3 >= 2** ## Q4. 除數為 `PAGE_QUEUES` 的除法 * 整數 `PAGE_QUEUES` 可能有以下數值: * (1 or 2 or 3 or 4) * (5 or 6 or 7 or 8) × ($2^n$), n from 1 to 14 #### 原本的運算: ```cpp #include <stdint.h> size_t blockidx = offset / PAGE_QUEUES; ``` #### 預先進行計算,避免整數除法指令的使用: * (假設在 x86_64 硬體執行 GNU/Linux 並透過 gcc/clang 編譯) ```cpp= #include <stdint.h> #define ffs32(x) ((__builtin_ffs(x)) - 1) size_t blockidx; size_t divident = PAGE_QUEUES; blockidx = offset >> ffs32(divident); divident >>= ffs32(divident); switch (divident) { CASES // case 3: blockidx /= 3; // break; // case 5: blockidx /= 5; // break; // case 7: blockidx /= 7; // break; } ``` 其中, `CASES` 可能形如 ```cpp case 2: blockidx /= 2; break; ``` 這樣的組合,請用最少的 case-break 敘述做出同等 `size_t blockidx = offset / PAGE_QUEUES;` 效果的程式碼。 :::info * **`__builtin_ffs(x)`** * Returns **one plus the index of the least significant 1-bit of x** ( 返回右起第一個 1 的位置 ) * if x is zero, returns zero. ::: * 觀察 `PAGE_QUEUES` 的範圍 * (1 or 2 or 3 or 4) = ($2^0$ or $2^1$ or $3×2^0$ or $2^2$ ) * (5 or 6 or 7 or 8) × ($2^n$), n from 1 to 14 = ($5×2^0$ or $3×2^1$ or $7×2^0$ or $2^3$) × ($2^n$), n from 1 to 14 * `PAGE_QUEUES` 可以歸納出另一種形式: * 1 × ($2^n$), n = 0, 1, 2 and 4 ~ 17 * 3 × ($2^n$), n = 0 and 2 ~ 15 * ( 5 or 7 ) × ($2^n$), n = 1 ~ 14 * 觀察上述形式,可以發現 * 除以 `PAGE_QUEUES` = 除以 (1 or 3 or 5 or 7) × ($2^n$) * 因此,可以先處理 除以 ($2^n$) 部份,再處理 除以 (3 or 5 or 7) * 任何數除以 1 等於 本身,不用再另外計算 * 程式碼解析: ```cpp=2 #define ffs32(x) ((__builtin_ffs(x)) - 1) size_t blockidx; size_t divident = PAGE_QUEUES; blockidx = offset >> ffs32(divident); divident >>= ffs32(divident); ``` * `#2` : `fs32(x)` = $2^n$ 中的 $n$ * `__builtin_ffs(x) - 1` = x 最右的 1 後面有幾個 0 * `#5` : 被除數除以 $2^n$ ( i.e. 右移 $n$ ) * `#6` : 除數除以 $2^n$ ( i.e. 右移 $n$ ) ```cpp=7 switch (divident) { case 3: blockidx /= 3; break; case 5: blockidx /= 5; break; case 7: blockidx /= 7; break; } ``` * 處理 除以 (3 or 5 or 7) ## Q4. 進階題 #### LeetCode [51. N-Queens](https://leetcode.com/problems/n-queens/) > 善用 `__builtin_ffs`,大幅降低記憶體開銷 * The n-queens puzzle is the problem of placing n queens on an n×n chessboard such that no two queens attack each other. :::spoiler 範例 ``` Input: 4 Output: [ [".Q..", // Solution 1 "...Q", "Q...", "..Q."], ["..Q.", // Solution 2 "Q...", "...Q", ".Q.."] ] Explanation: There exist two distinct solutions to the 4-queens puzzle as shown above. ``` ::: ##### [Backtracking](http://web.ntnu.edu.tw/~algo/Backtracking.html) 法: ```cpp= size_t ways[] = {1, 1, 0, 0, 2, 10, 4, 40, 92, 352, 724, 2680, 14200, 73712, 365596, 2279184, 14772512, 95815104, 666090624, 4968057848, 39029188884, 314666222712, 2691008701644, 24233937684440, 227514171973736, 2207893435808352, 22317699616364044, 234907967154122528}; int N; char*** res; int* QueensPos; int w; bool CheckQueen(int row,int col){ for(int i=0; i < row; i++){ if (QueensPos[i] == col || abs(QueensPos[i] - col) == abs(row - i)) return false; } return true; } void PushAnswer(){ char* queen = malloc((N + 1) * sizeof(char)); queen[N] = '\0'; for (int i = 0; i< N; i++){ for(int j = 0; j< N; j++){ queen[j] = (QueensPos[i]==j) ? 'Q' : '.'; } memcpy(res[w][i], queen, N + 1); } free(queen); w++; } void NQueens(int row){ if(w == ways[N]){ return; } for (int j=0; j < N ; j++){ if(CheckQueen(row, j)){ QueensPos[row] = j; (row == N-1) ? PushAnswer() : NQueens(row+1); } } } char *** solveNQueens(int n, int* returnSize, int** returnColumnSizes){ N = n; *returnSize = ways[n]; *returnColumnSizes = malloc(ways[n] * sizeof(int)); for (int i = 0; i < ways[n]; i++) (*returnColumnSizes)[i] = n; QueensPos = malloc(n *sizeof(int)); res = malloc( ways[n] * sizeof(char**) ); for (int i = 0; i < ways[n]; ++i) res[i] = malloc(n * sizeof(char *)); for (int i = 0; i < ways[n]; ++i) for (int j = 0; j < n; ++j) { res[i][j] = malloc((n + 1) * sizeof(char)); } w = 0; // 不加這行 Submit 會 Runtime Error NQueens(0); return res; } ``` ```graphviz digraph{ node [shape = record] q1[label="{Q|x|x|x}|{x|x||}|{x||x|}|{x|||x}"] q2[label="{x|x||}|{Q|x|x|x}|{x|x||}|{x||x|}"] q3[label="{x||x|}|{x|x||}|{Q|x|x|x}|{x|x||}"] q4[label="{x|||x}|{x||x|}|{x|x||}|{Q|x|x|x}"] //q13[label="{Q|x|x|x}|{x|x|x|}|{x|Q|x|x}|{x|x|x|x}"] q1302[label="{Q|x|x|x}|{x|x|x|Q}|{x|Q|x|x}|{x|x|x|x}" color = gray] q14[label="{Q|x|x|x}|{x|x||x}|{x|x|x|}|{x|Q|x|x}"] q24[label="{x|x||}|{Q|x|x|x}|{x|x|x|}|{x|Q|x|x}"] q31[label="{x|Q|x|x}|{x|x|x|}|{Q|x|x|x}|{x|x||}"] q41[label="{x|Q|x|x}|{x|x|x|}|{x|x||x}|{Q|x|x|x}"] q4203[label="{x|x|x|x}|{x|Q|x|x}|{x|x|x|Q}|{Q|x|x|x}" color = gray] //q1302[label="{Q|x|x|x}|{x|x|x|Q}|{x|Q|x|x}|{x|x|x|x}"] q1420[label="{Q|x|x|x}|{x|x|Q|x}|{x|x|x|x}|{x|Q|x|x}" color = gray] q1403[label="{Q|x|x|x}|{x|x|x|x}|{x|x|x|Q}|{x|Q|x|x}" color = gray] q2413[label="{x|x|Q|x}|{Q|x|x|x}|{x|x|x|Q}|{x|Q|x|x}" color= red] q2401[label="{x|x|x|Q}|{Q|x|x|x}|{x|x|x|x}|{x|Q|x|x}" color = gray] q3142[label="{x|Q|x|x}|{x|x|x|Q}|{Q|x|x|x}|{x|x|Q|x}" color= red] q3104[label="{x|Q|x|x}|{x|x|x|x}|{Q|x|x|x}|{x|x|x|Q}" color = gray] q4130[label="{x|Q|x|x}|{x|x|x|x}|{x|x|Q|x}|{Q|x|x|x}" color = gray] q4102[label="{x|Q|x|x}|{x|x|x|Q}|{x|x|x|x}|{Q|x|x|x}" color = gray] root->q1->q1302 q1->q14->q1420 q14->q1403 root->q2->q24->q2413 q24->q2401 root->q3->q31->q3142 q31->q3104 root->q4->q41->q4130 q41->q4102 q4->q4203 } ``` * 觀察上圖的<font color=red>成立解</font> : * 每一列 ( 行 ) 必定要有一個 Q * 放 Q 時 `(i, j)`要考慮 * 左上方 `( i-k, j-k )` * 正上方 `( i-k, j )` * 右上方 `( i-k, j+k )` * $k = 1$ ~ $(i-1)$ * 程式碼解析: ```cpp=47 char *** solveNQueens(int n, int* returnSize, int** returnColumnSizes){ N = n; *returnSize = ways[n]; *returnColumnSizes = malloc(ways[n] * sizeof(int)); for (int i = 0; i < ways[n]; i++) (*returnColumnSizes)[i] = n; QueensPos = malloc(n *sizeof(int)); res = malloc( ways[n] * sizeof(char**) ); for (int i = 0; i < ways[n]; ++i) res[i] = malloc(n * sizeof(char *)); for (int i = 0; i < ways[n]; ++i) for (int j = 0; j < n; ++j) { res[i][j] = malloc((n + 1) * sizeof(char)); } w = 0; // 不加這行 Submit 會 Runtime Error NQueens(0); return res; } ``` * `ways[]` : 記錄 [N-queens 的 互不相同解個數](https://zh.wikipedia.org/wiki/%E6%95%B4%E6%95%B8%E6%95%B8%E5%88%97%E7%B7%9A%E4%B8%8A%E5%A4%A7%E5%85%A8) * `*returnSize` : `*res` 的 行數 ( i.e. 解的數量 `ways[n]` ) * `**returnColumnSizes` : `*res[w]` 的 columns 數 ( i.e. `n` ) * `QueensPos` : 各列 `Q` 所在的行數 * `QueensPos[row] = column` * `res` : a pointer to pointers to pointers to charater arrays - 4-Queens 為例: ```graphviz digraph{ rankdir = LR w1[label="QueensPos1" color=white] w0[label="QueensPos0" color=white] QueensPos0[shape=record label="{{1}|{3}|{0}|{2}}"] QueensPos1[shape=record label="{{2}|{0}|{3}|{1}}"] w1 -> QueensPos1 w0 -> QueensPos0 } ``` ```graphviz digraph{ rankdir = "LR" Q[label="res" color=white] res[ shape=record label="{<res0>res[0]}|{<res1>res[1]}"] res0[shape=record label="{<res00>res[0][0]}|{<res01>res[0][1]}|{<res02>res[0][2]}|{<res03>res[0][3]}"] res1[shape=record label="{<res10>res[1][0]}|{<res11>res[1][1]}|{<res12>res[1][2]}|{<res13>res[1][3]}"] res00[shape=record label="{{.}|{Q}|{.}|{.}|{/0}}"] res01[shape=record label="{{.}|{.}|{.}|{Q}|{/0}}"] res02[shape=record label="{{Q}|{.}|{.}|{.}|{/0}}"] res03[shape=record label="{{.}|{.}|{Q}|{.}|{/0}}"] res10[shape=record label="{{.}|{.}|{Q}|{.}|{/0}}"] res11[shape=record label="{{Q}|{.}|{.}|{.}|{/0}}"] res12[shape=record label="{{.}|{.}|{.}|{Q}|{/0}}"] res13[shape=record label="{{.}|{Q}|{.}|{.}|{/0}}"] Q -> res "res":res0 -> res0 "res":res1 -> res1 "res0":res00 -> res00 "res0":res01 -> res01 "res0":res02 -> res02 "res0":res03 -> res03 "res1":res10 -> res10 "res1":res11 -> res11 "res1":res12 -> res12 "res1":res13 -> res13 } ``` ```cpp=33 void NQueens(int row){ if(w == ways[N]){ return; } for (int j=0; j < N ; j++){ if(CheckQueen(row, j)){ QueensPos[row] = j; (row == N-1) ? PushAnswer() : NQueens(row+1); } } } ``` * 從 `row` = 0 開始進行遞迴 放置 Queen * `#39` ~ `#42`: * 找出合法位置擺放 Queen * 若 `row` = `(N-1)` ,代表成功擺到最後一列 儲存該解 ( i.e. `PushAnswer()` ) 若否,進行下一列的遞迴 ( i.e. `NQueens(row+1)` ) * `#35` : 若 `w` = `ways[N]`,代表已紀錄完所有解,可停止遞迴 ```cpp=12 bool CheckQueen(int row,int col){ for(int i=0; i < row; i++){ if (QueensPos[i] == col || abs(QueensPos[i] - col) == abs(row - i)) return false; } return true; } ``` * 檢查想擺放的位置是否合法 * `QueensPos` 紀錄各列 Q 所在的行數 * 若 `QueensPos[i]` = `col`,則 `( i, j )` 的 上方 有 Q * 若 `QueensPos[i]` = `j-k` or `j+k`,則 `( i, j )` 的 左上 或 右上 有 Q * `k` = `abs( QueensPos[i] - j )` > k = i-(i-k) > j-k = j-(i-(i-k)) = j - i + (i-k) = `(col - row + i)` > j+k = j+(i-(i-k)) = j + i - (i-k) = `(col + row - i)` ```cpp=20 void PushAnswer(){ char* queen = malloc((N + 1) * sizeof(char)); queen[N] = '\0'; for (int i = 0; i< N; i++){ for(int j = 0; j< N; j++){ queen[j] = (QueensPos[i]==j) ? 'Q' : '.'; } memcpy(res[w][i], queen, N + 1); } free(queen); w++; } ``` * 將找到的一組解 ( i.e. `*QueensPos` ) 紀錄到 `res[w]` * [效能評估](https://leetcode.com/submissions/detail/408160954/): :::spoiler 檢測內容 根據 [guaneec](https://hackmd.io/@guaneec/sp2020q3-quiz4#Q4---ffs) 的筆記 ``` 4 1 2 3 5 6 7 8 9 ``` ::: ![](https://i.imgur.com/mwHY2ML.png) ##### 搭配 Bitewise mask ( 參考 [RinHizakura 同學](https://hackmd.io/@RinHizakura/BkoGM5V8v#solution-21) )