2019q1 Homework3 (review)

# 2019q1 Homework3 (review) contributed by < `rebvivi` > ###### tags: `linux2019` * [F05: review](https://hackmd.io/s/S1-wcjavN) ## 作業要求 - 從第 1 週, 第 2 週, 第 3 週(上), 第 3 週(下), 第 4 週(上), 第 4 週(下) 選出你感興趣的 3 個題組進行作答，並且回答附帶的「延伸問題」 - 比照課前測驗參考解答: Q1 和對 Linked List 進行氣泡排序的模式來撰寫共筆，需要詳細分析自己的想法、參閱的材料 (包含 C 語言規格書的章節)，以及==進行相關實驗== - 若有不能理解的部分，請標註出來。 ## 第一周測驗 2 ### Data Alignment - 變數的宣告，會配置其所需的 memory ： * `char`： 1 bytes * `int` ： 4 bytes * `double`： 8 bytes - memory 擺放方式： * 如果一個接著一個放，會影響到 performance 以及 efficiency 的問題 * 在 32 bits 的架構上，一次的資料存取也就是 32 bits ( 4 bytes ) * 這 4 bytes 不是隨便從哪個點抓都可以，而是要從 byte number 為 4 的倍數開始取，一次取 4 bytes * 取 byte 0 ~ byte 3 或是 byte 4 ~ byte 7 * 不會從 byte 3 或是 byte 7 開始抓 4 個 bytes - 如果想要的資料分佈在 byte 1 ~ byte 4 * 取 byte 0 ~ byte 3，再去除 bye 0 的資料，留下 byte 1 ~ byte 3 * 取 byte 4 ~ byte 7，並去除 byte 5 ~ byte 7 的資料，留下 byte 4 * 將 byte 1 ~ 3 和 byte 4 整合為 32 bits - 假如資料分布不在 4 的倍數，會導致存取速度降低， compiler 在配置 memory 時，就會按照宣告的 type 去做 alignment * `int` ： 4 byte alignment ### 程式目的 - 實現 data alignment * 將 memory 的資料放置在 2^N^ (N 為從 0 開始的整數) 的 memory address - 增加電腦存取的效率 ### 題目程式碼 >`x`：我們想要 align 的數字 >`a`：align 成 `a`-byte boundary ```c #define ALIGN_uint32(x, a) __ALIGN_MASK(x, ((uint32_t)(a)) - 1) #define __ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask)) ``` 假設 `x = 0x1006` 而且 `a = 4`，因為是 4-byte boundary，可能的 aligned value 只有 0x1000， 0x1004， 0x1008 ...，所以 0x1006 的 aligned value 就是 0x1008 #### trace 程式碼 - `mask = ((uint32_t)(a)) - 1` * `a = 4`，`mask = 4 - 1 = 0x0003` - `(x) + (mask) = 0x1006 + 0x0003 = 0x1009` - `~(mask) = ~(0x0003) = 0xFFFC` - `((x) + (mask)) & ~(mask) = 0x1009 & 0xFFFC = 0x1008` ### 驗證程式碼 ```c #include <stdint.h> #include <stdlib.h> #include <stdio.h> #define ALIGN_uint32(x, a) __ALIGN_MASK(x, ((int32_t)(a)) - 1) #define __ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask)) int main() { int x = 0x1006; int a = 4; printf("%x\n",ALIGN_uint32(x, a)); } ``` >以下是輸出 ``` 1008 ``` ## 第二周測驗 2 ### Population count - `population count`：計算數值用 binary 表示時，有多少 bit 是 1 * 0F0F~16~ 的 population count 是 8 0F0F~16~ = 0000111100001111~2~，用 binary 表示有 8 個 bit 是 1 ### part 1 題目程式碼 >`v`：我們所要判斷的數字 >`n`：`v`用 binary 表示時，有多少 bit 是 1 ```c unsigned popcnt_naive(unsigned v) { unsigned n = 0; while (v) v &= (v - 1), n = -(~n); return n; } ``` - `v &= (v - 1)`：用來清除最低位的 1 * `v` 的最低位 1 在 `v-1` 中變為了 0 ，所以 `v &= (v - 1)` 就可以清除最低位的 1 - `n = -(~n)`：將 `n` 加 1 * `(~n)`：取 `n` 的 1 補數 * `-(~n)`：再取 `(~n)` 的 2 補數 * `n = -(~n)`：將 `n` 中， `0→1`，`1→0`，運作兩次之後在加上 1，就等同於將 `n` 加 1 ### part 1 驗證程式碼 ```c #include <stdlib.h> #include <stdio.h> unsigned popcnt_naive(unsigned v) { unsigned n = 0; while (v) v &= (v - 1), n = -(~n); return n; } int main() { unsigned a=0x0F0F; printf("%d\n",popcnt_naive(a)); } ``` >以下是輸出 ``` 8 ``` ### part 1 測量執行時間 >可以看出在 bit 數不多的時候，呈現小小的週期性分佈，但執行時間總體相差不大，因為即使 number 增加，但是用 binary 表示時， bit 為 1 的數目不一定會也跟著增加 > ![](https://i.imgur.com/FjHuvdk.png) >但是隨著 bit 為 1 的數目增加，執行時間大致會呈現 O(n) 上升的趨勢 > ![](https://i.imgur.com/KSBFEoc.png) ### part 2 題目程式碼 >`v`：我們所要判斷的數字 ```c= unsigned popcnt(unsigned v) { unsigned n; n = (v >> 1) & 0x77777777; v -= n; n = (n >> 1) & 0x77777777; v -= n; n = (n >> 1) & 0x77777777; v -= n; v = (v + (v >> X)) & Y; v *= 0x01010101; return v >> 24; } ``` 將 `v` 以 4 個 bit 為單位，拆成 8 個單位去計算 - 假設簡稱每個單位為 B - `v` 的 8 個單位分別為 B~7~ B~6~ B~5~ B~4~ B~3~ B~2~ B~1~ B~0~ - `(v >> 1)`：將 `v` 往右移一個 bit - `(v >> 1) & 0x77777777`：去除掉因為右移一個 bit，而跑到下一個單位 B 的 1 * 因為 7~10~=0111~2~，如果一個數字和 0111~2~ 做 `&` ，就能夠去除單位 B 的 msb - `v -= n`：用原本的數字 `v` 減去`(v >> 1) & 0x77777777` - 假設我們現在只看其中一個單位 B ，也就是只看 4 個 bit - 假設這個單位 B 的數字是 `x` - `x` 的 4 個 bit 分別為 b~3~ b~2~ b~1~ b~0~ - 所以 x=b~3~*2^3^+b~2~*2^2^+b~1~*2^1^+b~0~*2^0^ - 在程式碼 4~9 行執行了 3 次相同的運算，如果我們用數學式子表示可以寫成 : x-[x/2]-[x/4]-[x/8] =(b~3~*2^3^+b~2~*2^2^+b~1~*2^1^+b~0~*2^0^)-(b~3~*2^2^+b~2~*2^1^+b~1~*2^0^)-(b~3~*2^1^+b~2~*2^0^)-(b~3~*2^0^) =b~3~+b~2~+b~1~+b~0~ - b~3~+b~2~+b~1~+b~0~ 也就是每個 bit 數字相加之後的結果，也可以看做這個單位 B `x` 所擁有 bit 為 1 的數目，也就是我們所想要求的 population count - `v = (v + (v >> 4)) & 0x0F0F0F0F`: 將 B~7~ 和 B~6~ 、 B~5~ 和 B~4~ 、 B~3~ 和 B~2~ 、 B~1~ 和 B~0~ 的 population count 相加 - `(v >> 4)` : 兩兩相加之前要先將 `v` 往右移 4 個 bit - `(v + (v >> 4))` : 再和原本的數字 `v` 相加，所以兩兩相加 population count 的結果會在原本 B~6~ 、 B~4~、 B~2~、 B~0~ 的位置上 - `v = (v + (v >> 4)) & 0x0F0F0F0F`: 因為我們只要 B~6~ 、 B~4~、 B~2~、 B~0~ 的兩兩相加 population count 的結果，所以我們要將 B~7~ 、 B~5~、 B~3~、 B~1~ mask 掉 - `v *= 0x0101010`: 將 B~6~ 、 B~4~、 B~2~、 B~0~ 相加，最終相加的結果會被存在 B~6~ ![](https://i.imgur.com/na1QsxO.jpg) - `v >> 24` : 因為相加的結果原本是存在 B~6~ ，所以我們要往右移 6 個單位 B ( 6 * 4 = 24 個 bits ) 才是整個數字 `v` 的 population count ### part 2 驗證程式碼 ```c #include <stdlib.h> #include <stdio.h> unsigned popcnt(unsigned v) { unsigned n; n = (v >> 1) & 0x77777777; v -= n; n = (n >> 1) & 0x77777777; v -= n; n = (n >> 1) & 0x77777777; v -= n; v = (v + (v >> 4)) & 0x0F0F0F0F; v *= 0x01010101; return v >> 24; } int main() { unsigned a=0x0F0F; printf("%d\n",popcnt(a)); } ``` >以下為輸出 ``` 8 ``` ### part 2 測量執行時間 >執行時間幾乎為 O(1)，只有在特定幾個數字有 peak > ![](https://i.imgur.com/UxqHHJV.png) >即使 bit 為 1 的數目增加，執行時間也都幾乎差不多，為 O(1) > ![](https://i.imgur.com/0exsNhi.png) ### Time Attack - `side channel attack`: 利用信道外的信息，例如 : 加解密的數據、數據比較時間、密文傳輸的流量或是途徑，進而進行攻擊的方式 - `time attack`: 屬於 `side channel attack` * 通過裝置運算的用時來推斷出所使用的運算操作 * 通過對比運算的時間推定資料位於哪個儲存裝置 * 利用通訊的時間差進行資料竊取假如有一個 function 負責比較 user 輸入的 password 和存放在 system 內的 password 是否相同，如果該 function 是從第一位開始比較，發現不同就立即返回，那麼通過計算返回的速度就知道了大概是哪一位開始不同的，密碼就能輕易的被破解了 >以下是很容易受到 `time attack` 的寫法 ```c if (user.password === password) { return { state: true }; } ``` #### 抵抗 Time Attack - 每次不同的輸入會造成處理時間的不同。為了抵抗 `time attack`，我們需要使字串比較花費相同的時間量，無論輸入的 password 是什麼 >以下是不容易受到 `time attack` 的寫法 >>`user.password` : user 輸入的 password >`password` 為 system 中儲存的正確 password ```c const correctUser = (password, user.password) => { const state = []; for (let i = 0; i < password.length; i) { if (!user.password[i]) { state.push(false); } else { state.push(user.password.charCodeAt(i) ^ password.charCodeAt(i)); } } return state.length !== 0 && state.every(item => !item); } ``` system 中每一個 password 的長度是固定的，每次比較 password 是否相同時，使用正確 password 的長度作為比較次數，比較每一個字元的 Unicode 編碼是否相等，並且把每一次的比較結果存放到一個陣列中，最後再判斷陣列的每一個元素是否為 0（為 0 表示兩個字元相同） ## 第三周(上)測驗 3 ### part 1 題目程式碼 ```c= #include <stddef.h> struct node { int data; struct node *next; } *head; void insert(struct node *newt) { struct node *node = head, *prev = NULL; while (node != NULL && node->data < newt->data) { prev = node; node = node->next; } newt->next = node; if (prev == NULL) head = newt; else prev->next = newt; } ``` `insert` 函式將 linked list 做 insertion sort - 9~12 行的 while 迴圈是要找到 `newt` 要 insert 的位置 - `newt->next = node`：`newt` 要 insert 在 `node` 的前面，所以 `node` 要變成`newt->next` - `if (prev == NULL) head = newt`:如果要 insert 的位置在第一個位置，`head` 就會變成 `newt` - `prev->next = newt`：如果要 insert 的位置不在第一個，那原本 `node` 的前面，也就是 `prev`，`prev->next`現在會是 `newt` ### part 1 驗證程式碼在實際驗證程式碼的時候，我把 `insert` 函式多加上了一個 parameter `head`，才能得知現在 linked list 的開頭在哪 >當在輸入 `void insert(struct node ** head, struct node * newt)` 中 `newt` 的 data 的數值時，我是在 0~9 的範圍中，隨機取 10個數字，但彼此都不重複，讓輸入是隨機的 ```c #include <stddef.h> #include <stdio.h> #include <stdlib.h> #include <time.h> struct node { int data; struct node *next; } * head; struct node *newNode(int new_data) { struct node *new_node = (struct node *) malloc(sizeof(struct node)); new_node->data = new_data; new_node->next = NULL; return new_node; } void insert(struct node **head, struct node *newt) { struct node *node = *head, *prev = NULL; while (node != NULL && node->data < newt->data) { prev = node; node = node->next; } newt->next = node; if (prev == NULL) *head = newt; else prev->next = newt; } void printList(struct node *head) { struct node *temp = head; while (temp != NULL) { printf("%d\n", temp->data); temp = temp->next; } } int main() { struct node *head = NULL; struct node *newt; int rand_num[10]; int i, j, k; srand(time(NULL)); for (i = 0; i < 10; i++) { rand_num[i] = rand() % 10; for (j = i - 1; j >= 0; j--) if (rand_num[i] == rand_num[j]) i--; } for (k = 0; k < 10; k++) { newt = newNode(rand_num[k]); insert(&head, newt); } printList(head); } ``` >以下是輸出 ``` 0 1 2 3 4 5 6 7 8 9 ``` ### part 1 測量執行時間 >如果 input 是： 1~n 的範圍中，隨機取 n 個數字，但彼此都不重複 >隨著 n 的增加，執行時間呈現 O(n^2^) > ![](https://i.imgur.com/re718Kt.png) ### part 2 題目程式碼 ```c= void insert(struct node *newt) { struct node **link = &head; while (*link && (*link)->data < newt->data) link = &(*link)->next; newt->next = *link; *link = newt; } ``` - `struct node **link = &head`: `link` 是一個 pointer to a pointer ，指向 `head` 的 memory address - 程式碼 4~5 行的 while 迴圈，如果現在 `(*link)` 的 `data` 比要 insert 的 `newt` 的 `data` 要小，那就把`link`的改成 `(*link)`下一個 node 的 memory address - 在跑完 while 迴圈之後，`(*link)` 指的是我們所要 insert 的 `next` 的下一個 node - 之後就能將 `newt` insert 在我們想要的位置 ### part 2 驗證程式碼 >當在輸入 `void insert(struct node ** head, struct node * newt)` 中 `newt` 的 data 的數值時，我是在 0~9 的範圍中，隨機取 10個數字，但彼此都不重複，讓輸入是隨機的 ```c #include <stddef.h> #include <stdio.h> #include <stdlib.h> #include <time.h> struct node { int data; struct node *next; } * head; struct node *newNode(int new_data) { struct node *new_node = (struct node *) malloc(sizeof(struct node)); new_node->data = new_data; new_node->next = NULL; return new_node; } void insert(struct node **head, struct node *newt) { struct node **link = &(*head); while (*link && (*link)->data < newt->data) link = &(*link)->next; newt->next = *link; *link = newt; } void printList(struct node *head) { struct node *temp = head; while (temp != NULL) { printf("%d\n", temp->data); temp = temp->next; } } int main() { struct node *head = NULL; struct node *newt; int a = 10; int rand_num[a]; int i, j, k; srand(time(NULL)); for (i = 0; i < a; i++) { rand_num[i] = rand() % a; for (j = i - 1; j >= 0; j--) if (rand_num[i] == rand_num[j]) i--; } for (k = 0; k < a; k++) { newt = newNode(rand_num[k]); insert(&head, newt); } printList(head); } ``` >以下是輸出 ``` 0 1 2 3 4 5 6 7 8 9 ``` ### part 2 測量執行時間 >如果 input 是： 1~n 的範圍中，隨機取 n 個數字，但彼此都不重複 >第二種 `insert` 函式和第一種一樣，隨著 n 的增加，執行時間都呈現 O(n^2^)，但執行時間比第一種快 ![](https://i.imgur.com/1xsVAUP.png)