2024q1 Homework5 (assessment)

# 2024q1 Homework5 (assessment) contributed by < [yy214123](https://github.com/yy214123) > ### CS:APP 3/e #### 第一章從 hello 這個程式開始: ```clike #include <stdio.h> int main() { printf("hello , world"); return 0; } ``` 介紹了該程式的生命週期，過去只知道 `xxx.c` 經過編譯後會產生 `xxx` 的可執行檔，這是第一次認識了整個編譯系統的不同執行階段。我將這個章節看完後，發現可以跟以前計算機組織及作業系統讀過的基本知識作更好的銜接。 #### 第二章以前學計組時很逃避數值表示相關的內容，一直覺得我是人類我只要學十進制就夠了，二進制、十六進制、補數等等是電腦的事。直到看了 [bitwise 操作](https://hackmd.io/@sysprog/c-bitwise)、[數值系統]([/eSVH5k6SQ7OrzZ5cT-uiKg](https://hackmd.io/@sysprog/c-numerics)) 才發現說其實數值表示非常有學問。但在研讀這個章節時，我對這個部分感到好奇（page23）： | 有號數 | 無號數 | 32位元 | 64 位元 | | ------ | ------ | ------ | ------- | | int | unsign | 4 byte | 4 byte | | int32_t | unsign32_t | 4 byte | 4 byte | :::info 兩者在 32 位元或 64 位元皆相同，為何不留下 int 就好？ > LP64, ILP64, `uint_fast32_t` ::: 而在書中的第 38 頁，在描述遮罩時有下方這一段話： >表達式 `~0` 將生成一個全 1 的遮罩，不管機器的 word size 是多少。儘管對於一個 32 位元機器來說，同樣的遮罩可以寫成 `0xFFFFFFFF`，但是這樣的程式碼不是可移植的。 :::info 我想確認我的理解是否正確，此處的不可移植是指在 32 位元機器：`0xFFFFFFFF` 若直接移植到 64 位元機器會變為：`0x00000000FFFFFFFF`，與全 1 的遮罩並不等價？ ::: ### 作業回顧 #### [M01: lab0](https://hackmd.io/@sysprog/linux2024-lab0/%2F%40sysprog%2Flinux2024-lab0-a) >[共筆紀錄](https://hackmd.io/@yy870530/linux2024-homework1) 這份作業對我來說很有挑戰。一開始，我甚至不熟悉Linux的基本操作，甚至從大一（2017年）之後，我就再也沒有接觸過 C 語言了，目前完成了指定的佇列操作，確保能通過 make test 的評分項目，也引入了 list_sort 和 tim_sort 進行效能比較，這對我來說是一次重要的學習和成長機會。然而，我目前遇到的困難是關於測試資料準備以及對應的統計分析，為什麼我設計的測試資料是 1000 筆？而每筆是由 1K 到 1000K 個長度介於 5 到 15 的隨機字串所組成，目前還是無法針對該部份提出對應的統計分析，會不會根本不需要這麼多？但這也讓我知道在統計這塊我有很大的缺失，還須進一步精進。 > 去閱讀 entropy #### [M02: quiz1+2](https://hackmd.io/@sysprog/BkmmEQCn6) >[共筆紀錄](https://hackmd.io/@yy870530/linux2024-homework2) ##### 第 `1` 週表單 - [x] 測驗 1 > 已提出挑選 pivot 的方式，尚未進行效能評比。 - [x] 測驗 2 > 已整合入 [lab0-c](https://github.com/yy214123/lab0-c) 並進行效能評比，但尚未將完整運作邏輯搞懂。 ##### 第 `2` 週表單 - [x] 測驗 1 > 已將完整運作邏輯搞懂，參考教材 [Linux 核心的 hash table 實作](https://hackmd.io/@sysprog/linux-hashtable)，目前正著手改進雜湊函數，原先是採 Division method，應可用 Multiplication Method 來予以改進。 - [x] 測驗 2 - [ ] 測驗 3 #### [M03: review](https://hackmd.io/@sysprog/linux2024-review) > [共筆紀錄](https://hackmd.io/yI-Rh8l7Q9uzWhQgBQbsnQ?view) 收穫了其他學員們（`56han`、`Ackerman666`、`164253`）給予的各項建議，大致上均已完成更新，每筆更新皆有整理對應 git commit 連結。我很喜歡 Code review 這個環節，每個人在撰寫程式的思維有所不同，通過這樣的交流，我得以從他人的視角發現並改善自己程式中未曾留意到的細節。 #### [M03: ttt](https://hackmd.io/@sysprog/linux2024-ttt) > [共筆紀錄](https://hackmd.io/yI-Rh8l7Q9uzWhQgBQbsnQ?view#%E5%B0%87-jservttt-%E5%B0%88%E6%A1%88%E6%95%B4%E5%90%88%E9%80%B2-lab0-c) 已將 [jserv/ttt](https://github.com/jserv/ttt) 整合進 [lab0-c](https://github.com/yy214123/lab0-c) 中，並完成在 `qtest` 中新增命令的作業需求，目前正著手進行 MCTS 相關的整合。 #### [M04: quiz3+4](https://hackmd.io/@sysprog/HkatSCZCT) >[共筆紀錄](https://hackmd.io/@yy870530/linux2024-homework4) ##### 第 `3` 週表單 - [x] 測驗 1 - [ ] 測驗 2 - [x] 測驗 3 - [ ] 測驗 4 - [x] 測驗 5 ##### 第 `4` 週表單 - [x] 測驗 1 > 尚為完成針對 LeetCode 477 撰寫出更高效的程式碼。 - [ ] 測驗 2 - [ ] 測驗 3 ___ ## 閱讀[〈因為自動飲料機而延畢的那一年〉](https://blog.opasschang.com/the-story-of-auto-beverage-machine-1/)的啟發是個很振奮人心的故事，讓人深思。 **如果你只會寫程式，而且對原子筆一竅不通，那會發生什麼悲劇就可想而知了。** > 對這段話頗有感觸，也反映了為什麼這門課程特別強調閱讀原始教材的重要性，一直以來我都在舉燭，只會在心中腦補程式碼，還沒著手進行 lab0-c 前，在如今這個生成式AI迅速發展的時代，我常常僅是打開 ChatGPT，描述我的目標，然後是一連串的「ctrl + c >> ctrl + v >> run」。當遇到 bug 時，我再次「ctrl + c >> ctrl + v」尋求 ChatGPT 的幫助來解決問題。剛開始寫 lab0-c 時，我也嘗試過這樣幹，然而，我發現從 ChatGPT 得到的回答往往不切實際。也對應到老師在 Google meeting 上提到的，如果只是一昧的使用這些工具，那並無法與他人拉開差距，其他科系的同學甚至比你還會問 ChatGPT 問題。 **"If I had eight hours to chop down a tree, I’d spend six hours sharpening my axe." – Abraham Lincoln** >逐漸意識到基本功的重要性，目前，我雖然沒辦法按時將每項作業完成，也未能在每週四課程開始前將教材確實看完，更不用說時常需要重複閱讀教材和相關的影片。但我發現，每當我重新閱讀教材時，總能獲得新的理解。那些在第一次閱讀時感到模糊不清的知識點，隨著閱讀次數的增加，逐漸浮現出新的見解和突破。 :::info 硬體設計就是不斷的重複「設計、製造、修正」的循環，想像中的設計會經過有誤差的方式被製造出來，製造出來的產品在現實中運作的效果不如預期，找到問題的根源後再重新修正，機械設計就是重複這樣的循環。只有把東西生出來做實驗，你才會知道該如何改進。 [〈因為自動飲料機而延畢的那一年(7)〉](https://blog.opasschang.com/the-story-of-auto-beverage-machine-7/) ::: :::info 事情如果太順利代表絕對有問題，而問題永遠會從意想不到的地方冒出來。 [〈因為自動飲料機而延畢的那一年(18)〉](https://blog.opasschang.com/the-story-of-auto-beverage-machine-18/) ::: :::info 除了資工系的學生不會寫程式，機械系的學生不會做機械，現在又多一條電工系的學生不會焊電路，這世界到底怎麼了啊。 [〈因為自動飲料機而延畢的那一年(22)〉](https://blog.opasschang.com/the-story-of-auto-beverage-machine-22/) ::: --- IEEE 754 sign bit, exp, m [ldexp](https://man7.org/linux/man-pages/man3/ldexp.3.html) ```c #include <stdio.h> #include <math.h> int main() { float x = 7.0; printf("=> %f\n", ldexpf(x, 1)); return 0; } ``` > 預期得到 `14.0` ```c float float_lshift(float x, int off) { // using bitwise operations, no multiplication // Definition: x * (1 << exp) ; int y = x; int exponet = y | 0x7f900000 ; exponet <<= (1 << off) ; return exponet; } ``` > 這邊從一開始就錯的一蹋糊塗，我將一個型別為 float 的變數值 assign 給另一個型別為 int 的變數，這樣的行為會抹除掉小數點的部分。 TODO: 修正程式碼並解釋 `float_lshift` 函數傳入兩個參數 `x` 及 `off`，最終需回傳 $x *2^{off}$。研讀 [你所不知道的 C 語言: 浮點數運算](https://hackmd.io/@sysprog/c-floating-point) 後發現有很方便的工具 [IEEE-754 Floating Point Converter](https://www.h-schmidt.net/FloatConverter/IEEE754.html)。在 c 語言中浮點數會以 IEEE 754 格式來儲存： ![image](https://hackmd.io/_uploads/rkJRYbmXC.png) 經觀察後可發現： `float_lshift(0.75,1)` $= 0.75 *2^{1} = 1.5$ ![image](https://hackmd.io/_uploads/SyhW9ZX7A.png) `float_lshift(0.75,2)` $= 0.75 *2^{2} = 3.0$ ![image](https://hackmd.io/_uploads/SyAGc-Q70.png) 會發現實際上會變動的只有中間綠色的 **Exponet** 部分，故在此實作中要先將 Exponet 提取出來。然而浮點數型別並不能直接進行 bitwise 操作，故須先進行型別的轉換： ```clike unsigned int *p = (unsigned int *)&x; ``` 接著先提取 exponet 的部分，我的想法是透過左移及右移來使 exponet 變更到較低的 8 bits 中： `S EEEEEEEE MMMMMMMMMMMMMMMMMMMMMMM` -> `0 00000000 000000000000000EEEEEEEE` 如此便可直接與輸入參數的 `off` 進行加法運算。接著再向左位移，使 exponet 的部分回歸其正確的位置中： `0 EEEEEEEE(+off) 00000000000000000000000` 接著將原先輸入的 `x` 其 exponet 全設置為 0，與 `0 EEEEEEEE(+off) 00000000000000000000000` 進行 OR 運算即可得到正確的輸出結果。修正後的程式碼： ```clike float float_lshift(float x, int off) { // using bitwise operations, no multiplication // Definition: x * (1 << off) ; unsigned int *p = (unsigned int *)&x; int exponet = ((*p << 1) >> 24); exponet += off; exponet <<= 23; *p &= 0x807FFFFF; *p |= exponet; float result = *((float *)p); printf("%f",result); } ``` > 用 union 改寫 > 想了一下發現將 exponet 變更到較低的 8 bits 時，其實向右移 23 bits 即可： ```diff - int exponet = ((*p << 1) >> 24); + int exponet = (*p >> 23); ``` 其呈現的效果為： `S EEEEEEEE MMMMMMMMMMMMMMMMMMMMMMM` -> `0 00000000 00000000000000SEEEEEEEE` 而與輸入的 `off` 進行加法後再左移恢復正確的位置，接著與 exponet 部分全 0 的 `x` 進行 OR 運算，相比原先的移動方式(需移動25次)，共減少了兩次的移動運算成本。 ### TODO: 第 7 週測驗題 (測驗 1 + 2), 第 9 週測驗題 (測驗 3) ## [第七週測驗](https://hackmd.io/@sysprog/linux2024-quiz7) ### 測驗 `1` ```clike #include <stdint.h> bool both_odd(uint32_t x, uint32_t y) { return (x & 1) && (y & 1); } ``` 如果 lsb 被設置為 1，則其值必為奇數，反之。而為減少運算量，將兩個 32 位元組合為一個 64 位元 ```clike static inline uint64_t bit_compound(uint32_t x, uint32_t y) { return ((0L + x) << 32) | ((y + 0L) & (-1L >> 32)); } ``` `y`= 0xAAAAAAAA `y`+ 0L = `0x00000000 00000000` = `0x00000000 AAAAAAAA` 假設 x 的 16 進制表示為 `0x12345678`，則相加並左移後為：`0x12345678 00000000`。 `x` 的型別是無號 32 位元整數，而 `0L` 是 64 位元，在 c99 有以下描述： :::info The rank of long long int shall be greater than the rank of long int, which shall be greater than the rank of int, which shall be greater than the rank of short int, which shall be greater than the rank of signed char. 〈 c99 6.3.1.1 Boolean, characters, and integers 〉 ::: 由此可知 long int 的 rank 比 int 高。 :::info if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type,then the operand with unsigned integer type is converted to the type of the operand with signed integer type. 〈 c99 - 6.3.1.8 Usual arithmetic conversionse 〉 ::: 所以 `x` 的型別會被轉為 64 位元的 long，將其與 `0L` 相加後向左移 32 位元，其結果如下： > 假設 x 的 16 進制表示為 `0x12345678`，則相加並左移後為：`0x12345678 00000000`。 > 假設 y 的 16 進制表示為 `0x12345678`，則相加後 `0x00000000 12345678 `。而 `(y + 0L` 同理。 :::warning 我對右移的部分有疑惑，為甚麼不直接： `return ((0L + x) << 32) | (0L + y);` 還需多做一個 `(-1L >> 32)`? > 提供更簡易程式碼 ::: Linux 核心中的 [lib/string.c](https://github.com/torvalds/linux/blob/master/lib/string.c) 有 [memchr](https://man7.org/linux/man-pages/man3/memchr.3.html) 實作： ```clike /** * memchr - Find a character in an area of memory. * @s: The memory area * @c: The byte to search for * @n: The size of the area. * * returns the address of the first occurrence of @c, or %NULL * if @c is not found */ void *memchr(const void *s, int c, size_t n) { const unsigned char *p = s; while (n-- != 0) { if ((unsigned char)c == *p++) { return (void *)(p - 1); } } return NULL; } ``` 這段程式會逐字元去檢查輸入的字串，直至與 `c` 相符。在規格書提到： :::info The result of the postfix ++ operator is the value of the operand. After the result is obtained, the value of the operand is incremented. (That is, the value 1 of the appropriate type is added to it.) See the discussions of additive operators and compound assignment for information on constraints, types, and conversions and the effects of operations on pointers. 〈 c99 - 6.5.2.4 Postfix increment and decrement operators 〉 ::: 所以在 while 迴圈中，會先比對 `c` 是否等於 `*p`，接著將其增加 1（即指向下一個字元），故若有匹配，應將其指向上一個字元（`p - 1`）。以 SWAR 改寫 ```clike #include <limits.h> #include <stddef.h> #include <stdint.h> #include <string.h> /* Nonzero if X is not aligned on a "long" boundary */ #define UNALIGNED(X) ((long) X & (sizeof(long) - 1)) /* How many bytes are loaded each iteration of the word copy loop */ #define LBLOCKSIZE (sizeof(long)) /* Threshhold for punting to the bytewise iterator */ #define TOO_SMALL(LEN) ((LEN) < LBLOCKSIZE) #if LONG_MAX == 2147483647L #define DETECT_NULL(X) (((X) - (0x01010101)) & ~(X) & (0x80808080)) #else #if LONG_MAX == 9223372036854775807L /* Nonzero if X (a long int) contains a NULL byte. */ #define DETECT_NULL(X) \ (((X) - (0x0101010101010101)) & ~(X) & (0x8080808080808080)) #else #error long int is not a 32bit or 64bit type. #endif #endif /* @return nonzero if (long)X contains the byte used to fill MASK. */ #define DETECT_CHAR(X, mask) DETECT_NULL(X ^ mask) ``` 前面定義一些巨集，以下逐一說明其功能 `DETECT_NULL` 這個巨集會根據 `long` 型別位元數之差異（32位元、64位元）進行對應的 bitwise 操作。 `(X) - (0x01010101)`： > 以 byte 的觀點出發，當 `X` 有任一個 byte 為 `0`，與 `0x01010101` 相減後對應的 byte 會是 `0xFF`。接著上一步結果與 `~X` 進行 and 運算，當 `X` 有任一個 byte 為 `0`，則 `~X` 中對應的 byte 亦為 `0xFF`。舉兩個例子說明： > 假設 `A`、`B` 皆為 32 位元 long，其中 `A` 為`0x00FF00FF` 、`B` 為 `0x12345678`。首先看 `A`，特意將第一個 byte、第三個 byte 設為為 0： `A` - `0x01010101` = `0xFFFEFFFE` ; `0xFFFEFFFE` & `~A` = `0xFF00FF00` 接著看 `B`，不具為 0 的 byte： `B` - `0x01010101` = `0x11335577` ; `0x11335577` & `~B` = `0x01030107` 最後與 `0x80808080` 進行 and 運算，將 1byte 分為左右 4bits 來看，若原先具 0 byte，左邊的 4bits 其最高位必為 `1`。示意圖如下： ```graphviz digraph G { rankdir=TB; node [shape=record]; context1 [label="1|*|*|*|*|*|*|*"]; } ``` ```clike void *memchr_opt(const void *str, int c, size_t len) { const unsigned char *src = (const unsigned char *) str; unsigned char d = c; while (UNALIGNED(src)) { if (!len--) return NULL; if (*src == d) return (void *) src; src++; } if (!TOO_SMALL(len)) { /* If we get this far, len is large and src is word-aligned. */ /* The fast code reads the source one word at a time and only performs * a bytewise search on word-sized segments if they contain the search * character. This is detected by XORing the word-sized segment with a * word-sized block of the search character, and then checking for the * presence of NULL in the result. */ unsigned long *asrc = (unsigned long *) src; unsigned long mask = d << 8 | d; mask |= mask << 16; for (unsigned int i = 32; i < LBLOCKSIZE * 8; i <<= 1) mask |= BBBB; while (len >= LBLOCKSIZE) { if (DETECT_CHAR(CCCC, mask)) break; asrc++; len -= LBLOCKSIZE; } /* If there are fewer than LBLOCKSIZE characters left, then we resort to * the bytewise loop. */ src = (unsigned char *) asrc; } while (len--) { if (*src == d) return (void *) src; src++; } return NULL; } ``` ```clike unsigned long mask = d << 8 | d; mask |= mask << 16; for (unsigned int i = 32; i < LBLOCKSIZE * 8; i <<= 1) mask |= BBBB; ``` 這部分會將字元 `d` 重複擴展，假設 `d` 的值為 `0xAB`。首先，將 `d` 向左移動 8 個位元，再與其自身進行 OR 運算，結果為 `0xABAB`，並將其賦值給變數 mask。接著，將 `mask` 向左移動 16 個位元，再與其自身進行 OR 運算，結果為 `0xABABABAB`。最後，利用 `for` 迴圈判斷是否需要進一步擴展。根據前述的運算邏輯，可以推導出最終的結果應為 mask << 32，即 `0xABABABABABABABAB`。