Linux 核心原始程式碼巨集: max, min

--- tags: LINUX KERNEL, LKI --- # Linux 核心原始程式碼巨集: `max`, `min` > 資料整理: [jserv](http://wiki.csie.ncku.edu.tw/User/jserv) :::success 本文探討 Linux 核心原始程式碼 `max` 和 `min` 巨集 (macro) 的實作，除了分析實務考量，也希望看倌得以深刻感受到 Linux 核心開發者對於工程細節和 [C 語言規格](https://hackmd.io/@sysprog/c-standards)的重視，反映於持續的淬鍊中。 ::: ## 點題實作 `max` 和 `min` 巨集可單純寫為以下: (為了便於討論，我們將該巨集稱為 ==`MAX0`==) ```c #define max(x, y) (x) > (y) ? (x) : (y) ``` 注意到傳入的 `x` 和 `y` 在表示式中，都用小括號 (即 `(` 和 `)`) 包覆，從而避免 `max(x, max(y, z))` 這種使用案例中，由於巨集展開，造成 [ternary operator](https://en.wikipedia.org/wiki/%3F:) (即上方程式碼的 `?` 和 `:`) 解析過程中的混淆。貌似上方程式碼已很周全，但若我們對比 Linux 核心原始程式碼 [linux/include/linux/minmax.h](https://github.com/torvalds/linux/blob/master/include/linux/minmax.h)，不免詫異於原本用 2 行即可寫完的 `max` 及 `min` 巨集定義，竟然被擴充成好多行，而且不同於尋常所見的 C 程式，這是怎麼一回事呢？ ## 運用 [typeof](https://gcc.gnu.org/onlinedocs/gcc/Typeof.html) 關鍵字 [typeof](https://gcc.gnu.org/onlinedocs/gcc/Typeof.html) 是 gcc 編譯器提供的 GNU extension，後來被 Intel C/C++ compiler (icc) 和 clang 所支援，用以得知編譯時期物件的型態。摘自 GCC 手冊對於 [typeof](https://gcc.gnu.org/onlinedocs/gcc/Typeof.html) 的描述: > There are two ways of writing the argument to [typeof](https://gcc.gnu.org/onlinedocs/gcc/Typeof.html): with an expression or with a type. Here is an example with an expression: > > `typeof (x[0](1))` 其中一個用法如上，`x[0](1)` 這樣複雜的表示式對應於 array of pointer to function 傳回的型態。手冊提到另一種用法，就談及本文強調的 `max` 巨集 > [typeof](https://gcc.gnu.org/onlinedocs/gcc/Typeof.html) is often useful in conjunction with statement expressions. Here is how the two together can be used to define a safe "maximum" macro which operates on any arithmetic type and evaluates each of its arguments exactly once: (==`MAX1`==) > ```c > #define max(a, b) ({ \ > typeof (a) _a = (a); \ > typeof (b) _b = (b); \ > _a > _b ? _a : _b; \ > }) > ``` 這段程式碼先將巨集的 `a` 和 `b` (注意: 可能是複雜的表示式) 儲存於變數中 (即 `_a` 和 `_b` 變數)，之後就可傳回「數值」而非「表示式」，也就是一樣是 GNU extension 的 `({` 和 `})` 的最後一道敘述 `_a > _b ? _a : _b;`。具體的案例是，前置處理器 (C preprocessor, 簡稱 `cpp`) 展開 `max(x++, ++y)` 時，確保無論是 `x++` 抑或 `++y` 的操作，只會被執行**一次**。反之，若因為巨集展開，導致某些表示式重複執行 (evaluate)，就稱為 **double evaluation**。試想以下程式碼: ```c #define max(a, b) (a > b ? a : b) void doOneTime() { printf("called doOneTime!\n"); } int f1() { doOneTime(); return 0; } int f2() { doOneTime(); return 1; } int result = max(f1(), f2()); ``` 實際執行後，我們會發現程式輸出竟有**3 次** `doOneTime` 函式，但在 `max` 的使用，我們期待只會呼叫 2 次。這是因為在巨集展開後，原本 `max(f1(), f2())` 會被改成這樣的形式 ```c int result = (f1() > f2() ? f1() : f2()); ``` 因此，上述 GCC 手冊 [typeof](https://gcc.gnu.org/onlinedocs/gcc/Typeof.html) 舉出的 `max` 巨集可避免 **double evaluation**。不過顯然 Linux 核心開發者不會因此滿足。考慮更進階的議題：進行實際 `max` 和 `min` 的比較前，先確認資料型態是否一致，若二者資料型態不一致，就在編譯階段拋出警告訊息，從而提醒 Linux 核心開發者要留意。以下程式碼取自 Linux 核心原始程式碼 [arch/powerpc/boot/types.h](https://github.com/torvalds/linux/blob/master/arch/powerpc/boot/types.h) (==`MAX2`==) ```cpp= #define max(x, y) ({ \ typeof(x) _x = (x); \ typeof(y) _y = (y); \ (void) (&_x == &_y); \ _x > _y ? _x : _y; }) ``` 第 2 和第 3 行如稍早提及，將巨集展開過程時，先進行求值 (evaluate)，並保存數值於區域變數 `_x` 和 `_y`。第 4 行 `(void) (&_max1 == &_max2);` 對執行結果無影響，其作用是判斷二個變數的類型是否一致 —— 只要 `_x` 和 `_y` 型態不一致，對應的指標型態就會不同，這時對二個不同指標型態的表示式進行比較，編譯器就會拋出以下警告訊息: > "comparison of distinct pointer types lacks a cast" 及早發現並警示開發者。在 Linux 核心原始程式碼中，類似的技巧出現很多次，目的都是避免潛在的錯誤。 > 延伸閱讀: [Linux 核心原始程式碼巨集: BUILD_BUG_ON_ZERO](https://hackmd.io/@sysprog/c-bitfield) ## 避免命名衝突 C 沒有直接的 [namespace](https://en.wikipedia.org/wiki/Namespace) 支援，通常開發者會善用 `struct` 來封裝變數和函式指標，且利用明確的 `static` 進行符號的能見度 ([visibility of symbols](https://gcc.gnu.org/wiki/Visibility)) 控制。上述的巨集有潛在的命名衝突問題，考慮以下用法: (檔名: `m.c`，使用 `MAX2`) ```c int main() { int x = 1, _x = 2; return max(x , _x); } ``` 編譯和執行: ```shell $ gcc -o m m.c && ./m $ echo $? ``` 會得到什麼結果呢？竟然是 `1`，而非預期的 `2`。因為 `x` 和 `_x` 命名衝突，導致變數 `_x` 實際定義不只一次。用 `gcc -E` 命令觀察巨集展開: ```c= int x = 1, _x = 2; return ({ typeof(x) _x = (x); \ typeof(y) _y = (_x); \ (void) (&_x == &_y); \ _x > _y ? _x : _y; }); ``` 留意到第 1 行已定義變數 `_x`，並指派初始數值 `2`，但在第 2 行中，由於 `{` 和 `}` 包覆的作用區域 (scope) 中，變數 `_x` 再次定義，因此實際進行比較的 `_x` 變數已非我們預期在第 1 行所見的 `2`，而是 `1`，換言之，在這種用法中，根本沒有進行比較，而且編譯器也不會拋出任何警告訊息，畢竟這是合法的 C 程式。一直到 [Linux v4.8](https://elixir.bootlin.com/linux/v4.8.17/source/include/linux/kernel.h#L745) 都採用上述巨集，不過在 [Linux v4.9](https://elixir.bootlin.com/linux/v4.9.275/source/include/linux/kernel.h#L752) 就改為以下的定義: (==`MAX3`==) ```c #define __max(t1, t2, max1, max2, x, y) ({ \ t1 max1 = (x); \ t2 max2 = (y); \ (void) (&max1 == &max2); \ max1 > max2 ? max1 : max2; }) #define max(x, y) \ __max(typeof(x), typeof(y), \ __UNIQUE_ID(max1_), __UNIQUE_ID(max2_), \ x, y) ``` 這裡的 `__UNIQUE_ID` 定義在 [include/linux/compiler-gcc.h](https://elixir.bootlin.com/linux/v4.9.275/source/include/linux/compiler-gcc.h#L211) ```c #define __UNIQUE_ID(prefix) \ __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) ``` 至於 `__PASTE` 巨集: ```cpp #define ___PASTE(a , b) a##b #define __PASTE(a , b) ___PASTE(a, b) ``` `##` 在 C 語言前置處理器的作用是 [concatenation](https://gcc.gnu.org/onlinedocs/cpp/Concatenation.html) (即連結、接續的意涵)，於是 `__PASTE(a, b)` 就會變為 `ab`，這也是為何該巨集取名為 `PASTE`，隱含「拼貼」的作用，至於一行就可寫出的巨集，為何拆成二行呢？同樣是避免巨集展開的非預期結果。 > 延伸閱讀: [你所不知道的 C 語言：前置處理器應用篇](https://hackmd.io/@sysprog/c-preprocessor) `__COUNTER__` 就成為理解程式碼的關鍵，這是另一個 GNU extension。引述 GCC 手冊 [Common Predefined Macros](https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html): > This macro expands to sequential integral values starting from 0. In conjunction with the ## operator, this provides a convenient means to generate unique identifiers. Care must be taken to ensure that `__COUNTER__` is not expanded prior to inclusion of precompiled headers which use it. Otherwise, the precompiled headers will not be used. 每次 `__COUNTER__` 會得到一個流水號碼，藉由 `##` 和 `__COUNTER__` 的組合，就能避免變數命名的衝突。我們用 `gcc -E` 命令觀察使用 `MAX3` 的程式碼如何展開巨集: ```c int x = 1, _x = 2; return ({ typeof(x) __UNIQUE_ID_max1_0 = (x); \ typeof(_x) __UNIQUE_ID_max2_1 = (_x); \ (void) (&__UNIQUE_ID_max1_0 == &__UNIQUE_ID_max2_1); \ __UNIQUE_ID_max1_0 > __UNIQUE_ID_max2_1 ? __UNIQUE_ID_max1_0 : __UNIQUE_ID_max2_1; \ }); ``` 注意到變數名稱後綴的 `_0`, `_1` 就是藉由 `__COUNTER__` 所得到的數值，每次展開 `max` 巨集都不會重複。 ## 更多的檢查 Linux 核心的演化往往不是單一路線，通常是兵分多路進行，有時就遇到非預期的交集。 C99 有個特徵是 VLA ([Variable-Length Array](https://en.wikipedia.org/wiki/Variable-length_array))，允許執行時期再決定陣列佔用的空間，但這在 Linux 核心堆疊有安全疑慮 (security implication)。為此，Linux v4.20 移除 VLA，並追加編譯參數 `-Wvla`，偵測 VLA 並拋出編譯警告，避免日後不經意加入 VLA 的程式碼。 > 延伸閱讀: [The Linux Kernel Is Now VLA-Free: A Win For Security, Less Overhead & Better For Clang](https://www.phoronix.com/scan.php?page=news_item&px=Linux-Kills-The-VLA) 本來這個開發方向跟上述 `max` 巨集不衝突，但考慮以下程式碼: ```c #define X 1 #define Y 2 int main() { char sym[max(X, Y)]; return 0; } ``` 當我們採用 `MAX3` 並搭配 `-Wvla` 編譯參數，會遇到以下警告訊息: ``` warning: ISO C90 forbids variable length array ‘sym’ [-Wvla] | char sym[max(X, Y)]; | ^~~~ ``` 有意思的是，倘若採用最初的 `MAX0`，反而不會遇到任何警告訊息，因為編譯器可推斷出 `max(X, Y)` 必為常數，在編譯時期就能決定 `sym` 佔用的空間，也就是說，上方程式碼不會包含 VLA。顯然 `max` 巨集會是 Linux 核心徹底移除 VLA 的障礙：我們該如何顧及上述考量，又避免非必要的警告呢？一開始 Linux 核心開發者 (也是倡議移除 VLA 者) Kees Cook 針對常數提供另一個巨集，從而避免 `-Wvla` 拋出的警告: ```diff --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -744,8 +744,9 @@ char *resource_string(char *buf, char *end, struct resource *res, #define FLAG_BUF_SIZE (2 * sizeof(res->flags)) #define DECODED_BUF_SIZE sizeof("[mem - 64bit pref window disabled]") #define RAW_BUF_SIZE sizeof("[mem - flags 0x]") - char sym[max(2*RSRC_BUF_SIZE + DECODED_BUF_SIZE, - 2*RSRC_BUF_SIZE + FLAG_BUF_SIZE + RAW_BUF_SIZE)]; +#define SIMPLE_MAX(x, y) ((x) > (y) ? (x) : (y)) + char sym[SIMPLE_MAX(2*RSRC_BUF_SIZE + DECODED_BUF_SIZE, + 2*RSRC_BUF_SIZE + FLAG_BUF_SIZE + RAW_BUF_SIZE)]; char *p = sym, *pend = sym + sizeof(sym); int decode = (fmt[0] == 'R') ? 1 : 0; ``` > [[PATCH] vsprintf: Remove accidental VLA usage share 0](https://lkml.org/lkml/2018/3/7/1077) 不過這樣就讓 `max` 巨集不夠通用，C 語言區分 `const`-qualified type value 和 constant expression，前者是用 `const` 宣告的變數，而後者才是編譯器判斷 VLA 是否的關鍵。 2018 年，針對上述議題的修正，可見 [kernel.h: Retain constant expression output for max()/min()](https://github.com/torvalds/linux/commit/3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91)，其編修紀錄提到: > This patch updates the `min()`/`max()` macros to evaluate to a constant expression when called on constant expression arguments. This removes several false-positive stack VLA warnings from an x86 allmodconfig build when `-Wvla` is added. 另外，2020 年，原本 `max` 和 `min` 巨集都定義在 `include/linux/kernel.h` 標頭檔，移動到 [include/linux/minmax.h](https://github.com/torvalds/linux/blob/master/include/linux/minmax.h)，具體修改可見 [kernel.h: split out min()/max() et al. helpers](https://github.com/torvalds/linux/commit/b296a6d53339a79082c1d2c1761e948e8b3def69)。現在我們看到的巨集定義如下: (==`MAX4`==) ```c #define __is_constexpr(x) \ (sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8))) #define __typecheck(x, y) \ (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1))) #define __no_side_effects(x, y) \ (__is_constexpr(x) && __is_constexpr(y)) #define __safe_cmp(x, y) \ (__typecheck(x, y) && __no_side_effects(x, y)) #define __cmp(x, y, op) ((x) op (y) ? (x) : (y)) #define __cmp_once(x, y, unique_x, unique_y, op) ({ \ typeof(x) unique_x = (x); \ typeof(y) unique_y = (y); \ __cmp(unique_x, unique_y, op); }) #define __careful_cmp(x, y, op) \ __builtin_choose_expr(__safe_cmp(x, y), \ __cmp(x, y, op), \ __cmp_once(x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y), op)) #define max(x, y) __careful_cmp(x, y, >) ``` 對比 `MAX3` 和 `MAX4`，主要變更在於 `__is_constexpr` 巨集的導入，用來判斷巨集的輸入是否為常數表示式 (constant expression)。看倌或許會認為 `8 ? ((void *) ((long) (x) *0l)) : (int *) 8` 這樣的程式碼是多餘 (心生竊喜)，或心想: > 「這該不會又是 GNU extension 吧？」但其實這段程式碼完全符合 C99/C11 規範，而且裡頭蘊藏大學問，請繼續看下去。我們準備以下的測試程式來觀察 `__is_constexpr` 巨集: ```c #include <stdio.h> #define Def 10 #define __is_constexpr(x) \ (sizeof(int) == sizeof(*(8 ? ((void *) ((long) (x) *0l)) : (int *) 8))) enum test { Enum }; int main() { int Val = 10; const int Const_val = 10; int a = __is_constexpr(Val); int b = __is_constexpr(Const_val); int c = __is_constexpr(10); int d = __is_constexpr(Def); int e = __is_constexpr(Enum); printf("a:%d b:%d c:%d d:%d e:%d\n", a, b, c, d, e); return 0; } ``` 執行後得到以下輸出: ``` a:0 b:0 c:1 d:1 e:1 ``` 逐一列舉: * 在 `int Val = 10;` 的 `Val` 不是 constant expression * 在 `const int Const_val = 10;` 的 `Const_val` 不是 constant expression * `10` 和 `#define Def 10` 都是 constant expression * `enum test { Enum };` 的 `Enum` 是 constant expression，沒特別指定時，其數值為 `0` 以下解析 `__is_constexpr` 巨集，其主體為: ```c (sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8 ) ) ) ``` 其中 `(long)(x) * 0l`，若 x 為常數表示式，則會得到 constant expression `0`，類似上述測試程式碼的 `Enum` 和 `Def`，否則，就是數值 `0`。注意這兩者的分野。 `((void *)((long)(x) * 0l))` 將前述表示法的數值轉型為 `void *`，依據 C11 標準 §6.3.2.3 > An integer constant expression with the value 0, or such an expression cast to type `void *`, is called a **null pointer constant**. > [A.3 Null Pointer Constant](https://www.gnu.org/software/libc/manual/html_node/Null-Pointer-Constant.html) 此處就是 `__is_constexpr` 巨集的關鍵：constant expression `0` 轉型為 `(void *)` 會導致 **null pointer constant**。相反地，如果本來不是常數表示式，就算乘以 `0`，只會得到數值 `0`，予以轉型後，不會得到 null pointer constant，而是一個內含值為 `0` 的指標，之後仍有機會變動。是此，當 `x` 為常數表示式時，考慮以下表示: ```c 8 ? ((void *) ((long) (x) *0l)) : (int *) 8) ``` 就會等價於: > 8 ? **[null pointer constant]** : (int *) 8 下一步則是解析 [ternary operator](https://en.wikipedia.org/wiki/%3F:) 的使用，其右側二個運算子應為同樣的型態。不僅是這個巨集最令人費解之處，也是為何我們該掌握 C 語言規格。依據 C11 標準 §6.5.15 Conditional operator > If both the second and third operands are pointers or one is a null pointer constant and the other is a pointer, the result type is a pointer to a type qualified with all the type qualifiers of the types referenced by both operands. Furthermore, if both operands are pointers to compatible types or to differently qualified versions of compatible types, the result type is a pointer to an appropriately qualified version of the composite type; if one operand is a **null pointer constant**, the result has the type of the other operand; otherwise, one operand is a pointer to void or a qualified version of `void`, in which case the result type is a pointer to an appropriately qualified version of `void`. [ternary operator](https://en.wikipedia.org/wiki/%3F:) 在 C 語言規格中，不能看做 if-else 的簡化形式，其運算子的型態也該留意。根據上述規格描述，考慮以下程式碼: ```c 1 ? (int *) 0 : (void *) 0 ``` `(int *) 0` 和 `(void *) 0` 數值相同，差別在型態： * `(int *) 0` 是 null pointer * `(void *) 0` 是 ==null pointer constant== 許多 C 語言編譯器提供的標頭檔會這樣定義 `NULL`: ```cpp #define NULL ((void *)0) ``` 注意 `NULL` 並非 C 語言內建關鍵字，而是留給標準函式庫實作者去規範，意味著我們不能總是期待 `NULL` 在所有的環境都一致，在某些環境就被定義為 `(0)`。另外，依據 C11 標準 §6.3.2.3: > If a **null pointer constant** is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function. 在 C 語言中，`(void *) 0` 和 `0` 都是 null pointer constant，但在 C++ 規格卻有不同描述: ([C++ 03](https://en.wikipedia.org/wiki/C%2B%2B03)) > A null pointer constant is an integral constant expression (expr.const) rvalue of integer type that evaluates to zero. 這意味著，在 C++ 中，只有 `0` 才是 null pointer constant。再次我們又看到 C 和 C++ 語言已是二套截然不同的程式語言。我們可利用以下程式來測試: (檔名: `c1.c`) ```c int main() { return 1 ? (int *) 0 : (void *) 0; } ``` 編譯和執行: ```shell $ gcc c1.c ; ./a.out $ echo $? ``` 得到 `0` 的輸出，這意味著，實際傳回型態為 `int *` 的 `0`，印證 C11 標準 §6.5.15 所說: > if one operand is a **null pointer constant**, the result has ==the type of the other operand==. 另一個案例: (檔名: `c2.c`) ```c int main() { return 1 ? (int *) 0 : (void *) 1; } ``` 比照上面的測試，依舊會輸出 `0` 的結果，但寓意卻不同。在 `1 ? (int *) 0 : (void *) 1` 表示式中，沒有 null pointer constant，只有一個 null pointer，即 `(int *) 0`，依據 C 語言規格: > if both operands are pointers to compatible types or to differently qualified versions of compatible types, the result type is a pointer to an appropriately qualified version of the composite type. 由於 `(int *)` 和 `(void *)` 是相容的指標型態，依據上述規則，應該要得到 `(void *) 0`，也就是說 `c1.c` 和 `c2.c` 儘管執行後都得到 `0`，但型態不同。可是該如何驗證型態不同呢？在 GCC 手冊 [6.59 Other Built-in Functions Provided by GCC](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html) 提到 `__builtin_types_compatible_p` > `int __builtin_types_compatible_p (type1, type2)` > You can use the built-in function `__builtin_types_compatible_p` to determine whether two types are the same. 我們準備以下程式碼: ```c #include <stdio.h> int main() { printf("%d\n", __builtin_types_compatible_p(typeof(1 ? (int *) 0 : (void *) 0), typeof(1 ? (int *) 0 : (void *) 1))); printf("%d\n", __builtin_types_compatible_p( typeof((int *) 0), typeof(0 ? (int *) 0 : (void *) 0))); printf("%d\n", __builtin_types_compatible_p( typeof((void *) 0), typeof(1 ? (int *) 0 : (void *) 1))); return 0; } ``` 執行後會得到以下輸出: ``` 0 1 1 ``` 也就是說 `1 ? (int *) 0 : (void *) 0` 和 `1 ? (int *) 0 : (void *) 1)` 的輸出型態不相容 (incompatibile)，因為前者是 `int *`，後者是 `void *`。後二個 `1` 就印證前述規範。藉由對 C 語言規格的「格物致知」後，我們終於有足夠的背景知識繼續解析 `__is_constexpr` 巨集。當 `x` 為常數表示式時，`__is_constexpr(x)` 實際的作用如下: > (sizeof(int) == sizeof(`*`**[int 指標型態]**) 複習 C11 標準 §6.5.15 所說: > if one operand is a **null pointer constant**, the result has ==the type of the other operand==. 只要 `x` 是常數表示式，就會得到 `:` 的另一端 `(int *)8`，反之，當 `x` 不是常數表示式，`__is_constexpr(x)` 實際的作用是: > (sizeof(int) == sizeof(`*`**[void 指標型態]**) 在 gcc 中，`sizeof(void)` 為 `1`，這和 C 語言規格描述不同，後者不允許對 `void` 進行 `sizeof` 操作，因為 `void` 是 incomplete type。依據 C11 規格 $6.5.3.4: > The `sizeof` operator shall not be applied to an expression that has function type or an incomplete type, to the parenthesized name of such a type, or to an expression that designates a bit-field member. 關於這項 GNU extension，可參見 [Arithmetic on void- and Function-Pointers](https://gcc.gnu.org/onlinedocs/gcc/Pointer-Arith.html)。gcc 之所以引入 `sizeof(void) = 1` 的非標準特徵，是為了避免指標型態物件進行運算 (`+` 和 `-` 操作) 時，讓 `void *ptr;` 實際的表現跟 `char *ptr;` 一致，這樣執行 `ptr++;` 才會讓 `ptr` 得到新的數值 (即 `(char *) ptr + 1`，否則會遭遇 `ptr + 1 == ptr` 的窘境。接著回頭解析 `MAX4`: ```c #define __no_side_effects(x, y) \ (__is_constexpr(x) && __is_constexpr(y)) #define __safe_cmp(x, y) \ (__typecheck(x, y) && __no_side_effects(x, y)) #define __cmp(x, y, op) ((x) op (y) ? (x) : (y)) #define __cmp_once(x, y, unique_x, unique_y, op) ({ \ typeof(x) unique_x = (x); \ typeof(y) unique_y = (y); \ __cmp(unique_x, unique_y, op); }) #define __careful_cmp(x, y, op) \ __builtin_choose_expr(__safe_cmp(x, y), \ __cmp(x, y, op), \ __cmp_once(x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y), op)) #define max(x, y) __careful_cmp(x, y, >) ``` 其中 `__builtin_choose_expr` 又是 GNU extension，摘自 GCC 手冊 [Other Built-in Functions Provided by GCC](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html) > Built-in Function: `type __builtin_choose_expr (const_exp, exp1, exp2)` > You can use the built-in function `__builtin_choose_expr` to evaluate code depending on the value of a constant expression. This built-in function returns exp1 if `const_exp`, which is an integer constant expression, is nonzero. Otherwise it returns exp2. > > This built-in function is analogous to the ‘? :’ operator in C, except that the expression returned has its type unaltered by promotion rules. Also, the built-in function does not evaluate the expression that is not chosen. For example, if `const_exp` evaluates to true, exp2 is not evaluated even if it has side effects. 可在編譯時期，依據常數表示式，挑選出指定的表示式，在 Linux 核心的 `max` 巨集就是判斷輸入的二個引數是否都是常數表示式，若是，就使用 `MAX0`，也就是最初的形式 ([愛回到最初](https://www.books.com.tw/products/0010134356)?)，反之，就使用 `MAX3` 的形式。接著解析 `__typecheck`，該巨集定義於 [linux/include/linux/kernel.h](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/include/linux/kernel.h?id=3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91) 中，乍看可能以為 `__safe_cmp(x, y)` 會先檢查兩者的型態是否相同，若不同則判斷為 false。但查看其定義，`__typecheck(x, y)` 並非如此，拆解如下: ```c #define __typecheck(x, y) \ ( !!( sizeof( (typeof(x) *)1 == (typeof(y) *)1 ) ) ) ``` 分析： 1. `==` 運算子的二端是 `(typeof(x) *) 1` 及 `(typeof(y) *) 1`，這將 `int 1` 轉換成指向 `x` 或 `y` 型態的指標。 2. `==` 會去比較二端是否相等，參照 C11 標準的 §6.5.9，最終會得到 `==` 的回傳值。 > — both operands have arithmetic type; — both operands are pointers to qualified or unqualified versions of compatible types; — one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void; or — one operand is a pointer and the other is a null pointer constant. 3. `sizeof(X)` 會得到 `X` 的資料寬度，C11 標準的 §6.5.9 第 3 點提到，無論結果是否相等，型態都是 `int` > The `==` (equal to) and `!=` (not equal to) operators are analogous to the relational operators except for their lower precedence.108) Each of the operators yields 1 if the specified relation is true and 0 if it is false. The result has type int. For any pair of operands, exactly one of the relations is true 4. 最後，`!!`代表的是將數值改變成 0 或 1，由於永遠都輸入 `sizeof(int)`，因此最終的結果必為 1 `__typecheck(x, y)` 的作用是在編譯階段查驗比較操作的型態差異。藉由 [`-Wcompare-distinct-pointer-types`](https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html) 選項，它會產生相應的警告，該警告預設情況下啟用，這不影響 `cmp` 函式的運作。 ## `min` 和 `max` 後續變更 > [commit 867046](https://github.com/torvalds/linux/commit/867046cc7027703f60a46339ffde91a1970f2901) ```c /* True for a non-negative signed int constant */ #define __is_noneg_int(x) \ (__builtin_choose_expr(__is_constexpr(x) && __is_signed(x), x, -1) >= 0) /* is_signed_type() isn't a constexpr for pointer types */ #define __is_signed(x) \ __builtin_choose_expr(__is_constexpr(is_signed_type(typeof(x))), \ is_signed_type(typeof(x)), 0) /* True for a non-negative signed int constant */ #define __is_noneg_int(x) \ (__builtin_choose_expr(__is_constexpr(x) && __is_signed(x), x, -1) >= 0) #define __types_ok(x, y) \ (__is_signed(x) == __is_signed(y) || \ __is_signed((x) + 0) == __is_signed((y) + 0) || \ __is_noneg_int(x) || __is_noneg_int(y)) #define __cmp_op_min < #define __cmp_op_max > #define __cmp(op, x, y) ((x) __cmp_op_##op (y) ? (x) : (y)) #define __cmp_once(op, x, y, unique_x, unique_y) ({ \ typeof(x) unique_x = (x); \ typeof(y) unique_y = (y); \ static_assert(__types_ok(x, y), \ #op "(" #x ", " #y ") signedness error, fix types or consider u" #op "() before " #op "_t()"); \ __cmp(op, unique_x, unique_y); }) #define __careful_cmp(op, x, y) \ __builtin_choose_expr(__is_constexpr((x) - (y)), \ __cmp(op, x, y), \ __cmp_once(op, x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y))) ``` 留意以下巨集： - `__is_signed(x)` - `__is_noneg_int(x)` - `__types_ok(x, y)` 以及在 `__cmp_once` 搭配 `static_assert`，即可在編譯階段判斷 `x` 及 `y` 的型態是否會在比較時造成問題，若是，則觸發編譯時期的警告。 ### `__is_signed` 若 `is_signed_type` 的結果是常數表示式，則會返回 `is_signed_type` 的結果，反之，表示查詢一個指標是否為有號數，因此後者結果為 `0`。 ```c __builtin_choose_expr( __is_constexpr( is_signed_type(typeof(x)) ), is_signed_type(typeof(x)), 0 ) ``` `is_signed_type` 巨集定義於 [`linux/compiler.h`](https://github.com/torvalds/linux/blob/master/include/linux/compiler.h#L242C1-L242C1): ```c #define is_signed_type(type) (((type)(-1)) < (__force type)1) ``` `__force` 標示可強制轉型的屬性，亦即 `__attribute__((force))`，該巨集檢查 $-1 < 1$ 是否成立，倘若二個數值都屬於無號數型態時，該表示式不成立。若引數的型別是指標，C11 標準的 §6.5.8 第 5 點提到，`>` 比較兩個指標的依據是，判斷二者是否指向相同的物件，而因編譯時無法確定指向的具體物件，因此 `is_signed_type` 不被認為是常數表示式。 > When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. ### `__is_noneg_int(x)` 若 `x` 是常數表示式且為有號數，則告知 `x` 是否大於等於 0。 ```c #define __is_noneg_int(x) \ (__builtin_choose_expr(__is_constexpr(x) && __is_signed(x), x, -1) >= 0) ``` ### `__types_ok(x, y)` 只要以下任意一個成立，即可用來比較數值 - `__is_signed(x) == __is_signed(y)`: 判斷是否同為有號數或無號數，相同型態的比較不會發生問題 - `__is_noneg_int(x)`: 若 `x` 是「非負」的有號數，代表 `y` 是無號數，而無號數跟非負有號數的比較不會造成問題 - `__is_noneg_int(y)`: 如果不是前述二者，則 `x` 是無號數，可確認 `y` 是否為非負有號數 - `__is_signed((x) + 0) == __is_signed((y) + 0)` ```c #define __types_ok(x, y) \ (__is_signed(x) == __is_signed(y) || \ __is_signed((x) + 0) == __is_signed((y) + 0) || \ __is_noneg_int(x) || __is_noneg_int(y)) ``` C11 標準的第 6.5.9 節中第 2 項規定，當使用 `+`, `-`, `>>`, `<<` 或 `~` 運算子時，若結果可用 `int` 型態表示，則回傳型態將為 `int`，反之，若結果不能用 `int` 表示，則回傳型態為 `unsigned int`，此乃 **integer promotions**。於是，只要無號數的數值不超過有號數能表示的最大範圍，我們就可將其視作有號數以進行比較。在 [[PATCH next v4 4/5] minmax: Allow comparisons of 'int' against 'unsigned char/short'](https://lkml.kernel.org/r/8732ef5f809c47c28a7be47c938b28d4@AcuMS.aculab.com) 討論例外的情境，也就是當比較 `unsigned short/char` 與` signed int` 時，由於 `unsigned short` 和 `unsigned char` 會被轉換成 `signed int`，且所有 `unsigned short` 和 `unsigned char`表示的數值都在 `signed int` 的範圍內，於是前述轉換不影響比較的結果。 > If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. 58) All other types are unchanged by the integer promotions. > > 58) The integer promotions are applied only: as part of the usual arithmetic conversions, to certain argument expressions, to the operands of the unary `+`, `-`, and `~` operators, and to both operands of the shift operators, as specified by their respective subclauses. ## 何不用 [inline function](https://en.wikipedia.org/wiki/Inline_function) ? 從上述 `MAX0`, `MAX1`...到 `MAX4` 的演化，可能會讓我們不禁懷疑：「為何不用 [inline function](https://en.wikipedia.org/wiki/Inline_function) 呢？」直覺來看，撰寫以下的程式碼，似乎可排除 `MAX3` 面對的議題: ```c static inline int max(int x, int y) { return x > y ? x : y; } ``` 就是讓編譯器來檢查型態和參數傳遞 (含 evaluation) 的議題。但核心開發者不採納這方式的考量: 1. [inline function](https://en.wikipedia.org/wiki/Inline_function) 是程式碼對編譯器的「建議」，並非總是會展開，例如 gcc 要到 `-O1` 最佳化等級才會嘗試將 `static inline` 函式嵌入 (inline) 到呼叫端的程式碼 2. 依舊無法克服 `MAX4` 在意的 VLA 偵測議題 3. 由於 C 語言缺乏 C++ 語言的 template 機制，上述 [inline function](https://en.wikipedia.org/wiki/Inline_function) 勢必要準備多個型態的實作，再搭配 C11 的 `_Generic` 關鍵字選擇合適的型態，但這樣的列舉缺乏通用性 4. 反觀 `MAX4` 雖然看似複雜，卻跟型態無關，又可搭配 GNU extension 在編譯時期進行必要的檢查，這點就比 C 語言的 [inline function](https://en.wikipedia.org/wiki/Inline_function) 機制來得靈活，其實也是有一定規模的 C 語言專案慣用手法 Linux 核心開發的過程中，與 GNU toolchain 息息相關，不僅有 gcc，尚有 [binutils](https://www.gnu.org/software/binutils/) (包含組譯器、連結器，和相關工具程式)，早期 Linux 核心開發者遭遇 gcc 最佳化的限制和詭異的錯誤，其一就是上述 `inline` 語意和實際效果的落差。在 [We don't use the "inline" keyword because it's broken](https://www.kernel.org/doc/local/inline.html) 這份文件中，就算尚未細讀內容，也大概猜到 Linux 核心開發者踩過的「地雷」: > Current versions of gcc turned "inline" into a request (similar to the old "register" keyword), rendering it essentially useless. These versions of gcc are free to ignore any such requests to inline functions, as well as to inline functions without the keyword. > ... > Earlier attempts to work around this breakage by declaring functions "static inline" or "extern inline" (instructing the compiler never to emit a non-inline version of the function, breaking the build if necessary to detect when the compiler wasn't following instructions) worked for a while, but were again broken by newer releases of gcc. 延伸閱讀: [你所不知道的 C 語言：編譯器和最佳化原理篇](https://hackmd.io/@sysprog/c-compiler-optimization) ## 結語在 Linux 核心原始程式碼的 `max` 和 `min` 巨集充分展現 C 語言的靈活和 Linux 核心開發者的巧思，儘管依賴 GNU extension，但仍做到編譯時期儘可能進行檢查，並確保執行時期依舊高效。核心開發者從發現問題、提出解法，到提升到整體策略，讓 Linux 持續擁抱各式創新，著實是核心的美。