sysprog
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • Write
        • Owners
        • Signed-in users
        • Everyone
        Owners Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Help
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Owners
  • Owners
  • Signed-in users
  • Everyone
Owners Signed-in users Everyone
Write
Owners
  • Owners
  • Signed-in users
  • Everyone
Owners Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    --- tags: linux2024 --- # [2024q1](http://wiki.csie.ncku.edu.tw/linux/schedule) 第 3 週測驗題 :::info 目的: 檢驗學員對 bitwise 的認知 ::: ==[作答表單: 測驗 1-2](https://docs.google.com/forms/d/e/1FAIpQLSdz43uXjXJMAiDc8Z5gFxyVES-r8He5yfviYNvvYzrdDzINvA/viewform)== (針對 Linux 核心「設計」課程) ==[作答表達: 測驗 3-5](https://docs.google.com/forms/d/e/1FAIpQLSevPGVEubDJFqxsUvmHcOpK5xiwa1-whomsqNY64jTOeXE52w/viewform)== (Linux 核心「實作」課程) ### 測驗 `1` 以下程式碼嘗試計算開平方根: (版本一) ```c #include <math.h> int i_sqrt(int N) { int msb = (int) log2(N); int a = 1 << msb; int result = 0; while (a != 0) { if ((result + a) * (result + a) <= N) result += a; a >>= 1; } return result; } ``` 對應的測試程式: ```c #include <stdio.h> int main() { int N = 36; printf("%d\n", i_sqrt(N)); return 0; } ``` 編譯方式: ```shell $ gcc -o i_sqrt i_sqrt.c -lm ``` > 因為呼叫 [log2](https://man7.org/linux/man-pages/man3/log2.3.html),所以需要連結 libm (math library),即 `-lm` 選項 我們可改寫上述 `i_sqrt` 函式,使其不依賴 [log2](https://man7.org/linux/man-pages/man3/log2.3p.html) 並保持原本的函式原型 (prototype) 和精度 (precision)。 (版本二) ```c int i_sqrt(int N) { int msb = 0; int n = N; while (n > 1) { n >>= 1; msb++; } int a = 1 << msb; int result = 0; while (a != 0) { if ((result + a) * (result + a) <= N) result += a; a >>= 1; } return result; } ``` [Digit-by-digit calculation](https://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Digit-by-digit_calculation) 提及常見的直式開方法的變形。將要開平方的數拆成 2 的冪相加 (若是十進位則為 10 的冪),也就是如下形式: $$ x = (000b_0b_1b_2\cdots b_{n-1}b_{n})^2 $$ 其中 $$ 000b_0b_1b_2\cdots b_{n-1}b_{n} $$ 為 bits pattern,其中 $b_0$ 為最高位的 1,最後的形式為 $$ [2\left(\sum\limits_{i = 0}^{n-1}{a_i}\right)+a_n]a_n $$ 的相加,其中每項就對應到上述程式碼的每一輪迴圈 ```c if (x >= b) x -= b, y += m; ``` 要判斷並可能減掉的 `b`。 原理是將欲求平方根的 $N^2$ 拆成: $$ N^2=(a_{n}+a_{n-1}+a_{n-2}+...+{a_0})^2, a_m=2^m\ or\ a_m=0 $$ 將其乘開得到 $$ \left[ \begin{array}{ccccc} a_0a_0 & a_1a_0 & a_2a_0 & ... & a_na_0 \\ a_0a_1 & a_1a_1 & a_2a_1 & ... & a_na_1 \\ a_0a_2 & a_1a_2 & a_2a_2 & ... & a_na_2 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ a_0a_n & a_1a_n & a_2a_n & ... & a_na_n \\ \end{array} \right] $$ 分成幾個部份觀察 $$ \left[ \begin{array}{c} a_0a_0 \\ \end{array} \right] $$ $$ \left[ \begin{array}{cc} & a_1a_0 \\ a_0a_1 & a_1a_1 \end{array} \right] $$ $$ \left[ \begin{array}{ccc} & & a_2a_0 \\ & & a_2a_1 \\ a_0a_2 & a_1a_2 & a_2a_2 \end{array} \right] $$ 原式可整理成 $$ \begin{split} \\ N^2 =& a_n^2+[2a_n+a_{n-1}]a_{n-1}+[2(a_n+a_{n-1})+a_{n-2}]a_{n-2}+...+[2(\displaystyle\sum_{i=1}^{n}a_i)+a_0]a_0 \\ =& a_n^2+[2P_n+a_{n-1}]a_{n-1}+[2P_{n-1}+a_{n-2}]a_{n-2}+...+[2P_1+a_0]a_0 \\ P_m =& a_n+a_{n-1}+...+a_m \\ \end{split} $$ $P_0=a_n+a_{n-1}+...+a_0$ 即為所求平方根 $N$ 顯然 $$ P_m = P_{m+1} + a_m$$ 我們的目的是從試出所有 $a_m$,$a_m=2^m$ or $a_m=0$。從 $m=n$ 一路試到 $m=0$,而求得 $a_m$ 只要試驗 $P_m^2 \leq N^2$ 是否成立,二種可能狀況: $$ \begin{cases} P_m = P_{m+1} + 2^m, & \text{if $P_m^2 \leq N^2$}\\ P_m = P_{m+1}, & \text{otherwise} \end{cases} $$ 但我們無法每輪去計算 $N^2 - P_m^2$ 以試驗 $a_m$,因為運算成本太高。一種想法是從上一輪的差值 $X_{m+1}$ 減去 $Y_m$ 得到 $X_{m}$ - $X_m = N^2 - P_m^2 = X_{m+1} - Y_m$ - $Y_m = P_m^2 - P_{m+1}^2 = 2P_{m+1}a_m+a_m^2$ 也就是紀錄上一輪的 $P_{m+1}$ 來計算 $Y_m$ 進一步調整,將 $Y_m$ 拆成 $c_m, d_m$ $$ c_m = P_{m+1}2^{m+1} \\ d_m = (2^m)^2 \\ Y_m=\left. \begin{cases} c_m+d_m & \text{if } a_m=2^m \\ 0 & \text{if } a_m=0 \end{cases} \right. $$ 藉由位元運算從 $c_m, d_m$ 推出下一輪 $c_{m-1}, d_{m-1}$ - $c_{m-1}=P_m2^m=(P_{m+1}+a_m)2^m=P_{m+1}2^m+a_m2^m=\begin{cases} c_m/2+d_m & \text{if }a_m=2^m \\ c_m/2 & \text{if }a_m=0 \end{cases}$ - $d_{m-1}=\dfrac{d_m}{4}$ 結合上述方法撰寫演算法來尋求$a_n+a_{n-1}+...+a_0$,假設從 $a_n$ 開始往下試。 因為是從 $a_n$ 開始,所以: - $X_{n+1}=N^2$ - $n$ 是最大的位數,使得 $a_n^2=(2^n)^2= d_n^2 = 4^n \leq N^2$ 所以用迴圈逼近 $d_n$, - $c_n = 0$ - $c_m=P_{m+1}2^{m+1}$。$a_n$ 已是已知最大,所以 $P_{n+1} = 0 \to c_n=0$ 因為 $d_m$ 恆大於 $0$,使用位移操作 $d_m$ 到最後必為 $0$,我們可用 `d != 0` 作為條件。又 $c_{-1} = P_0 = a_n+a_{n-1}+...+a_0$,算到最後的 $c_{-1}$ 即為所求 $N$,$a_n$ 也一併得知。 由上述的推論可知,求出根號需要求得 $c_{-1}$ 即為所求 $N$ 。故寫出迭代的關係,即可求得解。 首先,先判斷輸入的數值,是否符合求解根號的範圍。 ```c if (x <= 1) /* Assume x is always positive */ return x; ``` 此處不考慮針對負數的開根號操作 (涉及虛數的處理),面對 `0` 和 `1` 這樣的單調狀況時,直接回傳。且小數不能被整數型別 `int` 儲存,所以也不用討論。 `int m` 對應上述推論中的 $d_{m}$ , 由於首項 $d_n = (2^n)^2$,可知 ```c int m = 1UL << ((31 - __builtin_clz(x)) & ~1UL) ``` > [`__builtin_clz(x)`](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html) 為 GCC 的內建函式。 > Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined. 其功能為找到 x 最高位數前面有幾個 0 , x 為 0 時,未定義。 最後,我們將程式碼改寫如下: (版本三,部分遮蔽) ```c int i_sqrt(int x) { if (x <= 1) /* Assume x is always positive */ return x; int z = 0; for (int m = 1UL << ((31 - __builtin_clz(x)) & ~1UL); m; m >>= AAAA) { int b = z + m; z >>= BBBB; if (x >= b) x -= b, z += m; } return z; } ``` :::success 延伸問題: 1. 解釋上述程式碼運作原理並重新整理數學式 (原題目故意略去某些細節),並嘗試用[第 2 週測驗題](https://hackmd.io/@sysprog/linux2024-quiz2)提到的 [ffs](https://man7.org/linux/man-pages/man3/ffs.3.html) / fls 取代 `__builtin_clz`,使程式不依賴 GNU extension,且提供分支和無分支 (branchless) 的實作。 2. 在 Linux 核心原始程式碼找出對整數進行平方根運算的程式碼,並解說其應用案例,至少該包含 [block](https://github.com/torvalds/linux/tree/master/block) 目錄的程式碼。 3. 閱讀〈[Analysis of the lookup table size for square-rooting](https://web.ece.ucsb.edu/~parhami/pubs_folder/parh99-asilo-table-sq-root.pdf)〉和〈[Timing Square Root](https://web.archive.org/web/20210208132927/http://assemblyrequired.crashworks.org/timing-square-root/)〉,嘗試撰寫更快的整數開平分根運算程式。 ::: --- ### 測驗 `2` 針對正整數在相鄰敘述進行 mod 10 和 div 10 操作,如何減少運算成本? ```c static void str_add(char *b, char *a, char *res, size_t size) { int carry = 0; for (int i = 0; i < size; i++) { int tmp = (b[i] - '0') + (a[i] - '0') + carry; carry = tmp / 10; tmp = tmp % 10; res[i] = tmp + '0'; } } ``` 可利用除法原理將 mod 10 和 div 10 合併。根據除法原理: $f = g \times Q + r$ 其中 * $f$: 被除數 * $g$: 除數 * $Q$: 商 * $r$: 餘數 對應的程式碼: ```c carry = tmp / 10; tmp = tmp - carry * 10; ``` 參考《[Hacker's Delight](http://web.archive.org/web/20180517023231/http://www.hackersdelight.org/divcMore.pdf)》,採用 bitwise operation 來實作除法,因為 $10$ 有包含 $5$ 這個因數,無法完全用 $2$ 的冪項來表示,因此結果會是不精確的。然而,觀察上面的程式碼後可以發現, `tmp` 不可能會大於 `19` ,因此只需要考慮 `19`~`0` 的情況即可。 我們的目標是,得到 `tmp / 10` 且直到小數點後第一位都是精確的。 為此,我們可提出一個猜想,假設我目標的最大數是 `n` , `l` 則是比 `n` 還要小的非負整數。假設 $n = ab$ ($a$ 是十位數 b 是個位數),且 $l = cd$ ($c$ 是十位數,$d$ 是個位數),則只要以 $n$ 算出來的除數在 $n$ 除以該除數後在精確度內,則 $l$ 除與該除數仍然會在精確度內。證明如下: 假設 $x$ 是除數, $$ a.b\leq\frac{n}{x}\leq a.b9\\ \Rightarrow\frac{n}{a.b9}\leq x\leq\frac{n}{a.b} $$ 1. 可見 $x$ 的右邊是 $10$,因此一定在精確度內。 2. $x$ 的左邊: $$ c.d\leq l\times\frac{a.b9}{n}\leq c.d9\\ \Rightarrow c.d\leq cd\times\frac{a.b9}{ab}\leq c.d9\\ $$ * $cd\times\frac{a.b9}{ab}$ 的左邊顯然成立 * $cd\times\frac{a.b9}{ab}$ 的右邊: $$ c.d + \frac{0.09cd}{ab}\leq c.d + 0.09 $$ 因為 $ab > cd$ 因此上述式子依然成立。 於是前述猜想也成立,接下來只需要針對 `19` 來做討論即可。 $$ 1.9\leq \frac{19}{x}\leq 1.99\\ \Rightarrow 9.55\leq x\leq10 $$ 接下來只需要找到這之中可以用除法來表示的除數即可。 找除數的方法是,透過 bitwise operation 得到的算式必定是 $\frac{an}{2^N}$ 其中 $N$ 為非負整數, $a$ 正整數。因此只需要透過查看 $2^N$ 再配對適合的 $a$ 即可。 其中,$2^N=128, a=13,\frac{128}{13}\approx9.84$ 便是一個可用的解。 接著,嘗試透過 bitwise operation 來配對數值。 ```c unsigned d2, d1, d0, q, r; d0 = q & 0b1; d1 = q & 0b11; d2 = q & 0b111; q = ((((tmp >> 3) + (tmp >> 1) + tmp) << 3) + d0 + d1 + d2) >> 7; r = tmp - (((q << 2) + q) << 1); ``` 關於這段程式碼,以 $13$ 為例,可發現 $13 = 8 + 4 + 1 = 2^3 + 2^2 + 2^0$ ,因此,首先要用 bitwise operation 得到 $\frac{13tmp}{8}$ $\frac{tmp}{8}+\frac{tmp}{2}+tmp$ ```c (tmp >> 3) + (tmp >> 1) + tmp ``` 這時會發生一個問題,即右移後,會有部分位元被捨棄,因此要另行處理。 ```c d0 = q & 0b1; d1 = q & 0b11; d2 = q & 0b111; ``` 最後合併前述操作。 ```c ((((tmp >> 3) + (tmp >> 1) + tmp) << 3) + d0 + d1 + d2) ``` 考慮 $128$ ,也就是右移 $7$ 個位元 ```c q = ((((tmp >> 3) + (tmp >> 1) + tmp) << 3) + d0 + d1 + d2) >> 7; ``` mod 10 就是 `tmp` 減去 div 10 的結果乘 $10$ ```c r = tmp - (((q << 2) + q) << 1); ``` 其中 `(((q << 2) + q) << 1)` 就是 ($q \times 4 + q) \times 2$,亦即 $10\times q$ 測試程式: ```c #include <stdint.h> #include <stdio.h> int main() { for (int n = 0; n <= 19; n++) { unsigned d2, d1, d0, q, r; d0 = q & 0b1; d1 = q & 0b11; d2 = q & 0b111; q = ((((n >> 3) + (n >> 1) + n) << 3) + d0 + d1 + d2) >> 7; r = n - (((q << 2) + q) << 1); printf("q: %d r: %d\n", q, r); } return 0; } ``` 執行結果: ``` q: 0 r: 0 q: 0 r: 1 q: 0 r: 2 q: 0 r: 3 q: 0 r: 4 q: 0 r: 5 q: 0 r: 6 q: 0 r: 7 q: 0 r: 8 q: 0 r: 9 q: 1 r: 0 q: 1 r: 1 q: 1 r: 2 q: 1 r: 3 q: 1 r: 4 q: 1 r: 5 q: 1 r: 6 q: 1 r: 7 q: 1 r: 8 q: 1 r: 9 ``` 結果符合預期。 可包裝為以下函式: (部分遮蔽) ```c #include <stdint.h> void divmod_10(uint32_t in, uint32_t *div, uint32_t *mod) { uint32_t x = (in | 1) - (in >> 2); /* div = in/10 ==> div = 0.75*in/8 */ uint32_t q = (x >> 4) + x; x = q; q = (q >> 8) + x; q = (q >> 8) + x; q = (q >> 8) + x; q = (q >> 8) + x; *div = (q >> CCCC); *mod = in - ((q & ~0x7) + (*div << DDDD)); } ``` 使用案例: ```c unsigned div, mod; divmod_10(193, &div, &mod); ``` 延伸閱讀: * [Modulus without Division, a tutorial](http://homepage.cs.uiowa.edu/~dwjones/bcd/mod.shtml) :::success 延伸問題: 1. 參照《[Hacker's Delight](https://en.wikipedia.org/wiki/Hacker%27s_Delight)》和 [CS:APP 第二章](https://hackmd.io/@sysprog/CSAPP-ch2),解釋上述程式碼運作原理,並對照 [Instruction tables](https://www.agner.org/optimize/instruction_tables.pdf),以分析最初具備除法操作的程式碼和最終完全不依賴除法和乘法的實作程式碼裡頭,指令序列佔用的 CPU 週期數量。 2. 練習撰寫不依賴任何除法指令的 `% 9` (modulo 9) 和 `% 5` (modulo 5) 程式碼,並證明在 `uint32_t` 的有效範圍可達到等效。 > 實作不同於《[Hacker's Delight](http://web.archive.org/web/20180517023231/http://www.hackersdelight.org/divcMore.pdf)》的程式碼 ::: --- ### 測驗 `3` 考慮 ilog2 可計算以 2 為底的對數,且其輸入和輸出皆為整數。以下是其一可能的實作: (版本一) ```c int ilog2(int i) { int log = -1; while (i) { i >>= 1; log++; } return log; } ``` 我們可改寫為以下程式碼: (版本二) ```c static size_t ilog2(size_t i) { size_t result = 0; while (i >= AAAA) { result += 16; i >>= 16; } while (i >= BBBB) { result += 8; i >>= 8; } while (i >= CCCC) { result += 4; i >>= 4; } while (i >= 2) { result += 1; i >>= 1; } return result; } ``` 亦可運用 GNU extension `__builtin_clz` 來改寫,注意輸入若是 `0` 則無定義: (版本三) ```c int ilog32(uint32_t v) { return (31 - __builtin_clz(DDDD)); } ``` 請補完程式碼,使其運作符合預期。 AAAA, BBBB, CCCC 是十進位數值,而 DDDD 是表示式。 書寫規範: * 以最簡潔的形式撰寫,且符合[作業一](https://hackmd.io/@sysprog/linux2024-lab0)排版規範 (近似 Linux 核心程式碼排版風格) * 書寫 `x + 1` 一類的表示式時,偏好變數在前、常數在後,且二元運算子前後有空白字元 :::success 延伸問題: 1. 解釋上述程式碼運作原理 2. 在 Linux 核心原始程式碼找出 log2 的相關程式碼 (或類似的形式),並解說應用案例 ::: --- ### 測驗 `4` [Exponentially Weighted Moving Average](https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average) (EWMA; 指數加權移動平均) 是種取平均的統計手法,並且使經過時間越久的歷史資料的權重也會越低,以下為 EWMA 的數學定義: $S_t = \left\{ \begin{array}{l} Y_0& t = 0 \\ \alpha Y_t + (1 - \alpha)\cdot S_{t-1}& t > 0 \\ \end{array} \right.$ - $\alpha$ 表示歷史資料加權降低的程度,介在 0 ~ 1 之間,越高的 $\alpha$ 會使歷史資料減少的越快 - $Y_t$ 表示在時間 $t$ 時的資料點 - $S_t$ 表示在時間 $t$ 時計算出的 EWMA 接著將上述的數學式展開,可以得到以下的結果 $\begin{split} S_t &= \alpha Y_t + (1 - \alpha)\cdot S_{t-1} \\ &= \alpha Y_t + (1 - \alpha)(\alpha Y_{t-1} + (1 - \alpha)\cdot S_{t-2}) \\ &= \alpha Y_t + (1 - \alpha)(\alpha Y_{t-1} + (1 - \alpha)(\alpha Y_{t-2} + (1 - \alpha)\cdot S_{t-3})) \\ &= \alpha (Y_t + (1 - \alpha)Y_{t-1} + (1 - \alpha)^2Y_{t-2} + \ldots + (1 - \alpha)^kY_{t-k} + \ldots + Y_0) \end{split}$ 從結果可以得知第前 $k$ 筆資料為 $(1 - \alpha)^kY_{t-k}$ 其加權為 $(1 - \alpha)^k$ 為指數的模樣,這也是為何該手法被稱為 EWMA。 Alpha 的選擇 > 參照 [Exponential Moving Average](https://tttapa.github.io/Pages/Mathematics/Systems-and-Control-Theory/Digital-filters/Exponential%20Moving%20Average/Exponential-Moving-Average.html) 和 [Frequency Response of the Running Average Filter](https://ptolemy.berkeley.edu/eecs20/week12/freqResponseRA.html) 該式是個 IIR 的 Low pass filter,於是可得出 DC 值,而 DC 值就是數值的平均值。另外不同的 $\alpha$ 會對 frequency response 產生影響,造成 lowpass filter 的 passband 的頻寬不同。如下圖所示。 ![image](https://hackmd.io/_uploads/H1ZQAQDTT.png) | | $\alpha$數值 | 頻寬 | 取得正確平均的時間 | 受noise的影響 | | ------------ | ------------ | --- | ------------------ | ------------- | | $\alpha$越小 | $1\over64$ | 越小 | 越慢 | 越小 | | $\alpha$越大 | $1\over4$ | 越大 | 越快 | 越大 | ```c static inline int is_power_of_2(unsigned long n) { return (n != 0 && ((n & (n - 1)) == 0)); } ``` 以下是 EWMA 的可能實作,請閱讀其原始程式碼的註解,嘗試補完實作。(顏重用測驗 `3` 的 `ilog2`) ```c #include <assert.h> /* Exponentially weighted moving average (EWMA) */ struct ewma { unsigned long internal; unsigned long factor; unsigned long weight; }; /* This section contains universal functions designed for computing the EWMA. * It maintains a structure housing the EWMA parameters alongside a scaled * internal representation of the average value, aimed at minimizing rounding * errors. Initialization of the scaling factor and the exponential weight (also * known as the decay rate) is achieved through the ewma_init(). Direct access * to the structure is discouraged; instead, interaction should occur * exclusively via the designated helper functions. */ /** * ewma_init() - Initialize EWMA parameters * @avg: Average structure * @factor: Scaling factor for the internal representation of the value. The * highest achievable average is determined by the formula * ULONG_MAX / (factor * weight). To optimize performance, the scaling * factor must be a power of 2. * @weight: Exponential weight, also known as the decay rate, dictates the rate * at which older values' influence diminishes. For efficiency, the weight * must be set to a power of 2. * * Initialize the EWMA parameters for a given struct ewma @avg. */ void ewma_init(struct ewma *avg, unsigned long factor, unsigned long weight) { if (!is_power_of_2(weight) || !is_power_of_2(factor)) assert(0 && "weight and factor have to be a power of two!"); avg->weight = ilog2(weight); avg->factor = ilog2(factor); avg->internal = 0; } /** * ewma_add() - Exponentially weighted moving average (EWMA) * @avg: Average structure * @val: Current value * * Add a sample to the average. */ struct ewma *ewma_add(struct ewma *avg, unsigned long val) { avg->internal = avg->internal ? (((avg->internal << EEEE) - avg->internal) + (val << FFFF)) >> avg->weight : (val << avg->factor); return avg; } /** * ewma_read() - Get average value * @avg: Average structure * * Returns the average value held in @avg. */ unsigned long ewma_read(const struct ewma *avg) { return avg->internal >> avg->factor; } ``` 書寫規範: * `EEEE` 和 `FFFF` 為表示式 * 以最簡潔的形式撰寫,且符合[作業一](https://hackmd.io/@sysprog/linux2024-lab0)排版規範 (近似 Linux 核心程式碼排版風格) :::success 延伸問題: 1. 解釋上述程式碼運作原理 2. 在 Linux 核心原始程式碼找出 EWMA 的相關程式碼 (或類似的形式),並解說應用案例,至少該涵蓋無線網路裝置驅動程式。 > 可對照〈[圖解傅立葉分析](https://hackmd.io/@sysprog/fourier-transform)〉 ::: --- ### 測驗 `5` 延伸測驗 `3`,已知輸入必為大於 `0` 的數值 (即 `x > 0`),以下程式碼可計算 $\lceil log_2(x) \rceil$ ,也就是 [ceil](https://man7.org/linux/man-pages/man3/ceil.3.html) 和 [log2](https://man7.org/linux/man-pages/man3/log2.3.html) 的組合並轉型為整數: ```cpp int ceil_ilog2(uint32_t x) { uint32_t r, shift; x--; r = (x > 0xFFFF) << 4; x >>= r; shift = (x > 0xFF) << 3; x >>= shift; r |= shift; shift = (x > 0xF) << 2; x >>= shift; r |= shift; shift = (x > 0x3) << 1; x >>= shift; return (r | shift | x > GGG) + 1; } ``` 請補完,使其行為符合預期,GGGG 是十進位整數。 :::success 延伸問題: 1. 解釋上述程式碼運作原理 2. 改進程式碼,使其得以處理 `x = 0` 的狀況,並仍是 branchless 3. 在 Linux 核心原始程式碼找出 ceil 和 log2 的組合 (類似的形式),並解說應用案例 (提示: 在 Linux 核心排程器就有) :::

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully