白耀升
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # 2022q1 Homework3 (quiz3) contributed by < `jim12312321` > > [測驗題目](https://hackmd.io/@sysprog/linux2022-quiz3) ## 測驗 1 : 連續的 bitmask ```cpp #define GENMASK(h, l) \ (((~0UL) >> (LEFT)) & ((~0UL) >> (l) << (RIGHT))) ``` 解析︰ 先觀察程式碼,`~0UL` 等於 0xFFFFFFFFFFFFFFFF ,有出現 `shift` ,中間用 `AND` 相連,因此我們可以合理猜測本題中應該是想透過將 `~0UL` 經過特定的 `shift` 後再透過 `AND` 將不在 `[h,l]` 中的 1 消除。 > e.g. > 要產生 `GENMASK(6,4)` ,要先產生 0x000000000000007F (`~0UL` 右移 57 個 bits) 和 0xFFFFFFFFFFFFFFF0 (`~0UL` 左移 4 個 bits),再藉由 `AND` 產生 0x0000000000000070 。 由上述例子進行延伸討論,我們可以知道若要透過 `~0UL` 產生 0x000000000000007F 需要右移,反之要產生 0xFFFFFFFFFFFFFFF0 則需要左移。 再與程式碼比較後可知,`(~0UL) >> (LEFT)` 僅有右移,`(~0UL) >> (l) << (RIGHT)` 包含左移操作。因此可知若對應上述例子,`(~0UL) >> (LEFT)` 目的在於產生 0x000000000000007F ,`(~0UL) >> (l) << (RIGHT)` 目的在於產生 0xFFFFFFFFFFFFFFF0 ,不過由於在 `(~0UL) >> (l) << (RIGHT)` 中還含有右移行為,因此產生出來的 bitmask 應該為 0x0FFFFFFFFFFFFFF0 。 ==最後可以發現右移操作中的 57 來自於 64-1-h ,而左移操作中的 4 來自於 l。== `LEFT` $\rightarrow$ 63 - h `RIGHT` $\rightarrow$ l ### 延伸問題 2 > 比較 Linux 核心 GENMASK 巨集的實作,闡述其額外的考量 在 linux kernel v5.16.12 中的 [isst.h](https://elixir.bootlin.com/linux/latest/source/tools/power/x86/intel-speed-select/isst.h#L33) 中可以發現 GENMASK 的程式碼 ```cpp #define GENMASK(h, l) (((~0UL) << (l)) & (~0UL >> (sizeof(long) * 8 - 1 - (h)))) ``` 在這段程式碼中可以發現與本題中的 GENMASK 幾乎一致,僅差在 `(~0UL) << (l)` 的操作中沒有先左移 l ,但這對於答案是沒有差別的。 另外在 linux kernel v5.16.12 中的 [bits.h](https://elixir.bootlin.com/linux/latest/source/include/linux/bits.h#L37) 中也可以發現 GENMASK 的程式碼。 ```cpp #define GENMASK(h, l) \ (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l)) ``` 接著我們分別來看 `GENMASK_INPUT_CHECK(h, l)` 和 `__GENMASK(h, l)` - GENMASK_INPUT_CHECK ```cpp #if !defined(__ASSEMBLY__) #include <linux/build_bug.h> #define GENMASK_INPUT_CHECK(h, l) \ (BUILD_BUG_ON_ZERO(__builtin_choose_expr( \ __is_constexpr((l) > (h)), (l) > (h), 0))) #else /* * BUILD_BUG_ON_ZERO is not available in h files included from asm files, * disable the input check if that is the case. */ #define GENMASK_INPUT_CHECK(h, l) 0 #endif ``` 搭配 [BUILD_BUG_ON_ZERO](https://elixir.bootlin.com/linux/latest/source/tools/include/linux/build_bug.h#L8) 的說明 ```cpp #ifdef __CHECKER__ #define BUILD_BUG_ON_ZERO(e) (0) #else /* __CHECKER__ */ /* * Force a compilation error if condition is true, but also produce a * result (of value 0 and type int), so the expression can be used * e.g. in a structure initializer (or where-ever else comma expressions * aren't permitted). */ #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); }))) #endif /* __CHECKER__ */ ``` 我們可知在 `BUILD_BUG_ON_ZERO(e)`中當 e 這個情況為 0 時沒事,而不為 0 時 (condition is true) ,印出錯誤訊息。 接著來看 [`__builtin_choose_expr`](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html) 的說明 >Built-in Function: type __builtin_choose_expr (const_exp, exp1, exp2) > >You can use the built-in function __builtin_choose_expr to evaluate code depending on the value of a constant expression. This built-in function returns exp1 if const_exp, which is an integer constant expression, is nonzero. Otherwise it returns exp2. 再來看 [`__is_constexpr`](https://elixir.bootlin.com/linux/latest/source/include/linux/const.h#L11) 的說明 ```cpp /* * This returns a constant expression while determining if an argument is * a constant expression, most importantly without evaluating the argument. * Glory to Martin Uecker <Martin.Uecker@med.uni-goettingen.de> */ #define __is_constexpr(x) \ (sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8))) ``` 最後透過整理上述三個資訊可知,`GENMASK_INPUT_CHECK(h, l)` 的目的在於==確認 h > l== ,若是輸出 0;若否 (h <= l) ,則印出錯誤訊息。 - `__GENMASK` ```cpp #define __GENMASK(h, l) \ (((~UL(0)) - (UL(1) << (l)) + 1) & \ (~UL(0) >> (BITS_PER_LONG - 1 - (h)))) ``` 我們一樣來舉個例子說明(不舉例給自己看我真的腦袋卡住) > e.g. > 要產生 `GENMASK(6,4)` ,要先產生 0x000000000000007F (`~0UL` 右移 57 個 bits) 和 0xFFFFFFFFFFFFFFF0 (`~0UL` 左移 4 個 bits),再藉由 `AND` 產生 0x0000000000000070 。 這跟上面舉的例子是完全相同的,而觀察程式碼可以發現, `~UL(0) >> (BITS_PER_LONG - 1 - (h))` 的作用等同 `(~0UL) >> (63 - h)` ,一樣是為了左移 4 個 bits 產生 0x000000000000007F 。 接著來看 `(((~UL(0)) - (UL(1) << (l)) + 1)` 想做什麼。 - `(~UL(0))` 產生 0xFFFFFFFFFFFFFFFF - `(~UL(0)) - (UL(1) << (l))` 產生 0xFFFFFFFFFFFFFFEF - `((~UL(0)) - (UL(1) << (l)) + 1)` 產生 0xFFFFFFFFFFFFFFF0 ya! 結果和 `((~0UL) << (l))` 是等價的!只是這裡的想法是先目標位置 (l) 的 bit 設為 0 ,再透過 `+1` 產生我們想要的 bitmask 。(雖然我覺得想法上沒有 `((~0UL) << (l))` 來的簡潔和直覺就是了)。 總結來說在 linux kernel 中的 `GENMASK` 並沒有什麼很特別的技術,僅是在 linux kernel v5.16.12 理的 [bits.h](https://elixir.bootlin.com/linux/latest/source/include/linux/bits.h#L37) 中的 `GENMASK` 多了要求 h 恆大於 l 的限制罷了。 ### 延伸問題 3 >舉出 Linux 核心原始程式碼中二處 GENMASK 巨集和 include/linux/bitfield.h 的應用案例 在 linux 中有不少 GENMASK 巨集和 include/linux/bitfield.h 的應用案例(請看[這裡](https://github.com/torvalds/linux/search?q=%27%23include+%3Clinux%2Fbitfield.h%3E%27+%26%26+%27GENMASK%27&type=))。大部分跟裝置驅動程式或是網路挺有關係的,礙於能力有限,我會盡量理解後留下心得。 :::warning device driver 請稱呼全名「裝置驅動程式」,不要簡稱「驅動」,不然日後涉及到編譯器和資料庫,都有對應的 compiler driver 和 database driver,必然會混淆。 :notes: jserv ::: #### 應用案例 1 本段程式碼可以在 [linux/sound/soc/intel/keembay/kmb_platform.h ](https://github.com/torvalds/linux/blob/5bfc75d92efd494db37f5c4c173d3639d4772966/sound/soc/intel/keembay/kmb_platform.h) 的第 64 到 66 行中找到。 ```cpp= /*line 64 to 66*/ /* Interrupt Flag */ #define TX_INT_FLAG GENMASK(5, 4) #define RX_INT_FLAG GENMASK(1, 0) ``` 可以猜測這是用來設立 flag 用以判斷該發送數據或是接收數據。 > 參考:[What do ‘TX’ and ‘RX’ refer to in the Network Charts?](https://www.copperegg.com/what-do-tx-and-rx-refer-to-in-the-network-charts/) 接著繼續追蹤 `TX_INT_FLAG` 和 `RX_INT_FLAG` ```cpp static inline void kmb_i2s_irq_trigger(struct kmb_i2s_info *kmb_i2s, u32 stream, int chan_nr, bool trigger) { u32 i, irq; u32 flag; if (stream == SNDRV_PCM_STREAM_PLAYBACK) flag = TX_INT_FLAG; else flag = RX_INT_FLAG; for (i = 0; i < chan_nr / 2; i++) { irq = readl(kmb_i2s->i2s_base + IMR(i)); if (trigger) irq = irq & ~flag; else irq = irq | flag; writel(irq, kmb_i2s->i2s_base + IMR(i)); } } ``` 可發現當 stream 狀態為播放時,將 flag 設定為傳輸模式;否則將 flag 設定為接收模式。 #### 應用案例2 本段程式碼可以在 [ linux/net/dsa/tag_ar9331.c](https://github.com/torvalds/linux/blob/9e9fb7655ed585da8f468e29221f0ba194a5f613/net/dsa/tag_ar9331.c) 的第 13 到 15 行中找到。 ```cpp=13 #define AR9331_HDR_VERSION 1 #define AR9331_HDR_VERSION_MASK GENMASK(15, 14) ``` 接著追蹤 `AR9331_HDR_VERSION_MASK` ,可以在同一個檔案中的第 36 行找到他。(下面程式碼中的第 10 行) ```cpp= static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb, struct net_device *dev) { struct dsa_port *dp = dsa_slave_to_port(dev); __le16 *phdr; u16 hdr; phdr = skb_push(skb, AR9331_HDR_LEN); hdr = FIELD_PREP(AR9331_HDR_VERSION_MASK, AR9331_HDR_VERSION); hdr |= AR9331_HDR_FROM_CPU | dp->index; /* 0b10 for AR8216 and 0b11 for AR9331 */ hdr |= AR9331_HDR_RESERVED_MASK; phdr[0] = cpu_to_le16(hdr); return skb; } ``` 再來看看 [FIELD_PREP](https://github.com/torvalds/linux/blob/master/include/linux/bitfield.h#L99) 的說明。 ```cpp /** * FIELD_PREP() - prepare a bitfield element * @_mask: shifted mask defining the field's length and position * @_val: value to put in the field * * FIELD_PREP() masks and shifts up the value. The result should * be combined with other fields of the bitfield using logical OR. */ #define FIELD_PREP(_mask, _val) \ ({ \ __BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: "); \ ((typeof(_mask))(_val) << __bf_shf(_mask)) & (_mask); \ }) ``` 結合上述的內容我們可以推斷在 [ linux/net/dsa/tag_ar9331.c](https://github.com/torvalds/linux/blob/9e9fb7655ed585da8f468e29221f0ba194a5f613/net/dsa/tag_ar9331.c) 中的 `GENMASK` 目的是為了結合 `FIELD_PREP` 將特定的值左移到特定位置。 --- ## 測驗 2 : 記憶體向下對齊 在進入程式碼解析之前,先來複習什麼是記憶體向上對齊和向下對齊,在⟨[你所不知道的 C 語言:記憶體管理、對齊及硬體特性](https://hackmd.io/@sysprog/c-memory#data-alignment)⟩ 中的 [data alignment](https://hackmd.io/@sysprog/c-memory#data-alignment) 中有提到,記憶體對齊就是讓記憶體內的資料落在 4 個 bytes 或是 8 個 btyes 的倍數位置(當然也可以根據需求設置對齊任意位置) ,方便 cpu 抓取資料。 > 以下例子皆以對齊 4 個 bytes 為例。 - 向上對齊 - 將記憶體內的資料對齊到 4 bytes + 1 bytes 的位置上。<br>對應到記憶體定址中,就是最後 2 個 bits clear ,並對第 3 個 bit + 1 。 ```graphviz digraph G { rankdir = LR; splines = false; node[shape = "record"] address[label ="<h>Memory address|<a1>0x0|<a2>0x1|<a3>0x2|<a4>0x3|<a1>0x4|<a2>0x5|<a3>0x6|<a4>0x7",width = 1.8] space_before[label ="<h>Memory space before||<d1>data|<d2>data|<d3>data|<d4>data|||",width = 2.5] space_after[label ="<h>Memory space after|||||<d1>data|<d2>data|<d3>data|<d4>data",width = 2.5] address->space_before->space_after[weight = 10, style = invis] space_before->space_after } ``` > address (after align up) : $0001_2$ > address (before align up) : $0100_2$ - 向下對齊 - 將記憶體內的資料對齊到 4 bytes 的位置上。<br>對應到記憶體定址中,就是最後 2 個 bits clear。 ```graphviz digraph G { rankdir = LR; splines = false; node[shape = "record"] address[label ="<h>Memory address|<a1>0x0|<a2>0x1|<a3>0x2|<a4>0x3|<a1>0x4|<a2>0x5|<a3>0x6|<a4>0x7",width = 1.8] space_before[label ="<h>Memory space before||<d1>data|<d2>data|<d3>data|<d4>data|||",width = 2.5] space_after[label ="<h>Memory space after|<d1>data|<d2>data|<d3>data|<d4>data||||",width = 2.5] address->space_before->space_after[weight = 10, style = invis] space_before->space_after } ``` > address (after align up) : $0001_2$ > address (before align up) : $0000_2$ 看完說明之後就可以來解析程式碼了。 ```cpp= struct fd { struct foo *foo; unsigned int flags; }; static inline struct fd to_fd(unsigned long v) { return (struct fd){EXP1, v & 3}; } ``` 由於本題要將結構體中第一個成員的地址向下對齊 4 個 bytes ,又因為 `forward declaration` 的關係,所以在這個回傳式中要加入對 (struct foo \*) 的操作。 `EXP1` $\rightarrow$ (struct foo \*)(v & ~3) ### 延伸問題 2 >在 Linux 核心原始程式碼找出類似技巧的實作 (例如檔案系統) 並解說 在 [linux/include/uapi/linux/const.h](https://github.com/torvalds/linux/blob/master/include/uapi/linux/const.h) 中可以找到對齊相關的程式碼 > 以下程式碼節錄自 [const.h](https://github.com/torvalds/linux/blob/master/include/uapi/linux/const.h) 第 31,32 行 ```cpp= #define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1) #define __ALIGN_KERNEL_MASK(x, mask) (((x) + (mask)) & ~(mask)) ``` - `__ALIGN_KERNEL_MASK` 針對特定的 bitmask (e.g. $15_{10} = 1111_2$) 進行向上對齊 - `__ALIGN_KERNEL` 在 `__ALIGN_KERNEL_MASK` 的基礎上,對特定長度的數字(通常是 $2^n ,n \in N$) 進行向上對齊 接著在 [linux/include/linux/align.h](https://github.com/torvalds/linux/blob/master/include/linux/align.h) 中可以找到向上對齊/向下對齊的程式碼。 > 以下程式碼節錄自 [align.h](https://github.com/torvalds/linux/blob/master/include/linux/align.h) 第 7 至 9 行 ```cpp= /* @a is a power of 2 value */ #define ALIGN(x, a) __ALIGN_KERNEL((x), (a)) #define ALIGN_DOWN(x, a) __ALIGN_KERNEL((x) - ((a) - 1), (a)) ``` - `ALIGN` 本來在 `__ALIGN_KERNEL` 中就是向上對齊了,所以他就是向上對齊。 - `ALIGN_DOWN` 在要對齊的地址中先 `- ((a) - 1)` ,達到把 `a - 1`以下的位元全部 clear 並且不 `+1` ,完成向下對齊。 --- ## 測驗 3 ︰ reverse bits ```cpp= #include <stdint.h> uint8_t rev8(uint8_t x) { x = (x >> 4) | (x << 4); x = ((x & 0xCC) >> 2) | (EXP2); x = ((x & 0xAA) >> 1) | (EXP3); return x; } ``` 觀察程式碼行為可以得知,這段程式碼正在逐步將位元交換,一開始先交換前 4 個 bits 和後 4 個 bits,接著將交換完的 4 個 bits 內的 bits 兩兩互換,最後在將兩個 bits 互換。(有點 mergesort 的感覺) - origin ```graphviz digraph G { rankdir = LR; splines = false; node[shape = "record"] bits_0[label ="{7 |6|5|4|3|2|1|0}",width = 1.8] } ``` - reverse (4 bits) ``` x = (x >> 4) | (x << 4); ``` ```graphviz digraph G { rankdir = LR; splines = false; node[shape = "record"] bits_0[label ="{3|2|1|0|7|6|5|4}",width = 1.8] } ``` - reverse (2 bits) ``` x = ((x & 0xCC) >> 2) | (EXP2); ``` ```graphviz digraph G { rankdir = LR; splines = false; node[shape = "record"] bits_0[label ="{1|0|3|2|5|4|7|6}",width = 1.8] } ``` - reverse (1 bit) ``` x = ((x & 0xAA) >> 1) | (EXP3); ``` ```graphviz digraph G { rankdir = LR; splines = false; node[shape = "record"] bits_0[label ="{0|1|2|3|4|5|6|7}",width = 1.8] } ``` `EXP2` $\rightarrow$ (x & 0x33) << 2 `EXP3` $\rightarrow$ (x & 0x55) << 1 ### 延伸問題 2 > 在 Linux 核心原始程式碼找出類似技巧的實作並解說其應用場景 在 [linux/lib/bitrev.c ](https://github.com/torvalds/linux/blob/master/lib/bitrev.c) 中可以發現由於反轉 8 bits 僅有 256 種可能,因此開發者直接把 8 bits 反轉的查詢表內建在裡面了。 > 節錄自 [bitrev.c ](https://github.com/torvalds/linux/blob/master/lib/bitrev.c) 中第 11 至 44 行 ```cpp= const u8 byte_rev_table[256] = { 0x00, 0x80, 0x40, 0xc0, 0x20, 0xa0, 0x60, 0xe0, 0x10, 0x90, 0x50, 0xd0, 0x30, 0xb0, 0x70, 0xf0, 0x08, 0x88, 0x48, 0xc8, 0x28, 0xa8, 0x68, 0xe8, 0x18, 0x98, 0x58, 0xd8, 0x38, 0xb8, 0x78, 0xf8, 0x04, 0x84, 0x44, 0xc4, 0x24, 0xa4, 0x64, 0xe4, 0x14, 0x94, 0x54, 0xd4, 0x34, 0xb4, 0x74, 0xf4, 0x0c, 0x8c, 0x4c, 0xcc, 0x2c, 0xac, 0x6c, 0xec, 0x1c, 0x9c, 0x5c, 0xdc, 0x3c, 0xbc, 0x7c, 0xfc, 0x02, 0x82, 0x42, 0xc2, 0x22, 0xa2, 0x62, 0xe2, 0x12, 0x92, 0x52, 0xd2, 0x32, 0xb2, 0x72, 0xf2, 0x0a, 0x8a, 0x4a, 0xca, 0x2a, 0xaa, 0x6a, 0xea, 0x1a, 0x9a, 0x5a, 0xda, 0x3a, 0xba, 0x7a, 0xfa, 0x06, 0x86, 0x46, 0xc6, 0x26, 0xa6, 0x66, 0xe6, 0x16, 0x96, 0x56, 0xd6, 0x36, 0xb6, 0x76, 0xf6, 0x0e, 0x8e, 0x4e, 0xce, 0x2e, 0xae, 0x6e, 0xee, 0x1e, 0x9e, 0x5e, 0xde, 0x3e, 0xbe, 0x7e, 0xfe, 0x01, 0x81, 0x41, 0xc1, 0x21, 0xa1, 0x61, 0xe1, 0x11, 0x91, 0x51, 0xd1, 0x31, 0xb1, 0x71, 0xf1, 0x09, 0x89, 0x49, 0xc9, 0x29, 0xa9, 0x69, 0xe9, 0x19, 0x99, 0x59, 0xd9, 0x39, 0xb9, 0x79, 0xf9, 0x05, 0x85, 0x45, 0xc5, 0x25, 0xa5, 0x65, 0xe5, 0x15, 0x95, 0x55, 0xd5, 0x35, 0xb5, 0x75, 0xf5, 0x0d, 0x8d, 0x4d, 0xcd, 0x2d, 0xad, 0x6d, 0xed, 0x1d, 0x9d, 0x5d, 0xdd, 0x3d, 0xbd, 0x7d, 0xfd, 0x03, 0x83, 0x43, 0xc3, 0x23, 0xa3, 0x63, 0xe3, 0x13, 0x93, 0x53, 0xd3, 0x33, 0xb3, 0x73, 0xf3, 0x0b, 0x8b, 0x4b, 0xcb, 0x2b, 0xab, 0x6b, 0xeb, 0x1b, 0x9b, 0x5b, 0xdb, 0x3b, 0xbb, 0x7b, 0xfb, 0x07, 0x87, 0x47, 0xc7, 0x27, 0xa7, 0x67, 0xe7, 0x17, 0x97, 0x57, 0xd7, 0x37, 0xb7, 0x77, 0xf7, 0x0f, 0x8f, 0x4f, 0xcf, 0x2f, 0xaf, 0x6f, 0xef, 0x1f, 0x9f, 0x5f, 0xdf, 0x3f, 0xbf, 0x7f, 0xff, }; ``` 而在 [linux/include/linux/bitrev.h ](https://github.com/torvalds/linux/blob/master/include/linux/bitrev.h)中則可以看到基於 `byte_rev_table` 延伸出來的 16 位元、 32 位元版本。 ```cpp=16 static inline u8 __bitrev8(u8 byte) { return byte_rev_table[byte]; } static inline u16 __bitrev16(u16 x) { return (__bitrev8(x & 0xff) << 8) | __bitrev8(x >> 8); } static inline u32 __bitrev32(u32 x) { return (__bitrev16(x & 0xffff) << 16) | __bitrev16(x >> 16); } ``` 另外在 [bitrev.h](https://github.com/torvalds/linux/blob/master/include/linux/bitrev.h) 中有提供用於 reverse 編譯時常數的版本,作法就根本題是幾乎一模一樣了。 ```cpp=65 #define __constant_bitrev8(x) \ ({ \ u8 ___x = x; \ ___x = (___x >> 4) | (___x << 4); \ ___x = ((___x & (u8)0xCCU) >> 2) | ((___x & (u8)0x33U) << 2); \ ___x = ((___x & (u8)0xAAU) >> 1) | ((___x & (u8)0x55U) << 1); \ ___x; \ }) ``` 因此在 linux 中真正在使用 bit reverse 時也是要先考慮輸入值是否為編譯時常數。 ```cpp=98 #define bitrev8(x) \ ({ \ u8 __x = x; \ __builtin_constant_p(__x) ? \ __constant_bitrev8(__x) : \ __bitrev8(__x) ; \ }) ``` #### 應用情境 在很多裝置驅動程式相關的程式碼中都能看到 bit reverse 的身影,以下程式碼雖不能完整解釋,就當放上來作為實際應用情境參考。 在 [linux/drivers/media/usb/cx231xx/cx231xx-input.c ](https://github.com/torvalds/linux/blob/5bfc75d92efd494db37f5c4c173d3639d4772966/drivers/media/usb/cx231xx/cx231xx-input.c) 的 `get_key_isdbt` 函式中可以看到以下程式碼。 ```cpp=39 scancode = bitrev8(cmd); dev_dbg(&ir->rc->dev, "cmd %02x, scan = %02x\n", cmd, scancode); ``` 又或者在 [linux/drivers/media/rc/img-ir/img-ir-nec.c](https://github.com/torvalds/linux/blob/5bfc75d92efd494db37f5c4c173d3639d4772966/drivers/media/rc/img-ir/img-ir-nec.c) 中的 `img_ir_nec_scancode` 函式可以看到解析 `raw code` to `scan code` 時用到 `bitrev8` 。 ```cpp=30 request->scancode = bitrev8(addr) << 24 | bitrev8(addr_inv) << 16 | bitrev8(data) << 8 | bitrev8(data_inv); ``` --- ## 測驗 4 : foreach macro foreach 的概念很簡單,就是依序提取出特定 array 內的所有資料。 先來看程式碼。 ```cpp= #include <assert.h> #define _foreach_no_nullval(i, p, arr) \ assert((i) >= sizeof(arr) / sizeof(arr[0]) || (p)) #define foreach_int(i, ...) \ for (unsigned _foreach_i = (((i) = ((int[]){__VA_ARGS__})[0]), 0); \ _foreach_i < sizeof((int[]){__VA_ARGS__}) / sizeof(int); \ (i) = ((int[]){__VA_ARGS__, 0})[EXP4]) #define foreach_ptr(i, ...) \ for (unsigned _foreach_i = \ (((i) = (void *) ((typeof(i)[]){__VA_ARGS__})[0]), 0); \ (i); (i) = (void *) ((typeof(i)[]){__VA_ARGS__, \ NULL})[EXP5], \ _foreach_no_nullval(_foreach_i, i, \ ((const void *[]){__VA_ARGS__}))) ``` 無論是 `foreach_int` 或 `foreach_ptr` 目標都是為了走訪 array 中所有資料,對應到 c 中 for 的虛擬碼就會是: ```cpp= for(take the first element and get first index; assert(index < MAX_LEN_ARRAY); update the element and index) ``` 為求精簡,題目中在一行內就要更新 element (題目中的 `i`) 與 index (題目中的 `_foreach_i`) 因此使用 `++Variable` 的技巧。 `EXP4` $\rightarrow$ ++_foreach_i `EXP5` $\rightarrow$ ++_foreach_i ### 延伸問題 2 >在 Linux 核心原始程式碼找出類似技巧的實作並解說其應用場景 在 [drivers/net/ethernet/mellanox/mlxfw/mlxfw_mfa2_tlv_multi.h](https://github.com/torvalds/linux/blob/5bfc75d92efd494db37f5c4c173d3639d4772966/drivers/net/ethernet/mellanox/mlxfw/mlxfw_mfa2_tlv_multi.h) 中可以找到類似的實作技巧。 ```cpp=33 #define mlxfw_mfa2_tlv_foreach(mfa2_file, tlv, idx, from_tlv, count) \ for (idx = 0, tlv = from_tlv; idx < (count); \ idx++, tlv = mlxfw_mfa2_tlv_next(mfa2_file, tlv)) ``` 這邊的作法跟 [list.h](https://github.com/sysprog21/lab0-c/blob/master/list.h) 中的 `list_for_each` 之類的實作挺類似的,就不特別解說。 這類型的技巧可以用來檢查特定變數/結構體有沒有符合特定條件。在同個目錄中的 [mlxfw_mfa2.c](https://github.com/torvalds/linux/blob/5bfc75d92efd494db37f5c4c173d3639d4772966/drivers/net/ethernet/mellanox/mlxfw/mlxfw_mfa2.c) 可以看到應用的場景。 ```cpp=232 /* check that all the devices exist */ mlxfw_mfa2_tlv_foreach(mfa2_file, tlv, idx, mfa2_file->first_dev, mfa2_file->dev_count) { if (!tlv) { pr_err("Device TLV error\n"); return false; } /* Check each device */ if (!mlxfw_mfa2_file_dev_validate(mfa2_file, tlv, idx)) return false; } ``` --- ## 測驗 5 : LeetCode 29. Divide Two Integers ```c= #include <limits.h> int divide(int dividend, int divisor) { int signal = 1; unsigned int dvd = dividend; if (dividend < 0) { signal *= -1; dvd = ~dvd + 1; } unsigned int dvs = divisor; if (divisor < 0) { signal *= -1; dvs = ~dvs + 1; } int shift = 0; while (dvd > (EXP6)) shift++; unsigned int res = 0; while (dvd >= dvs) { while (dvd < (dvs << shift)) shift--; res |= (unsigned int) 1 << shift; EXP7; } if (signal == 1 && res >= INT_MAX) return INT_MAX; return res * signal; } ``` `EXP6` $\rightarrow$ dvs << shift `EXP7` $\rightarrow$ dvd -= dvs << shift --- ## 測驗 6 : LeetCode 149. Max Points on a Line 本題程式的目的在於將可連成一條線的點($\dfrac{y}{x}$ 相同者,在本題中 `x` 與 `y` 都有先除以 `gcd(x,y)` ,以確保 hash 值相同),存在 hash table 同個 hash 值的 linked list 中。 ```cpp= static bool can_insert(struct list_head *head, int p1, int p2) { struct point_node *pn; list_for_each_entry (pn, head, link) return EXP8; return true; } ``` 所以 `EXP8` 的目的在於判斷傳進來的 `(p1,p2)` 與原先 list 中的 `(p1,p2)` 是否一致。 `EXP8` $\rightarrow$ `p1 == pn->p1` or `p2 == pn->p2` ```cpp= if (can_insert(&heads[hash], i, j)) { struct point_node *pn = malloc(sizeof(*pn)); pn->p1 = i; pn->p2 = j; EXP9; } ``` 而 `EXP9` 的目的就在於將新的節點放進 `heads` 中對應 hash 的位置。 `EXP9` $\rightarrow$ list_add(&pn->link, &heads[hash]) --- ## 測驗 7 : ilog32 考慮題目一開始的要求 >針對給定的 32 位元無號數,計算扣除開頭的 0,最少需要多少位元才可儲存 相當於對特定的數字進行 $log_2$ 的操作再取 floor 。因此這題便是對前述概念的實作。 ```cpp= int ilog32(uint32_t v) { int ret = v > 0; int m = (v > 0xFFFFU) << 4; v >>= m; ret |= m; m = (v > 0xFFU) << 3; v >>= m; ret |= m; m = (v > 0xFU) << 2; v >>= m; ret |= m; m = EXP10; v >>= m; ret |= m; EXP11; return ret; } ``` 觀察題目後可以發現有類似的模組重複發生,如下所示。 ```cpp= m = (v > 0xFFU) << 3; v >>= m; ret |= m; ``` 每次對半檢查,判斷對半後較高位的那組中是否有值。若有則記錄判斷了幾個 bits ,反覆檢查直至全部檢查過。 到 `EXP10` 之前的程式碼檢查到 4 個 bits ,因此可知 `EXP10` 要檢查剩下未確定的 4 個 bits 中,較高位的 2 個 bits 是否為 0。 `EXP10` $\rightarrow$ (v > 0x3) << 1 理論上還要再進行一次針對剩下 2 個 bits 的操作, `ret` 的結果才會是最終答案,如下所示。 ```cpp= m = (v > 0x1); v >>= m; ret |= m; ``` 題目中顯然是希望最後這三道指令能濃縮成一道,首先 `v >>= m` ,與最後回傳值無關,不理他。剩下 `m = (v > 0x1)` 跟 `ret |= m` 。不過由於最初就是考慮到每次位移的 bits 都不會重複,因此在這個情況下 `|=` 等價於 `+=` 。 `EXP11` $\rightarrow$ `ret |= v > 1` or `ret += v > 1` --- ## 測驗 8 : binary tree 以下解說只挑有填空的位置講解。 ```cpp= while (*p != 0 && (*p)->data != d) { if (d < (*p)->data) EXP12; else EXP13; } ``` 根據二元搜尋樹的特性,若在當前節點無法匹配到指定值且當前節點不為葉節點,匹配值較當前值大則向右搜尋,反之向左。 `EXP12` $\rightarrow$ p = &(*p)->left `EXP13` $\rightarrow$ p = &(*p)->right ```cpp= tnode **r = &q->right; while ((*r)->left) r = EXP14; q->data = (*r)->data; q = *r; *r = q->right; ``` 這段在處理欲移除節點擁有同時擁有左右子樹的情況,此時應拿全部小於該數值中最接近該數值的值取代(左子樹中最右邊的值),或全部大於該數值中最接近該數值的值取代(右子樹中最左邊的值),而在本題中使用的是後者。 `EXP14` $\rightarrow$ &(*r)->left --- ## 測驗 9 : round up alignment 上面有解釋過何謂向上對齊,這邊簡單複習。 > 以下例子皆以對齊 4 個 bytes 為例。 - 向上對齊 - 將記憶體內的資料對齊到 4 bytes + 1 bytes 的位置上。<br>對應到記憶體定址中,就是最後 2 個 bits clear ,並對第 3 個 bit + 1 。 ```graphviz digraph G { rankdir = LR; splines = false; node[shape = "record"] address[label ="<h>Memory address|<a1>0x0|<a2>0x1|<a3>0x2|<a4>0x3|<a1>0x4|<a2>0x5|<a3>0x6|<a4>0x7",width = 1.8] space_before[label ="<h>Memory space before||<d1>data|<d2>data|<d3>data|<d4>data|||",width = 2.5] space_after[label ="<h>Memory space after|||||<d1>data|<d2>data|<d3>data|<d4>data",width = 2.5] address->space_before->space_after[weight = 10, style = invis] space_before->space_after } ``` > address (after align up) : $0001_2$ > address (before align up) : $0100_2$ 接著來看程式碼。 ```cpp= /* maximum alignment needed for any type on this platform, rounded up to a power of two */ #define MAX_ALIGNMENT 16 /* Given a size, round up to the next multiple of sizeof(void *) */ #define ROUND_UP_TO_ALIGNMENT_SIZE(x) \ (((x) + MAX_ALIGNMENT - MMM) & ~(NNN)) ``` 因此本題希望將 x 向上對齊到 4 的倍數的位置上 (e.g. $0000 0010_2 \rightarrow 00001 0000_2$) ,並且將最低 4 個 bits 清空。 `MMM` $\rightarrow$ 1 `NNN` $\rightarrow$ MAX_ALIGNMENT - 1 --- ## 測驗 10 : DIVIDE_ROUND_CLOSEST ```cpp= #define DIVIDE_ROUND_CLOSEST(x, divisor) \ ({ \ typeof(x) __x = x; \ typeof(divisor) __d = divisor; \ (((typeof(x)) -1) > 0 || ((typeof(divisor)) -1) > 0 || \ (((__x) > 0) == ((__d) > 0))) \ ? ((RRR) / (__d)) \ : ((SSS) / (__d)); \ }) ``` 用來進行判斷進入 `((RRR) / (__d))` 或 `((SSS) / (__d))` 的三個條件,主要目的便是拿來判別運算結果有沒有可能為負,若不可能就進 `((RRR) / (__d))` 否則進 `((SSS) / (__d))` 。 三個條件中須注意的是 `((typeof(x)) -1)` 這是用以將 -1 cast 成 `typeof(x)` ,假設 x 的 type 為 int ,則 `((typeof(x)) -1) > 0` 不成立,若 x 的 type 為 unsigned int ,則 `((typeof(x)) -1) > 0` 成立(之所以要檢查是否為 unsigned ,是因為跟 unsigned 運算的 signed 數值也會變 unsigned ,因此恆大於等於 0)。 考慮到在 c 中除法會無條件捨去,因此對於除完後餘數大於等於 0.5 的數值要補齊這一半,利用==加上==這一半的值讓他進位;而對於小於 0.5 的情況,有沒有補都不會造成進位,因此補上去也沒關係。而在負數的除法中情況相反,對於除完後餘數大於等於 0.5 的數值要補齊這一半,利用==減去==這一半的值讓他進位。 `RRR` $\rightarrow$ (__x) + ((__d) >> 1) `SSS` $\rightarrow$ (__x) - ((__d) >> 1) --- ## 測驗 11 : floor sqrt `fls(x)` 的目的在於找出最高位的 set bit 的位置。 >`fls(1) == 0`, `fls(2) == 1`, `fls(0x8000 0000 0000 0000) == 63` ```cpp= unsigned long i_sqrt(unsigned long x) { unsigned long b, m, y = 0; if (x <= 1) return x; m = 1UL << (fls(x) & ~1UL); while (m) { b = y + m; XXX; if (x >= b) { YYY; y += m; } ZZZ; } return y; } ``` 一直卡住沒什麼想法,在參考 [kdnvt](https://hackmd.io/@5ux520T8Sd-OoYfdxZR55Q/SkO7JHRgq#2022q1-Homework3-quiz3) 同學的解釋之後有了一點頭緒。 用 2 進位的邏輯 $N^2$ 可以被表示為 $(a_n+...+a_0)^2$ ,其中 $a_m$ ($0 \leq m \leq n$) 表示 $2^m$ 或 0 ,因此我們可以從 $a_n$ 找回 $a_0$ 並把結果記錄下來 $P_m = a_n + ... + a_0$ 。 為了決定 $P_m$ 中的 $a_m$ 屬於 $a^m$ 或 0 ,利用以下規則︰ >If $P_m^2 \leq N^2$, $P_{m+1} = P_m - 2^m$ >Else, $P_{m+1} = P_m$ :::info 對應的程式碼 ```cpp= if (x >= b) { YYY; y += m; } ``` ::: 並且為了每次都去計算 $P_m^2$ ,在每一次計算的時候,我們僅紀錄 $X_m = N^2-P_m^2$ ,並用 $X_{m+1} = X_m + Y_m$ 的方式更新 $X_m$ 。(其中 $Y_m = P_m^2 - P_{m+1}^2=2P_{m+1}a_m+a_m^2$) 接著要從最大的 n 開始往下找,也就是 `m = 1UL << (fls(x) & ~1UL)` 這行的作用,並令 $P_n = a_n$ 。然後我們用兩個新變數 $c_m$ 和 $d_m$ 分別儲存 $Y_m$ 中的兩個部份。 >$c_m=P_{m+1}2^{m+1}$ (本題的 `y`) >$d_m=(2^m)^2$ (本題的 `m`) 最後 $c_m$ 和 $d_m$ 會利用以下方式更新。 >$c_{m-1} = P_m2^m = P_{m+1}2^m + a_m2^m$. If $a_m = 2^m$, $c_{m-1} = \dfrac{c_m}{2}+d_m$. Else $c_{m-1} = \dfrac{c_m}{2}$ >$d_{m-1}=\dfrac{d_m}{4}$ `XXX` $\rightarrow$ y >>= 1 `YYY` $\rightarrow$ x -= b `ZZZ` $\rightarrow$ m >>= 2 ---

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully