Yourui
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invitee
    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Versions and GitHub Sync Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
Invitee
Publish Note

Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

Your note will be visible on your profile and discoverable by anyone.
Your note is now live.
This note is visible on your profile and discoverable online.
Everyone on the web can find and read all notes of this public team.
See published notes
Unpublish note
Please check the box to agree to the Community Guidelines.
View profile
Engagement control
Commenting
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Suggest edit
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
Emoji Reply
Enable
Import from Dropbox Google Drive Gist Clipboard
   owned this note    owned this note      
Published Linked with GitHub
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
# 2024q1 Homework2 (quiz1+2) contributed by < [`yourui1017`](https://github.com/yourui1017) > ## 第一週測驗題 ### 測驗1 ::: success 1. 解釋程式碼的運作原理,提出改進方案並予以實作。 2. 使用 Linux 核心風格的 List API 改寫上述程式碼,並針對鏈結串列,提出可避免最差狀況的快速排序實作,應設計效能評比的測試程式。 ::: :::spoiler 遮蔽的部分 首先先針對遮蔽的部分進行推敲。 * AAAA ```c node_t *list_tail(node_t **left) { while ((*left) && (*left)->next) left = &(AAAA); return *left; } ``` 可以看到函式的目的是要找出 list 的最後一個 node,因此在指標的指標 left 還沒指到 NULL 之前,left 都應該持續往下一個節點移動,故 AAAA = (*left)->next。 * BBBB ```c int list_length(node_t **left) { int n = 0; while (*left) { ++n; left = &(BBBB); } return n; } ``` 可以看到函式的目的是要找出 list 的長度,因此在指標的指標 left 還沒指到 NULL 之前,left 都應該持續往下一個節點移動,故 BBBB = (*left)->next。 * CCCC ```c while (p) { node_t *n = p; p = CCCC; list_add(n->value > value ? &right : &left, n); } ``` 該段程式碼推測是想要用 left 或 right 指標指向當前的節點(若比 pivot 小,則 left 會指向該節點,相反亦然。),且因為 list_add 會改變當前節點指向的位置,因此需要用 p 指標儲存下一個節點位置,故 CCCC = p->next * DDDD && EEEE ```c begin[i] = left; end[i] = DDDD; begin[i + 1] = pivot; end[i + 1] = pivot; begin[i + 2] = right; end[i + 2] = EEEE; ``` 在 [2024q1 第 1 週測驗題](https://hackmd.io/@sysprog/linux2024-quiz1) 中有提到,begin 和 end 這兩個矩陣是為了用堆疊模擬原本的遞迴呼叫。 其中,begin 依照順序會儲存 left、pivot 和 right 的首指標,而 end 應該要儲存 left、pivot 和 right 的尾指標,故 DDDD = list_tail(&left) EEEE = list_tail(&right) ::: #### quick_sort 運作原理 1. pivot 會被指定在第一個節點的位置,並根據 pivot 的值與其他節點進行比較,如果小於 pivot 則會使用到 left 指標,將這些節點串接起來。相反的,如果大於 pivot 則會使用到 right 指標。 2. 將當前 left 和 right 指標指向的位置儲存在 begin 和 end 當中,以堆疊的形式模擬遞迴的寫法。 3. 直到分割成單一節點後,堆疊才會被 pop 出來,並且新增到 result 中。 * 首先 pivot 會被指定在第一個節點的位置。 ```graphviz digraph _graph_name_ { rankdir=LR; # 順序由左至右(上下是"TD") graph [fontname="DFKai-SB"]; # 此三行是設定字型 node [fontname="DFKai-SB"]; # 中文務必指定 edge [fontname="DFKai-SB"]; # 不然可能會出現亂碼 # {rank=same R D} # 同層(強迫上下) # 這邊是 node (節點) head[shape=plaintext,fontcolor = red] pivot[shape = plaintext] A[label = "3"] B[label = "4"] C[label = "1"] # 不給 label 就會直接用名稱 D[label = "2"] E[label = "NULL"] # 這邊是 edge (邊) pivot->A head->A->B->C->D->E # 可以一直連 } ``` * 因為之後要存進堆疊中,所以將 pivot 斷開成單一節點。 ```graphviz digraph _graph_name_ { rankdir=LR; # 順序由左至右(上下是"TD") graph [fontname="DFKai-SB"]; # 此三行是設定字型 node [fontname="DFKai-SB"]; # 中文務必指定 edge [fontname="DFKai-SB"]; # 不然可能會出現亂碼 head[shape=plaintext,fontcolor = red] pivot[shape = plaintext] A[label = "3"] B[label = "4"] C[label = "1"] # 不給 label 就會直接用名稱 D[label = "2"] E[label = "NULL"] F[label = "NULL"] # 這邊是 edge (邊) pivot->A A->F head->A B->C->D->E # 可以一直連 } ``` * 使用 left 指標指向比 pivot 還要小的節點;使用 right 指標指向比 pivot 還要大的節點 ```graphviz digraph _graph_name_ { rankdir=LR; # 順序由左至右(上下是"TD") graph [fontname="DFKai-SB"]; # 此三行是設定字型 node [fontname="DFKai-SB"]; # 中文務必指定 edge [fontname="DFKai-SB"]; # 不然可能會出現亂碼 head[shape=plaintext,fontcolor = red] pivot[shape = plaintext] left[shape=plaintext,fontcolor = blue] right[shape=plaintext,fontcolor = blue] A[label = "3"] B[label = "4"] C[label = "1"] # 不給 label 就會直接用名稱 D[label = "2"] E[label = "NULL"] F[label = "NULL"] G[label = "NULL"] # 這邊是 edge (邊) pivot->A A->E head->A right->B->F left->D->C->G } ``` * 儲存到 begin 和 end 的堆疊中。 ```graphviz digraph _graph_name_ { rankdir=LR; # 順序由左至右(上下是"TD") graph [fontname="DFKai-SB"]; # 此三行是設定字型 node [fontname="DFKai-SB"]; # 中文務必指定 edge [fontname="DFKai-SB"]; # 不然可能會出現亂碼 head[shape=plaintext,fontcolor = red] pivot[shape = plaintext] left[shape=plaintext,fontcolor = blue] right[shape=plaintext,fontcolor = blue] begin0[label = "begin[0]", shape=plaintext] begin1[label = "begin[1]", shape=plaintext] begin2[label = "begin[2]", shape=plaintext] end0[label = "end[0]", shape=plaintext] end1[label = "end[1]", shape=plaintext] end2[label = "end[2]", shape=plaintext] A[label = "3"] B[label = "4"] C[label = "1"] # 不給 label 就會直接用名稱 D[label = "2"] E[label = "NULL"] F[label = "NULL"] G[label = "NULL"] # 這邊是 edge (邊) pivot->A A->E head->A right->B->F left->D->C->G begin0->D end0->C begin1->A end1->A begin2->B end2->B } ``` * 因為 begin[2] 指向單一節點,將begin[2] 節點 pop 出來,並且加入 result 中。 ```graphviz digraph _graph_name_ { rankdir=LR; # 順序由左至右(上下是"TD") graph [fontname="DFKai-SB"]; # 此三行是設定字型 node [fontname="DFKai-SB"]; # 中文務必指定 edge [fontname="DFKai-SB"]; # 不然可能會出現亂碼 head[shape=plaintext,fontcolor = red] pivot[shape = plaintext] left[shape=plaintext,fontcolor = blue] right[shape=plaintext,fontcolor = blue] begin0[label = "begin[0]", shape=plaintext] begin1[label = "begin[1]", shape=plaintext] begin2[label = "begin[2]", shape=plaintext] end0[label = "end[0]", shape=plaintext] end1[label = "end[1]", shape=plaintext] end2[label = "end[2]", shape=plaintext] result[shape=plaintext,fontcolor = red] A[label = "3"] B[label = "4"] C[label = "1"] # 不給 label 就會直接用名稱 D[label = "2"] E[label = "NULL"] F[label = "NULL"] G[label = "NULL"] # 這邊是 edge (邊) pivot->A A->E head->A right->B->F left->D->C->G begin0->D end0->C begin1->A end1->A begin2->B end2->B result->B } ``` 重複以上步驟,直到堆疊中沒有東西即可完成排序。 :::warning 問題: 1. 能否減少堆疊需要使用到的記憶體大小?(在此方法中的最差狀況下的 max_level 為2*n -1,想辦法減少 max_level) 2. 能否藉由改變 pivot 的選擇方式,改善最差狀況的快速排序實作? ::: **max_level** 在單向鍊結串列的 `quicksort` 中,最差狀況下的 max_level 會是 `2*n-1`,因為在堆疊中會儲存到 NULL,導致空間的浪費。 * 解決辦法 * 使用環狀鍊結串列改善,如此一來就不需要儲存 NULL,導致空間浪費,使 max_level 降為 n+1。 **pivot** * 解決辦法 * 隨機選取 pivot 的位置,避免 pivot 選取到最小或最大的值上,導致最差狀況的發生。 **使用 Linux 核心風格的 List API 改寫** * 將 `quick_sort` 使用 List API 改寫,精簡程式碼並增加易讀性。 ```diff node_t *L = begin[i], *R = end[i]; - if (L != R) { - node_t *pivot = L; - value = pivot->value; - node_t *p = pivot->next; - pivot->next = NULL; - - while (p) { - node_t *n = p; - p = p->next; - list_add(n->value > value ? &right : &left, n); + struct list_head *L = begin[i], *R = list_tail(begin[i]); + if (L->next != R) { + + // struct list_head *pivot = L->next; + // element_t *pivot_entry = list_entry(pivot, element_t, list); + // list_del_init(pivot); + + struct list_head *pivot = select_pivot(L); + element_t *pivot_entry = list_entry(pivot, element_t, list); + list_del_init(pivot); + + element_t *entry, *safe; + list_for_each_entry_safe(entry, safe, L, list){ + list_move(&entry->list, ((strcmp(entry->value, pivot_entry->value) > 0) ? right : left)); } ``` **實驗結果** * 使用 `perf stat --repeat 5 -e cache-references,instructions,cycles ./main ` 指令計算效能 * 將 pivot 設為第一個節點 * ::: spoiler 結果 ``` Size 10 Performance counter stats for './main' (5 runs): 63,068 cache-references ( +- 1.08% ) 1,132,332 instructions # 0.83 insn per cycle ( +- 0.40% ) 1,368,200 cycles ( +- 5.47% ) 0.0007218 +- 0.0000499 seconds time elapsed ( +- 6.91% ) ``` ``` Size 100 Performance counter stats for './main' (5 runs): 66,776 cache-references ( +- 3.01% ) 1,484,857 instructions # 0.97 insn per cycle ( +- 0.15% ) 1,530,310 cycles ( +- 5.18% ) 0.0008237 +- 0.0000400 seconds time elapsed ( +- 4.86% ) ``` ``` Size 1000 Performance counter stats for './main' (5 runs): 70,253 cache-references ( +- 2.31% ) 25,101,039 instructions # 2.92 insn per cycle ( +- 0.02% ) 8,585,825 cycles ( +- 1.91% ) 0.0025760 +- 0.0000577 seconds time elapsed ( +- 2.24% ) ``` ``` Size 10000 Performance counter stats for './main' (5 runs): 6,822,270 cache-references ( +- 7.93% ) 2,266,479,072 instructions # 3.46 insn per cycle ( +- 0.00% ) 655,706,507 cycles ( +- 0.98% ) 0.16660 +- 0.00165 seconds time elapsed ( +- 0.99% ) ``` ``` Size 100000 Performance counter stats for './main' (5 runs): 4,499,222,447 cache-references ( +- 0.86% ) 235,180,643,595 instructions # 3.40 insn per cycle ( +- 2.69% ) 69,161,118,332 cycles ( +- 1.74% ) 18.203 +- 0.198 seconds time elapsed ( +- 1.09% ) ``` ::: * 隨機選取 pivot * :::spoiler 結果 ``` Size 10 Performance counter stats for './main' (5 runs): 62,809 cache-references ( +- 1.47% ) 1,135,753 instructions # 0.84 insn per cycle ( +- 0.24% ) 1,352,131 cycles ( +- 3.86% ) 0.0007908 +- 0.0000288 seconds time elapsed ( +- 3.64% ) ``` ``` Size 100 Performance counter stats for './main' (5 runs): 67,972 cache-references ( +- 2.31% ) 1,567,539 instructions # 0.89 insn per cycle ( +- 0.17% ) 1,754,914 cycles ( +- 3.85% ) 0.0010084 +- 0.0000784 seconds time elapsed ( +- 7.77% ) ``` ``` Size 1000 Performance counter stats for './main' (5 runs): 82,764 cache-references ( +- 3.04% ) 28,938,278 instructions # 2.35 insn per cycle ( +- 0.01% ) 12,310,694 cycles ( +- 1.21% ) 0.01314 +- 0.00108 seconds time elapsed ( +- 8.23% ) ``` ``` Size 10000 Performance counter stats for './main' (5 runs): 28,574,071 cache-references ( +- 5.96% ) 2,522,037,077 instructions # 2.46 insn per cycle ( +- 0.00% ) 1,023,736,256 cycles ( +- 1.45% ) 0.26103 +- 0.00528 seconds time elapsed ( +- 2.02% ) ``` ``` Size 100000 Performance counter stats for './main' (5 runs): 9,848,674,522 cache-references ( +- 0.54% ) 246,962,163,428 instructions # 2.39 insn per cycle ( +- 0.00% ) 103,465,962,705 cycles ( +- 0.47% ) 26.567 +- 0.251 seconds time elapsed ( +- 0.95% ) ``` ::: * 發現結果其實沒有變比較好,推測是因為鍊結串列每次在選取隨機點時需要走訪節點,且此走訪的時間會超過 worst case 搜尋的時間,導致效能下降。 ```c struct list_head *select_pivot(struct list_head *L){ struct list_head *pivot = L->next; int listsize = list_length(L); int rnum = rand(); for (int i = 0; i < (rnum % listsize); i++) pivot = pivot->next; return pivot; } ``` ### 測驗2 ::: success 延伸問題: 1. 解釋上述程式碼的運作原理,提出改進方案並予以實作。 2. 將改進過的 Timsort 實作整合進 sysprog21/lab0-c,作為另一個有效的排序命令,應設計效能評比的測試程式,確保測試案例符合 Timsort 設想的資料分布樣式。 ::: :::spoiler 遮蔽的部分 首先先針對遮蔽的部分進行推敲。 * AAAA ```c static struct list_head *merge(void *priv, list_cmp_func_t cmp, struct list_head *a, struct list_head *b) { struct list_head *head; struct list_head **tail = AAAA; ``` 這邊宣告了 head 指標和 tail 為指標的指標,並考量到下面的程式碼是讓 tail 在該串列中進行走訪,故 AAAA = &head。 * BBBB && CCCC ```c for (;;) { /* if equal, take 'a' -- important for sort stability */ if (cmp(priv, a, b) <= 0) { *tail = a; tail = BBBB; a = a->next; if (!a) { *tail = b; break; } } else { *tail = b; tail = CCCC; b = b->next; if (!b) { *tail = a; break; } } } ``` 這邊的程式碼是要讓 tail 根據比較的結果指向 a 或 b 指標,並往下個指標移動,故 BBBB = CCCC = &(*tail)->next * DDDD && EEEE ```c static void build_prev_link(struct list_head *head, struct list_head *tail, struct list_head *list) { tail->next = list; do { list->prev = tail; tail = list; list = list->next; } while (list); /* The final links to make a circular doubly-linked list */ DDDD = head; EEEE = tail; } ``` build_prev_link 函式是要讓指標根據當前的排列順序指向正確的 link,故 DDDD = tail->next,EEEE = head->prev。 * FFFF ```c /* The final merge; rebuild prev links */ struct list_head *stk0 = tp, *stk1 = stk0->prev; while (stk1 && stk1->prev) stk0 = stk0->prev, stk1 = stk1->prev; if (stk_size <= FFFF) { build_prev_link(head, head, stk0); return; } merge_final(priv, cmp, head, stk1, stk0); ``` 這邊是要判斷 stk_size 的大小,如果只有一個以下的 link 就不需要進行 build_prev_link,故 FFFF = 1。 ::: 因為篇幅的緣故所以又開了一個筆記 [Timsort 筆記](https://hackmd.io/TCRWYzfSSjOOo3-ezNEeKA) 紀錄學習過程。 #### Timsort 程式運作原理 ```graphviz digraph _graph_name_ { # 基本宣告 label = "Minrun = 3, stk_size = 0"; rankdir=LR; # 順序由左至右(上下是"TD") node [shape=record, fontname=Helvetica, fontsize=10] # node list_head[label="{<p> prev | <n> next}"]; result[label="{<h> head | <n> next}"]; head_tim[shape=plaintext,fontcolor = red] list_tim[shape=plaintext,fontcolor = red] tp_tim[shape=plaintext,fontcolor = red] A[label = "1"] B[label = "2"] C[label = "8"] D[label = "4"] E[label = "3"] F[label = "5"] G[label = "7"] H[label = "10"] NULL[label = "NULL", shape=plaintext] # 指標 list_tim->A tp_tim->NULL # list_head 的 next head_tim->list_head->A->B->C->D->E->F->G->H->NULL # list_head 的 prev H->G->F->E->D->C->B->A->list_head->H } ``` 進行這段的程式碼。 ```c do { /* Find next run */ struct pair result = find_run(priv, list, cmp); result.head->prev = tp; tp = result.head; list = result.next; stk_size++; tp = merge_collapse(priv, cmp, tp); } while (list); ``` 執行第一次的 while loop ,尋找下一個 run ,把該 run 的長度儲存在 run_head 的 next 的 prev 中,並且讓 run_head 的 prev 指向 NULL 。 除此之外,還宣告了 pair 的結構體儲存當前 run 的 head 和待尋找的節點位置。 ```graphviz digraph _graph_name_ { # 基本宣告 label = "Minrun = 3, stk_size = 1"; rankdir=LR; # 順序由左至右(上下是"TD") node [shape=record, fontname=Helvetica, fontsize=10] # node list_head[label="{<p> prev | <n> next}"]; result[label="{<h> head | <n> next}"]; len_1[label = "len_1 = 3", shape=plaintext] head_tim[shape=plaintext,fontcolor = red] list_tim[shape=plaintext,fontcolor = red] tp_tim[shape=plaintext,fontcolor = red] A[label = "1", style=filled, fillcolor = aquamarine1] B[label = "2", style=filled, fillcolor = aquamarine1] C[label = "8", style=filled, fillcolor = aquamarine1] D[label = "4", style=filled, fillcolor = gold] E[label = "3"] F[label = "5"] G[label = "7"] H[label = "10"] // NULL[label = "NULL", shape=plaintext] NULL_1[label = "NULL", shape=plaintext] NULL_2[label = "NULL", shape=plaintext] NULL_3[label = "NULL", shape=plaintext] NULL_4[label = "NULL", shape=plaintext] # 指標 list_tim->D tp_tim->A # list_head 的 next head_tim->list_head->A->B->C->NULL_1 D->E->F->G->H->NULL_2 # list_head 的 prev H->G->F->E->D->C->B->len_1 A->NULL_3 # result result:h->A result:n->D # 區域變數 // list[shape=plaintext, fontcolor =blue] // next[shape=plaintext, fontcolor =blue] // head[shape=plaintext, fontcolor =blue] // prev[shape=plaintext, fontcolor =blue] // list->C // head->A // next->D // prev->NULL_4 } ``` 進行第二次的 while loop ,尋找下一個 run ,並且因為 stack 的長度滿足 $B>C$ ,所以不合併。 ```graphviz digraph _graph_name_ { # 基本宣告 label = "Minrun = 3, stk_size = 2"; rankdir=LR; # 順序由左至右(上下是"TD") node [shape=record, fontname=Helvetica, fontsize=10] # node list_head[label="{<p> prev | <n> next}"]; result[label="{<h> head | <n> next}"]; len_1[label = "len_1 = 3", shape=plaintext] len_2[label = "len_2 = 2", shape=plaintext] head_tim[shape=plaintext,fontcolor = red] list_tim[shape=plaintext,fontcolor = red] tp_tim[shape=plaintext,fontcolor = red] A[label = "1", style=filled, fillcolor = coral1] B[label = "2", style=filled, fillcolor = coral1] C[label = "8", style=filled, fillcolor = coral1] D[label = "4", style=filled, fillcolor = aquamarine1] E[label = "3", style=filled, fillcolor = aquamarine1] F[label = "5", style=filled, fillcolor = gold] G[label = "7"] H[label = "10"] // NULL[label = "NULL", shape=plaintext] NULL_1[label = "NULL", shape=plaintext] NULL_2[label = "NULL", shape=plaintext] NULL_3[label = "NULL", shape=plaintext] NULL_4[label = "NULL", shape=plaintext] # 指標 list_tim->F tp_tim->E # list_head 的 next head_tim->list_head->A->B->C->NULL_1 D->NULL_4 E->D F->G->H->NULL_2 # list_head 的 prev H->G->F->E->A D->len_2 C->B->len_1 A->NULL_3 # result result:h->E result:n->F # 區域變數 // list[shape=plaintext, fontcolor =blue] // next[shape=plaintext, fontcolor =blue] // head[shape=plaintext, fontcolor =blue] // prev[shape=plaintext, fontcolor =blue] // list->E // head->E // next->F // prev->D } ``` 進行第三次的 while loop,尋找下一個 run 。 ```graphviz digraph _graph_name_ { # 基本宣告 label = "Minrun = 3, stk_size = 3"; rankdir=LR; # 順序由左至右(上下是"TD") node [shape=record, fontname=Helvetica, fontsize=10] # node list_head[label="{<p> prev | <n> next}"]; result[label="{<h> head | <n> next}"]; len_1[label = "len_1 = 3", shape=plaintext] len_2[label = "len_2 = 2", shape=plaintext] len_3[label = "len_3 = 3", shape=plaintext] head_tim[shape=plaintext,fontcolor = red] list_tim[shape=plaintext,fontcolor = red] tp_tim[shape=plaintext,fontcolor = red] A[label = "1", style=filled, fillcolor = coral1] B[label = "2", style=filled, fillcolor = coral1] C[label = "8", style=filled, fillcolor = coral1] D[label = "4", style=filled, fillcolor = coral1] E[label = "3", style=filled, fillcolor = coral1] F[label = "5", style=filled, fillcolor = aquamarine1] G[label = "7", style=filled, fillcolor = aquamarine1] H[label = "10", style=filled, fillcolor = aquamarine1] // NULL[label = "NULL", shape=plaintext] NULL_1[label = "NULL", shape=plaintext] NULL_2[label = "NULL", shape=plaintext] NULL_3[label = "NULL", shape=plaintext] NULL_4[label = "NULL", shape=plaintext] NULL_5[label = "NULL", shape=plaintext] # 指標 list_tim->NULL_2 tp_tim->F # list_head 的 next head_tim->list_head->A->B->C->NULL_1 D->NULL_4 E->D F->G->H->NULL_2 # list_head 的 prev H->G->len_3 F->E E->A D->len_2 C->B->len_1 A->NULL_3 # result result:h->F result:n->NULL_5 # 區域變數 // list[shape=plaintext, fontcolor =blue] // next[shape=plaintext, fontcolor =blue] // head[shape=plaintext, fontcolor =blue] // prev[shape=plaintext, fontcolor =blue] // list->H // head->F // next->NULL_2 // prev->NULL_4 } ``` 進行第三次的 while loop 。 因為堆疊的長度不滿足 $A>B+C$ 所以將 stack[1] 和 stack[2] 合併,完成 `merge_collapse` ,跳出 while 。 ```graphviz digraph _graph_name_ { # 基本宣告 label = "Minrun = 3, stk_size = 2"; rankdir=LR; # 順序由左至右(上下是"TD") node [shape=record, fontname=Helvetica, fontsize=10] # node list_head[label="{<p> prev | <n> next}"]; result[label="{<h> head | <n> next}"]; len_1[label = "len_1 = 3", shape=plaintext] len_2[label = "len_2 = 2", shape=plaintext] len_3[label = "len_3 = 5", shape=plaintext] head_tim[shape=plaintext,fontcolor = red] list_tim[shape=plaintext,fontcolor = red] tp_tim[shape=plaintext,fontcolor = red] A[label = "1", style=filled, fillcolor = coral1] B[label = "2", style=filled, fillcolor = coral1] C[label = "8", style=filled, fillcolor = coral1] D[label = "4", style=filled, fillcolor = chartreuse1] E[label = "3", style=filled, fillcolor = chartreuse1] F[label = "5", style=filled, fillcolor = chartreuse1] G[label = "7", style=filled, fillcolor = chartreuse1] H[label = "10", style=filled, fillcolor = chartreuse1] // NULL[label = "NULL", shape=plaintext] NULL_1[label = "NULL", shape=plaintext] NULL_2[label = "NULL", shape=plaintext] NULL_3[label = "NULL", shape=plaintext] NULL_5[label = "NULL", shape=plaintext] # 指標 list_tim->NULL_2 tp_tim->E # list_head 的 next head_tim->list_head->A->B->C->NULL_1 E->D D->F->G->H->NULL_2 # list_head 的 prev H->G->len_3 F->E E->A D->len_2 C->B->len_1 A->NULL_3 # result result:h->F result:n->NULL_5 # 區域變數 // at[shape=plaintext, fontcolor =blue] // prev[shape=plaintext, fontcolor =blue] // list[shape=plaintext, fontcolor =blue] // at->F // prev->A // list->E } ``` 因為已經搜尋完所有的節點,且堆疊只剩下 stack[0] 和 stack[1] ,故將它們合併。 執行 `merge_force_collapse` 。 ```graphviz digraph _graph_name_ { # 基本宣告 label = "Minrun = 3, stk_size = 1"; rankdir=LR; # 順序由左至右(上下是"TD") node [shape=record, fontname=Helvetica, fontsize=10] # node list_head[label="{<p> prev | <n> next}"]; result[label="{<h> head | <n> next}"]; len_1[label = "len_1 = 8", shape=plaintext] len_2[label = "len_2 = 2", shape=plaintext] len_3[label = "len_3 = 5", shape=plaintext] head_tim[shape=plaintext,fontcolor = red] list_tim[shape=plaintext,fontcolor = red] tp_tim[shape=plaintext,fontcolor = red] A[label = "1", style=filled, fillcolor = chartreuse1] B[label = "2", style=filled, fillcolor = chartreuse1] C[label = "8", style=filled, fillcolor = chartreuse1] D[label = "4", style=filled, fillcolor = chartreuse1] E[label = "3", style=filled, fillcolor = chartreuse1] F[label = "5", style=filled, fillcolor = chartreuse1] G[label = "7", style=filled, fillcolor = chartreuse1] H[label = "10", style=filled, fillcolor = chartreuse1] // NULL[label = "NULL", shape=plaintext] NULL_2[label = "NULL", shape=plaintext] NULL_3[label = "NULL", shape=plaintext] NULL_5[label = "NULL", shape=plaintext] # 指標 list_tim->NULL_2 tp_tim->A # list_head 的 next head_tim->list_head->A->B->C->E->D->F->G->H->NULL_2 # list_head 的 prev H->G->len_3 F->E E->A D->len_2 C->B->len_1 A->NULL_3 # result result:h->F result:n->NULL_5 // # 區域變數 // at[shape=plaintext, fontcolor =blue] // prev[shape=plaintext, fontcolor =blue] // list[shape=plaintext, fontcolor =blue] // at->E // prev->NULL_3 // list->A } ``` 完成合併,將 prev 指標接上,完成 timsort。 ```graphviz digraph _graph_name_ { # 基本宣告 label = "Minrun = 3, stk_size = 1"; rankdir=LR; # 順序由左至右(上下是"TD") node [shape=record, fontname=Helvetica, fontsize=10] # node list_head[label="{<p> prev | <n> next}"]; result[label="{<h> head | <n> next}"]; head_tim[shape=plaintext,fontcolor = red] list_tim[shape=plaintext,fontcolor = red] tp_tim[shape=plaintext,fontcolor = red] A[label = "1"] B[label = "2"] C[label = "8"] D[label = "4"] E[label = "3"] F[label = "5"] G[label = "7"] H[label = "10"] NULL[label = "NULL", shape=plaintext] NULL_1[label = "NULL", shape=plaintext] # 指標 list_tim->NULL tp_tim->A # list_head 的 next head_tim->list_head->A->B->C->E->D->F->G->H->list_head # list_head 的 prev H->G->F->D->E->C->B->A->list_head->H # result result:h->F result:n->NULL_1 } ``` 在上述的示意圖中可以發現在此程式碼中並沒有規定 Minrun 的大小,run 的大小都是根據升冪 / 降冪的數量決定,導致 run 的數量不固定,無法確保分組接近 2 的冪。 另外,合併的方式也只有使用到合併排序,因此如果輸入資料大量的連續升冪 / 降冪將會導致效能低落,故可以實作 galloping mode。 :::info TODO: 1. 加入 minrun 的判斷,並實作 insertion sort 保證達成 minrun。 2. 實作 exponentioal search 並在 Timsort 中加入 galloping mode。 3. 將改進過的 Timsort 實作整合進 sysprog21/lab0-c,作為另一個有效的排序命令,應設計效能評比的測試程式,確保測試案例符合 Timsort 設想的資料分布樣式。 ::: #### 實驗結果 1. 加入 minrun 的判斷,並實作 insertion sort 保證達成 minrun。 首先,先計算該鍊結串列的 minrun ,其中 minrun 的大小不得超過 64 ,否則會使 insertion sort 的效能低落。 ```c static size_t find_minrun (struct list_head *head) { size_t len = list_length(head); // find first 6 bit & add up remain bits size_t minrun = 0; while(len >> 6) { minrun += (len & 1); len >>= 1; } minrun += len; return minrun; } ``` 在 `find_run` 中加入 insertion sort ,使 run 的大小達到指定的 minrun 。 ```diff - if (cmp(priv, list, next) > 0) { - /* decending run, also reverse the list */ - struct list_head *prev = NULL; - do { - len++; - list->next = prev; - prev = list; - list = next; - next = list->next; - head = list; - } while (next && cmp(priv, list, next) > 0); - list->next = prev; - else { - do { - len++; - list = next; - next = list->next; - } while (next && cmp(priv, list, next) <= 0); - list->next = NULL; + while (len < minrun && next) { + if (cmp(priv, list, next) > 0) { + /* decending run, also reverse the list */ + do { + if (cmp(priv, *ptr, next) > 0) { + len++; + list->next = next->next; + next->next = *ptr; + *ptr = next; + next = list->next; + ptr = &head; + } + else + ptr = &(*ptr)->next; + } while (next && cmp(priv, list, next) > 0 && len < minrun ); + } else { + do { + len++; + list = next; + next = list->next; + } while (next && cmp(priv, list, next) <= 0 && len < minrun ); + } } + + list->next = NULL; head->prev = NULL; + runs++; head->next->prev = (struct list_head *) len; result.head = head, result.next = next; return result; ``` 使用 `perf stat --repeat 3 -e cache-misses,cache- references,instructions,cycles ./main` 針對隨機資料進行測試。 未使用 minrun : ``` Sample size = 10000 runs = 4132 Performance counter stats for './main' (3 runs): 41,237 cache-misses # 20.13% of all cache refs ( +- 8.36% ) 204,818 cache-references ( +- 8.36% ) 10,369,779 instructions # 1.01 insn per cycle ( +- 0.34% ) 10,267,786 cycles ( +- 2.64% ) 0.01182 +- 0.00119 seconds time elapsed ( +- 10.10% ) ``` ``` Sample size = 100000 runs = 41293 Performance counter stats for './main' (3 runs): 444,764 cache-misses # 11.31% of all cache refs ( +- 12.41% ) 3,931,395 cache-references ( +- 3.49% ) 107,925,882 instructions # 0.85 insn per cycle ( +- 0.15% ) 126,984,584 cycles ( +- 1.90% ) 0.0432 +- 0.0103 seconds time elapsed ( +- 23.85% ) ``` ``` Sample size = 1000000 runs = 412970 Performance counter stats for './main' (3 runs): 29,468,058 cache-misses # 36.47% of all cache refs ( +- 6.93% ) 80,794,375 cache-references ( +- 4.52% ) 1,235,257,451 instructions # 0.48 insn per cycle ( +- 0.34% ) 2,549,099,081 cycles ( +- 3.84% ) 0.6463 +- 0.0246 seconds time elapsed ( +- 3.80% ) ``` 使用 minrun : ``` Sample size = 10000 runs = 250 Performance counter stats for './main' (3 runs): 23,796 cache-misses # 12.57% of all cache refs ( +- 38.20% ) 189,267 cache-references ( +- 5.48% ) 14,209,541 instructions # 1.41 insn per cycle ( +- 0.18% ) 10,082,736 cycles ( +- 0.62% ) 0.004440 +- 0.000234 seconds time elapsed ( +- 5.26% ) ``` ``` Sample size = 100000 runs = 1924 Performance counter stats for './main' (3 runs): 418,569 cache-misses # 10.57% of all cache refs ( +- 10.14% ) 3,959,824 cache-references ( +- 0.76% ) 166,587,040 instructions # 1.29 insn per cycle ( +- 0.04% ) 128,848,686 cycles ( +- 1.20% ) 0.033213 +- 0.000508 seconds time elapsed ( +- 1.53% ) ``` ``` Sample size = 1000000 runs = 15873 Performance counter stats for './main' (3 runs): 26,499,448 cache-misses # 34.56% of all cache refs ( +- 1.07% ) 76,686,563 cache-references ( +- 0.06% ) 1,988,304,575 instructions # 0.83 insn per cycle ( +- 0.00% ) 2,400,667,775 cycles ( +- 0.53% ) 0.60511 +- 0.00330 seconds time elapsed ( +- 0.54% ) ``` 可以觀察到加入 minrun 後, run 的數量都會維持在略小於 2 的冪,這樣一來就會使合併的效能提昇。 2. 實作 exponentioal search 並在 Timsort 中加入 galloping mode。 因為 exponentioal search 的效率取決於能夠隨機訪問元素,但對於鍊結串列來說,因為需要走訪所有的節點,導致隨機訪問的效率較低,因此判斷即使加入 exponentioal search 反而可能會讓整體 timsort 的效能降低。 3. 將改進過的 Timsort 實作整合進 sysprog21/lab0-c,作為另一個有效的排序命令,應設計效能評比的測試程式,確保測試案例符合 Timsort 設想的資料分布樣式。 參考 [csotaku0926](https://hackmd.io/@csotaku0926/linux2024-homework1#%E6%95%88%E8%83%BD%E5%88%86%E6%9E%90%E5%AF%A6%E9%A9%97) 同學的方法進行效能測試的實驗。 詳細請看 [commit40fb4](https://github.com/yourui1017/lab0-c/commit/7d2c3debde207bd6b5c22938bb483bda8a4f2295) 。 ## 第二週測驗題 ### 測驗1 ::: success 延伸問題: 1. 解釋上述程式碼的運作,撰寫完整的測試程式,指出其中可改進之處並實作 2. 在 Linux 核心的 cgroups 相關原始程式碼,找出 pre-order walk 程式碼並探討 ::: :::spoiler 遮蔽的部分 首先先針對遮蔽的部分進行推敲。 * AAAA ```diff - n->next = AAAA; + n->next = h->first; ``` 讓新增的 node 指向原本 h->first 指向的位置。 * BBBB ```diff - hlist_for_each (p, BBBB) { + hlist_for_each (p, &heads[hash]) { ``` 針對對應的 heads[hash] 去找內部的 inorder tree 。 * CCCC ```diff - struct order_node *on = CCCC(p, struct order_node, node); + struct order_node *on = container_of(p, struct order_node, node); ``` 使用 container_of 找到包含此 node 的 order_node 。 * DDDD ```diff - hlist_add_head(&on->node, DDDD); + hlist_add_head(&on->node, &heads[hash]); ``` 針對對應的 heads[hash] 加入 node 。 ::: **Construct Binary Tree from Preorder and Inorder Traversal 原理解釋** 本測驗的目的是藉由輸入樹的前序和中序的序列還原完整的樹。 前序遍歷:順序是根節點、左子節點、右子節點。 中序遍歷:順序是左子節點、根節點、右子節點。 後序遍歷:順序是左子節點、右子節點、根節點。 因此可以先透過前序序列找到根節點,並且找到根節點對應到中序序列的 index 。如此一來,就可以藉由中序序列找到根節點、左子樹和右子樹。 並且使用 dfs(Depth-first Search) ,依據順序找到根節點、左子樹和右子樹,在此程式碼中是使用遞迴實作 dfs 。 **示意圖** ```graphviz digraph order { rankdir=TB; inorder [label="Inorder:";shape="none"] preorder[label="Preorder:";shape="none"] {rank=same inorder A_in B_in C_in D_in E_in} {rank=same preorder A_pre B_pre C_pre D_pre E_pre} A_in [label="9"] B_in [label="3"] C_in [label="15"] D_in [label="20"] E_in [label="7"] inorder A_pre [label="3"] B_pre [label="9"] C_pre [label="20"] D_pre [label="15"] E_pre [label="7"] preorder->inorder[style=invis] preorder->A_pre [style=invis] A_pre->B_pre->C_pre->D_pre->E_pre inorder->A_in [style=invis] A_in->B_in->C_in->D_in->E_in } ``` 因此可以找到根節點、左子樹和右子樹。 根節點:3 左子樹:9 右子樹:15、20、7 ```graphviz digraph order { rankdir=TB; inorder [label="Inorder:";shape="none"] preorder[label="Preorder:";shape="none"] // root [label="root";shape="none"] // left [label="left";shape="none"] // right [label="right";shape="none"] {rankdir=TB preorder inorder} {rank=same inorder A_in B_in C_in D_in E_in} {rank=same preorder A_pre B_pre C_pre D_pre E_pre} A_in [label="9"] B_in [label="3"] C_in [label="15"] D_in [label="20"] E_in [label="7"] A_pre [label="3"] B_pre [label="9"] C_pre [label="20"] D_pre [label="15"] E_pre [label="7"] preorder->inorder[style=invis] preorder->A_pre [style=invis] A_pre->B_pre->C_pre->D_pre->E_pre inorder->A_in [style=invis] A_in->B_in->C_in->D_in->E_in A_pre->B_in [color=red] } ``` 接下來就可以針對左右子樹找根節點、左子樹和右子樹。 最後就可以建立原始的樹如下圖: ```graphviz digraph order { rankdir=TB; {rank=TB A B} {rank=same B C} {rank=same D E} A [label="3"] B [label="9"] C [label="20"] D [label="15"] E [label="7"] A->B A->C C->D C->E } ``` **雜湊表示意圖** 另外,因為本測驗是基於鍊結串列做實作,因此為保證查閱時間為 O(1) ,使用 hash table 儲存中序序列,以便在 dfs 函式中可以使用前序序列查閱。 ![image](https://hackmd.io/_uploads/SkOPkX610.png) 上圖取自 [Linux 核心的 hash table 實作](https://hackmd.io/@sysprog/linux-hashtable) :::info TODO 1. 在 Linux 核心的 cgroups 相關原始程式碼,找出 pre-order walk 程式碼並探討。 ::: ### 測驗2 :::success 延伸問題: 1. 解釋上述程式碼的運作,撰寫完整的測試程式,指出其中可改進之處並實作 2. 在 Linux 核心找出 LRU 相關程式碼並探討 ::: :::spoiler 遮蔽的部份 * EEEE ```diff - EEEE = pprev; + next->pprev = pprev; ``` * FFFF ```diff - LRUNode *cache = list_entry(pos, LRUNode, FFFF); + LRUNode *cache = list_entry(pos, LRUNode, link); ``` * GGGG ```diff - list_del(GGGG); + list_del(pos); ``` * HHHH ```diff - LRUNode *cache = list_entry(pos, LRUNode, HHHH); + LRUNode *cache = list_entry(pos, LRUNode, node); ``` * IIII ```diff - list_move(IIII, &obj->dhead); + list_move(&cache->link, &obj->dhead); ``` * JJJJ ```diff - LRUNode *c = list_entry(pos, LRUNode, JJJJ); + LRUNode *c = list_entry(&c->dhead, LRUNode, node); ``` * KKKK ```diff - list_move(KKKK, &obj->dhead); + list_move(&c->link, &obj->dhead); ``` ::: **LRU 原理解釋** Least Recently Used (LRU) 的核心觀念就是將比較常用到的資料放置在記憶體空間的前面,減少對磁盤的存取,讓讀取的速度可以更快。 LRU 使用的是先進先出的概念,所以如果佇列滿了,則會將 tail 的 Node 移除,並在 Head 加上新使用的 Node 。如此一來,佇列就會儲存最近被使用到的資料。 **示意圖** ```graphviz digraph { rankdir=LR; subgraph cluster_1{ node [shape=record]; label = "LRUCache" other[label = "<cap> capacity|<count> count"] list_head [label = "<h> dhead|{<prev> prev|<next> next}", group = list]; hlist_head [label = "<h> hhead[]|<ht0>|<ht1>|<ht2>|<ht3>|<ht4> "]; } subgraph cluster_2{ node [shape=record]; label = "LRUNode 2" other2[label = "<key> key|<val> value"] list_head2 [label = "<h> link|{<prev> prev|<next> next}", group = list]; hlist_node2 [label = "<h> node|{<pprev> pprev|<next> next}", group = list]; } subgraph cluster_3{ node [shape=record]; label = "LRUNode 1" other1[label = "<key> key|<val> value"] list_head1 [label = "<h> link|{<prev> prev|<next> next}", group = list]; hlist_node1 [label = "<h> node|{<pprev> pprev|<next> next}", group = list]; } list_head:next->list_head2:h list_head2:prev->list_head:h list_head2:next->list_head1:h list_head1:prev->list_head2:h hlist_head:ht0->hlist_node2:h hlist_node2:pprev->hlist_head:ht0 hlist_node2:next->hlist_node1:h hlist_node1:pprev->hlist_node2:next } ``` 其中 `count` 代表目前 Cache使用的大小、 `capacity` 代表 hhead 的大小。 除了使用雙向鍊結串列連接以外,還有使用到 hash table 儲存鍊結,這樣一來就可以加快存取鍊結的速度。 **程式碼解釋** ```c int lRUCacheGet(LRUCache *obj, int key) { int hash = key % obj->capacity; struct hlist_node *pos; hlist_for_each (pos, &obj->hhead[hash]) { LRUNode *cache = list_entry(pos, LRUNode, node); if (cache->key == key) { list_move(&cache->link, &obj->dhead); return cache->value; } } return -1; } ``` lRUCacheGet 會將 key 模除得到 hash值,並且根據 hash 值去找對應的 hash table 中的 CacheNode ,如果找到話就回傳該 key 對應的 value。 藉由存取 hash table 加快鍊結串列的存取速度。 ```c void lRUCachePut(LRUCache *obj, int key, int value) { LRUNode *cache = NULL; int hash = key % obj->capacity; struct hlist_node *pos; hlist_for_each (pos, &obj->hhead[hash]) { LRUNode *c = list_entry(pos, LRUNode, node); if (c->key == key) { list_move(&c->link, &obj->dhead); cache = c; } } if (!cache) { if (obj->count == obj->capacity) { cache = list_last_entry(&obj->dhead, LRUNode, link); list_move(&cache->link, &obj->dhead); hlist_del(&cache->node); hlist_add_head(&cache->node, &obj->hhead[hash]); } else { cache = malloc(sizeof(LRUNode)); hlist_add_head(&cache->node, &obj->hhead[hash]); list_add(&cache->link, &obj->dhead); obj->count++; } cache->key = key; } cache->value = value; } ``` lRUCachePut 則是會先判斷 Cache 的容量是否已經滿了。 1. count == capacity 移除 hlist 的最後一個 node ,並且將剛剛存取的 node 移至 hlist 的 head 後。 2. count != capacity 直接將剛剛存取的 node ,加入在hlist 的 head 後。 此處的 hash function 都是使用模除的方式,但針對不知道 key 的範圍以及分佈情形,適用 Multiplication Method 。 :::info TODO: 1. 將 hash function 改成 Multiplication Method 。 2. 在 Linux 核心找出 LRU 相關程式碼並探討。 ::: ### 測驗3 ::: success 延伸問題: 1. 解釋上述程式碼的運作原理 2. 在 Linux 核心原始程式碼找出 find_nth_bit 的應用案例,並予以解說和分析,需要涵蓋 CPU affinity 相關的原始程式碼。 ::: :::spoiler 遮蔽的部份 * AAAA ```diff - if ((word & AAAA) == 0) { + if ((word & 0xfffffff) == 0) { ``` * BBBB ```diff - *p &= BBBB; + *p &= ~mask; ``` * CCCC ```diff - if (sz CCCC BITS_PER_LONG) + if (sz % BITS_PER_LONG) ``` ::: **原理解釋** 參考自 [yuyuan0625](https://hackmd.io/@yuyuan/linux2024-homework2) ```c static inline unsigned long __ffs(unsigned long word) { int num = 0; #if BITS_PER_LONG == 64 if ((word & AAAA) == 0) { num += 32; word >>= 32; } #endif if ((word & 0xffff) == 0) { num += 16; word >>= 16; } if ((word & 0xff) == 0) { num += 8; word >>= 8; } if ((word & 0xf) == 0) { num += 4; word >>= 4; } if ((word & 0x3) == 0) { num += 2; word >>= 2; } if ((word & 0x1) == 0) num += 1; return num; } ``` `__ffs()` 是用來查詢一個 word 中最低位 1 的所在位置, 若 (word & 0xffffffff) == 0 即可知道較低的32位元內沒有任一位元為1, 因此將 word 右移 32 位元再進行檢查。 ```c static inline void __clear_bit(unsigned long nr, volatile unsigned long *addr) { unsigned long mask = BIT_MASK(nr); unsigned long *p = ((unsigned long *) addr) + BIT_WORD(nr); *p &= BBBB; } ``` 此段程式碼是透過 BIT_MASK(nr) 產生出只有第 nr 位為 1 ,其他位皆為 0 的遮罩,並利用此遮罩將該 addr 的第 nr 位清除。 ```c static inline unsigned long fns(unsigned long word, unsigned int n) { while (word) { unsigned int bit = __ffs(word); if (n-- == 0) return bit; __clear_bit(bit, &word); } return BITS_PER_LONG; } ``` 此段程式碼目的為找到第 n 個被設為 1 的位置,它會不斷地呼叫 __ffs 找到最低被設為 1 的位元,若還不是第 n 個就會使用 __clear_bit 將目前的位元清除,再繼續找下一個被設為 1 的位元。 ```c static inline unsigned long find_nth_bit(const unsigned long *addr, unsigned long size, unsigned long n) { if (n >= size) return size; if (small_const_nbits(size)) { unsigned long val = *addr & GENMASK(size - 1, 0); return val ? fns(val, n) : size; } return FIND_NTH_BIT(addr[idx], size, n); } ``` 運作原理: 先利用 `small_const_nbits` 比較要找的位數是否超過限制的 `size` , 並利用 `GENMASK(h, l)` 將 h 到 l 間的位元變為 1和 `addr` 做 & 運算 ,若 `val` 有值代表 `addr` h 到 l 間有位元被設為1,因此再利用 fns 找到為 第 n 個被設為 1 的位元並回傳其位置。若不符合 `small_const_nbits` 就利用 `FIND_NTH_BIT` 來處理。 ```c #define FIND_NTH_BIT(FETCH, size, num) \ ({ \ unsigned long sz = (size), nr = (num), idx, w, tmp; \ \ for (idx = 0; (idx + 1) * BITS_PER_LONG <= sz; idx++) { \ if (idx * BITS_PER_LONG + nr >= sz) \ goto out; \ \ tmp = (FETCH); \ w = hweight_long(tmp); \ if (w > nr) \ goto found; \ \ nr -= w; \ } \ \ if (sz CCCC BITS_PER_LONG) \ tmp = (FETCH) & BITMAP_LAST_WORD_MASK(sz); \ found: \ sz = min(idx * BITS_PER_LONG + fns(tmp, nr), sz); \ out: \ sz; \ }) ``` `FIND_NTH_BIT` 能夠搜尋超出單個 `BITS_PER_LONG `長度的範圍。如果要查找的位元不在目前處理的字節中,能透過迴圈繼續在下一個 `BITS_PER_LONG` 長度的區塊中搜尋,直到找到該位為止。 因為 `size` 可能無法被 `BITS_PER_LONG` 整除,因此搜尋到最後一輪時應避免找到超出 `size` 範圍的字元,因此使用取餘數的方式找到最後一個字節所需要搜尋的字元數量。 :::info TODO: 1. 在 Linux 核心原始程式碼找出 find_nth_bit 的應用案例,並予以解說和分析,需要涵蓋 CPU affinity 相關的原始程式碼。 ::: ## 參考資料 [2024q1 第 1 週測驗題](https://hackmd.io/@sysprog/linux2024-quiz1) [演算法-快速排序法Quick Sort](https://ithelp.ithome.com.tw/articles/10278644) [timsort.txt](https://svn.python.org/projects/python/trunk/Objects/listsort.txt) [V8 內的排序演算法 — Timsort](https://yuanchieh.page/posts/2019/2019-08-09-v8-%E5%85%A7%E7%9A%84%E6%8E%92%E5%BA%8F%E6%BC%94%E7%AE%97%E6%B3%95-timsort/) [排序算法 (六) - Timsort](https://www.sakuratears.top/blog/%E6%8E%92%E5%BA%8F%E7%AE%97%E6%B3%95%EF%BC%88%E5%85%AD%EF%BC%89-TimSort.html) [世界上最快的排序算法——Timsort](https://www.cnblogs.com/sunshuyi/p/12680918.html) [指數搜尋 Exponential Search](https://rust-algo.club/searching/exponential_search/)

Import from clipboard

Paste your markdown or webpage here...

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lose their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.
Upgrade
All
  • All
  • Team
No template.

Create a template

Upgrade

Delete template

Do you really want to delete this template?
Turn this template into a regular note and keep its content, versions, and comments.

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

By clicking below, you agree to our terms of service.

Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
Wallet ( )
Connect another wallet

New to HackMD? Sign up

Help

  • English
  • 中文
  • Français
  • Deutsch
  • 日本語
  • Español
  • Català
  • Ελληνικά
  • Português
  • italiano
  • Türkçe
  • Русский
  • Nederlands
  • hrvatski jezik
  • język polski
  • Українська
  • हिन्दी
  • svenska
  • Esperanto
  • dansk

Documents

Help & Tutorial

How to use Book mode

Slide Example

API Docs

Edit in VSCode

Install browser extension

Contacts

Feedback

Discord

Send us email

Resources

Releases

Pricing

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions and GitHub Sync
Get Full History Access

  • Edit version name
  • Delete

revision author avatar     named on  

More Less

Note content is identical to the latest version.
Compare
    Choose a version
    No search result
    Version not found
Sign in to link this note to GitHub
Learn more
This note is not linked with GitHub
 

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub
      • Please sign in to GitHub and install the HackMD app on your GitHub repo.
      • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
      Learn more  Sign in to GitHub

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Include title and tags
      Available push count

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully