contributed by < chensheep
>
Andrushika
根據老師撰寫的資訊科技詞彙翻譯:
「實作」會是符合上述 "implement" 語意的譯詞,取「起」以「創作」的意思,而「製造」本質上是固定輸入並通常產生固定輸出的行為。
目前全篇都寫作「實做」,建議修改。
「閱讀 lib/list_sort.c」小節有錯字:
目標是 merge list a 與 list b,因為 merge 中不「維」 prev 指標
此外,在撰寫文件時,應該注意標點符號與斷句位置,方便閱讀。
如 q_merge
小節可以修正如下(使用 Claude 協助修正):
目前實現方式是依序從第 2 個 queue 開始,將所有的 node 都接到第 1 個 queue 的後方,最後再針對第 1 個 queue 做排序。
這個作法很直覺,但因為最後需要針對第 1 個 queue 做排序,目前使用的是 Merge sort,所以其時間複雜度為。
原文的句子:
從卡方分佈表找合適的 P value,
為 24.762842054728754,因為 14.848 < 24.762842054728754 < 32.007,所以 P value 介於 0.9 和 0.1 之間。
此處使用的是右尾檢定,但推導過程不精準,若顯著水準為 alpha = 0.05,則應該直接使用 0.9
, 0.1
去進行比較。
$ gcc --version
gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 39 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 6
On-line CPU(s) list: 0-5
Vendor ID: GenuineIntel
Model name: Intel(R) Core(TM) i5-9500 CPU @ 3.00GHz
CPU family: 6
Model: 158
Thread(s) per core: 1
Core(s) per socket: 6
Socket(s): 1
Stepping: 10
CPU(s) scaling MHz: 18%
CPU max MHz: 4400.0000
CPU min MHz: 800.0000
BogoMIPS: 6000.00
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi
mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfm
on pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 mo
nitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic mov
be popcnt tsc_deadline_timer xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault
epb pti ssbd ibrs ibpb stibp tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjus
t bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsav
ec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp vnmi md_
clear flush_l1d arch_capabilities
Virtualization features:
Virtualization: VT-x
Caches (sum of all):
L1d: 192 KiB (6 instances)
L1i: 192 KiB (6 instances)
L2: 1.5 MiB (6 instances)
L3: 9 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-5
Vulnerabilities:
Gather data sampling: Mitigation; Microcode
Itlb multihit: KVM: Mitigation: VMX disabled
L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT disabled
Mds: Mitigation; Clear CPU buffers; SMT disabled
Meltdown: Mitigation; PTI
Mmio stale data: Mitigation; Clear CPU buffers; SMT disabled
Reg file data sampling: Not affected
Retbleed: Mitigation; IBRS
Spec rstack overflow: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; IBRS; IBPB conditional; STIBP disabled; RSB filling; PBRSB-eIBRS Not affecte
d; BHI Not affected
Srbds: Mitigation; Microcode
Tsx async abort: Mitigation; TSX disabled
參考以下實做
struct list_head *q_new()
{
struct list_head *lh = malloc(sizeof(struct list_head));
INIT_LIST_HEAD(lh);
return lh;
}
malloc(3) — Linux manual page 中RETURN VALUE 提到
加入檢查 malloc
函式回傳結果,當 malloc
發生錯誤時 q_new
回傳 NULL
struct list_head *q_new()
{
struct list_head *lh = malloc(sizeof(struct list_head));
if (!lh) {
return NULL;
}
INIT_LIST_HEAD(lh);
return lh;
}
實做方式
void q_free(struct list_head *head)
{
element_t *e = NULL, *is = NULL;
if (!head) return;
list_for_each_entry_safe (e, is, head, list) {
list_del(&e->list);
q_release_element(e);
}
free(head);
}
透過 qtest
程式測試,產生一個空的 queue 後釋放沒有錯誤訊息
$ ./qtest
cmd> new
l = []
cmd> free
l = NULL
cmd>
但在實做完 q_insert_head
,出現以下錯誤訊息
$ ./qtest
cmd> new
l = []
cmd> ih 1
l = [1]
cmd> free
ERROR: Corruption detected in block with address 0x614c90fac170 when attempting to free it
l = NULL
cmd>
首先 harness.c
程式碼找到打印相關訊息的部份
是由於 footer 與 MAGICFOOTER 不一致所導致
if (footer != MAGICFOOTER) {
report_event(MSG_ERROR,
"Corruption detected in block with address %p when "
"attempting to free it",
p);
error_occurred = true;
}
首先想到的作法 test_free
與 alloc
打印出一些訊息,也確實觀察到 footer 被改動
但無法明確的發現在何時被更改,後來想到之前老師提到可以透過 gdb
工具除錯,步驟如下
harness.c
檔案中的函式 alloc
加入訊息,印出 footer 記憶體位置,後續會用 gdb 追蹤這個位址diff --git a/harness.c b/harness.c
index 4e02891..f2720db 100644
--- a/harness.c
+++ b/harness.c
@@ -161,6 +161,9 @@ static void *alloc(alloc_t alloc_type, size_t size)
// cppcheck-suppress nullPointerRedundantCheck
new_block->prev = NULL;
+ printf("new block p %p magic footer %zx %p\n",
+ p, *find_footer(new_block), &(*find_footer(new_block)));
+
if (allocated)
allocated->prev = new_block;
allocated = new_block;
qtest
程式$./qtest
qtest
的 PID 並 attach 到 gdb
$ ps -aux |grep qtest
$ sudo gdb -p <PID>
harness.c
中 aloc
的最後一行(gdb) break harness.c:173
Breakpoint 1 at 0x6068a22dba01: file harness.c, line 173.
(gdb) continue
Continuing.
qtest
運行的 terminal 執行cmd> new
new block p 0x6096d0f6f2c0 magic footer beefdead 0x5dc240b922d0
l = []
cmd> it 1
new block p 0x6096d0f6f710 magic footer beefdead 0x5dc240b92720
new block p 0x6096d0f6f750 magic footer beefdead 0x5dc240b92752
gdb
terminal 設定觀察 footer 記憶位置(gdb) awatch *(0x5dc240b92720)
(gdb) continue
qtest
運行的 terminal 執行cmd> free
gdb
可以觀察到錯誤訊息,可以透過 bt 打印出 stackHardware access (read/write) watchpoint 2: *(0x5dc240b92720)
Old value = -1091576147
New value = 1085874880
list_add (head=0x5dc240b922c0, node=0x5dc240b92718) at /home/chen/workspace/linux2025/lab0-c/list.h:107
107 node->prev = head;
(gdb) bt
#0 list_add (head=0x5dc240b922c0, node=0x5dc240b92718) at /home/chen/workspace/linux2025/lab0-c/list.h:107
#1 q_insert_head (head=0x5dc240b922c0, s=s@entry=0x5dc22380f138 "123") at queue.c:49
#2 0x00005dc223807f8b in queue_insert (pos=pos@entry=POS_HEAD, argc=<optimized out>, argv=<optimized out>) at qtest.c:234
#3 0x00005dc22380815c in do_ih (argc=<optimized out>, argv=<optimized out>) at qtest.c:282
#4 0x00005dc22380976b in interpret_cmda (argc=argc@entry=2, argv=argv@entry=0x5dc240b93250) at console.c:183
#5 0x00005dc223809d32 in interpret_cmd (cmdline=cmdline@entry=0x5dc240b931e0 "ih 1") at console.c:203
#6 0x00005dc22380a828 in run_console (infile_name=infile_name@entry=0x0) at console.c:661
#7 0x00005dc223808b5a in main (argc=1, argv=<optimized out>) at qtest.c:1451
可以觀察到在呼叫 q_insert_head
最後執行 list_add
時就已經修改到 footer
在審視一下目前的實做方式
bool q_insert_head(struct list_head *head, char *s)
{
element_t *e = malloc(sizeof(struct list_head));
int sl = strlen(s);
e->value = malloc((sl + 1) * sizeof(char));
strncpy(e->value, s, sl + 1);
list_add(&e->list, head);
return true;
}
發現在第 3 行透過 malloc 分配記憶體給 e 時,應取 element_t 的結構大小而非 list_head
@@ -36,7 +36,7 @@ void q_free(struct list_head *head)
/* Insert an element at head of queue */
bool q_insert_head(struct list_head *head, char *s)
{
- element_t *e = malloc(sizeof(struct list_head));
+ element_t *e = malloc(sizeof(element_t));
// printf("malloc e %p\n", e);
int sl = strlen(s);
e->value = malloc((sl + 1) * sizeof(char));
閱讀 queue.h 中關於 q_insert_head
註解提到
觀察到 q_insert_head
的 parameter 包含 head 與 s ,並未輸入字串 s 的長度,由此判斷 s 應最後一個字元應該為 '\0'
實做如下
bool q_insert_head(struct list_head *head, char *s)
{
element_t *e = malloc(sizeof(struct list_head));
int sl = strlen(s);
e->value = malloc((sl + 1) * sizeof(char));
strncpy(e->value, s, sl + 1);
list_add(&e->list, head);
return true;
}
此實做分配錯記憶體大小的問題,已更新,詳細的查測過程
@@ -36,7 +36,7 @@ void q_free(struct list_head *head)
/* Insert an element at head of queue */
bool q_insert_head(struct list_head *head, char *s)
{
- element_t *e = malloc(sizeof(struct list_head));
+ element_t *e = malloc(sizeof(element_t));
// printf("malloc e %p\n", e);
int sl = strlen(s);
e->value = malloc((sl + 1) * sizeof(char));
處理針對 trace-11-malloc 與 trace-12-malloc 錯誤訊息
+++ TESTING trace trace-11-malloc:
# Test of malloc failure on 'q_insert_head': 'q_new', and 'q_insert_head'
Segmentation fault occurred. You dereferenced a NULL or invalid pointer--- trace-11-malloc 0/6
+++ TESTING trace trace-12-malloc:
# Test of malloc failure on 'q_insert_tail': 'q_new', 'q_insert_head', and 'q_insert_tail'
Segmentation fault occurred. You dereferenced a NULL or invalid pointer--- trace-12-malloc 0/6
待處理
參考 q_insert_head 中透過 list_add
函式將新的節點加入到 list 的頭,參考 list.h
中相對有個函式 list_add_tail
將節點加入到 list 的尾,調整使用 list_add_tail
實做如下
bool q_insert_tail(struct list_head *head, char *s)
{
element_t *e = malloc(sizeof(struct list_head));
int sl = strlen(s);
e->value = malloc((sl + 1) * sizeof(char));
strncpy(e->value, s, sl + 1);
list_add_tail(&e->list, head);
return true;
}
使用 qtest
測試結果符合預期
$ ./qtest
cmd> new
l = []
cmd> ih 1
l = [1]
cmd> ih 2
l = [2 1]
cmd> it 3
l = [2 1 3]
cmd>
目前實做方式是依序從第 2 個 queue 開始將所有的 node 都接到第一個 queue 的後方,最後在針對第 1 個 queue 做排序。
這個作法很直覺但因為最後需針對第 1 個 queue 做排序,目前排序時做為 Merge sort 所以其時間複查度為
實做方式透過函式 list_splice_tail
移動所有的 nodes,注意到 list_splice_tail
的註解說明中有提到
The @list head is not modified and has to be initialized to be used as a valid list head/node again.
所以需在呼叫 INIT_LIST_HEAD 才可以再當成可用的 head,參考下方。
diff --git a/queue.c b/queue.c
index 5f398d3..d3c76b5 100644
--- a/queue.c
+++ b/queue.c
@@ -478,6 +478,7 @@ int q_merge(struct list_head *head, bool descend)
list_for_each_entry_safe(e, is, head, chain) {
if (e != first) {
list_splice_tail(e->q, first->q);
+ INIT_LIST_HEAD(e->q);
}
}
配合 q_merge 註解中提到
There is no need to free the 'queue_contex_t' and its member 'q' since they will be released externally.
加入 INIT_LIST_HEAD(e->q);
後續做 q_free 才不會發生問題。
已更新 Commit dc074bd
透過 scripys/driver-shuffle.py
輸出 shuffle 結果如下
Expectation: 4166
Observation: {'1234': 24965, '1243': 0, '1324': 0, '1342': 0, '1423': 0, '1432': 0, '2134': 0, '2143': 0, '2314': 0, '2341': 24894, '2413': 0, '2431': 0, '3124': 0, '3142': 0, '3214': 0, '3241': 0, '3412': 24888, '3421': 0, '4123': 25253, '4132': 0, '4213': 0, '4231': 0, '4312': 0, '4321': 0}
chi square sum: 500101.38214114256
從圖表來看 shuffle 的頻率明顯不符合 Uniform distribution
Commit 3773ea2 已修正此問題,執行 ./scripts/driver-shuffle.py
的結果
$ ./scripts/driver-shuffle.py
Expectation: 4166
Observation: {'1234': 4241, '1243': 4165, '1324': 4115, '1342': 4269, '1423': 4210, '1432': 4143, '2134': 4171, '2143': 4191, '2314': 4130, '2341': 4006, '2413': 4196, '2431': 4185, '3124': 4108, '3142': 4229, '3214': 4232, '3241': 4180, '3412': 4091, '3421': 4109, '4123': 4202, '4132': 4256, '4213': 4157, '4231': 4027, '4312': 4225, '4321': 4162}
chi square sum: 24.762842054728754
因為 P value(0.9~0.1)> alpha(0.05),統計檢定的結果不拒絕虛無假說(
也就是樣本資料不拒絕 shuffle 的結果遵循 Uniform distribution,因為沒有足夠證據推翻。
從圖表來看 shuffle 的頻率是大致符合 Uniform distribution 的。
從 include/linux/types.h
中知到 struct list_head 定義如下
...
struct list_head {
struct list_head *next, *prev;
};
...
list_sort_.c 中 merge 的運作參考以下流程,目標是 merge list a 與 list b,因為 merge 中不維 prev 指標,所以下圖省略
after loop 1
after loop 2
after loop 3
因為 b 為 NULL 所以進一步執行
if (!b) {
*tail = a;
break;
}
最後回傳 head。
接著說明 list_sort
函式,部份程式碼如下
...
do {
size_t bits;
struct list_head **tail = &pending;
/* Find the least-significant clear bit in count */
for (bits = count; bits & 1; bits >>= 1)
tail = &(*tail)->prev;
/* Do the indicated merge */
if (likely(bits)) {
struct list_head *a = *tail, *b = a->prev;
a = merge(priv, cmp, b, a);
/* Install the merged result in place of the inputs */
a->prev = b->prev;
*tail = a;
}
/* Move one element from input list to pending */
list->prev = pending;
pending = list;
list = list->next;
pending->next = NULL;
count++;
} while (list);
...
count 變數從 0 開始每回和加 1 。
程式碼第 6 行,會找尋最小為 0 的 bit,可以參考下方表格,用黃色與紅色表示。
程式碼第 10 行,會判斷剩餘的的 bits 是不是 0,非 0 則進行整並,參考下方紅色的 0 即需整並的情形,對應到 merge 欄位。
count (binary) | merge | initial pending | after merge | after insert a new node |
---|---|---|---|---|
0 (00000000) | NO | NULL | [1] | |
1 (00000001) | NO | [1] | [1, 1] | |
2 (00000010) | YES | [1, 1] | [2] | [2, 1] |
3 (00000011) | NO | [2, 1] | [2, 1, 1] | |
4 (00000100) | YES | [2, 1, 1] | [2, 2] | [2, 2, 1] |
5 (00000101) | YES | [2, 2, 1] | [4, 1] | [4, 1, 1] |
6 (00000110) | YES | [4, 1, 1] | [4, 2] | [4, 2, 1] |
7 (00000111) | NO | [4, 2, 1] | [4, 2, 1, 1] | |
8 (00001000) | YES | [4, 2, 1, 1] | [4, 2, 2] | [4, 2, 2, 1] |
9 (00001001) | YES | [4, 2, 2, 1] | [4, 4, 1] | [4, 4, 1, 1] |
10 (00001010) | YES | [4, 4, 1, 1] | [4, 4, 2] | [4, 4 ,2, 1] |
11 (00001011) | YES | [4, 4, 2, 1] | [8, 2, 1] | [8, 2, 1, 1] |
12 (00001100) | YES | [8, 2, 1, 1] | [8, 2, 2] | [8, 2, 2, 1] |
13 (00001101) | YES | [8, 2, 2, 1] | [8, 4, 1] | [8, 4, 1, 1] |
14 (00001110) | YES | [8, 4, 1, 1] | [8, 4, 2] | [8, 4, 2, 1] |
15 (00001111) | NO | [8, 4, 2, 1] | [8, 4, 2, 1, 1] | |
16 (00010000) | YES | [8, 4, 2, 1, 1] | [8, 4, 2, 2] | [8, 4, 2, 2, 1] |
取當 count 為 9 來舉例說明,初始 pending 列表如下
9 的 binary 為 00001001,因為最小的 0 出現前有一個 1 因此下方會執行 1 次
tail = &(*tail)->prev;
接著
struct list_head *a = *tail, *b = a->prev;
然後
a = merge(priv, cmp, b, a);
/* Install the merged result in place of the inputs */
a->prev = b->prev;
*tail = a;
最後這段程式碼會在放如一個節點
/* Move one element from input list to pending */
list->prev = pending;
pending = list;
list = list->next;
pending->next = NULL;
count++;
接著看 merge_final
函式中這段程式碼
...
do {
/*
* If the merge is highly unbalanced (e.g. the input is
* already sorted), this loop may run many iterations.
* Continue callbacks to the client even though no
* element comparison is needed, so the client's cmp()
* routine can invoke cond_resched() periodically.
*/
if (unlikely(!++count))
cmp(priv, b, b);
b->prev = tail;
tail = b;
b = b->next;
} while (b);
...
待處理
關於 likely 與 unlikely 函式
於 linux-6.14-rc7 的檔案 include/linux/compiler.h
可以找到其定義,在 CONFIG_TRACE_BRANCH_PROFILING 未被開啟的狀態下被定義如下
//有 defined CONFIG_TRACE_BRANCH_PROFILING
# define likely(x) (__branch_check__(x, 1, __builtin_constant_p(x)))
# define unlikely(x) (__branch_check__(x, 0, __builtin_constant_p(x)))
//沒有 defined CONFIG_TRACE_BRANCH_PROFILING
# define likely(x) __builtin_expect(!!(x), 1)
# define unlikely(x) __builtin_expect(!!(x), 0)
關於 CONFIG_TRACE_BRANCH_PROFILING 可以參考 kernel/trace/Kconfig
中說明
在 linux-6.14-rc7 的資料夾底下
make menuconfig
可以在 Tracers -> Branch Profiling -> Trace likely/unlikely profile 找到說明
這個 config 是用來追蹤 likely 與 unlikely 的情形,打開此選項對效能有很大的影響
因此可以推斷平常執行時 likely 與 unlikely 定義是使用
# define likely(x) __builtin_expect(!!(x), 1)
# define unlikely(x) __builtin_expect(!!(x), 0)
解釋 __builtin_expect,參考 6.64 Other Built-in Functions Provided by GCC
Built-in Function: long __builtin_expect (long exp, long c)
You may use __builtin_expect to provide the compiler with branch prediction information. In general, you should prefer to use actual profile feedback for this (-fprofile-arcs), as programmers are notoriously bad at predicting how their programs actually perform. However, there are applications in which this data is hard to collect.
可以使用這個函式來告知編譯器分支預測的資訊。
當第 c 帶入 0 代表這個分支不被預期發生,帶入 0 則代表會被預期發生,編譯器會針對這個對於產的的 assembly code 做調整。
參考下方實驗
#include <stdlib.h>
#include <stdio.h>
#include "time.h"
int main()
{
int x = !time(NULL);
if(__builtin_expect(x, 0)) {
printf("not expected\n");
} else {
printf("expected\n");
}
return 0;
}
.LC0:
.string "not expected"
.LC1:
.string "expected"
.text
.globl main
.type main, @function
main:
.LFB39:
.cfi_startproc
endbr64
subq $8, %rsp
.cfi_def_cfa_offset 16
movl $0, %edi
call time@PLT
testq %rax, %rax
je .L5
leaq .LC1(%rip), %rdi
call puts@PLT
.L3:
movl $0, %eax
addq $8, %rsp
.cfi_remember_state
.cfi_def_cfa_offset 8
ret
.L5:
.cfi_restore_state
leaq .LC0(%rip), %rdi
call puts@PLT
jmp .L3
.cfi_endproc
#include <stdlib.h>
#include <stdio.h>
#include "time.h"
int main()
{
int x = !time(NULL);
if(__builtin_expect(x, 1)) {
printf("expected\n");
} else {
printf("not expected\n");
}
return 0;
}
.LC0:
.string "expected"
.LC1:
.string "not expected"
.text
.globl main
.type main, @function
main:
.LFB39:
.cfi_startproc
endbr64
subq $8, %rsp
.cfi_def_cfa_offset 16
movl $0, %edi
call time@PLT
testq %rax, %rax
jne .L2
leaq .LC0(%rip), %rdi
call puts@PLT
.L3:
movl $0, %eax
addq $8, %rsp
.cfi_remember_state
.cfi_def_cfa_offset 8
ret
.L2:
.cfi_restore_state
leaq .LC1(%rip), %rdi
call puts@PLT
jmp .L3
.cfi_endproc
可以觀察到 expected 的訊息都被安排到 je 或 jne 的下方。
另外文件中提到可以在編譯時加入 -fprofile-arcs
可以追蹤真實運作中每個分支實際被運作的情形,進一步查閱 3.13 Program Instrumentation Options
-fprofile-arcs
Add code so that program flow arcs are instrumented. During execution the program records how many times each branch and call is executed and how many times it is taken or returns. On targets that support constructors with priority support, profiling properly handles constructors, destructors and C++ constructors (and destructors) of classes which are used as a type of a global variable.
When the compiled program exits it saves this data to a file called auxname.gcda for each source file. The data may be used for profile-directed optimizations (-fbranch-probabilities), or for test coverage analysis (-ftest-coverage). Each object file’s auxname is generated from the name of the output file, if explicitly specified and it is not the final executable, otherwise it is the basename of the source file. In both cases any suffix is removed (e.g. foo.gcda for input file dir/foo.c, or dir/foo.gcda for output file specified as -o dir/foo.o).
Note that if a command line directly links source files, the corresponding .gcda files will be prefixed with the unsuffixed name of the output file. E.g. gcc a.c b.c -o binary would generate binary-a.gcda and binary-b.gcda files.
在編譯時加入這個選項後,運作程式時會自動產生 .gcda
檔案。
The data may be used for profile-directed optimizations (-fbranch-probabilities), or for test coverage analysis (-ftest-coverage)
(待處理)
另外也提到可以透過 gcov 工作來分析。(待處理)
(gdb) set $b=(block_element_t*) ((size_t)p-sizeof(block_element_t))
(gdb) set $t=(size_t*)((size_t)$b+sizeof(block_element_t)+$b->payload_size)
(gdb) p/x *$t
select() can monitor only file descriptors numbers that are less than FD_SETSIZE (1024)
文件中提及 select 只能監視最多 1024 個 file descriptors,超過則可以用 poll(2) 或 epoll(7)。
Thus, if using select() within a loop, the sets must be
reinitialized before each call.