contributed by < leewei05
>
memchr_opt 初步理解為以下:
注意用詞:
尊重資訊科技前輩的篳路藍縷,儘量使用傳統詞彙。
首先會判斷帶入的字串長度是否比 8 個位元組還要長,如果比 8 個位元組還短即比對各個位元組 (byte) 搜尋指定的 char。如果比字串比 8 個位元組還要長,那就會進入 if (!TOO_SMALL(length))
的判斷式。
與其一個位元組接著一個位元組比對,memchr 會根據 unsigned char d
(欲搜尋的字元) 製作一個 8 個位元組的 bytemask。
void *memchr_opt(const void *src_void, int c, size_t length)
{
const unsigned char *src = (const unsigned char *) src_void;
unsigned char d = c;
while (UNALIGNED(src)) {
if (!length--)
return NULL;
if (*src == d)
return (void *) src;
src++;
}
if (!TOO_SMALL(length)) {
/* If we get this far, we know that length is large and
* src is word-aligned.
*/
/* The fast code reads the source one word at a time and only performs
* the bytewise search on word-sized segments if they contain the search
* character, which is detected by XORing the word-sized segment with a
* word-sized block of the search character and then detecting for the
* presence of NULL in the result.
*/
unsigned long *asrc = (unsigned long *) src;
unsigned long mask = d << 8 | d;
mask = mask << 16 | mask;
for (unsigned int i = 32; i < LBLOCKSIZE * 8; i <<= 1)
mask = (mask << i) | mask;
while (length >= LBLOCKSIZE) {
// check if char matches
if (DETECT_CHAR(*asrc, mask)) {
// if yes, check char one by one until we find the char
src = (unsigned char *) asrc;
while (length--) {
if (*src == d)
return (void *) src;
src++;
}
}
// if no char matches, reduce length by LBLOCKSIZE
length -= LBLOCKSIZE;
// move forward by LBLOCKSIZE because there were no matching
// charater in this block
asrc += 1;
// check if source array is shorter than LBLOCKSIZE
if (DETECT_NULL(*asrc))
break;
}
/* If there are fewer than LBLOCKSIZE characters left, then we resort to
* the bytewise loop.
*/
src = (unsigned char *) asrc;
}
while (length--) {
if (*src == d)
return (void *) src;
src++;
}
return NULL;
}
int main()
{
const char str[] = "http://wiki.csie.ncku.edu.tw";
const char ch = '.';
const char str2[] = "http://wiki.csie.ncku.edu.tw";
const char ch2 = 'p';
const char str3[] = "http://wiki.csie.ncku.edu.tw";
const char ch3 = 'u';
char *ret = memchr_opt(str, ch, strlen(str));
printf("String after |%c| is - |%s|\n", ch, ret);
char *ret2 = memchr_opt(str2, ch2, strlen(str2));
printf("String after |%c| is - |%s|\n", ch2, ret2);
char *ret3 = memchr_opt(str3, ch3, strlen(str3));
printf("String after |%c| is - |%s|\n", ch3, ret3);
return 0;
}
執行結果如下:
String after |.| is - |.csie.ncku.edu.tw|
String after |p| is - |p://wiki.csie.ncku.edu.tw|
String after |u| is - |u.edu.tw|
一開始很快就 parse 出 .
,但加了另外兩個 case 發現都會回傳 NULL。透過 gdb 檢查之後發現這行寫錯了。原本是要位移一個 pointer to long,但一開始寫的版本是取回一個位元組,所以才會出錯。也因為為了熟悉 gdb 的關係,花了不少時間,但學會使用 gdb 除錯真的節省不少時間以及排除用感覺寫程式的壞習慣。
// WRONG
*asrc = *asrc + 8;
// CORRECT
asrc += 1;
首先從核心模組的 init function 開始看起。查閱 Linux Kernel 原始碼,dont_trace_init
基本上就是建立一個 work queue,並且設定 work queue 的延遲時間(1 Jiffy)。
exit function 則是停止 work queue 的 routine 並清除先前定義的 work queue。
static int __init dont_trace_init(void)
{
wq = create_workqueue(DONT_TRACE_WQ_NAME);
queue_delayed_work(wq, &dont_trace_task, JIFFIES_DELAY);
loaded = true;
pr_info("Loaded!\n");
return 0;
}
static void __exit dont_trace_exit(void)
{
loaded = false;
/* No new routines will be queued */
cancel_delayed_work(&dont_trace_task);
/* Wait for the completion of all routines */
flush_workqueue(wq);
destroy_workqueue(wq);
pr_info("Unloaded.\n");
}
除了 periodic_routine
以外需要實作,其他四個函式分別為:
kill_task
: 刪除 Kernel space 的行程is_tracer
: 判斷行程是否有追蹤其他行程kill_tracee
: 刪除所有 tracer 追蹤的行程check
: 檢查所有的行程的 ptraced,也就是追蹤的行程。如果 is_tracer
返回 true,則刪除所有追蹤行程以及該行程。/*
* 'ptraced' is the list of tasks this task is using ptrace() on.
*
* This includes both natural children and PTRACE_ATTACH targets.
* 'ptrace_entry' is this task's link on the p->parent->ptraced list.
*/
struct list_head ptraced;
首先,likely
巨集是由 gcc extension __builtin_expect
組成。此用途是可提示編譯器產生對 branch predictor 友善的程式碼,因為工程師可以透過 __builtin_expect
來告知編譯器 !!(x) == 1
的機率很大,也就是 x 很可能為 1。!!
是確保 !!(x)
的結果一定為 0 或是 1。
6.5.3.3 Unary arithmetic operators
The result of the logical negation operator ! is 0 if the value of its operand compares
unequal to 0, 1 if the value of its operand compares equal to 0. The result has type int.
The expression !E is equivalent to (0==E)
# define likely(x) __builtin_expect(!!(x), 1)
什麼時候需要會需要執行 check()
? A: 掛載核心模組的時候。
可以使用 init function 所定義的 loaded = true;
,因為大部分的時間 loaded == true
。
static void periodic_routine(struct work_struct *ws)
{
// check each process if dont_trace module is loaded
if (likely(loaded))
check();
// run work task again after JIFFIES_DELAY
schedule_delayed_work(&dont_trace_task, JIFFIES_DELAY);
}
測試程式碼
# 確認沒有掛載模組
$ lsmod | grep dont
# 編譯模組
$ make
make -C /lib/modules/`uname -r`/build M=quiz3 modules
make[1]: Entering directory '/usr/src/linux-headers-5.13.0-39-generic'
CC [M] quiz3/dont_trace.o
MODPOST quiz3/Module.symvers
CC [M] quiz3/dont_trace.mod.o
LD [M] quiz3/dont_trace.ko
BTF [M] quiz3/dont_trace.ko
Skipping BTF generation for quiz3/dont_trace.ko due to unavailability of vmlinux
make[1]: Leaving directory '/usr/src/linux-headers-5.13.0-39-generic'
$ yes >/dev/null &
[1] 21169
$ sudo gdb -q --pid=`pidof yes`
Attaching to process 21169
Reading symbols from /usr/bin/yes...
(No debugging symbols found in /usr/bin/yes)
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...
Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libc-2.31.so...
Reading symbols from /lib64/ld-linux-x86-64.so.2...
(No debugging symbols found in /lib64/ld-linux-x86-64.so.2)
0x00007f399da860a7 in __GI___libc_write (fd=1, buf=0x555c63a3b510, nbytes=8192) at ../sysdeps/unix/sysv/linux/write.c:26
26 ../sysdeps/unix/sysv/linux/write.c: No such file or directory.
(gdb) signal SIGINT
Continuing with signal SIGINT.
Program terminated with signal SIGINT, Interrupt.
The program no longer exists.
(gdb) quit
[1]+ Interrupt yes > /dev/null
$ make load
sudo insmod dont_trace.ko
$ lsmod | grep dont
dont_trace 16384 0
$ yes >/dev/null &
[1] 21450
$ sudo gdb -q --pid=`pidof yes`
Attaching to process 21450
[1]+ Killed yes > /dev/null
Killed
$make unload
sudo rmmod dont_trace
當一個行程追蹤了另一個行程,被追蹤的行程 (tracee) 需要成為追蹤行程 (tracer) 的子行程,tracer 擁有 tracee 的控制權,包括設定 break point,檢視 tracee 的記憶體位址,攔截系統呼叫等等。GDB 即是透過 ptrace 實現上述的功能。
Ptrace 的限制
參考文件
Ptrace manual
Intercepting and Emulating Linux System Calls with Ptrace
Ptrace, Utrace, Uprobes: Lightweight, Dynamic Tracing of User Apps