# Tracing linux kernel with *ftrace* and *virtme* [TOC] ## 實驗環境 - 安裝 [virtme](https://github.com/amluto/virtme) - 詳細的安裝流程可以參照 [測試 Linux 核心的虛擬化環境](https://hackmd.io/@sysprog/linux-virtme) - 注意要使用以下的指令切換到 Linux kernel 5.10 ```c=1 $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git $ cd linux $ git checkout v5.10 ``` ## 啟用核心中的 *ftrace* 模組 - 注意:若要啟用核心中的ftrace,需要在`make ARCH=x86 CROSS_COMPILE=x86_64-linux-gnu- -j8` 指令之前使用 `make menuconfig` 設定選項 - > 使用 virtme 選取預設核心組態並編譯: $ virtme-configkernel --defconfig $ **make menuconfig** $ make ARCH=x86 CROSS_COMPILE=x86_64-linux-gnu- -j8 - 輸入 `make menuconfig`之後會出現以下的畫面  - 進到 `Kernel hacking` 的目錄  - 進到 `Tracers` 的目錄  - 選取 `Kernel Function Tracer` 和 `Kernel fucntion profiler`  - 設定完成後,接著從 `make ARCH=x86 CROSS_COMPILE=x86_64-linux-gnu- -j8` 指令開始編譯核心 ## 用 `ftrace` 中的 `function tracer` 作實驗 - 利用 `function tracer` 紀錄 `ls` 指令產生 `__do_fault` 的次數 - 在 `/tmp` 路徑建立 `traceprocess.sh` 檔案 ```bash= #!/bin/bash # traceprocess.sh TRACE_DIR=/sys/kernel/debug/tracing echo > $TRACE_DIR/trace echo $$ > $TRACE_DIR/set_ftrace_pid echo __do_fault > $TRACE_DIR/set_ftrace_filter echo function > $TRACE_DIR/current_tracer exec $* ``` - 執行 `./traceprocess.sh ls` 之後用 `cat $TRACE_DIR/trace` 看結果 ``` # tracer: function # # entries-in-buffer/entries-written: 52/52 #P:1 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | ls-208 [000] .... 5360.308711: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.308774: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.308781: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.308915: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.308935: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.308958: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.308983: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.309041: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.309306: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.309332: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.309356: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.309378: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.309399: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.309421: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.309443: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.309779: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.309789: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.309816: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.309839: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.309863: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.309888: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.310590: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.310608: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.310691: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.311350: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.311363: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.311391: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.311900: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.312393: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.312464: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.312505: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.313118: __do_fault <-handle_mm_fault ls-208 [000] .... 5360.313145: __do_fault <-handle_mm_fault ``` ## 用 `ftrace` 中的 `function_graph tracer` 作實驗 - 利用 `function tracer` 紀錄 `ls` 指令產生 `scheduler()` 的 `function calling graph` - 在 `/tmp` 路徑建立 `graph.sh` 檔案 ```bash=1 #!/bin/bash # graph.sh TRACE_DIR=/sys/kernel/debug/tracing echo > $TRACE_DIR/trace echo $$ > $TRACE_DIR/set_ftrace_pid echo > $TRACE_DIR/set_ftrace_filter echo function_graph > $TRACE_DIR/current_tracer echo schedule > set_graph_function exec $* ``` - 執行 `./graph.sh ls` 之後用 `cat $TRACE_DIR/trace` 看結果 ```c=1 0) | schedule() { 0) | rcu_note_context_switch() { 0) 0.086 us | rcu_qs(); 0) 0.296 us | } 0) 0.094 us | _raw_spin_lock(); 0) | pick_next_task_fair() { 0) 0.121 us | update_curr(); 0) | pick_next_entity() { 0) 0.094 us | wakeup_preempt_entity.isra.0(); 0) 0.099 us | clear_buddies(); 0) 0.511 us | } 0) | put_prev_entity() { 0) 0.092 us | update_curr(); 0) 0.185 us | __enqueue_entity(); 0) 0.102 us | __update_load_avg_se(); 0) 0.098 us | __update_load_avg_cfs_rq(); 0) 0.941 us | } 0) | set_next_entity() { 0) 0.098 us | __update_load_avg_se(); 0) 0.086 us | __update_load_avg_cfs_rq(); 0) 0.534 us | } 0) 2.594 us | } 0) 0.359 us | __traceiter_sched_switch(); 0) 0.098 us | finish_task_switch(); 0) + 20.474 us | } ``` - 觀察 `ls` 指令中 `schedule()` 所需要的時間 ```bash= #!/bin/bash # graph.sh TRACE_DIR=/sys/kernel/debug/tracing echo > $TRACE_DIR/trace echo $$ > $TRACE_DIR/set_ftrace_pid echo schedule > $TRACE_DIR/set_ftrace_filter echo function_graph > $TRACE_DIR/current_tracer echo schedule > set_graph_function exec $* ``` - 執行 `./graph.sh ls` 之後用 `cat $TRACE_DIR/trace` 看結果 ```c=1 0) ! 285.366 us | schedule(); 0) ! 220.528 us | schedule(); 0) + 61.051 us | schedule(); 0) + 19.200 us | schedule(); 0) ! 124.577 us | schedule(); 0) + 42.002 us | schedule(); 0) + 36.972 us | schedule(); 0) 6.818 us | schedule(); 0) + 34.778 us | schedule(); 0) + 26.441 us | schedule(); 0) + 17.487 us | schedule(); 0) 5.430 us | schedule(); 0) + 36.539 us | schedule(); 0) + 34.682 us | schedule(); 0) + 36.247 us | schedule(); 0) 6.426 us | schedule(); 0) + 33.554 us | schedule(); 0) + 26.195 us | schedule(); 0) + 16.376 us | schedule(); 0) 5.257 us | schedule(); 0) + 33.107 us | schedule(); 0) + 35.430 us | schedule(); 0) 8.142 us | schedule(); 0) + 33.478 us | schedule(); 0) + 27.084 us | schedule(); 0) + 16.468 us | schedule(); 0) 5.328 us | schedule(); 0) + 33.092 us | schedule(); 0) + 35.729 us | schedule(); 0) + 36.055 us | schedule(); 0) 6.621 us | schedule(); 0) + 33.212 us | schedule(); 0) + 24.953 us | schedule(); 0) + 16.413 us | schedule(); 0) 5.218 us | schedule(); ``` ## `Event tracing` - 利用 `ftrace` 追蹤 `sched_switch` event 產生的次數 - 在 `/tmp` 路徑建立 `event.sh` 檔案 ```bash= #!/bin/bash # event.sh TRACE_DIR=/sys/kernel/debug/tracing echo nop > $TRACE_DIR/current_tracer echo 1 > $TRACE_DIR/events/sched/sched_switch/enable ``` - 執行 `./event.sh ls` 之後用 `cat $TRACE_DIR/trace` 看結果 ```c=1 # tracer: nop # # entries-in-buffer/entries-written: 26152/220171 #P:1 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | <idle>-0 [000] d... 7518.162516: sched_switch: prev_comm=swappee r/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=vi next_pid=219 next_prii o=120 vi-219 [000] d... 7518.162526: sched_stat_runtime: comm=vi pii d=219 runtime=10834 [ns] vruntime=24019514237 [ns] vi-219 [000] d... 7518.162527: sched_switch: prev_comm=vi pree v_pid=219 prev_prio=120 prev_state=D ==> next_comm=swapper/0 next_pid=0 next_prii o=120 <idle>-0 [000] d.h. 7518.162553: sched_waking: comm=vi pid=219 prio=120 target_cpu=000 <idle>-0 [000] dNh. 7518.162554: sched_wakeup: comm=vi pid=219 ```
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up