# Kernel Study – Kdump and Crash ## Reference link + [Kernel Crash Dump](https://ubuntu.com/server/docs/kernel-crash-dump) + [Ubuntu 20.04 Kdump + Crash 初体验](https://www.ebpf.top/post/ubuntu_kdump_crash/) ## Installation 安裝kdump,kexec和crash。 ```bash $ sudo apt install linux-crashdump $ sudo apt install crash ``` **手動安裝最新的crash** 因為舊版的crash無法讀取新版的kernel,會造成segmetaion fail,所以手動安裝最新版的crash。 ```bash= # 下載最新的crash # https://github.com/crash-utility/crash/releases $ wget https://github.com/crash-utility/crash/archive/refs/tags/8.0.0.tar.gz $ make $ make install ``` 重新設定kdump和kexec的設定值。 ```bash $ dpkg-reconfigure kexec-tools $ dpkg-reconfigure kdump-tools ``` kdump和kexec的設定檔。 + /etc/default/kexec + /etc/default/kdump-tools 確認kdump已經正常運作 ```bash $ sudo kdump-config show ``` 查看kdump-tools service的狀態 ```bash $ service kdump-tools status ``` ## Verification ```bash $ sudo echo 1 > /proc/sys/kernel/sysrq $ sudo echo c > /proc/sysrq-trigger ``` 命令運行成功後,/var/carsh 目錄中會生成了一個以當前日期命名的目錄,包含 dmesg.x 和 dump.x 兩個檔案,其中 demsg.x 為崩潰時候的系統kernel log,dump.x 則為轉儲的kernel snapshot。 ## 安裝帶有debug info的vmlinux ```bash $ echo "deb http://ddebs.ubuntu.com $(lsb_release -cs) main restricted universe multiverse deb http://ddebs.ubuntu.com $(lsb_release -cs)-updates main restricted universe multiverse deb http://ddebs.ubuntu.com $(lsb_release -cs)-proposed main restricted universe multiverse" | sudo tee -a /etc/apt/sources.list.d/ddebs.list $ sudo apt install ubuntu-dbgsym-keyring $ sudo apt-get update $ sudo apt -y install linux-image-$(uname -r)-dbgsym # 安裝完成後,查看檔案 $ sudo ls -hl /usr/lib/debug/boot/ ``` ## Hardlockup/Softlockup ref : [Softlockup detector and hardlockup detector (aka nmi_watchdog)](https://www.kernel.org/doc/html/latest/admin-guide/lockup-watchdogs.html) ref : [Documentation for /proc/sys/kernel/](https://www.kernel.org/doc/html/latest/admin-guide/sysctl/kernel.html) + ‘softlockup’ : ==a bug that causes the kernel to loop in kernel mode for more than 20 seconds== , without giving other tasks a chance to run. + A ‘hardlockup’ : ==a bug that causes the CPU to loop in kernel mode for more than 10 seconds==, without letting other interrupts have a chance to run. 確定kernel config使否有開啟 使用以下的command開起偵測hardlockup和softlockup。 ```bash $ echo 1 > /proc/sys/kernel/softlockup_panic $ echo 1 > /proc/sys/kernel/hardlockup_panic $ echo 1 > /proc/sys/kernel/hung_task_panic $ echo 30 > /proc/sys/kernel/hung_task_timeout_secs ``` 或是修改/etc/sysctl.conf ```bash </etc/sysctl.conf> kernel.hung_task_panic = 1 kernel.softlockup_panic = 1 kernel.hardlockup_panic = 1 kernel.hung_task_timeout_secs = 30 ``` 重新載入系統參數 ```bash $ sudo sysctl -p ``` ## GitKernelBuild ref : [GitKernelBuild](https://wiki.ubuntu.com/KernelTeam/GitKernelBuild) ### Prerequisites ```bash $ sudo apt-get install git build-essential kernel-package fakeroot libncurses5-dev libssl-dev ccache bison flex libelf-dev dwarves ``` ### Kernel Build and Installation #### Download the source and use the old config ```bash $ git clone https://github.com/torvalds/linux.git $ cp /boot/config-$(uname -r) .config $ make oldconfig # 會提示你新的config的選項 或是 $ make olddefconfig # 新的config會用default值 ``` #### Modify the .config 移除以下的keys,因為沒這些檔案。 ```bash $ ./scripts/config --set-str SYSTEM_TRUSTED_KEYS "" $ ./scripts/config --set-str SYSTEM_REVOCATION_KEYS "" $ ./scripts/config --disable SYSTEM_REVOCATION_KEYS $ ./scripts/config --disable SYSTEM_TRUSTED_KEYS ``` #### Build and Install kernel ```bash $ time make -j$(nproc) olddefconfig bindeb-pkg # 會在上一層打包成deb格式 $ sudo dpkg -i linux-image-<version>.deb sudo dpkg -i linux-headers-<version>.deb $ sudo reboot ``` ## Crash ###### tags: `kernel`