[TOC]
## Introduction
For Ubuntu, [Kernel crash dump](https://ubuntu.com/server/docs/kernel-crash-dump) describes how to have machine generate dump file upon crash. The instrutions doesn't always work, and usually need extra tweaks. Plus, it only shows how to generate the dump file itself, not including how to prepare environment for reading the dump file. There are another few steps to do before you can read the dump file using `crash(8)` properly.
This articitle describes both aspects. It describes how to set up `kdump` on the crashing machine in the first half. The other half decribes how to set up environments to read that dump file.
## Setup `kdump` on crashed machine
### Step 1: Install `kdump-tools`
This should be really forward:
```
$ sudo apt install linux-crashdump
```
### (Issue 1) kdump is not enabled by default
Note that in the documentation, is is stated that:
*...Starting with 16.04, the kernel crash dump mechanism is enabled by default.*
When fact, it is not. At least not on my `lima` virtual machine, nor is it enabled on Ubuntu 22.04 for amd64 desktop. It has to be enabled manually. See the next step.
### Step 2: `dpkg-reconfigure`
If this is the first time you install these package, you'll see prompts asking you to hand over reboot to `kexec-tools`:
```
|------------------------| Configuring kexec-tools |------------------------|
| |
| |
| If you choose this option, a system reboot will trigger a restart into a |
| kernel loaded by kexec instead of going through the full system boot |
| loader process. |
| |
| Should kexec-tools handle reboots (sysvinit only)? |
| |
| <Yes> <No> |
| |
|---------------------------------------------------------------------------|
```
and whether to enable `kdump`:
```
|------------------------| Configuring kdump-tools |------------------------|
| |
| |
| If you choose this option, the kdump-tools mechanism will be enabled. A |
| reboot is still required in order to enable the crashkernel kernel |
| parameter. |
| |
| Should kdump-tools be enabled be default? |
| |
| <Yes> <No> |
| |
|---------------------------------------------------------------------------|
```
To enable `kdump`, both options should be `yes`. You can enable those options later after installation, by `dpkg-reconfigure`:
```
$ dpkg-reconfigure kexec-tools
```
And:
```
$ dpkg-reconfigure kdump-tools
```
### (Issue 2): `grub` failed to update due to `memtest86+`
Now, here's one issue you may face. If you are running Ubuntu 22.04 on a x86-64 machine, chances that you may fail to `update-grub` due to `memtest86+` not supporting EFI boot:
```
...
Memtest86+ needs a 16-bit boot, that is not available on EFI, exiting
...
```
And if that happens, the `dpkg-reconfigure` will not take effect, and `kdump` will not enable. To deal with it, see discussion in [Update-grub broken, Memtest86+](https://askubuntu.com/questions/1421573/update-grub-broken-memtest86). You can work around by:
```
$ sudo chmod -x /etc/grub.d/20_memtest86+
```
And then run the `dpkg-reconfigure` commands again. This should make it work on x86 machine.
### Step 3: reboot
After reboot, if you run:
```
$ kdump-config show
```
You should see:
```
DUMP_MODE: kdump
USE_KDUMP: 1
KDUMP_COREDIR: /var/crash
crashkernel addr: 0x
/var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinuz-5.19.0-45-generic
kdump initrd:
/var/lib/kdump/initrd.img: symbolic link to /var/lib/kdump/initrd.img-5.19.0-45-generic
current state: ready to kdump
kexec command:
/sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-5.19.0-45-generic root=UUID=94ffbc88-f087-4bc9-8d8f-dc365280fc1c ro console=tty1 console=ttyS0 reset_devices systemd.unit=kdump-tools-dump.service nr_cpus=1 irqpoll nousb" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz
```
Note the `Ready to dump` message. If `kdump` is not set up properly, it will show `current state: Not ready to kdump` instead.
### (Optional) Set dump file format in `/etc/`
If `kdump` is set up using these steps, the default output format will be a compressed format called `dumpfile`. `gdb` is not able to interpret this format, however. Only `crash(8)` can understand it. If you'd like to do post-moterm using `gdb` (or `drgn`), you'll need to dump in ELF format. This can be done by editing `MAKEDUMP_ARGS` parameter in [`/etc/default/kdump-tools`](https://manpages.debian.org/unstable/kdump-tools/kdump-tools.5.en.html):
```
$ vim /etc/default/kdump-tools
```
The `MAKEDUMP_ARGS` parameter specifies what options are passed to `makedumpfile` when a dump file is generated. By default, it is commented out:
```
# ---------------------------------------------------------------------------
# Makedumpfile options:
# MAKEDUMP_ARGS - extra arguments passed to makedumpfile (8). The default,
# if unset, is to pass '-c -d 31' telling makedumpfile to use compression
# and reduce the corefile to in-use kernel pages only.
#MAKEDUMP_ARGS="-c -d 31"
```
Uncomment that option, and replace `-c` option with `-E`, meaning that instead of using the compressed (`-c`) `dumpfile` format, store the dump file into an ELF (`-E`):
```
# ---------------------------------------------------------------------------
# Makedumpfile options:
# MAKEDUMP_ARGS - extra arguments passed to makedumpfile (8). The default,
# if unset, is to pass '-c -d 31' telling makedumpfile to use compression
# and reduce the corefile to in-use kernel pages only.
#MAKEDUMP_ARGS="-E -d 31"
```
### Step 4: increase reserved memory size
`kdump` reserved a portion of memory for the kdump kernel. Determining how much it should reserved is another lore. Instead of setting a lower value from the beginning (like what the documentation does), I suggest make it large enough in the very beginning --- even it may seem to large --- and then gradually decrease it. It's better to make sure it works before optimizing the size. So edit `etc/default/grub.d/kdump-tools.cfg`, and make it (as the documentation suggests):
```
GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT crashkernel=384M-:512M"
```
And then
```
$ sudo update-grub
```
### Step 5: test `kdump`
Crash the kernel by the SysRq:
```
$ sudo -s
[sudo] password for ubuntu:
# echo c > /proc/sysrq-trigger
```
After reboot, if kdump is properly configured, there should be a directory named after the datetime crash happens:
```
$ tree /var/crash/
/var/crash/
├── 202306211421
│ ├── dmesg.202306211421
│ └── dump.202306211421
├── kdump_lock
├── kexec_cmd
└── linux-image-5.19.0-45-generic-202306211421.crash
```
The `dmesg.<datetime>` is the preserved `dmesg` log when crash happens; the `dump.<datetime>` file is the dump file, which can be analyze with `crash(8)` later on.
## Setup `crash` utilities
For generating the dump files, the above steps will be sufficient. However, if you are going to *analyze* the dump file, you'd need:
1. An uncompressed kernel image i.e. the `vmlinux`, with kernel debug symbols. This is provided as `NAMELIST` in `crash(8)` terminologies.
2. A working `crash(8)` binary.
Usually 1. can only be generated if certain configs is enabled during kernel compilation. Fortunately, in Ubuntu this can be installed as separate package. For 2., althought it can be installed by `apt install`, taht one doesn't always work (it doesn't work on my Ubuntu 22.04), so it is likely that you have to manually compile it from [source](https://github.com/crash-utility/crash/releases).
### Step 1: Install kernel debug symbol
Follow the instructions in [CrashdumpRecipe](https://wiki.ubuntu.com/Kernel/CrashdumpRecipe):
```
$ echo "deb http://ddebs.ubuntu.com $(lsb_release -cs) main restricted universe multiverse
deb http://ddebs.ubuntu.com $(lsb_release -cs)-updates main restricted universe multiverse
deb http://ddebs.ubuntu.com $(lsb_release -cs)-proposed main restricted universe multiverse" | \
sudo tee -a /etc/apt/sources.list.d/ddebs.list
```
```
$ sudo apt install ubuntu-dbgsym-keyring
```
```
$ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys F2EDC64DC5AEE1F6B9C621F0C8CAB6595FDFF622
```
Finally, install the debug symbol package. Note that dumpfile is peculiar about kernel version. It has to have `vmlinux` with debug symbol whose version is the same as the kernel that generates the dump file. If they happen to be the same (say, you are debugging your own machine), then it can be installed by:
```
$ sudo apt-get install linux-image-`uname -r`-dbgsym
```
otherwise, replace the `uname -r` part with proper kernel version:
```
$ sudo apt-get install linux-image-5.19.0-43-dbgsym
```
If succeeded, there will be a `vmlinux-<uname -r>-generic` appears under `/lib/debug/boot/`, say:
```
$ ls /lib/debug/boot/
vmlinux-5.19.0-43-generic
```
As a side note, if `vmlinux` version doesn't match that of the dump file, later `crash(8)` will complains:
```
crash: invalid kernel virtual address: 1fbc0 type: "current_task (per_cpu)"
crash: invalid kernel virtual address: 1fbc0 type: "current_task (per_cpu)"
crash: invalid kernel virtual address: 1fbc0 type: "current_task (per_cpu)"
crash: invalid kernel virtual address: 1fbc0 type: "current_task (per_cpu)"
crash: seek error: kernel virtual address: ffff892d3bc1fbc0 type: "current_task (per_cpu)"
crash: seek error: kernel virtual address: ffff892d3bc9fbc0 type: "current_task (per_cpu)"
crash: seek error: kernel virtual address: ffff892d3bd1fbc0 type: "current_task (per_cpu)"
crash: seek error: kernel virtual address: ffff892d3bd9fbc0 type: "current_task (per_cpu)"
crash: page excluded: kernel virtual address: ffffffff9463abc0 type: "current_task (per_cpu)"
crash: page excluded: kernel virtual address: ffffffff9463abc0 type: "current_task (per_cpu)"
crash: page excluded: kernel virtual address: ffffffff9463abc0 type: "current_task (per_cpu)"
```
### Step 2: get a working `crash(8)`
When you install `linux-crashdump`, there is a version `crash(8)` included in that package. The basic usage of `crash(8)` would be:
```
$ crash <debug kernel> <crash dump>
```
So in this case:
```
$ crash /lib/debug/boot/vmlinux-5.19.0-45-generic /var/crash/202306211421/dump.202306211421
```
However this `crash` doesn't necessarily work. For example, on Ubuntu 22.04, if it is `crash 8.0.0` you are using, the `crash` will crash:
```
crash 8.0.0
Copyright (C) 2002-2021 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011, 2020-2021 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
Copyright (C) 2015, 2021 VMware, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
please wait... (gathering kmem slab cache data)
crash: invalid structure member offset: kmem_cache_s_num
FILE: memory.c LINE: 9619 FUNCTION: kmem_cache_init()
[/usr/bin/crash] error trace: 556099d8969e => 556099d5d2f4 => 556099e2b11b => 556099e2b09c
```
To deal with it, you have to compile newer version of `crash(8)` from scratch. Per the date of writing of this article, latest available version is [8.0.3](https://github.com/crash-utility/crash/releases).
To compile it, first install the dependencies:
```
sudo apt install bison texinfo libz-dev
```
And then get the source code and compile it:
```
$ wget "https://github.com/crash-utility/crash/archive/refs/tags/8.0.3.zip"
$ unzip 8.0.3.zip
$ cd crash-8.0.3/
$ make
```
Finally, use the compiled `crash`:
```
$ ./crash /lib/debug/boot/vmlinux-5.19.0-45-generic /var/crash/202306211421/dump.202306211421
```
If everything works well, the `crash> ` prompt will appear:
```
crash 8.0.3
Copyright (C) 2002-2022 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011, 2020-2022 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
Copyright (C) 2015, 2021 VMware, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
/home/lima.linux/.gdbinit:1: Error in sourced command file:
~/.gef-2b72f5d0d9f0f218a91cd1ca5148e45923b950d5.py:52: Error in sourced command file:
Undefined command: "import". Try "help".
KERNEL: /lib/debug/boot/vmlinux-5.19.0-45-generic
DUMPFILE: /var/crash/202306211421/dump.202306211421 [PARTIAL DUMP]
CPUS: 4
DATE: Wed Jun 21 14:20:48 UTC 2023
UPTIME: 00:25:24
LOAD AVERAGE: 0.06, 0.04, 0.05
TASKS: 220
NODENAME: lima-default
RELEASE: 5.19.0-45-generic
VERSION: #46-Ubuntu SMP PREEMPT_DYNAMIC Wed Jun 7 09:08:58 UTC 2023
MACHINE: x86_64 (1000 Mhz)
MEMORY: 4 GB
PANIC: "Kernel panic - not syncing: sysrq triggered crash"
PID: 3753
COMMAND: "bash"
TASK: ffff8e51874bc8c0 [THREAD_INFO: ffff8e51874bc8c0]
CPU: 3
STATE: TASK_RUNNING (PANIC)
crash>
```
And you are good to go!