Chapter 14. Configuring kdump on the command line - RHEL9
Chapter 14. Configuring kdump on the command line - RHEL9
https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/9/html/managing_monitoring_and_updating_the_kernel/configuring-kdump-on-the-command-line_managing-monitoring-and-updating-the-kernel
Chapter 14. Configuring kdump on the command line
The memory for kdump is reserved during the system boot. You can configure the memory size in the system’s Grand Unified Bootloader (GRUB) configuration file. The memory size depends on the crashkernel= value specified in the configuration file and the size of the physical memory of system. 14.1. Estimating the kdump size
When planning and building your kdump environment, it is important to know the space required by the crash dump file.
The makedumpfile –mem-usage command estimates the space required by the crash dump file. It generates a memory usage report. The report helps you decide the dump level and the pages that are safe to exclude.
Procedure
Enter the following command to generate a memory usage report:
# makedumpfile --mem-usage /proc/kcore
TYPE PAGES EXCLUDABLE DESCRIPTION
-------------------------------------------------------------
ZERO 501635 yes Pages filled with zero
CACHE 51657 yes Cache pages
CACHE_PRIVATE 5442 yes Cache pages + private
USER 16301 yes User process pages
FREE 77738211 yes Free pages
KERN_DATA 1333192 no Dumpable kernel data
Important
The makedumpfile –mem-usage command reports required memory in pages. This means that you must calculate the size of memory in use against the kernel page size.
By default the RHEL kernel uses 4 KB sised pages on AMD64 and Intel 64 CPU architectures, and 64 KB sised pages on IBM POWER architectures. 14.2. Configuring kdump memory usage on RHEL 9 Copy link
The kexec-tools package maintains the default crashkernel= memory reservation values. The kdump service uses the default value to reserve the crash kernel memory for each kernel. The default value can also serve as the reference base value to estimate the required memory size when you set the crashkernel= value manually. The minimum size of the crash kernel can vary depending on the hardware and machine specifications.
The automatic memory allocation for kdump also varies based on the system hardware architecture and available memory size. For example, on AMD64 and Intel 64-bit architectures, the default value for the crashkernel= parameter will work only when the available memory is more than 1 GB. The kexec-tools utility configures the following default memory reserves on AMD64 and Intel 64-bit architecture:
crashkernel=1G-4G:192M,4G-64G:256M,64G:512M
You can also run kdumpctl estimate to get an approximate value without triggering a crash. The estimated crashkernel= value might not be an exact one but can serve as a reference to set an appropriate crashkernel= value. Note
The crashkernel=auto option in the boot command line is no longer supported on RHEL 9 and later releases.
Prerequisites
You have root permissions on the system.
You have fulfilled kdump requirements for configurations and targets. For details, see Supported kdump configurations and targets.
You have installed the zipl utility if it is the IBM Z system.
Procedure
Configure the default value for crash kernel:
# kdumpctl reset-crashkernel --kernel=ALL
When configuring the crashkernel= value, test the configuration by rebooting the system with kdump enabled. If the kdump kernel fails to boot, increase the memory size gradually to set an acceptable value.
To use a custom crashkernel= value:
Configure the required memory reserve.
crashkernel=192M
Optionally, you can set the amount of reserved memory to a variable depending on the total amount of installed memory by using the syntax crashkernel=<range1>:<size1>,<range2>:<size2>. For example:
crashkernel=1G-4G:192M,2G-64G:256M
The example reserves 192 MB of memory if the total amount of system memory is 1 GB or higher and lower than 4 GB. If the total amount of memory is more than 4 GB, 256 MB is reserved for kdump.
Optional: Offset the reserved memory.
Some systems require to reserve memory with a certain fixed offset since crashkernel reservation is very early, and it wants to reserve some area for special usage. If the offset is set, the reserved memory begins there. To offset the reserved memory, use the following syntax:
crashkernel=192M@16M
The example reserves 192 MB of memory starting at 16 MB (physical address 0x01000000). If you offset to 0 or do not specify a value, kdump offsets the reserved memory automatically. You can also offset memory when setting a variable memory reservation by specifying the offset as the last value. For example, crashkernel=1G-4G:192M,2G-64G:256M@16M.
Update the boot loader configuration:
# grubby --update-kernel ALL --args "crashkernel=<custom-value>"
The <custom-value> must contain the custom crashkernel= value that you have configured for the crash kernel.
Reboot for changes to take effect:
# reboot
Verification
The commands to test kdump configuration will cause the kernel to crash with data loss. Follow the instructions with care. You must not use an active production system to test the kdump configuration.
Cause the kernel to crash by activating the sysrq key. The address-YYYY-MM-DD-HH:MM:SS/vmcore file is saved to the target location as specified in the /etc/kdump.conf file. If you select the default target location, the vmcore file is saved in the partition mounted under /var/crash/.
Activate the sysrq key to boot into the kdump kernel:
# echo c > /proc/sysrq-trigger
The command causes kernel to crash and reboots the kernel if required.
Display the /etc/kdump.conf file and check if the vmcore file is saved in the target destination.
Additional resources
How to manually modify the boot parameter in grub before the system boots.
grubby(8) man page on your system.
14.3. Configuring the kdump target
The crash dump is usually stored as a file in a local file system, written directly to a device. Optionally, you can send crash dump over a network by using the NFS or SSH protocols. Only one of these options to preserve a crash dump file can be set at a time. The default behaviour is to store it in the /var/crash/ directory of the local file system.
Prerequisites
You have root permissions on the system.
Fulfilled requirements for kdump configurations and targets. For details, see Supported kdump configurations and targets.
Procedure
To store the crash dump file in /var/crash/ directory of the local file system, edit the /etc/kdump.conf file and specify the path:
path /var/crash
The option path /var/crash represents the path to the file system in which kdump saves the crash dump file.
Note
When you specify a dump target in the /etc/kdump.conf file, then the path is relative to the specified dump target.
When you do not specify a dump target in the /etc/kdump.conf file, then the path represents the absolute path from the root directory.
Depending on the file system mounted in the current system, the dump target and the adjusted dump path are configured automatically.
To secure the crash dump file and the accompanying files produced by kdump, you should set up proper attributes for the target destination directory, such as user permissions and SELinux contexts. Additionally, you can define a script, for example kdump_post.sh in the kdump.conf file as follows:
kdump_post <path_to_kdump_post.sh>
The kdump_post directive specifies a shell script or a command that executes after kdump has completed capturing and saving a crash dump to the specified destination. You can use this mechanism to extend the functionality of kdump to perform actions including the adjustments in file permissions.
The kdump target configuration
grep -v ^# /etc/kdump.conf | grep -v ^$
ext4 /dev/mapper/vg00-varcrashvol path /var/crash core_collector makedumpfile -c –message-level 1 -d 31
The dump target is specified (ext4 /dev/mapper/vg00-varcrashvol), and, therefore, it is mounted at /var/crash. The path option is also set to /var/crash. Therefore, the kdump saves the vmcore file in the /var/crash/var/crash directory.
To change the local directory for saving the crash dump, edit the /etc/kdump.conf configuration file as a root user:
Remove the hash sign (#) from the beginning of the #path /var/crash line.
Replace the value with the intended directory path. For example:
path /usr/local/cores
Important
In RHEL 9, the directory defined as the kdump target using the path directive must exist when the kdump systemd service starts to avoid failures. Unlike in earlier versions of RHEL, the directory is no longer created automatically if it does not exist when the service starts.
To write the file to a different partition, edit the /etc/kdump.conf configuration file:
Remove the hash sign (#) from the beginning of the #ext4 line, depending on your choice.
device name (the #ext4 /dev/vg/lv_kdump line)
file system label (the #ext4 LABEL=/boot line)
UUID (the #ext4 UUID=03138356-5e61-4ab3-b58e-27507ac41937 line)
Change the file system type and the device name, label or UUID, to the required values. The correct syntax for specifying UUID values is both UUID="correct-uuid" and UUID=correct-uuid. For example:
ext4 UUID=03138356-5e61-4ab3-b58e-27507ac41937
Important
It is recommended to specify storage devices by using a LABEL= or UUID=. Disk device names such as /dev/sda3 are not guaranteed to be consistent across reboot.
When you use Direct Access Storage Device (DASD) on IBM Z hardware, ensure the dump devices are correctly specified in /etc/dasd.conf before proceeding with kdump.
To write the crash dump directly to a device, edit the /etc/kdump.conf configuration file:
Remove the hash sign (#) from the beginning of the #raw /dev/vg/lv_kdump line.
Replace the value with the intended device name. For example:
raw /dev/sdb1
To store the crash dump to a remote machine by using the NFS protocol:
Remove the hash sign (#) from the beginning of the #nfs my.server.com:/export/tmp line.
Replace the value with a valid hostname and directory path. For example:
nfs penguin.example.com:/export/cores
Restart the kdump service for the changes to take effect:
sudo systemctl restart kdump.service
Note
While using the NFS directive to specify the NFS target, kdump.service automatically attempts to mount the NFS target to check the disk space. There is no need to mount the NFS target in advance. To prevent kdump.service from mounting the target, use the dracut_args --mount directive in kdump.conf. This will enable kdump.service to call the dracut utility with the --mount argument to specify the NFS target.
To store the crash dump to a remote machine by using the SSH protocol:
Remove the hash sign (#) from the beginning of the #ssh user@my.server.com line.
Replace the value with a valid username and hostname.
Include your SSH key in the configuration.
Remove the hash sign from the beginning of the #sshkey /root/.ssh/kdump_id_rsa line.
Change the value to the location of a key valid on the server you are trying to dump to. For example:
ssh john@penguin.example.com
sshkey /root/.ssh/mykey
Additional resources
Files produced by kdump after system crash. 14.4. Configuring the kdump core collector
The kdump service uses a core_collector program to capture the crash dump image. In RHEL, the makedumpfile utility is the default core collector. It helps shrink the dump file by:
Compressing the size of a crash dump file and copying only necessary pages by using various dump levels.
Excluding unnecessary crash dump pages.
Filtering the page types to be included in the crash dump.
Note
Crash dump file compression is enabled by default in the RHEL 7 and above.
If you need to customise the crash dump file compression, follow this procedure.
Syntax
core_collector makedumpfile -l –message-level 1 -d 31
Options
-c, -l or -p: specify compress dump file format by each page using either, zlib for -c option, lzo for -l option or snappy for -p option.
-d (dump_level): excludes pages so that they are not copied to the dump file.
--message-level : specify the message types. You can restrict outputs printed by specifying message_level with this option. For example, specifying 7 as message_level prints common messages and error messages. The maximum value of message_level is 31.
Prerequisites
You have root permissions on the system.
Fulfilled requirements for kdump configurations and targets. For details, see Supported kdump configurations and targets.
Procedure
As a root, edit the /etc/kdump.conf configuration file and remove the hash sign ("#") from the beginning of the #core_collector makedumpfile -l --message-level 1 -d 31.
Enter the following command to enable crash dump file compression:
core_collector makedumpfile -l –message-level 1 -d 31
The -l option specifies the dump compressed file format. The -d option specifies dump level as 31. The –message-level option specifies message level as 1.
Also, consider following examples with the -c and -p options:
To compress a crash dump file by using -c:
core_collector makedumpfile -c -d 31 --message-level 1
To compress a crash dump file by using -p:
core_collector makedumpfile -p -d 31 --message-level 1
Additional resources
makedumpfile(8) man page on your system
Configuration file for kdump
14.5. Configuring the kdump default failure responses
By default, when kdump fails to create a crash dump file at the configured target location, the system reboots and the dump is lost in the process. You can change the default failure response and configure kdump to perform a different operation when it fails to save the core dump to the primary target. The additional actions are:
dump_to_rootfs Saves the core dump to the root file system. reboot Reboots the system, losing the core dump in the process. halt Stops the system, losing the core dump in the process. poweroff Power the system off, losing the core dump in the process. shell Runs a shell session from within the initramfs, you can record the core dump manually. final_action Enables additional operations such as reboot, halt, and poweroff after a successful kdump or when shell or dump_to_rootfs failure action completes. The default is reboot. failure_action Specifies the action to perform when a dump might fail in a kernel crash. The default is reboot.
Prerequisites
Root permissions.
Fulfilled requirements for kdump configurations and targets. For details, see Supported kdump configurations and targets.
Procedure
As a root user, remove the hash sign (#) from the beginning of the #failure_action line in the /etc/kdump.conf configuration file.
Replace the value with a required action.
failure_action poweroff
Additional resources
Configuring the kdump target
14.6. Configuration file for kdump
The configuration file for kdump kernel is /etc/sysconfig/kdump. This file controls the kdump kernel command line parameters. For most configurations, use the default options. However, in some scenarios you might need to modify certain parameters to control the kdump kernel behaviour. For example, modifying the KDUMP_COMMANDLINE_APPEND option to append the kdump kernel command-line to obtain a detailed debugging output or the KDUMP_COMMANDLINE_REMOVE option to remove arguments from the kdump command line.
KDUMP_COMMANDLINE_REMOVE
This option removes arguments from the current kdump command line. It removes parameters that can cause kdump errors or kdump kernel boot failures. These parameters might have been parsed from the previous KDUMP_COMMANDLINE process or inherited from the /proc/cmdline file.
When this variable is not configured, it inherits all values from the /proc/cmdline file. Configuring this option also provides information that is helpful in debugging an issue.
To remove certain arguments, add them to KDUMP_COMMANDLINE_REMOVE as follows:
KDUMP_COMMANDLINE_REMOVE=”hugepages hugepagesz slub_debug quiet log_buf_len swiotlb”
KDUMP_COMMANDLINE_APPEND
This option appends arguments to the current command line. These arguments might have been parsed by the previous KDUMP_COMMANDLINE_REMOVE variable.
For the kdump kernel, disabling certain modules such as mce, cgroup, numa, hest_disable can help prevent kernel errors. These modules can consume a significant part of the kernel memory reserved for kdump or cause kdump kernel boot failures.
To disable memory cgroups on the kdump kernel command line, run the command as follows:
KDUMP_COMMANDLINE_APPEND=”cgroup_disable=memory”
Additional resources
The Documentation/admin-guide/kernel-parameters.txt file
The /etc/sysconfig/kdump file
14.7. Testing the kdump configuration
After configuring kdump, you must manually test a system crash and ensure that the vmcore file is generated in the defined kdump target. The vmcore file is captured from the context of the freshly booted kernel. Therefore, vmcore has critical information for debugging a kernel crash. Warning
Do not test kdump on active production systems. The commands to test kdump will cause the kernel to crash with loss of data. Depending on your system architecture, ensure that you schedule significant maintenance time because kdump testing might require several reboots with a long boot time.
If the vmcore file is not generated during the kdump test, identify and fix issues before you run the test again for a successful kdump testing.
If you make any manual system modifications, you must test the kdump configuration at the end of any system modification. For example, if you make any of the following changes, ensure that you test the kdump configuration for an optimal kdump performances for:
Package upgrades.
Hardware level changes, for example, storage or networking changes.
Firmware upgrades.
New installation and application upgrades that include third party modules.
If you use the hot-plugging mechanism to add more memory on hardware that support this mechanism.
After you make changes in the /etc/kdump.conf or /etc/sysconfig/kdump file.
Prerequisites
You have root permissions on the system.
You have saved all important data. The commands to test kdump cause the kernel to crash with loss of data.
You have scheduled significant machine maintenance time depending on the system architecture.
Procedure
Enable the kdump service:
# kdumpctl restart
Check the status of the kdump service with the kdumpctl:
# kdumpctl status
kdump:Kdump is operational
Optionally, if you use the systemctl command, the output prints in the systemd journal.
Start a kernel crash to test the kdump configuration. The sysrq-trigger key combination causes the kernel to crash and might reboot the system if required.
# echo c > /proc/sysrq-trigger
On a kernel reboot, the address-YYYY-MM-DD-HH:MM:SS/vmcore file is created at the location you have specified in the /etc/kdump.conf file. The default is /var/crash/.
Additional resources
Configuring the kdump target
14.8. Files produced by kdump after system crash
After your system crashes, the kdump service captures the kernel memory in a dump file (vmcore) and it also generates additional diagnostic files to aid in troubleshooting and postmortem analysis.
Files produced by kdump:
vmcore - main kernel memory dump file containing system memory at the time of the crash. It includes data as per the configuration of the core_collector program specified in kdump configuration. By default the kernel data structures, process information, stack traces, and other diagnostic information.
vmcore-dmesg.txt - contents of the kernel ring buffer log (dmesg) from the primary kernel that panicked.
kexec-dmesg.log - has kernel and system log messages from the execution of the secondary kexec kernel that collects the vmcore data.
Additional resources
What is the kernel ring buffer
14.9. Enabling and disabling the kdump service
You can configure to enable or disable the kdump functionality on a specific kernel or on all installed kernels. You must routinely test the kdump functionality and validate its operates correctly.
Prerequisites
You have root permissions on the system.
You have completed kdump requirements for configurations and targets. See Supported kdump configurations and targets.
All configurations for installing kdump are set up as required.
Procedure
Enable the kdump service for multi-user.target:
# systemctl enable kdump.service
Start the service in the current session:
# systemctl start kdump.service
Stop the kdump service:
# systemctl stop kdump.service
Disable the kdump service:
# systemctl disable kdump.service
Warning
It is recommended to set kptr_restrict=1 as default. When kptr_restrict is set to (1) as default, the kdumpctl service loads the crash kernel regardless of whether the Kernel Address Space Layout (KASLR) is enabled.
If kptr_restrict is not set to 1 and KASLR is enabled, the contents of /proc/kore file are generated as all zeros. The kdumpctl service fails to access the /proc/kcore file and load the crash kernel. The kexec-kdump-howto.txt file displays a warning message, which recommends you to set kptr_restrict=1. Verify for the following in the sysctl.conf file to ensure that kdumpctl service loads the crash kernel:
Kernel kptr_restrict=1 in the sysctl.conf file.
14.10. Preventing kernel drivers from loading for kdump
You can control the capture kernel from loading certain kernel drivers by adding the KDUMP_COMMANDLINE_APPEND= variable in the /etc/sysconfig/kdump configuration file. By using this method, you can prevent the kdump initial RAM disk image initramfs from loading the specified kernel module. This helps to prevent the out-of-memory (OOM) killer errors or other crash kernel failures.
You can append the KDUMP_COMMANDLINE_APPEND= variable by using one of the following configuration options:
rd.driver.blacklist=<modules>
modprobe.blacklist=<modules>
Prerequisites
You have root permissions on the system.
Procedure
Display the list of modules that are loaded to the currently running kernel. Select the kernel module that you intend to block from loading:
$ lsmod
Module Sise Used by
fuse 126976 3
xt_CHECKSUM 16384 1
ipt_MASQUERADE 16384 1
uinput 20480 1
xt_conntrack 16384 1
Update the KDUMP_COMMANDLINE_APPEND= variable in the /etc/sysconfig/kdump file. For example:
KDUMP_COMMANDLINE_APPEND="rd.driver.blacklist=hv_vmbus,hv_storvsc,hv_utils,hv_netvsc,hid-hyperv"
Also, consider the following example by using the modprobe.blacklist=<modules> configuration option:
KDUMP_COMMANDLINE_APPEND="modprobe.blacklist=emcp modprobe.blacklist=bnx2fc modprobe.blacklist=libfcoe modprobe.blacklist=fcoe"
Restart the kdump service:
# systemctl restart kdump
Additional resources
dracut.cmdline man page on your system.
14.11. Running kdump on systems with encrypted disk
When you run a LUKS encrypted partition, systems require certain amount of available memory. If the system has less than the required amount of available memory, the cryptsetup utility fails to mount the partition. As a result, capturing the vmcore file to an encrypted target location fails in the second kernel (capture kernel).
The kdumpctl estimate command helps you estimate the amount of memory you need for kdump. kdumpctl estimate prints the recommended crashkernel value, which is the most suitable memory size required for kdump.
The recommended crashkernel value is calculated based on the current kernel size, kernel module, initramfs, and the LUKS encrypted target memory requirement.
If you are using the custom crashkernel= option, kdumpctl estimate prints the LUKS required size value. The value is the memory size required for LUKS encrypted target.
Procedure
Print the estimate crashkernel= value:
# *kdumpctl estimate*
Encrypted kdump target requires extra memory, assuming using the keyslot with minimum memory requirement
Reserved crashkernel: 256M
Recommended crashkernel: 652M
Kernel image size: 47M
Kernel modules size: 8M
Initramfs size: 20M
Runtime reservation: 64M
LUKS required size: 512M
Large modules: <none>
WARNING: Current crashkernel size is lower than recommended size 652M.
Configure the amount of required memory by increasing the crashkernel= value.
Reboot the system.
Note
If the kdump service still fails to save the dump file to the encrypted target, increase the crashkernel= value as required.