Why does fstrim fail on Red Hat Enterprise Linux VMs on VMware hypervisors?
Why does fstrim fail on Red Hat Enterprise Linux VMs on VMware hypervisors?
https://access.redhat.com/solutions/773263
Solution Verified - Updated August 5 2024 at 7:23 AM - English Environment Red Hat Enterprise Linux VM (So far, the problem has been observed in Red Hat Enterprise Linux 6 and 7. However, in theory, it can affect all versions) VMware Hypervisors Issue fstrim fails on VMware guests with messages similar to the following:
On ext4 filesystems:
Raw fstrim: /path/to/mountpoint: the discard operation is not supported
fstrim: /path/to/mountpoint: FITRIM ioctl failed: Input/output error
fstrim: /path/to/mountpoint: FITRIM ioctl failed: Operation not supported On XFS filesystems:
Raw fstrim: /path/to/mountpoint: the discard operation is not supported
fstrim: /path/to/mountpoint: FITRIM ioctl failed: Input/output error The exact error message depends on filesystem type, system state and version.
Resolution For Trim/Discard to work, a series of external components need to support it. This includes the storage that is used by the hypervisor and the hypervisor software:
There are version dependencies for VMware hypervisors, in order to properly support discards, depending on the features that are used. For more details, the hypervisor documentation and vendor need to be consulted.
Similarly, for the backing storage of the hypervisor, the corresponding documentation and vendor need to be consulted to ensure that the storage properly supports discards.
Within the VM the discard needs to be properly handled by both the disk devices (typically /dev/sdX) and any layers built by devicemapper on top of the disks (e.g. LVs).
Once the prerequisites from the hypervisor and the backing storage are properly configured, then a reboot of the VM is expected to auto detect the correct parameters and discards are expected to work properly. This is the simplest and fastest resolution, but it requires downtime.
If the VM cannot be rebooted after the hypervisor and the backing storage have been properly configured, then it is necessary to manually review the point at which discards fail, in order to determine what steps are necessary.
In some cases, the disks and device mapper structures are not properly setup in the VM because the VM has not been rebooted after the configuration on the hypervisor has been updated. In such situations, it is possible that manually setting correct parameters on the disks and updating device mapper maps can allow discards to succeed.
An example of such a situation, along with potential workarounds is:
The following output is the result of a previously failed fstrim command. The failure has triggered disabling of discards on both /dev/sdd and /vg_name/lv_name (See the Root Cause section for more details on how this can happen).
Raw
lsblk -s –output NAME,MAJ:MIN,DISC-ALN,DISC-GRAN,DISC-MAX,DISC-ZERO,MOUNTPOINT /dev/vg_name/lv_name
NAME MAJ:MIN DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO MOUNTPOINT vg_name-lv_name 253:2 0 512B 0B 0 /path/to/mountpoint └─sdd 8:48 0 512B 0B 0
cat /sys/block/sdd/device/scsi_disk/3:0:0:2/provisioning_mode
disabled After correcting the misconfiguration that was causing the hypervisor to reject discard commands, it is possible to configure sdd to use a specific command for discards. This requires knowing what command the hypervisor supports for discards. If the hypervisor supports Unmap commands for discards, then “unmap” can be set to provisioning_mode using:
Raw
echo -n unmap > /sys/block/sdd/device/scsi_disk/3:0:0:2/provisioning_mode
cat /sys/block/sdd/device/scsi_disk/3:0:0:2/provisioning_mode
unmap
lsblk -s –output NAME,MAJ:MIN,DISC-ALN,DISC-GRAN,DISC-MAX,DISC-ZERO,MOUNTPOINT /dev/vg_name/lv_name
NAME MAJ:MIN DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO MOUNTPOINT vg_name-lv_name 253:2 0 512B 0B 0 /path/to/mountpoint └─sdd 8:48 0 512B 32M 0 sdd is now using unmap for discards, however, the map for vg_name/lv_name will still not accept discards and fstrim will still be failing.
Updating the LV map is also necessary. The straight forward was to do this is disabling and re-enabling the LV. However, this requires unmounting the filesystem, therefore downtime.
Raw
umount /path/to/mountpoint
lvchange -an vg_name/lv_name
lvchange -ay vg_name/lv_name
mount /dev/vg_name/lv_name /path/to/mountpoint
Alternatively a change in the structure of the LV will force LVM to update the map. For example extending the LV, even by a single extent (lvextend -l+1 vg_name/lv_name) will cause an update in the map resetting discard parameters based on the underlying PV.
Root Cause
Such failures are typically caused by inconsistent information reported by the virtual disks at the moment the VM boots (or when the disks are presented to the VM). Such inconsistent information can lead in setting up wrong limits for discards, or using a scsi command that is not supported by the virtual disks, triggering a failure. This is frequently caused by older hypervisor versions. For exact versions and updates required to support discards, the hypervisor documentation and the hypervisor vendor need to be consulted.
When the kernel in the VM detects certain types of failures, it will disable discards on the devices that failed. Subsequent attempts will be failing only because discards have been disabled, without issuing requests to the virtual disks.
This automatic disabling of discards happens during the completion of the discard request (in function sd_done) when a the failure is caused by a sense key matching “Illegal Request/Invalid command operation code” or “Illegal Request/Invalid field in cdb”.
The code disabling discards in Red Hat Enterprise Linux 7 can be seen below:
Raw static int sd_done(struct scsi_cmnd SCpnt) { … switch (sshdr.sense_key) { … case ILLEGAL_REQUEST: if (sshdr.asc == 0x10) / DIX: Host detected corruption / good_bytes = sd_completed_bytes(SCpnt); / INVALID COMMAND OPCODE or INVALID FIELD IN CDB */ if (sshdr.asc == 0x20 || sshdr.asc == 0x24) { switch (op) { case UNMAP: sd_config_discard(sdkp, SD_LBP_DISABLE); break; case WRITE_SAME_16: case WRITE_SAME: if (unmap) sd_config_discard(sdkp, SD_LBP_DISABLE); else { … Configuration changes and updates on the hypervisor to provide support for discards, will not automatically change parameters within the VM. It is possible that disk configuration in the VM will still be inconsistent after the changes made on the hypervisor, if the VM is not rebooted as well.
Diagnostic Steps Review and confirm, along with the hypervisor vendor and the vendor of the backing storage that they both support discards and they are properly configured for this purpose.
Review what disks report within the VM. This can be done with direct queries to the disks. A list of these queries is:
Raw sg_inq -vvvv /dev/sdX sg_inq -H /dev/sdX sg_readcap –16 -vvvv /dev/sdX sg_readcap –16 -H /dev/sdX sg_vpd -vvvv -p lbpv /dev/sdX sg_vpd -H -p lbpv /dev/sdX sg_vpd -vvvv -p bl /dev/sdX sg_vpd -H -p bl /dev/sdX Note: Each query is repeated twice above, once providing human readable output and once with the -H flag printing the raw bytes in hexadecimal format.
Note: The values returned by these commands are reported directly from the disk. Any discrepancies are originating from lower layers (for the VM it is indistinguishable if these are originating from the hypervisor or the backing storage).
The Read Capacity(16) (sg_readcap –16) command reveals if logical block provisioning is supported. If the LBPME bit is set, then it is expected that either the Unmap or a variant of the Write Same commands with the unmap bit set is supported by the device.
The Logical Block Provisioning VPD page (sg_vpd -p lbpv) provides more information on the commands that are supported. Its output is self-descriptive. If one of the Write Same variants with the unmap bit is reported to be supported and the Logical block provisioning read zeros (LBPRZ) bit is set, then the Write Same SCSI command (with the unmap flag) will be the preferred command for discards. Not all disks will support the Logical Block Provisioning VPD page.
The Block limits VPD page (sg_vpd -p bl), reports the maximum number of blocks that can be specified for discard within the Unmap/Write Same commands mentioned above.
Review what is reported by the kernel as the interpretation of the values reported by the disks:
Raw cat /sys/block/sdX/device/scsi_disk/H:B:T:L/thin_provisioning
cat /sys/block/sdX/device/scsi_disk/H:B:T:L/provisioning_mode
lsblk -s –output NAME,MAJ:MIN,DISC-ALN,DISC-GRAN,DISC-MAX,DISC-ZERO,MOUNTPOINT thin_provisioning and provisioning_mode reveal if the disk is considered thin provisioned and what command will be used by discards. VMware hypervisors mostly support the Unmap command.
The output of lsblk reveals the limits set for discards. If, for some device, DISC-MAX is 0 then it is an indication that discards are disabled for this device.
Depending on the command-set that the disks support, the hdparm command can also provide information on discard support (mostly for devices supporting SATA commands):
Raw
hdparm -I /dev/sd |grep -i trim
Supporting hardware will return output similar to the following:
Raw
hdparm -I /dev/sdb |grep -i trim
* Data Set Management TRIM supported (limit 8 blocks)
* Deterministic read data after TRIM Review system logs while running a failed fstrim command. Depending on the way fstrim fails, details about failed scsi commands related to fstrim may appear.
If there are no messages printed in logs when fstrim fails, this is an indication that it failed before a request was sent to the disk, most likely because discards were disabled by an earlier failure.
The following output reveals a Write Same(16) command failing. The error reported (Illegal Request/Invalid field in cdb) is a strong indication that the Write Same(16) command with the unmap flag is not supported by the disk. However, this error is returned by the disk (the hypervisor). Therefore, its cause needs to be investigated on the hypervisor.
Raw kernel: sd 2:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s kernel: sd 2:0:5:0: [sdf] Sense Key : Illegal Request [current] kernel: sd 2:0:5:0: [sdf] Add. Sense: Invalid field in cdb kernel: sd 2:0:5:0: [sdf] CDB: Write same(16) 93 08 00 00 00 00 19 24 07 58 00 01 00 00 00 00 This sense key will cause the kernel to disable discards on disk sdf. Any future fstrim commands will fail without issuing a Write same(16) request, only because discards have already been disabled.
The following output reveals an Unmap command failing. The error reported (Illegal Request/Invalid field in parameter list) is an indication that the disk didn’t accept the description of the area requested by the discard command. As in the case of the Write Same(16) commands above, this is a reply sent from the disk (the hypervisor) and therefore details on the cause of this failure can only be found on the hypervisor.
Raw kernel: sd 34:0:5:0: [sdi] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s kernel: sd 34:0:5:0: [sdi] Sense Key : Illegal Request [current] kernel: sd 34:0:5:0: [sdi] Add. Sense: Invalid field in parameter list kernel: sd 34:0:5:0: [sdi] CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00 In cases where Unmap requests get rejected by the target and it is necessary to review the request being sent (e.g. to see if Unmap requests more segments than the target allows, or if it requests a bigger block than the target allows), this can be done with the attached print_unmap.stp systemtap script. To use this script, the system needs to be prepared according to: What is SystemTap and how to use it?. For Write Same(16) requests, the CDB printed in system logs is enough to provide the full command description, there is no need for any additional steps to reveal the request sent to the target.
Attachments print_unmap.stp Product(s) Red Hat Enterprise LinuxComponent coreutils kernel util-linuxCategory ConfigureTags ext4 file_systems rhel rhel_6 rhel_7 storage vmware xfs This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.