FSTRIM/discards on SAN storage array causing extreme IO issues

https://access.redhat.com/solutions/3669411

Solution Unverified - Updated November 10 2023 at 4:39 PM - English Environment Red Hat Enterprise Linux 9 Red Hat Enterprise Linux 8 Red Hat Enterprise Linux 7 Issue FSTRIM/discards can cause severe performance hits to database I/O.

Resolution Engage your storage vendor for assistance.

The common cause of this issue is when storage provides back a very large max discard size (see Diagnostics for example where storage specifies a maximum discard size of 2GB). Each individual discard io (unmap|write same) is allowed to specify a size up to this value provided by storage. Very large discards when sent to storage often results in very long storage hardware latencies which affect not only other discards, but also other io. This type of issue was discussed upstream and any changes to ignore or reject or otherwise manipulate the values returned by storage hardware to avoid these types of issues were rejected. The consensus was that the hardware should return a reasonable maximum discard io size that, when used, will not result in very large storage hardware latency performance issues. To avoid huge latency issues due to poor hardware storage design choices, the /sys/block//queue/discard_max_bytes can be decreased. This will prevent overly large discards from being sent to the backing device. There is discussion about the details of very large discard sizes being provided by hardware storage at Issues around discard.

Note some storage vendors indicate that they don’t have a recommended maximum discard size, leaving it up to individual users to set a value that works best for them. This issue is reported in the below bugs:

Bug 1643824 - Sever performance problems when running fstrim to a large LUN Closed as NOTABUG Bug 2160828 - RFE - Request for optimal discard size parameter This bug was closed due to a refusal from the upstream kernel block/scsi maintainers. The thread can be found at [PATCH] block: set reasonable default for discard max. After discussion, the upstream maintainers said that this is a hardware issue; it is the hardware’s responsibility to export a reasonable discard max value. Since the mentioned bugs are in private state, for full details on them, please contact Red Hat Support. However, it all distills back to the fact that when the linux code uses the maximum discard values provided by storage hardware, storage hardware ends up having large performance issues … and the solution to that issue is for the storage hardware vendors to choose a better maximum discard size that will prevent such performance problems. Trying to second guess what that better value should be in some automatic way within the kernel software could result in guessing wrong for some hardware vendors that have high performance discard capability whereby such performance issues do not exist. Root Cause Depending on how a device handles them internally, huge discards can introduce massive latencies (hundreds of msec) on the device side.

Diagnostic Steps The command lsblk -D will show the discard_max_bytes under the DISC-MAX column: Raw lsblk -D NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO sda 0 512B 2G 0 ├─sda1 0 512B 2G 0 └─sda2 0 512B 2G 0 ├─rhel-root 0 512B 2G 0 ├─rhel-swap 0 512B 2G 0 └─rhel-home 0 512B 2G 0 Product(s) Red Hat Enterprise LinuxComponent kernelCategory TroubleshootTags san storage This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.