XFS Corruption Detected - Unmount and Run xfs_repair

https://access.redhat.com/solutions/1194613

Environment

Red Hat Enterprise Linux 10
Red Hat Enterprise Linux 9
Red Hat Enterprise Linux 8
Red Hat Enterprise Linux 7
Red Hat Enterprise Linux 6 (with the Scalable File System Add-on)
Red Hat Enterprise Linux 5.6+ (with the Scalable File System Add-on)
XFS Filesystem

Issue

XFS filesystem encountered some corruption and filesystem repair is recommended.

Jan  1 12:42:12 server kernel: XFS (dm-1): Corruption detected. Unmount and run xfs_repair

XFS errors spotted in /var/log/messages:

2023-04-18T12:27:26-04:00 [kern.alert] kernel: XFS (dm-1): Metadata corruption detected at 
xfs_dinode_verify.part.8+0x155/0x690 [xfs], inode 0xa03d21 dinode
2023-04-18T12:27:26-04:00 [kern.alert] kernel: XFS (dm-1): Unmount and run xfs_repair

Resolution

Before starting, make sure the device is unmounted, and you have a proper data backup of the FS. If you do not have a backup, then attempt mounting the FS in norecovery mode, which mounts the FS without running log recovery, and it must be in read-only state as well. If the FS mounts, it should allow you to take a backup:
```
# mount /dev/device /path/to/mount/point -o ro,norecovery -vvv
```
Use the xfs_repair utility, replacing /dev/device with the block device where the XFS filesystem to be repaired is (i.e. /dev/dm-1):
```
# xfs_repair -v /dev/device 2>&1 | tee /tmp/xfs_repair-v.out
```
The xfs_repair utility cannot repair an XFS filesystem containing a dirty transaction log. To clear it, unmount and mount the filesystem, so the log can be properly replayed:
```
# umount /path/to/dev/device
# mount /path/to/mount/point
```
If the log is corrupt and cannot be replayed, use the -L option to clear the log. For more details, please refer to: Unable to mount or check XFS filesystem.
```
# xfs_repair -Lv /dev/device |& tee /tmp/xfs_repair-Lv.out
```
Important: Be aware that cleaning the log forcibly may result in further corruption or data loss, so before running the xfs_repair utility, make sure that a backup is available.
If xfs_repair cannot repair the filesystem, a metadata image can be taken to help troubleshoot why repair was unsuccessful. For details, refer to: How to create a metadata dump from an XFS filesystem?
```
# xfs_metadump -gwa /dev/device /tmp/device.metadump
```
It might be helpful to include the file and directory names as well (which are obfuscated by default) in the metadata dump, in which case the command will look like this:
```
# xfs_metadump -gwao /dev/device /tmp/device.metadump
```
Important: The xfs_metadump may only be used to dump metadata from unmounted filesystems, or read-only mounted.

Restore the metadata dump on top of a sparse file, and then run xfs_repair to repair it:

# truncate -s 10G /tmp/disk.img
    
# xfs_mdrestore -g /tmp/device.metadump /tmp/disk.img
    
# xfs_repair -fv /tmp/disk.img

Root Cause

XFS and other journaled filesystems can usually recover from system crashes by just replaying the transaction logs, without the need for running a filesystem check. However, if the XFS kernel driver detects the metadata has been corrupted on disk it may request that the filesystem be repaired.

Note: How do I determine which dm-X device maps to each device mapper device on Red Hat Enterprise Linux?
A XFS filesystem can become corrupted for a variety of reasons, the most notable of which are:
- Connection failure(s) during write
- Bad hardware (intermittent hardware failure)
- Bad cables/fabric
- Power loss
- Faulty network connections
- Flapping on NIC
- Software/firmware bugs
- Incorrect file system resize operations, such as logical volume resising.
For more information, please refer to the following documents:
- Red Hat Enterprise Linux 8 Managing file systems: Chapter 13. Checking and repairing a file system
- Red Hat Enterprise Linux 7 Storage Administration Guide: Chapter 12. File System Check

Diagnostic Steps

While booting, check the following error messages coming from the XFS driver:

Jan  1 12:42:12 server kernel: XFS (dm-1): Corruption detected. Unmount and run xfs_repair
Jan  1 12:42:12 server kernel: XFS (dm-1): Internal error xfs_trans_cancel at line 1948 of file fs/xfs/xfs_trans.c.  Caller 0xffffffffa02dd6af
Jan  1 12:42:12 server kernel: 
Jan  1 12:42:12 server kernel: Pid: 12345, comm: xxxx Not tainted 2.6.32-431.23.3.el6.x86_64 #1
Jan  1 12:42:12 server kernel: Call Trace:
Jan  1 12:42:12 server kernel: [<ffffffffa02bae5f>] ? xfs_error_report+0x3f/0x50 [xfs]
Jan  1 12:42:12 server kernel: [<ffffffffa02dd6af>] ? xfs_create+0x1ef/0x640 [xfs]
Jan  1 12:42:12 server kernel: [<ffffffffa02d86b5>] ? xfs_trans_cancel+0xf5/0x120 [xfs]
Jan  1 12:42:12 server kernel: [<ffffffffa02dd6af>] ? xfs_create+0x1ef/0x640 [xfs]
Jan  1 12:42:12 server kernel: [<ffffffffa02eaa5d>] ? xfs_vn_mknod+0xad/0x1c0 [xfs]
Jan  1 12:42:12 server kernel: [<ffffffffa02eaba0>] ? xfs_vn_create+0x10/0x20 [xfs]
Jan  1 12:42:12 server kernel: [<ffffffff81198416>] ? vfs_create+0xe6/0x110
Jan  1 12:42:12 server kernel: [<ffffffff8119c27e>] ? do_filp_open+0xa8e/0xd20
Jan  1 12:42:12 server kernel: [<ffffffff8118e7a4>] ? cp_new_stat+0xe4/0x100
Jan  1 12:42:12 server kernel: [<ffffffff8128f87a>] ? strncpy_from_user+0x4a/0x90
Jan  1 12:42:12 server kernel: [<ffffffff811a8bd2>] ? alloc_fd+0x92/0x160
Jan  1 12:42:12 server kernel: [<ffffffff81185c39>] ? do_sys_open+0x69/0x140
Jan  1 12:42:12 server kernel: [<ffffffff81185d50>] ? sys_open+0x20/0x30
Jan  1 12:42:12 server kernel: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
Jan  1 12:42:12 server kernel: XFS (dm-1): xfs_do_force_shutdown(0x8) called from line 1949 of file fs/xfs/xfs_trans.c.  Return address = 0xffffffffa02d86ce
Jan  1 12:42:12 server kernel: XFS (dm-1): Corruption of in-memory data detected.  Shutting down filesystem
Jan  1 12:42:12 server kernel: XFS (dm-1): Please umount the filesystem and rectify the problem(s)
Jan  1 12:42:16 server kernel: XFS (dm-1): xfs_log_force: error 5 returned.

Product(s)
Red Hat Enterprise Linux
Component
kernel
Category
Troubleshoot
Tags
file_systems
xfs

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.