XFS Corruption Detected - Unmount and Run xfs_repair

https://access.redhat.com/solutions/1194613

Environment

  • Red Hat Enterprise Linux 10
  • Red Hat Enterprise Linux 9
  • Red Hat Enterprise Linux 8
  • Red Hat Enterprise Linux 7
  • Red Hat Enterprise Linux 6 (with the Scalable File System Add-on)
  • Red Hat Enterprise Linux 5.6+ (with the Scalable File System Add-on)
  • XFS Filesystem

Issue

  • XFS filesystem encountered some corruption and filesystem repair is recommended.

    Jan  1 12:42:12 server kernel: XFS (dm-1): Corruption detected. Unmount and run xfs_repair
    
  • XFS errors spotted in /var/log/messages:

    2023-04-18T12:27:26-04:00 [kern.alert] kernel: XFS (dm-1): Metadata corruption detected at 
    xfs_dinode_verify.part.8+0x155/0x690 [xfs], inode 0xa03d21 dinode
    2023-04-18T12:27:26-04:00 [kern.alert] kernel: XFS (dm-1): Unmount and run xfs_repair
    

Resolution

  • Before starting, make sure the device is unmounted, and you have a proper data backup of the FS. If you do not have a backup, then attempt mounting the FS in norecovery mode, which mounts the FS without running log recovery, and it must be in read-only state as well. If the FS mounts, it should allow you to take a backup:

    # mount /dev/device /path/to/mount/point -o ro,norecovery -vvv
    
  • Use the xfs_repair utility, replacing /dev/device with the block device where the XFS filesystem to be repaired is (i.e. /dev/dm-1):

    # xfs_repair -v /dev/device 2>&1 | tee /tmp/xfs_repair-v.out
    
  • The xfs_repair utility cannot repair an XFS filesystem containing a dirty transaction log. To clear it, unmount and mount the filesystem, so the log can be properly replayed:

    # umount /path/to/dev/device
    # mount /path/to/mount/point
    
  • If the log is corrupt and cannot be replayed, use the -L option to clear the log. For more details, please refer to: Unable to mount or check XFS filesystem.

    # xfs_repair -Lv /dev/device |& tee /tmp/xfs_repair-Lv.out
    

    Important: Be aware that cleaning the log forcibly may result in further corruption or data loss, so before running the xfs_repair utility, make sure that a backup is available.

  • If xfs_repair cannot repair the filesystem, a metadata image can be taken to help troubleshoot why repair was unsuccessful. For details, refer to: How to create a metadata dump from an XFS filesystem?

    # xfs_metadump -gwa /dev/device /tmp/device.metadump
    

    It might be helpful to include the file and directory names as well (which are obfuscated by default) in the metadata dump, in which case the command will look like this:

    # xfs_metadump -gwao /dev/device /tmp/device.metadump
    

    Important: The xfs_metadump may only be used to dump metadata from unmounted filesystems, or read-only mounted.

  • Restore the metadata dump on top of a sparse file, and then run xfs_repair to repair it:

    # truncate -s 10G /tmp/disk.img
        
    # xfs_mdrestore -g /tmp/device.metadump /tmp/disk.img
        
    # xfs_repair -fv /tmp/disk.img
    

Root Cause

Diagnostic Steps

  • While booting, check the following error messages coming from the XFS driver:

    Jan  1 12:42:12 server kernel: XFS (dm-1): Corruption detected. Unmount and run xfs_repair
    Jan  1 12:42:12 server kernel: XFS (dm-1): Internal error xfs_trans_cancel at line 1948 of file fs/xfs/xfs_trans.c.  Caller 0xffffffffa02dd6af
    Jan  1 12:42:12 server kernel: 
    Jan  1 12:42:12 server kernel: Pid: 12345, comm: xxxx Not tainted 2.6.32-431.23.3.el6.x86_64 #1
    Jan  1 12:42:12 server kernel: Call Trace:
    Jan  1 12:42:12 server kernel: [<ffffffffa02bae5f>] ? xfs_error_report+0x3f/0x50 [xfs]
    Jan  1 12:42:12 server kernel: [<ffffffffa02dd6af>] ? xfs_create+0x1ef/0x640 [xfs]
    Jan  1 12:42:12 server kernel: [<ffffffffa02d86b5>] ? xfs_trans_cancel+0xf5/0x120 [xfs]
    Jan  1 12:42:12 server kernel: [<ffffffffa02dd6af>] ? xfs_create+0x1ef/0x640 [xfs]
    Jan  1 12:42:12 server kernel: [<ffffffffa02eaa5d>] ? xfs_vn_mknod+0xad/0x1c0 [xfs]
    Jan  1 12:42:12 server kernel: [<ffffffffa02eaba0>] ? xfs_vn_create+0x10/0x20 [xfs]
    Jan  1 12:42:12 server kernel: [<ffffffff81198416>] ? vfs_create+0xe6/0x110
    Jan  1 12:42:12 server kernel: [<ffffffff8119c27e>] ? do_filp_open+0xa8e/0xd20
    Jan  1 12:42:12 server kernel: [<ffffffff8118e7a4>] ? cp_new_stat+0xe4/0x100
    Jan  1 12:42:12 server kernel: [<ffffffff8128f87a>] ? strncpy_from_user+0x4a/0x90
    Jan  1 12:42:12 server kernel: [<ffffffff811a8bd2>] ? alloc_fd+0x92/0x160
    Jan  1 12:42:12 server kernel: [<ffffffff81185c39>] ? do_sys_open+0x69/0x140
    Jan  1 12:42:12 server kernel: [<ffffffff81185d50>] ? sys_open+0x20/0x30
    Jan  1 12:42:12 server kernel: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
    Jan  1 12:42:12 server kernel: XFS (dm-1): xfs_do_force_shutdown(0x8) called from line 1949 of file fs/xfs/xfs_trans.c.  Return address = 0xffffffffa02d86ce
    Jan  1 12:42:12 server kernel: XFS (dm-1): Corruption of in-memory data detected.  Shutting down filesystem
    Jan  1 12:42:12 server kernel: XFS (dm-1): Please umount the filesystem and rectify the problem(s)
    Jan  1 12:42:16 server kernel: XFS (dm-1): xfs_log_force: error 5 returned.
    
  • Product(s)
  • Red Hat Enterprise Linux

  • Component
  • kernel

  • Category
  • Troubleshoot

  • Tags
  • file_systems
  • xfs

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Updated: