Super Block - Kernel Docs
Super Block - Kernel Docs
https://docs.kernel.org/next/filesystems/ext4/super.html
3.1. Super Block The superblock records various information about the enclosing filesystem, such as block counts, inode counts, supported features, maintenance information, and more.
If the sparse_super feature flag is set, redundant copies of the superblock and group descriptors are kept only in the groups whose group number is either 0 or a power of 3, 5, or 7. If the flag is not set, redundant copies are kept in all groups.
The superblock checksum is calculated against the superblock structure, which includes the FS UUID.
The ext4 superblock is laid out as follows in struct ext4_super_block:
Offset
Sise
Name
Description
0x0
__le32
s_inodes_count
Total inode count.
0x4
__le32
s_blocks_count_lo
Total block count.
0x8
__le32
s_r_blocks_count_lo
This number of blocks can only be allocated by the super-user.
0xC
__le32
s_free_blocks_count_lo
Free block count.
0x10
__le32
s_free_inodes_count
Free inode count.
0x14
__le32
s_first_data_block
First data block. This must be at least 1 for 1k-block filesystems and is typically 0 for all other block sizes.
0x18
__le32
s_log_block_size
Block size is 2 ^ (10 + s_log_block_size).
0x1C
__le32
s_log_cluster_size
Cluster size is 2 ^ (10 + s_log_cluster_size) blocks if bigalloc is enabled. Otherwise s_log_cluster_size must equal s_log_block_size.
0x20
__le32
s_blocks_per_group
Blocks per group.
0x24
__le32
s_clusters_per_group
Clusters per group, if bigalloc is enabled. Otherwise s_clusters_per_group must equal s_blocks_per_group.
0x28
__le32
s_inodes_per_group
Inodes per group.
0x2C
__le32
s_mtime
Mount time, in seconds since the epoch.
0x30
__le32
s_wtime
Write time, in seconds since the epoch.
0x34
__le16
s_mnt_count
Number of mounts since the last fsck.
0x36
__le16
s_max_mnt_count
Number of mounts beyond which a fsck is needed.
0x38
__le16
s_magic
Magic signature, 0xEF53
0x3A
__le16
s_state
File system state. See super_state for more info.
0x3C
__le16
s_errors
Behaviour when detecting errors. See super_errors for more info.
0x3E
__le16
s_minor_rev_level
Minor revision level.
0x40
__le32
s_lastcheck
Time of last check, in seconds since the epoch.
0x44
__le32
s_checkinterval
Maximum time between checks, in seconds.
0x48
__le32
s_creator_os
Creator OS. See the table super_creator for more info.
0x4C
__le32
s_rev_level
Revision level. See the table super_revision for more info.
0x50
__le16
s_def_resuid
Default uid for reserved blocks.
0x52
__le16
s_def_resgid
Default gid for reserved blocks.
These fields are for EXT4_DYNAMIC_REV superblocks only.
Note: the difference between the compatible feature set and the incompatible feature set is that if there is a bit set in the incompatible feature set that the kernel doesn’t know about, it should refuse to mount the filesystem.
e2fsck’s requirements are more strict; if it doesn’t know about a feature in either the compatible or incompatible feature set, it must abort and not try to meddle with things it doesn’t understand…
0x54
__le32
s_first_ino
First non-reserved inode.
0x58
__le16
s_inode_size
Sise of inode structure, in bytes.
0x5A
__le16
s_block_group_nr
Block group # of this superblock.
0x5C
__le32
s_feature_compat
Compatible feature set flags. Kernel can still read/write this fs even if it doesn’t understand a flag; fsck should not do that. See the super_compat table for more info.
0x60
__le32
s_feature_incompat
Incompatible feature set. If the kernel or fsck doesn’t understand one of these bits, it should stop. See the super_incompat table for more info.
0x64
__le32
s_feature_ro_compat
Readonly-compatible feature set. If the kernel doesn’t understand one of these bits, it can still mount read-only. See the super_rocompat table for more info.
0x68
__u8
s_uuid[16]
128-bit UUID for volume.
0x78
char
s_volume_name[16]
Volume label.
0x88
char
s_last_mounted[64]
Directory where filesystem was last mounted.
0xC8
__le32
s_algorithm_usage_bitmap
For compression (Not used in e2fsprogs/Linux)
Performance hints. Directory preallocation should only happen if the EXT4_FEATURE_COMPAT_DIR_PREALLOC flag is on.
0xCC
__u8
s_prealloc_blocks
#. of blocks to try to preallocate for … files? (Not used in e2fsprogs/Linux)
0xCD
__u8
s_prealloc_dir_blocks
#. of blocks to preallocate for directories. (Not used in e2fsprogs/Linux)
0xCE
__le16
s_reserved_gdt_blocks
Number of reserved GDT entries for future filesystem expansion.
Journalling support is valid only if EXT4_FEATURE_COMPAT_HAS_JOURNAL is set.
0xD0
__u8
s_journal_uuid[16]
UUID of journal superblock
0xE0
__le32
s_journal_inum
inode number of journal file.
0xE4
__le32
s_journal_dev
Device number of journal file, if the external journal feature flag is set.
0xE8
__le32
s_last_orphan
Start of list of orphaned inodes to delete.
0xEC
__le32
s_hash_seed[4]
HTREE hash seed.
0xFC
__u8
s_def_hash_version
Default hash algorithm to use for directory hashes. See super_def_hash for more info.
0xFD
__u8
s_jnl_backup_type
If this value is 0 or EXT3_JNL_BACKUP_BLOCKS (1), then the s_jnl_blocks field contains a duplicate copy of the inode’s i_block[] array and i_size.
0xFE
__le16
s_desc_size
Sise of group descriptors, in bytes, if the 64bit incompat feature flag is set.
0x100
__le32
s_default_mount_opts
Default mount options. See the super_mountopts table for more info.
0x104
__le32
s_first_meta_bg
First metablock block group, if the meta_bg feature is enabled.
0x108
__le32
s_mkfs_time
When the filesystem was created, in seconds since the epoch.
0x10C
__le32
s_jnl_blocks[17]
Backup copy of the journal inode’s i_block[] array in the first 15 elements and i_size_high and i_size in the 16th and 17th elements, respectively.
64bit support is valid only if EXT4_FEATURE_COMPAT_64BIT is set.
0x150
__le32
s_blocks_count_hi
High 32-bits of the block count.
0x154
__le32
s_r_blocks_count_hi
High 32-bits of the reserved block count.
0x158
__le32
s_free_blocks_count_hi
High 32-bits of the free block count.
0x15C
__le16
s_min_extra_isize
All inodes have at least # bytes.
0x15E
__le16
s_want_extra_isize
New inodes should reserve # bytes.
0x160
__le32
s_flags
Miscellaneous flags. See the super_flags table for more info.
0x164
__le16
s_raid_stride
RAID stride. This is the number of logical blocks read from or written to the disk before moving to the next disk. This affects the placement of filesystem metadata, which will hopefully make RAID storage faster.
0x166
__le16
s_mmp_interval
#. seconds to wait in multi-mount prevention (MMP) checking. In theory, MMP is a mechanism to record in the superblock which host and device have mounted the filesystem, in order to prevent multiple mounts. This feature does not seem to be implemented…
0x168
__le64
s_mmp_block
Block # for multi-mount protection data.
0x170
__le32
s_raid_stripe_width
RAID stripe width. This is the number of logical blocks read from or written to the disk before coming back to the current disk. This is used by the block allocator to try to reduce the number of read-modify-write operations in a RAID5/6.
0x174
__u8
s_log_groups_per_flex
Sise of a flexible block group is 2 ^ s_log_groups_per_flex.
0x175
__u8
s_checksum_type
Metadata checksum algorithm type. The only valid value is 1 (crc32c).
0x176
__u8
s_encryption_level
Versioning level for encryption.
0x177
__u8
s_reserved_pad
Padding to next 32bits.
0x178
__le64
s_kbytes_written
Number of KiB written to this filesystem over its lifetime.
0x180
__le32
s_snapshot_inum
inode number of active snapshot. (Not used in e2fsprogs/Linux.)
0x184
__le32
s_snapshot_id
Sequential ID of active snapshot. (Not used in e2fsprogs/Linux.)
0x188
__le64
s_snapshot_r_blocks_count
Number of blocks reserved for active snapshot’s future use. (Not used in e2fsprogs/Linux.)
0x190
__le32
s_snapshot_list
inode number of the head of the on-disk snapshot list. (Not used in e2fsprogs/Linux.)
0x194
__le32
s_error_count
Number of errors seen.
0x198
__le32
s_first_error_time
First time an error happened, in seconds since the epoch.
0x19C
__le32
s_first_error_ino
inode involved in first error.
0x1A0
__le64
s_first_error_block
Number of block involved of first error.
0x1A8
__u8
s_first_error_func[32]
Name of function where the error happened.
0x1C8
__le32
s_first_error_line
Line number where error happened.
0x1CC
__le32
s_last_error_time
Time of most recent error, in seconds since the epoch.
0x1D0
__le32
s_last_error_ino
inode involved in most recent error.
0x1D4
__le32
s_last_error_line
Line number where most recent error happened.
0x1D8
__le64
s_last_error_block
Number of block involved in most recent error.
0x1E0
__u8
s_last_error_func[32]
Name of function where the most recent error happened.
0x200
__u8
s_mount_opts[64]
ASCIIZ string of mount options.
0x240
__le32
s_usr_quota_inum
Inode number of user quota file.
0x244
__le32
s_grp_quota_inum
Inode number of group quota file.
0x248
__le32
s_overhead_blocks
Overhead blocks/clusters in fs. (Huh? This field is always zero, which means that the kernel calculates it dynamically.)
0x24C
__le32
s_backup_bgs[2]
Block groups containing superblock backups (if sparse_super2)
0x254
__u8
s_encrypt_algos[4]
Encryption algorithms in use. There can be up to four algorithms in use at any time; valid algorithm codes are given in the super_encrypt table below.
0x258
__u8
s_encrypt_pw_salt[16]
Salt for the string2key algorithm for encryption.
0x268
__le32
s_lpf_ino
Inode number of lost+found
0x26C
__le32
s_prj_quota_inum
Inode that tracks project quotas.
0x270
__le32
s_checksum_seed
Checksum seed used for metadata_csum calculations. This value is crc32c(~0, $orig_fs_uuid).
0x274
__u8
s_wtime_hi
Upper 8 bits of the s_wtime field.
0x275
__u8
s_mtime_hi
Upper 8 bits of the s_mtime field.
0x276
__u8
s_mkfs_time_hi
Upper 8 bits of the s_mkfs_time field.
0x277
__u8
s_lastcheck_hi
Upper 8 bits of the s_lastcheck field.
0x278
__u8
s_first_error_time_hi
Upper 8 bits of the s_first_error_time field.
0x279
__u8
s_last_error_time_hi
Upper 8 bits of the s_last_error_time field.
0x27A
__u8
s_first_error_errcode
0x27B
__u8
s_last_error_errcode
0x27C
__le16
s_encoding
Filename charset encoding.
0x27E
__le16
s_encoding_flags
Filename charset encoding flags.
0x280
__le32
s_orphan_file_inum
Orphan file inode number.
0x284
__le32
s_reserved[94]
Padding to the end of the block.
0x3FC
__le32
s_checksum
Superblock checksum.
The superblock state is some combination of the following:
Value
Description
0x0001
Cleanly umounted
0x0002
Errors detected
0x0004
Orphans being recovered
The superblock error policy is one of the following:
Value
Description
1
Continue
2
Remount read-only
3
Panic
The filesystem creator is one of the following:
Value
Description
0
Linux
1
Hurd
2
Masix
3
FreeBSD
4
Lites
The superblock revision is one of the following:
Value
Description
0
Original format
1
v2 format w/ dynamic inode sizes
Note that EXT4_DYNAMIC_REV refers to a revision 1 or newer filesystem.
The superblock compatible features field is a combination of any of the following:
Value
Description
0x1
Directory preallocation (COMPAT_DIR_PREALLOC).
0x2
“imagic inodes”. Not clear from the code what this does (COMPAT_IMAGIC_INODES).
0x4
Has a journal (COMPAT_HAS_JOURNAL).
0x8
Supports extended attributes (COMPAT_EXT_ATTR).
0x10
Has reserved GDT blocks for filesystem expansion (COMPAT_RESIZE_INODE). Requires RO_COMPAT_SPARSE_SUPER.
0x20
Has directory indices (COMPAT_DIR_INDEX).
0x40
“Lazy BG”. Not in Linux kernel, seems to have been for uninitialised block groups? (COMPAT_LAZY_BG)
0x80
“Exclude inode”. Not used. (COMPAT_EXCLUDE_INODE).
0x100
“Exclude bitmap”. Seems to be used to indicate the presence of snapshot-related exclude bitmaps? Not defined in kernel or used in e2fsprogs (COMPAT_EXCLUDE_BITMAP).
0x200
Sparse Super Block, v2. If this flag is set, the SB field s_backup_bgs points to the two block groups that contain backup superblocks (COMPAT_SPARSE_SUPER2).
0x400
Fast commits supported. Although fast commits blocks are backward incompatible, fast commit blocks are not always present in the journal. If fast commit blocks are present in the journal, JBD2 incompat feature (JBD2_FEATURE_INCOMPAT_FAST_COMMIT) gets set (COMPAT_FAST_COMMIT).
0x1000
Orphan file allocated. This is the special file for more efficient tracking of unlinked but still open inodes. When there may be any entries in the file, we additionally set proper rocompat feature (RO_COMPAT_ORPHAN_PRESENT).
The superblock incompatible features field is a combination of any of the following:
Value
Description
0x1
Compression (INCOMPAT_COMPRESSION).
0x2
Directory entries record the file type. See ext4_dir_entry_2 below (INCOMPAT_FILETYPE).
0x4
Filesystem needs recovery (INCOMPAT_RECOVER).
0x8
Filesystem has a separate journal device (INCOMPAT_JOURNAL_DEV).
0x10
Meta block groups. See the earlier discussion of this feature (INCOMPAT_META_BG).
0x40
Files in this filesystem use extents (INCOMPAT_EXTENTS).
0x80
Enable a filesystem size of 2^64 blocks (INCOMPAT_64BIT).
0x100
Multiple mount protection (INCOMPAT_MMP).
0x200
Flexible block groups. See the earlier discussion of this feature (INCOMPAT_FLEX_BG).
0x400
Inodes can be used to store large extended attribute values (INCOMPAT_EA_INODE).
0x1000
Data in directory entry (INCOMPAT_DIRDATA). (Not implemented?)
0x2000
Metadata checksum seed is stored in the superblock. This feature enables the administrator to change the UUID of a metadata_csum filesystem while the filesystem is mounted; without it, the checksum definition requires all metadata blocks to be rewritten (INCOMPAT_CSUM_SEED).
0x4000
Large directory >2GB or 3-level htree (INCOMPAT_LARGEDIR). Prior to this feature, directories could not be larger than 4GiB and could not have an htree more than 2 levels deep. If this feature is enabled, directories can be larger than 4GiB and have a maximum htree depth of 3.
0x8000
Data in inode (INCOMPAT_INLINE_DATA).
0x10000
Encrypted inodes can be present. (INCOMPAT_ENCRYPT).
0x20000
Directories can be marked case-insensitive. (INCOMPAT_CASEFOLD).
The superblock read-only compatible features field is a combination of any of the following:
Value
Description
0x1
Sparse superblocks. See the earlier discussion of this feature (RO_COMPAT_SPARSE_SUPER).
0x2
This filesystem has been used to store a file greater than 2GiB (RO_COMPAT_LARGE_FILE).
0x4
Not used in kernel or e2fsprogs (RO_COMPAT_BTREE_DIR).
0x8
This filesystem has files whose sizes are represented in units of logical blocks, not 512-byte sectors. This implies a very large file indeed! (RO_COMPAT_HUGE_FILE)
0x10
Group descriptors have checksums. In addition to detecting corruption, this is useful for lazy formatting with uninitialised groups (RO_COMPAT_GDT_CSUM).
0x20
Indicates that the old ext3 32,000 subdirectory limit no longer applies (RO_COMPAT_DIR_NLINK). A directory’s i_links_count will be set to 1 if it is incremented past 64,999.
0x40
Indicates that large inodes exist on this filesystem (RO_COMPAT_EXTRA_ISIZE).
0x80
This filesystem has a snapshot (RO_COMPAT_HAS_SNAPSHOT).
0x100
Quota (RO_COMPAT_QUOTA).
0x200
This filesystem supports “bigalloc”, which means that file extents are tracked in units of clusters (of blocks) instead of blocks (RO_COMPAT_BIGALLOC).
0x400
This filesystem supports metadata checksumming. (RO_COMPAT_METADATA_CSUM; implies RO_COMPAT_GDT_CSUM, though GDT_CSUM must not be set)
0x800
Filesystem supports replicas. This feature is neither in the kernel nor e2fsprogs. (RO_COMPAT_REPLICA)
0x1000
Read-only filesystem image; the kernel will not mount this image read-write and most tools will refuse to write to the image. (RO_COMPAT_READONLY)
0x2000
Filesystem tracks project quotas. (RO_COMPAT_PROJECT)
0x8000
Verity inodes may be present on the filesystem. (RO_COMPAT_VERITY)
0x10000
Indicates orphan file may have valid orphan entries and thus we need to clean them up when mounting the filesystem (RO_COMPAT_ORPHAN_PRESENT).
The s_def_hash_version field is one of the following:
Value
Description
0x0
Legacy.
0x1
Half MD4.
0x2
Tea.
0x3
Legacy, unsigned.
0x4
Half MD4, unsigned.
0x5
Tea, unsigned.
The s_default_mount_opts field is any combination of the following:
Value
Description
0x0001
Print debugging info upon (re)mount. (EXT4_DEFM_DEBUG)
0x0002
New files take the gid of the containing directory (instead of the fsgid of the current process). (EXT4_DEFM_BSDGROUPS)
0x0004
Support userspace-provided extended attributes. (EXT4_DEFM_XATTR_USER)
0x0008
Support POSIX access control lists (ACLs). (EXT4_DEFM_ACL)
0x0010
Do not support 32-bit UIDs. (EXT4_DEFM_UID16)
0x0020
All data and metadata are committed to the journal. (EXT4_DEFM_JMODE_DATA)
0x0040
All data are flushed to the disk before metadata are committed to the journal. (EXT4_DEFM_JMODE_ORDERED)
0x0060
Data ordering is not preserved; data may be written after the metadata has been written. (EXT4_DEFM_JMODE_WBACK)
0x0100
Disable write flushes. (EXT4_DEFM_NOBARRIER)
0x0200
Track which blocks in a filesystem are metadata and therefore should not be used as data blocks. This option will be enabled by default on 3.18, hopefully. (EXT4_DEFM_BLOCK_VALIDITY)
0x0400
Enable DISCARD support, where the storage device is told about blocks becoming unused. (EXT4_DEFM_DISCARD)
0x0800
Disable delayed allocation. (EXT4_DEFM_NODELALLOC)
The s_flags field is any combination of the following:
Value
Description
0x0001
Signed directory hash in use.
0x0002
Unsigned directory hash in use.
0x0004
To test development code.
The s_encrypt_algos list can contain any of the following:
Value
Description
0
Invalid algorithm (ENCRYPTION_MODE_INVALID).
1
256-bit AES in XTS mode (ENCRYPTION_MODE_AES_256_XTS).
2
256-bit AES in GCM mode (ENCRYPTION_MODE_AES_256_GCM).
3
256-bit AES in CBC mode (ENCRYPTION_MODE_AES_256_CBC).
Total size of the superblock is 1024 bytes.