From 55f6d464364acd30c9e0c92589c66b608076f233 Mon Sep 17 00:00:00 2001 From: Ye Bin Date: Sat, 25 Oct 2025 15:26:57 +0800 Subject: [PATCH 01/17] jbd2: avoid bug_on in jbd2_journal_get_create_access() when file system corrupted ANBZ: #31323 commit 71bbe06c40fc59b5b15661eca8ff307f4176d7f9 stable. commit 986835bf4d11032bba4ab8414d18fce038c61bb4 upstream. There's issue when file system corrupted: ------------[ cut here ]------------ kernel BUG at fs/jbd2/transaction.c:1289! Oops: invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 5 UID: 0 PID: 2031 Comm: mkdir Not tainted 6.18.0-rc1-next RIP: 0010:jbd2_journal_get_create_access+0x3b6/0x4d0 RSP: 0018:ffff888117aafa30 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff88811a86b000 RCX: ffffffff89a63534 RDX: 1ffff110200ec602 RSI: 0000000000000004 RDI: ffff888100763010 RBP: ffff888100763000 R08: 0000000000000001 R09: ffff888100763028 R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000000 R13: ffff88812c432000 R14: ffff88812c608000 R15: ffff888120bfc000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f91d6970c99 CR3: 00000001159c4000 CR4: 00000000000006f0 Call Trace: __ext4_journal_get_create_access+0x42/0x170 ext4_getblk+0x319/0x6f0 ext4_bread+0x11/0x100 ext4_append+0x1e6/0x4a0 ext4_init_new_dir+0x145/0x1d0 ext4_mkdir+0x326/0x920 vfs_mkdir+0x45c/0x740 do_mkdirat+0x234/0x2f0 __x64_sys_mkdir+0xd6/0x120 do_syscall_64+0x5f/0xfa0 entry_SYSCALL_64_after_hwframe+0x76/0x7e The above issue occurs with us in errors=continue mode when accompanied by storage failures. There have been many inconsistencies in the file system data. In the case of file system data inconsistency, for example, if the block bitmap of a referenced block is not set, it can lead to the situation where a block being committed is allocated and used again. As a result, the following condition will not be satisfied then trigger BUG_ON. Of course, it is entirely possible to construct a problematic image that can trigger this BUG_ON through specific operations. In fact, I have constructed such an image and easily reproduced this issue. Therefore, J_ASSERT() holds true only under ideal conditions, but it may not necessarily be satisfied in exceptional scenarios. Using J_ASSERT() directly in abnormal situations would cause the system to crash, which is clearly not what we want. So here we directly trigger a JBD abort instead of immediately invoking BUG_ON. Fixes: 470decc613ab ("[PATCH] jbd2: initial copy of files from jbd") Signed-off-by: Ye Bin Reviewed-by: Jan Kara Message-ID: <20251025072657.307851-1-yebin@huaweicloud.com> Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/jbd2/transaction.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index a0d90f87da98..8ad96da7c98b 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -1280,14 +1280,23 @@ int jbd2_journal_get_create_access(handle_t *handle, struct buffer_head *bh) * committing transaction's lists, but it HAS to be in Forget state in * that case: the transaction must have deleted the buffer for it to be * reused here. + * In the case of file system data inconsistency, for example, if the + * block bitmap of a referenced block is not set, it can lead to the + * situation where a block being committed is allocated and used again. + * As a result, the following condition will not be satisfied, so here + * we directly trigger a JBD abort instead of immediately invoking + * bugon. */ spin_lock(&jh->b_state_lock); - J_ASSERT_JH(jh, (jh->b_transaction == transaction || - jh->b_transaction == NULL || - (jh->b_transaction == journal->j_committing_transaction && - jh->b_jlist == BJ_Forget))); + if (!(jh->b_transaction == transaction || jh->b_transaction == NULL || + (jh->b_transaction == journal->j_committing_transaction && + jh->b_jlist == BJ_Forget)) || jh->b_next_transaction != NULL) { + err = -EROFS; + spin_unlock(&jh->b_state_lock); + jbd2_journal_abort(journal, err); + goto out; + } - J_ASSERT_JH(jh, jh->b_next_transaction == NULL); J_ASSERT_JH(jh, buffer_locked(jh2bh(jh))); if (jh->b_transaction == NULL) { -- Gitee From d10f72ac5a957c1594eadf88454b2515240a07b5 Mon Sep 17 00:00:00 2001 From: Deepanshu Kartikey Date: Mon, 20 Oct 2025 14:09:36 +0800 Subject: [PATCH 02/17] ext4: refresh inline data size before write operations ANBZ: #31323 commit 54ab81ae5f218452e64470cd8a8139bb5880fe2b stable. commit 892e1cf17555735e9d021ab036c36bc7b58b0e3b upstream. The cached ei->i_inline_size can become stale between the initial size check and when ext4_update_inline_data()/ext4_create_inline_data() use it. Although ext4_get_max_inline_size() reads the correct value at the time of the check, concurrent xattr operations can modify i_inline_size before ext4_write_lock_xattr() is acquired. This causes ext4_update_inline_data() and ext4_create_inline_data() to work with stale capacity values, leading to a BUG_ON() crash in ext4_write_inline_data(): kernel BUG at fs/ext4/inline.c:1331! BUG_ON(pos + len > EXT4_I(inode)->i_inline_size); The race window: 1. ext4_get_max_inline_size() reads i_inline_size = 60 (correct) 2. Size check passes for 50-byte write 3. [Another thread adds xattr, i_inline_size changes to 40] 4. ext4_write_lock_xattr() acquires lock 5. ext4_update_inline_data() uses stale i_inline_size = 60 6. Attempts to write 50 bytes but only 40 bytes actually available 7. BUG_ON() triggers Fix this by recalculating i_inline_size via ext4_find_inline_data_nolock() immediately after acquiring xattr_sem. This ensures ext4_update_inline_data() and ext4_create_inline_data() work with current values that are protected from concurrent modifications. This is similar to commit a54c4613dac1 ("ext4: fix race writing to an inline_data file while its xattrs are changing") which fixed i_inline_off staleness. This patch addresses the related i_inline_size staleness issue. Reported-by: syzbot+f3185be57d7e8dda32b8@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=f3185be57d7e8dda32b8 Cc: stable@kernel.org Signed-off-by: Deepanshu Kartikey Message-ID: <20251020060936.474314-1-kartikey406@gmail.com> Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/ext4/inline.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c index 2e48fe693c0c..7db84a209433 100644 --- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -410,7 +410,12 @@ static int ext4_prepare_inline_data(handle_t *handle, struct inode *inode, return -ENOSPC; ext4_write_lock_xattr(inode, &no_expand); - + /* + * ei->i_inline_size may have changed since the initial check + * if other xattrs were added. Recalculate to ensure + * ext4_update_inline_data() validates against current capacity. + */ + (void) ext4_find_inline_data_nolock(inode); if (ei->i_inline_off) ret = ext4_update_inline_data(handle, inode, len); else -- Gitee From 98ed72d6c7f751a87ffd86afd97d8459432146f9 Mon Sep 17 00:00:00 2001 From: Alexey Nepomnyashih Date: Tue, 4 Nov 2025 17:33:25 +0800 Subject: [PATCH 03/17] ext4: add i_data_sem protection in ext4_destroy_inline_data_nolock() ANBZ: #31323 commit b322bac9f01d03190b5abc52be5d9dd9f22a2b41 stable. commit 0cd8feea8777f8d9b9a862b89c688b049a5c8475 upstream. Fix a race between inline data destruction and block mapping. The function ext4_destroy_inline_data_nolock() changes the inode data layout by clearing EXT4_INODE_INLINE_DATA and setting EXT4_INODE_EXTENTS. At the same time, another thread may execute ext4_map_blocks(), which tests EXT4_INODE_EXTENTS to decide whether to call ext4_ext_map_blocks() or ext4_ind_map_blocks(). Without i_data_sem protection, ext4_ind_map_blocks() may receive inode with EXT4_INODE_EXTENTS flag and triggering assert. kernel BUG at fs/ext4/indirect.c:546! EXT4-fs (loop2): unmounting filesystem. invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 RIP: 0010:ext4_ind_map_blocks.cold+0x2b/0x5a fs/ext4/indirect.c:546 Call Trace: ext4_map_blocks+0xb9b/0x16f0 fs/ext4/inode.c:681 _ext4_get_block+0x242/0x590 fs/ext4/inode.c:822 ext4_block_write_begin+0x48b/0x12c0 fs/ext4/inode.c:1124 ext4_write_begin+0x598/0xef0 fs/ext4/inode.c:1255 ext4_da_write_begin+0x21e/0x9c0 fs/ext4/inode.c:3000 generic_perform_write+0x259/0x5d0 mm/filemap.c:3846 ext4_buffered_write_iter+0x15b/0x470 fs/ext4/file.c:285 ext4_file_write_iter+0x8e0/0x17f0 fs/ext4/file.c:679 call_write_iter include/linux/fs.h:2271 [inline] do_iter_readv_writev+0x212/0x3c0 fs/read_write.c:735 do_iter_write+0x186/0x710 fs/read_write.c:861 vfs_iter_write+0x70/0xa0 fs/read_write.c:902 iter_file_splice_write+0x73b/0xc90 fs/splice.c:685 do_splice_from fs/splice.c:763 [inline] direct_splice_actor+0x10f/0x170 fs/splice.c:950 splice_direct_to_actor+0x33a/0xa10 fs/splice.c:896 do_splice_direct+0x1a9/0x280 fs/splice.c:1002 do_sendfile+0xb13/0x12c0 fs/read_write.c:1255 __do_sys_sendfile64 fs/read_write.c:1323 [inline] __se_sys_sendfile64 fs/read_write.c:1309 [inline] __x64_sys_sendfile64+0x1cf/0x210 fs/read_write.c:1309 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x35/0x80 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Fixes: c755e251357a ("ext4: fix deadlock between inline_data and ext4_expand_extra_isize_ea()") Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Alexey Nepomnyashih Message-ID: <20251104093326.697381-1-sdl@nppct.ru> Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/ext4/inline.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c index 7db84a209433..c7391508bf26 100644 --- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -443,9 +443,13 @@ static int ext4_destroy_inline_data_nolock(handle_t *handle, if (!ei->i_inline_off) return 0; + down_write(&ei->i_data_sem); + error = ext4_get_inode_loc(inode, &is.iloc); - if (error) + if (error) { + up_write(&ei->i_data_sem); return error; + } error = ext4_xattr_ibody_find(inode, &i, &is); if (error) @@ -483,6 +487,7 @@ static int ext4_destroy_inline_data_nolock(handle_t *handle, brelse(is.iloc.bh); if (error == -ENODATA) error = 0; + up_write(&ei->i_data_sem); return error; } -- Gitee From cc87cc6f098760919656be8ec57c041786773629 Mon Sep 17 00:00:00 2001 From: Eric Whitney Date: Sat, 23 Jul 2022 00:39:10 +0800 Subject: [PATCH 04/17] ext4: minor defrag code improvements ANBZ: #31323 commit ce8d8540ddbc575e0beacad3691777ce19633771 stable. commit d412df530f77d0f61c41b83f925997452fc3944c upstream. Modify the error returns for two file types that can't be defragged to more clearly communicate those restrictions to a caller. When the defrag code is applied to swap files, return -ETXTBSY, and when applied to quota files, return -EOPNOTSUPP. Move an extent tree search whose results are only occasionally required to the site always requiring them for improved efficiency. Address a few typos. Signed-off-by: Eric Whitney Link: https://lore.kernel.org/r/20220722163910.268564-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o Stable-dep-of: a2e5a3cea4b1 ("ext4: correct the checking of quota files before moving extents") Signed-off-by: Sasha Levin Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/ext4/move_extent.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 886d2b161c23..4da9e8946a9e 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -466,19 +466,17 @@ mext_check_arguments(struct inode *orig_inode, if (IS_IMMUTABLE(donor_inode) || IS_APPEND(donor_inode)) return -EPERM; - /* Ext4 move extent does not support swapfile */ + /* Ext4 move extent does not support swap files */ if (IS_SWAPFILE(orig_inode) || IS_SWAPFILE(donor_inode)) { - ext4_debug("ext4 move extent: The argument files should " - "not be swapfile [ino:orig %lu, donor %lu]\n", + ext4_debug("ext4 move extent: The argument files should not be swap files [ino:orig %lu, donor %lu]\n", orig_inode->i_ino, donor_inode->i_ino); - return -EBUSY; + return -ETXTBSY; } if (ext4_is_quota_file(orig_inode) && ext4_is_quota_file(donor_inode)) { - ext4_debug("ext4 move extent: The argument files should " - "not be quota files [ino:orig %lu, donor %lu]\n", + ext4_debug("ext4 move extent: The argument files should not be quota files [ino:orig %lu, donor %lu]\n", orig_inode->i_ino, donor_inode->i_ino); - return -EBUSY; + return -EOPNOTSUPP; } /* Ext4 move extent supports only extent based file */ @@ -626,11 +624,11 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk, if (ret) goto out; ex = path[path->p_depth].p_ext; - next_blk = ext4_ext_next_allocated_block(path); cur_blk = le32_to_cpu(ex->ee_block); cur_len = ext4_ext_get_actual_len(ex); /* Check hole before the start pos */ if (cur_blk + cur_len - 1 < o_start) { + next_blk = ext4_ext_next_allocated_block(path); if (next_blk == EXT_MAX_BLOCKS) { o_start = o_end; ret = -ENODATA; @@ -659,7 +657,7 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk, donor_page_index = d_start >> (PAGE_SHIFT - donor_inode->i_blkbits); offset_in_page = o_start % blocks_per_page; - if (cur_len > blocks_per_page- offset_in_page) + if (cur_len > blocks_per_page - offset_in_page) cur_len = blocks_per_page - offset_in_page; /* * Up semaphore to avoid following problems: -- Gitee From a2f9c4c038ad05ddc1fe34e8649527b1a1c01f27 Mon Sep 17 00:00:00 2001 From: Zhang Yi Date: Mon, 13 Oct 2025 09:51:17 +0800 Subject: [PATCH 05/17] ext4: correct the checking of quota files before moving extents ANBZ: #31323 commit 296bbf46b5082c04bd04ed7ed1993d5f42714590 stable. commit a2e5a3cea4b18f6e2575acc444a5e8cce1fc8260 upstream. The move extent operation should return -EOPNOTSUPP if any of the inodes is a quota inode, rather than requiring both to be quota inodes. Fixes: 02749a4c2082 ("ext4: add ext4_is_quota_file()") Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Message-ID: <20251013015128.499308-2-yi.zhang@huaweicloud.com> Signed-off-by: Theodore Ts'o Signed-off-by: Sasha Levin Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/ext4/move_extent.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 4da9e8946a9e..c72fccb08a6b 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -473,7 +473,7 @@ mext_check_arguments(struct inode *orig_inode, return -ETXTBSY; } - if (ext4_is_quota_file(orig_inode) && ext4_is_quota_file(donor_inode)) { + if (ext4_is_quota_file(orig_inode) || ext4_is_quota_file(donor_inode)) { ext4_debug("ext4 move extent: The argument files should not be quota files [ino:orig %lu, donor %lu]\n", orig_inode->i_ino, donor_inode->i_ino); return -EOPNOTSUPP; -- Gitee From b2288dee06fb79e7ede0e91dfffcf005f6b74862 Mon Sep 17 00:00:00 2001 From: Kemeng Shi Date: Fri, 5 Jan 2024 17:20:54 +0800 Subject: [PATCH 06/17] ext4: remove unused return value of __mb_check_buddy ANBZ: #31323 commit 14c5a790b4a0cd533bfe1e7109bf2dbec781378d stable. commit 133de5a0d8f8e32b34feaa8beae7a189482f1856 upstream. Remove unused return value of __mb_check_buddy. Signed-off-by: Kemeng Shi Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20240105092102.496631-2-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o Stable-dep-of: d9ee3ff810f1 ("ext4: improve integrity checking in __mb_check_buddy by enhancing order-0 validation") Signed-off-by: Sasha Levin Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/ext4/mballoc.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 14a5f468ba9b..ac261c0af9d5 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -602,7 +602,7 @@ do { \ } \ } while (0) -static int __mb_check_buddy(struct ext4_buddy *e4b, char *file, +static void __mb_check_buddy(struct ext4_buddy *e4b, char *file, const char *function, int line) { struct super_block *sb = e4b->bd_sb; @@ -621,7 +621,7 @@ static int __mb_check_buddy(struct ext4_buddy *e4b, char *file, void *buddy2; if (e4b->bd_info->bb_check_counter++ % 10) - return 0; + return; while (order > 1) { buddy = mb_find_buddy(e4b, order, &max); @@ -686,7 +686,7 @@ static int __mb_check_buddy(struct ext4_buddy *e4b, char *file, grp = ext4_get_group_info(sb, e4b->bd_group); if (!grp) - return NULL; + return; list_for_each(cur, &grp->bb_prealloc_list) { ext4_group_t groupnr; struct ext4_prealloc_space *pa; @@ -696,7 +696,6 @@ static int __mb_check_buddy(struct ext4_buddy *e4b, char *file, for (i = 0; i < pa->pa_len; i++) MB_CHECK_ASSERT(mb_test_bit(k + i, buddy)); } - return 0; } #undef MB_CHECK_ASSERT #define mb_check_buddy(e4b) __mb_check_buddy(e4b, \ -- Gitee From 65e0aacf551605b2a4b6e4657b505f0a4d314ac2 Mon Sep 17 00:00:00 2001 From: Yongjian Sun Date: Thu, 6 Nov 2025 14:06:14 +0800 Subject: [PATCH 07/17] ext4: improve integrity checking in __mb_check_buddy by enhancing order-0 validation ANBZ: #31323 commit 1074a1474745c0a4bc9c154641002190eb7f9914 stable. commit d9ee3ff810f1cc0e253c9f2b17b668b973cb0e06 upstream. When the MB_CHECK_ASSERT macro is enabled, we found that the current validation logic in __mb_check_buddy has a gap in detecting certain invalid buddy states, particularly related to order-0 (bitmap) bits. The original logic consists of three steps: 1. Validates higher-order buddies: if a higher-order bit is set, at most one of the two corresponding lower-order bits may be free; if a higher-order bit is clear, both lower-order bits must be allocated (and their bitmap bits must be 0). 2. For any set bit in order-0, ensures all corresponding higher-order bits are not free. 3. Verifies that all preallocated blocks (pa) in the group have pa_pstart within bounds and their bitmap bits marked as allocated. However, this approach fails to properly validate cases where order-0 bits are incorrectly cleared (0), allowing some invalid configurations to pass: corrupt integral order 3 1 1 order 2 1 1 1 1 order 1 1 1 1 1 1 1 1 1 order 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Here we get two adjacent free blocks at order-0 with inconsistent higher-order state, and the right one shows the correct scenario. The root cause is insufficient validation of order-0 zero bits. To fix this and improve completeness without significant performance cost, we refine the logic: 1. Maintain the top-down higher-order validation, but we no longer check the cases where the higher-order bit is 0, as this case will be covered in step 2. 2. Enhance order-0 checking by examining pairs of bits: - If either bit in a pair is set (1), all corresponding higher-order bits must not be free. - If both bits are clear (0), then exactly one of the corresponding higher-order bits must be free 3. Keep the preallocation (pa) validation unchanged. This change closes the validation gap, ensuring illegal buddy states involving order-0 are correctly detected, while removing redundant checks and maintaining efficiency. Fixes: c9de560ded61f ("ext4: Add multi block allocator for ext4") Suggested-by: Jan Kara Signed-off-by: Yongjian Sun Reviewed-by: Baokun Li Reviewed-by: Jan Kara Message-ID: <20251106060614.631382-3-sunyongjian@huaweicloud.com> Signed-off-by: Theodore Ts'o Signed-off-by: Sasha Levin Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/ext4/mballoc.c | 49 +++++++++++++++++++++++++++++++---------------- 1 file changed, 32 insertions(+), 17 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index ac261c0af9d5..5b9a0f8adfbb 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -602,6 +602,24 @@ do { \ } \ } while (0) +/* + * Perform buddy integrity check with the following steps: + * + * 1. Top-down validation (from highest order down to order 1, excluding order-0 bitmap): + * For each pair of adjacent orders, if a higher-order bit is set (indicating a free block), + * at most one of the two corresponding lower-order bits may be clear (free). + * + * 2. Order-0 (bitmap) validation, performed on bit pairs: + * - If either bit in a pair is set (1, allocated), then all corresponding higher-order bits + * must not be free (0). + * - If both bits in a pair are clear (0, free), then exactly one of the corresponding + * higher-order bits must be free (0). + * + * 3. Preallocation (pa) list validation: + * For each preallocated block (pa) in the group: + * - Verify that pa_pstart falls within the bounds of this block group. + * - Ensure the corresponding bit(s) in the order-0 bitmap are marked as allocated (1). + */ static void __mb_check_buddy(struct ext4_buddy *e4b, char *file, const char *function, int line) { @@ -646,15 +664,6 @@ static void __mb_check_buddy(struct ext4_buddy *e4b, char *file, continue; } - /* both bits in buddy2 must be 1 */ - MB_CHECK_ASSERT(mb_test_bit(i << 1, buddy2)); - MB_CHECK_ASSERT(mb_test_bit((i << 1) + 1, buddy2)); - - for (j = 0; j < (1 << order); j++) { - k = (i * (1 << order)) + j; - MB_CHECK_ASSERT( - !mb_test_bit(k, e4b->bd_bitmap)); - } count++; } MB_CHECK_ASSERT(e4b->bd_info->bb_counters[order] == count); @@ -670,15 +679,21 @@ static void __mb_check_buddy(struct ext4_buddy *e4b, char *file, fragments++; fstart = i; } - continue; + } else { + fstart = -1; } - fstart = -1; - /* check used bits only */ - for (j = 0; j < e4b->bd_blkbits + 1; j++) { - buddy2 = mb_find_buddy(e4b, j, &max2); - k = i >> j; - MB_CHECK_ASSERT(k < max2); - MB_CHECK_ASSERT(mb_test_bit(k, buddy2)); + if (!(i & 1)) { + int in_use, zero_bit_count = 0; + + in_use = mb_test_bit(i, buddy) || mb_test_bit(i + 1, buddy); + for (j = 1; j < e4b->bd_blkbits + 2; j++) { + buddy2 = mb_find_buddy(e4b, j, &max2); + k = i >> j; + MB_CHECK_ASSERT(k < max2); + if (!mb_test_bit(k, buddy2)) + zero_bit_count++; + } + MB_CHECK_ASSERT(zero_bit_count == !in_use); } } MB_CHECK_ASSERT(!EXT4_MB_GRP_NEED_INIT(e4b->bd_info)); -- Gitee From d43eed45336433b16e21a452d038c2561a9cb46c Mon Sep 17 00:00:00 2001 From: Karina Yankevich Date: Wed, 22 Oct 2025 17:32:53 +0800 Subject: [PATCH 08/17] ext4: xattr: fix null pointer deref in ext4_raw_inode() ANBZ: #31323 commit b72a3476f0c97d02f63a6e9fff127348d55436f6 stable. commit b97cb7d6a051aa6ebd57906df0e26e9e36c26d14 upstream. If ext4_get_inode_loc() fails (e.g. if it returns -EFSCORRUPTED), iloc.bh will remain set to NULL. Since ext4_xattr_inode_dec_ref_all() lacks error checking, this will lead to a null pointer dereference in ext4_raw_inode(), called right after ext4_get_inode_loc(). Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: c8e008b60492 ("ext4: ignore xattrs past end") Cc: stable@kernel.org Signed-off-by: Karina Yankevich Reviewed-by: Sergey Shtylyov Reviewed-by: Baokun Li Message-ID: <20251022093253.3546296-1-k.yankevich@omp.ru> Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/ext4/xattr.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index fa8ce1c66d12..8a7045d8a555 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -1134,7 +1134,11 @@ ext4_xattr_inode_dec_ref_all(handle_t *handle, struct inode *parent, if (block_csum) end = (void *)bh->b_data + bh->b_size; else { - ext4_get_inode_loc(parent, &iloc); + err = ext4_get_inode_loc(parent, &iloc); + if (err) { + EXT4_ERROR_INODE(parent, "parent inode loc (error %d)", err); + return; + } end = (void *)ext4_raw_inode(&iloc) + EXT4_SB(parent->i_sb)->s_inode_size; } -- Gitee From 343bf543ef3fc13875ac862913fc3e635b9dc31b Mon Sep 17 00:00:00 2001 From: Yongjian Sun Date: Thu, 6 Nov 2025 14:06:13 +0800 Subject: [PATCH 09/17] ext4: fix incorrect group number assertion in mb_check_buddy ANBZ: #31323 commit 3fb5562eedea75bddaecda3590120f1a49c7f91c stable. commit 3f7a79d05c692c7cfec70bf104b1b3c3d0ce6247 upstream. When the MB_CHECK_ASSERT macro is enabled, an assertion failure can occur in __mb_check_buddy when checking preallocated blocks (pa) in a block group: Assertion failure in mb_free_blocks() : "groupnr == e4b->bd_group" This happens when a pa at the very end of a block group (e.g., pa_pstart=32765, pa_len=3 in a group of 32768 blocks) becomes exhausted - its pa_pstart is advanced by pa_len to 32768, which lies in the next block group. If this exhausted pa (with pa_len == 0) is still in the bb_prealloc_list during the buddy check, the assertion incorrectly flags it as belonging to the wrong group. A possible sequence is as follows: ext4_mb_new_blocks ext4_mb_release_context pa->pa_pstart += EXT4_C2B(sbi, ac->ac_b_ex.fe_len) pa->pa_len -= ac->ac_b_ex.fe_len __mb_check_buddy for each pa in group ext4_get_group_no_and_offset MB_CHECK_ASSERT(groupnr == e4b->bd_group) To fix this, we modify the check to skip block group validation for exhausted preallocations (where pa_len == 0). Such entries are in a transitional state and will be removed from the list soon, so they should not trigger an assertion. This change prevents the false positive while maintaining the integrity of the checks for active allocations. Fixes: c9de560ded61f ("ext4: Add multi block allocator for ext4") Signed-off-by: Yongjian Sun Reviewed-by: Baokun Li Reviewed-by: Jan Kara Message-ID: <20251106060614.631382-2-sunyongjian@huaweicloud.com> Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/ext4/mballoc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 5b9a0f8adfbb..40ca411bc654 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -706,6 +706,8 @@ static void __mb_check_buddy(struct ext4_buddy *e4b, char *file, ext4_group_t groupnr; struct ext4_prealloc_space *pa; pa = list_entry(cur, struct ext4_prealloc_space, pa_group_list); + if (!pa->pa_len) + continue; ext4_get_group_no_and_offset(sb, pa->pa_pstart, &groupnr, &k); MB_CHECK_ASSERT(groupnr == e4b->bd_group); for (i = 0; i < pa->pa_len; i++) -- Gitee From cbb21d4490f82becfbf5bf99da16ee9367f541e3 Mon Sep 17 00:00:00 2001 From: Byungchul Park Date: Fri, 24 Oct 2025 15:39:40 +0800 Subject: [PATCH 10/17] jbd2: use a weaker annotation in journal handling ANBZ: #31323 commit 544e55f02687d6745aea7472b117da05c67c2348 stable. commit 40a71b53d5a6d4ea17e4d54b99b2ac03a7f5e783 upstream. jbd2 journal handling code doesn't want jbd2_might_wait_for_commit() to be placed between start_this_handle() and stop_this_handle(). So it marks the region with rwsem_acquire_read() and rwsem_release(). However, the annotation is too strong for that purpose. We don't have to use more than try lock annotation for that. rwsem_acquire_read() implies: 1. might be a waiter on contention of the lock. 2. enter to the critical section of the lock. All we need in here is to act 2, not 1. So trylock version of annotation is sufficient for that purpose. Now that dept partially relies on lockdep annotaions, dept interpets rwsem_acquire_read() as a potential wait and might report a deadlock by the wait. Replace it with trylock version of annotation. Signed-off-by: Byungchul Park Reviewed-by: Jan Kara Cc: stable@kernel.org Message-ID: <20251024073940.1063-1-byungchul@sk.com> Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/jbd2/transaction.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index 8ad96da7c98b..7361e0134026 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -450,7 +450,7 @@ static int start_this_handle(journal_t *journal, handle_t *handle, read_unlock(&journal->j_state_lock); current->journal_info = handle; - rwsem_acquire_read(&journal->j_trans_commit_map, 0, 0, _THIS_IP_); + rwsem_acquire_read(&journal->j_trans_commit_map, 0, 1, _THIS_IP_); jbd2_journal_free_transaction(new_transaction); /* * Ensure that no allocations done while the transaction is open are -- Gitee From 140306553a9a21bddfa271ef38c7ce076ff47826 Mon Sep 17 00:00:00 2001 From: Fedor Pchelkin Date: Tue, 30 Dec 2025 09:26:14 +0800 Subject: [PATCH 11/17] ext4: fix string copying in parse_apply_sb_mount_options() ANBZ: #31323 commit 52ac96c4a2dd7bc47666000440b0602d9742e820 stable. commit ee5a977b4e771cc181f39d504426dbd31ed701cc upstream. strscpy_pad() can't be used to copy a non-NUL-term string into a NUL-term string of possibly bigger size. Commit 0efc5990bca5 ("string.h: Introduce memtostr() and memtostr_pad()") provides additional information in that regard. So if this happens, the following warning is observed: strnlen: detected buffer overflow: 65 byte read of buffer size 64 WARNING: CPU: 0 PID: 28655 at lib/string_helpers.c:1032 __fortify_report+0x96/0xc0 lib/string_helpers.c:1032 Modules linked in: CPU: 0 UID: 0 PID: 28655 Comm: syz-executor.3 Not tainted 6.12.54-syzkaller-00144-g5f0270f1ba00 #0 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:__fortify_report+0x96/0xc0 lib/string_helpers.c:1032 Call Trace: __fortify_panic+0x1f/0x30 lib/string_helpers.c:1039 strnlen include/linux/fortify-string.h:235 [inline] sized_strscpy include/linux/fortify-string.h:309 [inline] parse_apply_sb_mount_options fs/ext4/super.c:2504 [inline] __ext4_fill_super fs/ext4/super.c:5261 [inline] ext4_fill_super+0x3c35/0xad00 fs/ext4/super.c:5706 get_tree_bdev_flags+0x387/0x620 fs/super.c:1636 vfs_get_tree+0x93/0x380 fs/super.c:1814 do_new_mount fs/namespace.c:3553 [inline] path_mount+0x6ae/0x1f70 fs/namespace.c:3880 do_mount fs/namespace.c:3893 [inline] __do_sys_mount fs/namespace.c:4103 [inline] __se_sys_mount fs/namespace.c:4080 [inline] __x64_sys_mount+0x280/0x300 fs/namespace.c:4080 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x64/0x140 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e Since userspace is expected to provide s_mount_opts field to be at most 63 characters long with the ending byte being NUL-term, use a 64-byte buffer which matches the size of s_mount_opts, so that strscpy_pad() does its job properly. Return with error if the user still managed to provide a non-NUL-term string here. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: 8ecb790ea8c3 ("ext4: avoid potential buffer over-read in parse_apply_sb_mount_options()") Cc: stable@vger.kernel.org Signed-off-by: Fedor Pchelkin Reviewed-by: Baokun Li Reviewed-by: Jan Kara Message-ID: <20251101160430.222297-1-pchelkin@ispras.ru> Signed-off-by: Theodore Ts'o [ goto failed_mount instead of return ] Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/ext4/super.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 155e39c13cb3..3bfe54bf938c 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -4309,10 +4309,11 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) } if (sbi->s_es->s_mount_opts[0]) { - char s_mount_opts[65]; + char s_mount_opts[64]; - strscpy_pad(s_mount_opts, sbi->s_es->s_mount_opts, - sizeof(s_mount_opts)); + if (strscpy_pad(s_mount_opts, sbi->s_es->s_mount_opts, + sizeof(s_mount_opts)) < 0) + goto failed_mount; if (!parse_options(s_mount_opts, sb, &journal_devnum, &journal_ioprio, 0)) { ext4_msg(sb, KERN_WARNING, -- Gitee From a56794a0aa68dd860dfef6b528e207bc31d36f67 Mon Sep 17 00:00:00 2001 From: Ye Bin Date: Tue, 30 Dec 2025 09:06:38 +0800 Subject: [PATCH 12/17] jbd2: fix the inconsistency between checksum and data in memory for journal sb ANBZ: #31323 commit 9f48638b2f7e5e8393e061b21e66ebfb3a4bca49 stable. commit 6abfe107894af7e8ce3a2e120c619d81ee764ad5 upstream. Copying the file system while it is mounted as read-only results in a mount failure: [~]# mkfs.ext4 -F /dev/sdc [~]# mount /dev/sdc -o ro /mnt/test [~]# dd if=/dev/sdc of=/dev/sda bs=1M [~]# mount /dev/sda /mnt/test1 [ 1094.849826] JBD2: journal checksum error [ 1094.850927] EXT4-fs (sda): Could not load journal inode mount: mount /dev/sda on /mnt/test1 failed: Bad message The process described above is just an abstracted way I came up with to reproduce the issue. In the actual scenario, the file system was mounted read-only and then copied while it was still mounted. It was found that the mount operation failed. The user intended to verify the data or use it as a backup, and this action was performed during a version upgrade. Above issue may happen as follows: ext4_fill_super set_journal_csum_feature_set(sb) if (ext4_has_metadata_csum(sb)) incompat = JBD2_FEATURE_INCOMPAT_CSUM_V3; if (test_opt(sb, JOURNAL_CHECKSUM) jbd2_journal_set_features(sbi->s_journal, compat, 0, incompat); lock_buffer(journal->j_sb_buffer); sb->s_feature_incompat |= cpu_to_be32(incompat); //The data in the journal sb was modified, but the checksum was not updated, so the data remaining in memory has a mismatch between the data and the checksum. unlock_buffer(journal->j_sb_buffer); In this case, the journal sb copied over is in a state where the checksum and data are inconsistent, so mounting fails. To solve the above issue, update the checksum in memory after modifying the journal sb. Fixes: 4fd5ea43bc11 ("jbd2: checksum journal superblock") Signed-off-by: Ye Bin Reviewed-by: Baokun Li Reviewed-by: Darrick J. Wong Reviewed-by: Jan Kara Message-ID: <20251103010123.3753631-1-yebin@huaweicloud.com> Signed-off-by: Theodore Ts'o Cc: stable@kernel.org [ Changed jbd2_superblock_csum(sb) to jbd2_superblock_csum(journal, sb) ] Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/jbd2/journal.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index 9b5500eb54a9..1457183e4b4a 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -2642,6 +2642,12 @@ int jbd2_journal_set_features(journal_t *journal, unsigned long compat, sb->s_feature_compat |= cpu_to_be32(compat); sb->s_feature_ro_compat |= cpu_to_be32(ro); sb->s_feature_incompat |= cpu_to_be32(incompat); + /* + * Update the checksum now so that it is valid even for read-only + * filesystems where jbd2_write_superblock() doesn't get called. + */ + if (jbd2_journal_has_csum_v2or3(journal)) + sb->s_checksum = jbd2_superblock_csum(journal, sb); unlock_buffer(journal->j_sb_buffer); journal->j_revoke_records_per_block = journal_revoke_records_per_block(journal); @@ -2672,9 +2678,17 @@ void jbd2_journal_clear_features(journal_t *journal, unsigned long compat, sb = journal->j_superblock; + lock_buffer(journal->j_sb_buffer); sb->s_feature_compat &= ~cpu_to_be32(compat); sb->s_feature_ro_compat &= ~cpu_to_be32(ro); sb->s_feature_incompat &= ~cpu_to_be32(incompat); + /* + * Update the checksum now so that it is valid even for read-only + * filesystems where jbd2_write_superblock() doesn't get called. + */ + if (jbd2_journal_has_csum_v2or3(journal)) + sb->s_checksum = jbd2_superblock_csum(journal, sb); + unlock_buffer(journal->j_sb_buffer); journal->j_revoke_records_per_block = journal_revoke_records_per_block(journal); } -- Gitee From e8cae44cb3dc7dcfaf42d8762d9d91689cad299a Mon Sep 17 00:00:00 2001 From: Ye Bin Date: Fri, 9 Jan 2026 23:23:13 +0800 Subject: [PATCH 13/17] ext4: introduce ITAIL helper MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ANBZ: #31323 commit 5872c0ff12f2f7d7b95cdd15d5b916ad53b86831 stable. commit 69f3a3039b0d0003de008659cafd5a1eaaa0a7a4 upstream. Introduce ITAIL helper to get the bound of xattr in inode. Signed-off-by: Ye Bin Reviewed-by: Jan Kara Link: https://patch.msgid.link/20250208063141.1539283-2-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o Signed-off-by: David Nyström Signed-off-by: Greg Kroah-Hartman Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/ext4/xattr.c | 10 +++++----- fs/ext4/xattr.h | 3 +++ 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index 8a7045d8a555..84c94f11e06e 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -599,7 +599,7 @@ ext4_xattr_ibody_get(struct inode *inode, int name_index, const char *name, return error; raw_inode = ext4_raw_inode(&iloc); header = IHDR(inode, raw_inode); - end = (void *)raw_inode + EXT4_SB(inode->i_sb)->s_inode_size; + end = ITAIL(inode, raw_inode); error = xattr_check_inode(inode, header, end); if (error) goto cleanup; @@ -744,7 +744,7 @@ ext4_xattr_ibody_list(struct dentry *dentry, char *buffer, size_t buffer_size) return error; raw_inode = ext4_raw_inode(&iloc); header = IHDR(inode, raw_inode); - end = (void *)raw_inode + EXT4_SB(inode->i_sb)->s_inode_size; + end = ITAIL(inode, raw_inode); error = xattr_check_inode(inode, header, end); if (error) goto cleanup; @@ -826,7 +826,7 @@ int ext4_get_inode_usage(struct inode *inode, qsize_t *usage) goto out; raw_inode = ext4_raw_inode(&iloc); header = IHDR(inode, raw_inode); - end = (void *)raw_inode + EXT4_SB(inode->i_sb)->s_inode_size; + end = ITAIL(inode, raw_inode); ret = xattr_check_inode(inode, header, end); if (ret) goto out; @@ -2219,7 +2219,7 @@ int ext4_xattr_ibody_find(struct inode *inode, struct ext4_xattr_info *i, header = IHDR(inode, raw_inode); is->s.base = is->s.first = IFIRST(header); is->s.here = is->s.first; - is->s.end = (void *)raw_inode + EXT4_SB(inode->i_sb)->s_inode_size; + is->s.end = ITAIL(inode, raw_inode); if (ext4_test_inode_state(inode, EXT4_STATE_XATTR)) { error = xattr_check_inode(inode, header, is->s.end); if (error) @@ -2743,7 +2743,7 @@ int ext4_expand_extra_isize_ea(struct inode *inode, int new_extra_isize, */ base = IFIRST(header); - end = (void *)raw_inode + EXT4_SB(inode->i_sb)->s_inode_size; + end = ITAIL(inode, raw_inode); min_offs = end - base; total_ino = sizeof(struct ext4_xattr_ibody_header) + sizeof(u32); diff --git a/fs/ext4/xattr.h b/fs/ext4/xattr.h index e5e36bd11f05..9a596e19c2b1 100644 --- a/fs/ext4/xattr.h +++ b/fs/ext4/xattr.h @@ -68,6 +68,9 @@ struct ext4_xattr_entry { ((void *)raw_inode + \ EXT4_GOOD_OLD_INODE_SIZE + \ EXT4_I(inode)->i_extra_isize)) +#define ITAIL(inode, raw_inode) \ + ((void *)(raw_inode) + \ + EXT4_SB((inode)->i_sb)->s_inode_size) #define IFIRST(hdr) ((struct ext4_xattr_entry *)((hdr)+1)) /* -- Gitee From 501b1cdcd20160adf9b5663ad9b29dc167b42455 Mon Sep 17 00:00:00 2001 From: Ye Bin Date: Fri, 9 Jan 2026 23:23:14 +0800 Subject: [PATCH 14/17] ext4: fix out-of-bound read in ext4_xattr_inode_dec_ref_all() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ANBZ: #31323 commit 27202452b0bc942fdc3db72a44c4dcdab96d5b56 stable. commit 5701875f9609b000d91351eaa6bfd97fe2f157f4 upstream. There's issue as follows: BUG: KASAN: use-after-free in ext4_xattr_inode_dec_ref_all+0x6ff/0x790 Read of size 4 at addr ffff88807b003000 by task syz-executor.0/15172 CPU: 3 PID: 15172 Comm: syz-executor.0 Call Trace: __dump_stack lib/dump_stack.c:82 [inline] dump_stack+0xbe/0xfd lib/dump_stack.c:123 print_address_description.constprop.0+0x1e/0x280 mm/kasan/report.c:400 __kasan_report.cold+0x6c/0x84 mm/kasan/report.c:560 kasan_report+0x3a/0x50 mm/kasan/report.c:585 ext4_xattr_inode_dec_ref_all+0x6ff/0x790 fs/ext4/xattr.c:1137 ext4_xattr_delete_inode+0x4c7/0xda0 fs/ext4/xattr.c:2896 ext4_evict_inode+0xb3b/0x1670 fs/ext4/inode.c:323 evict+0x39f/0x880 fs/inode.c:622 iput_final fs/inode.c:1746 [inline] iput fs/inode.c:1772 [inline] iput+0x525/0x6c0 fs/inode.c:1758 ext4_orphan_cleanup fs/ext4/super.c:3298 [inline] ext4_fill_super+0x8c57/0xba40 fs/ext4/super.c:5300 mount_bdev+0x355/0x410 fs/super.c:1446 legacy_get_tree+0xfe/0x220 fs/fs_context.c:611 vfs_get_tree+0x8d/0x2f0 fs/super.c:1576 do_new_mount fs/namespace.c:2983 [inline] path_mount+0x119a/0x1ad0 fs/namespace.c:3316 do_mount+0xfc/0x110 fs/namespace.c:3329 __do_sys_mount fs/namespace.c:3540 [inline] __se_sys_mount+0x219/0x2e0 fs/namespace.c:3514 do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x67/0xd1 Memory state around the buggy address: ffff88807b002f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff88807b002f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff88807b003000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff88807b003080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff88807b003100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Above issue happens as ext4_xattr_delete_inode() isn't check xattr is valid if xattr is in inode. To solve above issue call xattr_check_inode() check if xattr if valid in inode. In fact, we can directly verify in ext4_iget_extra_inode(), so that there is no divergent verification. Fixes: e50e5129f384 ("ext4: xattr-in-inode support") Signed-off-by: Ye Bin Reviewed-by: Jan Kara Link: https://patch.msgid.link/20250208063141.1539283-3-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o Signed-off-by: David Nyström Signed-off-by: Greg Kroah-Hartman Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/ext4/inode.c | 5 +++++ fs/ext4/xattr.c | 26 +------------------------- fs/ext4/xattr.h | 7 +++++++ 3 files changed, 13 insertions(+), 25 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 38f7b920deb7..35479a62a135 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4609,6 +4609,11 @@ static inline int ext4_iget_extra_inode(struct inode *inode, *magic == cpu_to_le32(EXT4_XATTR_MAGIC)) { int err; + err = xattr_check_inode(inode, IHDR(inode, raw_inode), + ITAIL(inode, raw_inode)); + if (err) + return err; + ext4_set_inode_state(inode, EXT4_STATE_XATTR); err = ext4_find_inline_data_nolock(inode); if (!err && ext4_has_inline_data(inode)) diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index 84c94f11e06e..1e0344e24266 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -263,7 +263,7 @@ __ext4_xattr_check_block(struct inode *inode, struct buffer_head *bh, __ext4_xattr_check_block((inode), (bh), __func__, __LINE__) -static int +int __xattr_check_inode(struct inode *inode, struct ext4_xattr_ibody_header *header, void *end, const char *function, unsigned int line) { @@ -280,9 +280,6 @@ __xattr_check_inode(struct inode *inode, struct ext4_xattr_ibody_header *header, return error; } -#define xattr_check_inode(inode, header, end) \ - __xattr_check_inode((inode), (header), (end), __func__, __LINE__) - static int xattr_find_entry(struct inode *inode, struct ext4_xattr_entry **pentry, void *end, int name_index, const char *name, int sorted) @@ -600,9 +597,6 @@ ext4_xattr_ibody_get(struct inode *inode, int name_index, const char *name, raw_inode = ext4_raw_inode(&iloc); header = IHDR(inode, raw_inode); end = ITAIL(inode, raw_inode); - error = xattr_check_inode(inode, header, end); - if (error) - goto cleanup; entry = IFIRST(header); error = xattr_find_entry(inode, &entry, end, name_index, name, 0); if (error) @@ -734,7 +728,6 @@ ext4_xattr_ibody_list(struct dentry *dentry, char *buffer, size_t buffer_size) struct ext4_xattr_ibody_header *header; struct ext4_inode *raw_inode; struct ext4_iloc iloc; - void *end; int error; if (!ext4_test_inode_state(inode, EXT4_STATE_XATTR)) @@ -744,14 +737,9 @@ ext4_xattr_ibody_list(struct dentry *dentry, char *buffer, size_t buffer_size) return error; raw_inode = ext4_raw_inode(&iloc); header = IHDR(inode, raw_inode); - end = ITAIL(inode, raw_inode); - error = xattr_check_inode(inode, header, end); - if (error) - goto cleanup; error = ext4_xattr_list_entries(dentry, IFIRST(header), buffer, buffer_size); -cleanup: brelse(iloc.bh); return error; } @@ -815,7 +803,6 @@ int ext4_get_inode_usage(struct inode *inode, qsize_t *usage) struct ext4_xattr_ibody_header *header; struct ext4_xattr_entry *entry; qsize_t ea_inode_refs = 0; - void *end; int ret; lockdep_assert_held_read(&EXT4_I(inode)->xattr_sem); @@ -826,10 +813,6 @@ int ext4_get_inode_usage(struct inode *inode, qsize_t *usage) goto out; raw_inode = ext4_raw_inode(&iloc); header = IHDR(inode, raw_inode); - end = ITAIL(inode, raw_inode); - ret = xattr_check_inode(inode, header, end); - if (ret) - goto out; for (entry = IFIRST(header); !IS_LAST_ENTRY(entry); entry = EXT4_XATTR_NEXT(entry)) @@ -2221,9 +2204,6 @@ int ext4_xattr_ibody_find(struct inode *inode, struct ext4_xattr_info *i, is->s.here = is->s.first; is->s.end = ITAIL(inode, raw_inode); if (ext4_test_inode_state(inode, EXT4_STATE_XATTR)) { - error = xattr_check_inode(inode, header, is->s.end); - if (error) - return error; /* Find the named attribute. */ error = xattr_find_entry(inode, &is->s.here, is->s.end, i->name_index, i->name, 0); @@ -2747,10 +2727,6 @@ int ext4_expand_extra_isize_ea(struct inode *inode, int new_extra_isize, min_offs = end - base; total_ino = sizeof(struct ext4_xattr_ibody_header) + sizeof(u32); - error = xattr_check_inode(inode, header, end); - if (error) - goto cleanup; - ifree = ext4_xattr_free_space(base, &min_offs, base, &total_ino); if (ifree >= isize_diff) goto shift; diff --git a/fs/ext4/xattr.h b/fs/ext4/xattr.h index 9a596e19c2b1..cbf235422aec 100644 --- a/fs/ext4/xattr.h +++ b/fs/ext4/xattr.h @@ -210,6 +210,13 @@ extern int ext4_xattr_ibody_set(handle_t *handle, struct inode *inode, extern struct mb_cache *ext4_xattr_create_cache(void); extern void ext4_xattr_destroy_cache(struct mb_cache *); +extern int +__xattr_check_inode(struct inode *inode, struct ext4_xattr_ibody_header *header, + void *end, const char *function, unsigned int line); + +#define xattr_check_inode(inode, header, end) \ + __xattr_check_inode((inode), (header), (end), __func__, __LINE__) + #ifdef CONFIG_EXT4_FS_SECURITY extern int ext4_init_security(handle_t *handle, struct inode *inode, struct inode *dir, const struct qstr *qstr); -- Gitee From 7e4ceec1deaa84e79d3bcad7962976b1036537a3 Mon Sep 17 00:00:00 2001 From: Yang Erkun Date: Sat, 13 Dec 2025 13:57:06 +0800 Subject: [PATCH 15/17] ext4: fix iloc.bh leak in ext4_xattr_inode_update_ref ANBZ: #31323 commit 7c9f059c3d531a12d7ad96cd34a44b8af7c00d5f stable. commit d250bdf531d9cd4096fedbb9f172bb2ca660c868 upstream. The error branch for ext4_xattr_inode_update_ref forget to release the refcount for iloc.bh. Find this when review code. Fixes: 57295e835408 ("ext4: guard against EA inode refcount underflow in xattr update") Signed-off-by: Yang Erkun Reviewed-by: Baokun Li Reviewed-by: Zhang Yi Link: https://patch.msgid.link/20251213055706.3417529-1-yangerkun@huawei.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/ext4/xattr.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index 1e0344e24266..bbdacb5b9c0a 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -980,6 +980,7 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode, ext4_error_inode(ea_inode, __func__, __LINE__, 0, "EA inode %lu ref wraparound: ref_count=%lld ref_change=%d", ea_inode->i_ino, ref_count, ref_change); + brelse(iloc.bh); ret = -EFSCORRUPTED; goto out; } -- Gitee From 66fdb5a7e41d13b3269abd4114f631bd757d2f09 Mon Sep 17 00:00:00 2001 From: Zhang Yi Date: Sat, 29 Nov 2025 18:32:37 +0800 Subject: [PATCH 16/17] ext4: don't cache extent during splitting extent ANBZ: #31323 commit 8302b5b4aacdbb378f7b1216bb2ee782b5142415 stable. commit 8b4b19a2f96348d70bfa306ef7d4a13b0bcbea79 upstream. Caching extents during the splitting process is risky, as it may result in stale extents remaining in the status tree. Moreover, in most cases, the corresponding extent block entries are likely already cached before the split happens, making caching here not particularly useful. Assume we have an unwritten extent, and then DIO writes the first half. [UUUUUUUUUUUUUUUU] on-disk extent U: unwritten extent [UUUUUUUUUUUUUUUU] extent status tree |<- ->| ----> dio write this range First, when ext4_split_extent_at() splits this extent, it truncates the existing extent and then inserts a new one. During this process, this extent status entry may be shrunk, and calls to ext4_find_extent() and ext4_cache_extents() may occur, which could potentially insert the truncated range as a hole into the extent status tree. After the split is completed, this hole is not replaced with the correct status. [UUUUUUU|UUUUUUUU] on-disk extent U: unwritten extent [UUUUUUU|HHHHHHHH] extent status tree H: hole Then, the outer calling functions will not correct this remaining hole extent either. Finally, if we perform a delayed buffer write on this latter part, it will re-insert the delayed extent and cause an error in space accounting. In adition, if the unwritten extent cache is not shrunk during the splitting, ext4_cache_extents() also conflicts with existing extents when caching extents. In the future, we will add checks when caching extents, which will trigger a warning. Therefore, Do not cache extents that are being split. Signed-off-by: Zhang Yi Reviewed-by: Ojaswin Mujoo Reviewed-by: Baokun Li Cc: stable@kernel.org Message-ID: <20251129103247.686136-6-yi.zhang@huaweicloud.com> Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/ext4/extents.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index b285aff62591..25f6b7160d9b 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -3169,6 +3169,9 @@ static int ext4_split_extent_at(handle_t *handle, BUG_ON((split_flag & (EXT4_EXT_DATA_VALID1 | EXT4_EXT_DATA_VALID2)) == (EXT4_EXT_DATA_VALID1 | EXT4_EXT_DATA_VALID2)); + /* Do not cache extents that are in the process of being modified. */ + flags |= EXT4_EX_NOCACHE; + ext_debug(inode, "logical block %llu\n", (unsigned long long)split); ext4_ext_show_leaf(inode, path); @@ -3339,6 +3342,9 @@ static int ext4_split_extent(handle_t *handle, ee_len = ext4_ext_get_actual_len(ex); unwritten = ext4_ext_is_unwritten(ex); + /* Do not cache extents that are in the process of being modified. */ + flags |= EXT4_EX_NOCACHE; + if (map->m_lblk + map->m_len < ee_block + ee_len) { split_flag1 = split_flag & EXT4_EXT_MAY_ZEROOUT; flags1 = flags | EXT4_GET_BLOCKS_PRE_IO; -- Gitee From 8201a17639e546c28769c5c5f2cfd393e6cd0ca4 Mon Sep 17 00:00:00 2001 From: Zilin Guan Date: Thu, 25 Dec 2025 16:48:00 +0800 Subject: [PATCH 17/17] ext4: fix memory leak in ext4_ext_shift_extents() ANBZ: #31323 commit 7e807cb8603b7664fa630a696cd891d9a03c248d stable. commit ca81109d4a8f192dc1cbad4a1ee25246363c2833 upstream. In ext4_ext_shift_extents(), if the extent is NULL in the while loop, the function returns immediately without releasing the path obtained via ext4_find_extent(), leading to a memory leak. Fix this by jumping to the out label to ensure the path is properly released. Fixes: a18ed359bdddc ("ext4: always check ext4_ext_find_extent result") Signed-off-by: Zilin Guan Reviewed-by: Zhang Yi Reviewed-by: Baokun Li Link: https://patch.msgid.link/20251225084800.905701-1-zilin@seu.edu.cn Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Xiao Long Signed-off-by: 528167 --- fs/ext4/extents.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 25f6b7160d9b..19552c8e6dbd 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -5262,7 +5262,8 @@ ext4_ext_shift_extents(struct inode *inode, handle_t *handle, if (!extent) { EXT4_ERROR_INODE(inode, "unexpected hole at %lu", (unsigned long) *iterator); - return -EFSCORRUPTED; + ret = -EFSCORRUPTED; + goto out; } if (SHIFT == SHIFT_LEFT && *iterator > le32_to_cpu(extent->ee_block)) { -- Gitee