]> git.baikalelectronics.ru Git - kernel.git/commitdiff
ext4: fix extent status tree race in writeback error recovery path
authorEric Whitney <enwlinux@gmail.com>
Wed, 15 Jun 2022 16:05:30 +0000 (12:05 -0400)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Wed, 17 Aug 2022 12:24:26 +0000 (14:24 +0200)
[ Upstream commit 2aade4676daeabddedddb2b237181a9216668fc0 ]

A race can occur in the unlikely event ext4 is unable to allocate a
physical cluster for a delayed allocation in a bigalloc file system
during writeback.  Failure to allocate a cluster forces error recovery
that includes a call to mpage_release_unused_pages().  That function
removes any corresponding delayed allocated blocks from the extent
status tree.  If a new delayed write is in progress on the same cluster
simultaneously, resulting in the addition of an new extent containing
one or more blocks in that cluster to the extent status tree, delayed
block accounting can be thrown off if that delayed write then encounters
a similar cluster allocation failure during future writeback.

Write lock the i_data_sem in mpage_release_unused_pages() to fix this
problem.  Ext4's block/cluster accounting code for bigalloc relies on
i_data_sem for mutual exclusion, as is found in the delayed write path,
and the locking in mpage_release_unused_pages() is missing.

Cc: stable@kernel.org
Reported-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Link: https://lore.kernel.org/r/20220615160530.1928801-1-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
fs/ext4/inode.c

index f0350d60ba507c50345d1064f8432236b9668b41..149377c849ee8b9a6be61cb4eb8313aa763b6756 100644 (file)
@@ -1560,7 +1560,14 @@ static void mpage_release_unused_pages(struct mpage_da_data *mpd,
                ext4_lblk_t start, last;
                start = index << (PAGE_SHIFT - inode->i_blkbits);
                last = end << (PAGE_SHIFT - inode->i_blkbits);
+
+               /*
+                * avoid racing with extent status tree scans made by
+                * ext4_insert_delayed_block()
+                */
+               down_write(&EXT4_I(inode)->i_data_sem);
                ext4_es_remove_extent(inode, start, last - start + 1);
+               up_write(&EXT4_I(inode)->i_data_sem);
        }
 
        pagevec_init(&pvec);