Re: kernel BUG in __clear_extent_bit

From: David Sterba <hidden>
Date: 2021-09-24 15:16:31
Also in: lkml

On Thu, Sep 23, 2021 at 10:24:51AM +0800, Hao Sun wrote:

Qu Wenruo [off-list ref] 于2021年9月15日周三 下午1:33写道：

quoted



On 2021/9/15 上午10:20, Hao Sun wrote:

quoted

Hello,

When using Healer to fuzz the latest Linux kernel, the following crash
was triggered.

HEAD commit: 6880fa6c5660 Linux 5.15-rc1
git tree: upstream
console output:
https://drive.google.com/file/d/1-9wwV6-OmBcJvHGCbMbP5_uCVvrUdTp3/view?usp=sharing
kernel config: https://drive.google.com/file/d/1rUzyMbe5vcs6khA3tL9EHTLJvsUdWcgB/view?usp=sharing
C reproducer: https://drive.google.com/file/d/1eXePTqMQ5ZA0TWtgpTX50Ez4q9ZKm_HE/view?usp=sharing
Syzlang reproducer:
https://drive.google.com/file/d/11s13louoKZ7Uz0mdywM2jmE9B1JEIt8U/view?usp=sharing

If you fix this issue, please add the following tag to the commit:
Reported-by: Hao Sun <redacted>

loop1: detected capacity change from 0 to 32768
BTRFS info (device loop1): disk space caching is enabled
BTRFS info (device loop1): has skinny extents
BTRFS info (device loop1): enabling ssd optimizations
FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 0
CPU: 1 PID: 25852 Comm: syz-executor Not tainted 5.15.0-rc1 #16
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
Call Trace:
  __dump_stack lib/dump_stack.c:88 [inline]
  dump_stack_lvl+0x8d/0xcf lib/dump_stack.c:106
  fail_dump lib/fault-inject.c:52 [inline]
  should_fail+0x13c/0x160 lib/fault-inject.c:146
  should_failslab+0x5/0x10 mm/slab_common.c:1328
  slab_pre_alloc_hook.constprop.99+0x4e/0xc0 mm/slab.h:494
  slab_alloc_node mm/slub.c:3120 [inline]
  slab_alloc mm/slub.c:3214 [inline]
  kmem_cache_alloc+0x44/0x280 mm/slub.c:3219
  alloc_extent_state+0x1e/0x1c0 fs/btrfs/extent_io.c:340

This is the one of the core systems btrfs uses, and we really don't want
that to fail.

Thus in fact it does some preallocation to prevent failure.

But for error injection case, we can still hit BUG_ON() which is used to
catch ENOMEM.

Hello,

Fuzzer triggered following crashes repeatedly when the `fault
injection` was enabled.

HEAD commit: 92477dd1faa6 Merge tag 's390-5.15-ebpf-jit-fixes'
git tree: upstream
kernel config: https://drive.google.com/file/d/1KgvcM8i_3hQiOL3fUh3JFpYNQM4itvV4/view?usp=sharing
[1] kernel BUG in btrfs_free_tree_block (fs/btrfs/extent-tree.c:3297):
https://paste.ubuntu.com/p/ZtzVKWbcGm/
[2] kernel BUG in clear_state_bit (fs/btrfs/extent_io.c:658!):
https://paste.ubuntu.com/p/hps2wXPG2b/
[3] kernel BUG in set_extent_bit (fs/btrfs/extent_io.c:1021):
https://paste.ubuntu.com/p/dcptjYYxgd/
[4] kernel BUG in set_state_bits (fs/btrfs/extent_io.c:939):
https://paste.ubuntu.com/p/NV9qtKB4KZ/

All the above crashes were triggered directly by the `BUG_ON()` macro
in the corresponding location.
Most `BUG_ON()` was hit due to `ENOMEM` when fault injected.
Would it be better for btrfs to handle the `ENOMEM` error, e.g.,
gracefully return, rather than panic the kernel?

If it would be so easy we would have done it already. Unfortunatelly in
some deep call chains or under locks or from contexts where the whole
operation is split accross subsystems or threads it's not always
possible to roll back. Some tricks like preallocation can bail out early
but we can't preallocate everything. The allocations are done under
GFP_NOFS that still has the no-fail semantics. The error you report do
not normally happen because allocator tries hard to return some memory.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help