Re: [5.15-rc1 regression] io_uring: fsstress hangs in do_coredump() on exit
From: Dave Chinner <david@fromorbit.com>
Date: 2021-09-21 21:36:15
Also in:
linux-fsdevel
On Tue, Sep 21, 2021 at 08:19:53AM -0600, Jens Axboe wrote:
On 9/21/21 7:25 AM, Jens Axboe wrote:quoted
On 9/21/21 12:40 AM, Dave Chinner wrote:quoted
Hi Jens, I updated all my trees from 5.14 to 5.15-rc2 this morning and immediately had problems running the recoveryloop fstest group on them. These tests have a typical pattern of "run load in the background, shutdown the filesystem, kill load, unmount and test recovery". Whent eh load includes fsstress, and it gets killed after shutdown, it hangs on exit like so: # echo w > /proc/sysrq-trigger [ 370.669482] sysrq: Show Blocked State [ 370.671732] task:fsstress state:D stack:11088 pid: 9619 ppid: 9615 flags:0x00000000 [ 370.675870] Call Trace: [ 370.677067] __schedule+0x310/0x9f0 [ 370.678564] schedule+0x67/0xe0 [ 370.679545] schedule_timeout+0x114/0x160 [ 370.682002] __wait_for_common+0xc0/0x160 [ 370.684274] wait_for_completion+0x24/0x30 [ 370.685471] do_coredump+0x202/0x1150 [ 370.690270] get_signal+0x4c2/0x900 [ 370.691305] arch_do_signal_or_restart+0x106/0x7a0 [ 370.693888] exit_to_user_mode_prepare+0xfb/0x1d0 [ 370.695241] syscall_exit_to_user_mode+0x17/0x40 [ 370.696572] do_syscall_64+0x42/0x80 [ 370.697620] entry_SYSCALL_64_after_hwframe+0x44/0xae It's 100% reproducable on one of my test machines, but only one of them. That one machine is running fstests on pmem, so it has synchronous storage. Every other test machine using normal async storage (nvme, iscsi, etc) and none of them are hanging. A quick troll of the commit history between 5.14 and 5.15-rc2 indicates a couple of potential candidates. The 5th kernel build (instead of ~16 for a bisect) told me that commit 15e20db2e0ce ("io-wq: only exit on fatal signals") is the cause of the regression. I've confirmed that this is the first commit where the problem shows up.Thanks for the report Dave, I'll take a look. Can you elaborate on exactly what is being run? And when killed, it's a non-fatal signal?
It's whatever kill/killall sends by default. Typical behaviour that causes a hang is something like: $FSSTRESS_PROG -n10000000 -p $PROCS -d $load_dir >> $seqres.full 2>&1 & .... sleep 5 _scratch_shutdown $KILLALL_PROG -q $FSSTRESS_PROG wait _shutdown_scratch is typically just an 'xfs_io -rx -c "shutdown" /mnt/scratch' command that shuts down the filesystem. Other tests in the recoveryloop group use DM targets to fail IO that trigger a shutdown, others inject errors that trigger shutdowns, etc. But the result is that all hang waiting for fsstress processes that have been using io_uring to exit. Just run fstests with "./check -g recoveryloop" - there's only a handful of tests and it only takes about 5 minutes to run them all on a fake DRAM based pmem device..
quoted hunk ↗ jump to hunk
Can you try with this patch?diff --git a/fs/io-wq.c b/fs/io-wq.c index b5fd015268d7..1e55a0a2a217 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c@@ -586,7 +586,8 @@ static int io_wqe_worker(void *data) if (!get_signal(&ksig)) continue; - if (fatal_signal_pending(current)) + if (fatal_signal_pending(current) || + signal_group_exit(current->signal)) { break; continue; }
Cleaned up so it compiles and the tests run properly again. But playing whack-a-mole with signals seems kinda fragile. I was pointed to this patchset by another dev on #xfs overnight who saw the same hangs that also fixed the hang: https://lore.kernel.org/lkml/cover.1629655338.git.olivier@trillion01.com/ (local) It was posted about a month ago and I don't see any response to it on the lists... Cheers, Dave, -- Dave Chinner david@fromorbit.com