Re: 5.15+, blocked tasks, folio_wait_bit_common
From: Chris Murphy <hidden>
Date: 2021-11-12 18:06:33
Also in:
linux-fsdevel
On Fri, Nov 12, 2021 at 1:55 AM Nikolay Borisov [off-list ref] wrote:
On 11.11.21 г. 22:57, Chris Murphy wrote:quoted
On Thu, Nov 11, 2021 at 3:24 PM Chris Murphy [off-list ref] wrote:quoted
Soon after logging in and launching some apps, I get a hang. Although there's lots of btrfs stuff in the call traces, I think we're stuck in writeback so everything else just piles up and it all hangs indefinitely. Happening since at least 5.16.0-0.rc0.20211109gitd2f38a3c6507.9.fc36.x86_64 and is still happening with 5.16.0-0.rc0.20211111gitdebe436e77c7.11.fc36.x86_64 Full dmesg including sysrq+w when the journal becomes unresponsive and then a bunch of block tasks > 120s roll in on their own. https://bugzilla-attachments.redhat.com/attachment.cgi?id=1841283The btrfs traces in this one doesn't look interesting, what's interesting is you have a bunch of tasks, including btrfs transaction commit which are stuck waiting to get a tag from the underlying block device - blk_mq_get_tag function. This indicates something's going on with the underlying block device.
Well the hang doesn't ever happen with 5.14.x or 5.15.x kernels, only the misc-next (Fedora rc0) kernels. And also I just discovered that it's not happening (or not as quickly) with IO scheduler none. I've been using kyber and when I switch back to it, the hang happens almost immediately. -- Chris Murphy