Thread (6 messages) 6 messages, 3 authors, 2021-11-15

Re: 5.15+, blocked tasks, folio_wait_bit_common

From: Nikolay Borisov <hidden>
Date: 2021-11-12 18:46:46
Also in: linux-fsdevel

[CC'ing Omar as Kyber is mentioned]

On 12.11.21 г. 20:06, Chris Murphy wrote:
On Fri, Nov 12, 2021 at 1:55 AM Nikolay Borisov [off-list ref] wrote:
quoted


On 11.11.21 г. 22:57, Chris Murphy wrote:
quoted
On Thu, Nov 11, 2021 at 3:24 PM Chris Murphy [off-list ref] wrote:
quoted
Soon after logging in and launching some apps, I get a hang. Although
there's lots of btrfs stuff in the call traces, I think we're stuck in
writeback so everything else just piles up and it all hangs
indefinitely.

Happening since at least
5.16.0-0.rc0.20211109gitd2f38a3c6507.9.fc36.x86_64 and is still happening with
5.16.0-0.rc0.20211111gitdebe436e77c7.11.fc36.x86_64

Full dmesg including sysrq+w when the journal becomes unresponsive and
then a bunch of block tasks  > 120s roll in on their own.

https://bugzilla-attachments.redhat.com/attachment.cgi?id=1841283

The btrfs traces in this one doesn't look interesting, what's
interesting is you have a bunch of tasks, including btrfs transaction
commit which are stuck waiting to get a tag from the underlying block
device - blk_mq_get_tag function. This indicates something's going on
with the underlying block device.
Well the hang doesn't ever happen with 5.14.x or 5.15.x kernels, only
the misc-next (Fedora rc0) kernels. And also I just discovered that
it's not happening (or not as quickly) with IO scheduler none. I've
been using kyber and when I switch back to it, the hang happens almost
immediately.
Well I see a bunch of WARN_ONs being triggered, so is it possible that
this is some issue which is going to be fixed in some future RC ? Omar
what steps should be taken to try and debug this from the Kyber side of
things?


Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help