Thread (58 messages) 58 messages, 10 authors, 2022-09-06

Re: stalling IO regression since linux 5.12, through 5.18

From: Jan Kara <jack@suse.cz>
Date: 2022-08-17 11:49:42
Also in: linux-block, lkml

On Wed 17-08-22 11:52:54, Holger Hoffstätte wrote:
On 2022-08-16 17:34, Chris Murphy wrote:
quoted
On Tue, Aug 16, 2022, at 11:25 AM, Nikolay Borisov wrote:
quoted
How about changing the scheduler either mq-deadline or noop, just
to see if this is also reproducible with a different scheduler. I
guess noop would imply the blk cgroup controller is going to be
disabled
I already reported on that: always happens with bfq within an hour or
less. Doesn't happen with mq-deadline for ~25+ hours. Does happen
with bfq with the above patches removed. Does happen with
cgroup.disabled=io set.

Sounds to me like it's something bfq depends on and is somehow
becoming perturbed in a way that mq-deadline does not, and has
changed between 5.11 and 5.12. I have no idea what's under bfq that
matches this description.
Chris, just a shot in the dark but can you try the patch from

https://lore.kernel.org/linux-block/20220803121504.212071-1-yukuai1@huaweicloud.com/ (local)

on top of something more recent than 5.12? Ideally 5.19 where it applies
cleanly.

No guarantees, I just remembered this patch and your problem sounds like
a lost wakeup. Maybe BFQ just drives the sbitmap in a way that triggers the
symptom.
Yes, symptoms look similar and it happens for devices with shared tagsets
(which megaraid sas is) but that problem usually appeared when there are
lots of LUNs sharing the tagset so that number of tags available per LUN
was rather low. Not sure if that is the case here but probably that patch
is worth a try.

Another thing worth trying is to compile the kernel without
CONFIG_BFQ_GROUP_IOSCHED. That will essentially disable cgroup support in
BFQ so we will see whether the problem may be cgroup related or not.

Another interesting thing might be to dump
/sys/kernel/debug/block/<device>/hctx*/{sched_tags,sched_tags_bitmap,tags,tags_bitmap}
as the system is hanging. That should tell us whether tags are in fact in
use or not when processes are blocking waiting for tags.

								Honza
-- 
Jan Kara [off-list ref]
SUSE Labs, CR
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help