Re: stalling IO regression since linux 5.12, through 5.18
From: Chris Murphy <hidden>
Date: 2022-08-17 15:04:16
Also in:
linux-block, linux-btrfs, lkml
From: Chris Murphy <hidden>
Date: 2022-08-17 15:04:16
Also in:
linux-block, linux-btrfs, lkml
On Wed, Aug 17, 2022, at 10:53 AM, Ming Lei wrote:
On Wed, Aug 17, 2022 at 10:34:38AM -0400, Chris Murphy wrote:quoted
On Wed, Aug 17, 2022, at 8:06 AM, Ming Lei wrote:quoted
blk-mq debugfs log is usually helpful for io stall issue, care to post the blk-mq debugfs log: (cd /sys/kernel/debug/block/$disk && find . -type f -exec grep -aH . {} \;)This is only sda https://drive.google.com/file/d/1aAld-kXb3RUiv_ShAvD_AGAFDRS03Lr0/view?usp=sharingFrom the log, there isn't any in-flight IO request. So please confirm that it is collected after the IO stall is triggered.
Yes, iotop reports no reads or writes at the time of collection. IO pressure 99% for auditd, systemd-journald, rsyslogd, and postgresql, with increasing pressure from all the qemu processes. Keep in mind this is a raid10, so maybe it's enough for just one block device IO to stall and the whole thing stops? That's why I included all block devices.
If yes, the issue may not be related with BFQ, and should be related with blk-cgroup code.
Problem happens with cgroup.disable=io, does this setting affect blk-cgroup? -- Chris Murphy