Thread (7 messages) 7 messages, 3 authors, 2021-12-13

Re: Random high CPU utilization in blk-mq with the none scheduler

From: Jens Axboe <axboe@kernel.dk>
Date: 2021-12-11 02:05:14
Also in: lkml

On 12/10/21 6:29 PM, Dexuan Cui wrote:
quoted
From: Dexuan Cui
Sent: Thursday, December 9, 2021 7:30 PM

Hi all,
I found a random high CPU utilization issue with some database benchmark
program running on a 192-CPU virtual machine (VM). Originally the issue
was found with RHEL 8.4 and Ubuntu 20.04, and further tests show that the
issue also reproduces with the latest upstream stable kernel v5.15.7, but
*not* with v5.16-rc1. It looks like someone resolved the issue in v5.16-rc1
recently?
I did git-bisect on the linux-block tree's for-5.16/block branch and this patch
resolves the random high CPU utilization issue (I'm not sure how):
	dc5fc361d891 ("block: attempt direct issue of plug list")
	https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-5.16/block&id=dc5fc361d891e089dfd9c0a975dc78041036b906

Do you think if it's easy to backport it to earlier versions like 5.10?
It looks like there are a lot of prerequisite patches.
It's more likely the real fix is avoiding the repeated plug list scan,
which I guess makes sense. That is this commit:

commit d38a9c04c0d5637a828269dccb9703d42d40d42b
Author: Jens Axboe [off-list ref]
Date:   Thu Oct 14 07:24:07 2021 -0600

    block: only check previous entry for plug merge attempt

If that's the case, try 5.15.x again and do:

echo 2 > /sys/block/<dev>/queue/nomerges

for each drive you are using in the IO test, and see if that gets
rid of the excess CPU usage.

-- 
Jens Axboe
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help