Thread (17 messages) 17 messages, 2 authors, 2021-12-02

Re: Write I/O queue hangup at random on recent Linus' kernels

From: Jens Axboe <axboe@kernel.dk>
Date: 2021-11-23 23:27:57
Also in: linux-bcache, linux-ext4, linux-nvme

On 11/23/21 2:05 PM, Kenneth R. Crudup wrote:
(Please forgive the SPAMmy nature of the To: list; I'm not exactly sure whose
subsystem this issue belongs to, so please trim as appropriate).

I've got a Kioxia NVMe SSD on my Dell XPS-7390 2-in-1 running an i7-1065G7 CPU
with 32GB RAM.  If you need more info (and I suspect so), please let me know.

I'm sorry I don't have a better description of the problem, but I run Linus'
master branch (and sometimes I weed out problems like this). I'm current as of
his commit 1360572566 (the 5.16-rc2 tag).

For about two weeks now every now and then my block/NVMe/...? subsystem comes to
a total halt on writes, and I get a system that can no longer issue writes
(reads/pageins still seem to work) until I reboot. SysRq-S/U/B still leaves a
dirty ext4 filesystem requring recovery on reboot.

It happens at random- twice today as a matter of fact- and there doesn't seem to
be any particular action that causes it:
It looks like some missed accounting. You can just disable wbt for now, would
be a useful data point to see if that fixes it. Just do:

echo 0 > /sys/block/nvme0n1/queue/wbt_lat_usec

and that will disable writeback throttling on that device.

I'll take a look at this, but most likely not until start next week...

-- 
Jens Axboe
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help