Thread (3 messages) 3 messages, 2 authors, 2020-03-04

Re: "I/O 8 QID 0 timeout, reset controller" on 5.6-rc2

From: "Jason A. Donenfeld" <Jason@zx2c4.com>
Date: 2020-03-04 11:19:48
Also in: linux-nvme

On Wed, Mar 4, 2020 at 6:02 AM Keith Busch [off-list ref] wrote:
On Mon, Mar 02, 2020 at 10:03:39AM +0800, Jason A. Donenfeld wrote:
quoted
Hi,

My torrent client was doing some I/O when the below happened. I'm
wondering if this is a known thing that's been fixed during the rc
cycle, a regression, or if my (pretty new) NVMe drive is failing.

Thanks,
Jason

Feb 24 20:36:58 thinkpad kernel: nvme nvme1: I/O 852 QID 15 timeout, aborting
Feb 24 20:37:29 thinkpad kernel: nvme nvme1: I/O 852 QID 15 timeout, reset controller
Feb 24 20:37:59 thinkpad kernel: nvme nvme1: I/O 8 QID 0 timeout, reset controller
Feb 24 20:39:00 thinkpad kernel: nvme nvme1: Device not ready; aborting reset
Feb 24 20:39:00 thinkpad kernel: nvme nvme1: Abort status: 0x371
Sorry to say, this indicates the controller has become unresponsive.
You usually see "timeout" messages in batches, though, so I wonder if
only the one IO command timed out or if the controller just doesn't
support an abort command limit.

You can try throttling the queue depth and see if the problem goes away.
The lowest possible depth can be set with kernel param
"nvme.io_queue_depth=2".
I was unfortunately never able to reproduce. This happened while
downloading a torrent, and torrent clients have a history of creating
"interesting" I/O patterns. Hardware is "Samsung SSD 970 EVO Plus
2TB".
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help