Thread (9 messages) 9 messages, 4 authors, 2018-10-11

Re: Hard lockup in blk_mq_free_request() / wbt_done() / wake_up_all()

From: Chris Boot <hidden>
Date: 2018-06-12 16:20:19
Also in: lkml

On 12/06/18 17:09, Jens Axboe wrote:
On 6/12/18 9:38 AM, Chris Boot wrote:
quoted
Hi folks,

I maintain a large (to me) system with 112 threads (4x Intel E7-4830 v4)
which has a MegaRAID SAS 9361-24i controller. This system is currently
running Debian's 4.16.12 kernel (from stretch-backports) with blk_mq
enabled.

I've run into a lockup which appears to involve blq_mq and writeback
throttling. It's hard to tell if I've run into this same thing with
older kernels; I'm trying to track down a deadlock but so far I've been
fairly certain that involved the OOM killer, but this doesn't seem to.
[snip]
Hmm that's really weird, I don't see how we could be spinning on the
waitqueue lock like that. I haven't seen any wbt bug reports like this
before.

Are things generally stable if you just turn off wbt? You can do that
for sda, for instance, by doing:

# echo 0 > /sys/block/sda/queue/wbt_lat_usec

It'd be interesting to get this data point. Eg leave blk-mq enabled, and
then just disable wbt.
Hi Jens,

Thanks for the speedy response. I'll see if I can get that tested soon;
if the system is stable without blk_mq I can see the users wanting to
keep it that way for a while. I'll let you know.
Is anything disabling wbt in the system otherwise?
Not that I'm aware of, no.

Thanks,
Chris

-- 
Chris Boot
bootc@boo.tc
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help