Thread (35 messages) 35 messages, 6 authors, 2018-08-22

Re: [PATCH] blk-wbt: Avoid lock contention and thundering herd issue in wbt_wait

From: Jens Axboe <axboe@kernel.dk>
Date: 2018-08-22 19:17:03

On 8/22/18 1:12 PM, Holger Hoffstätte wrote:
On 08/22/18 19:28, Jens Axboe wrote:
quoted
On 8/22/18 8:27 AM, Jens Axboe wrote:
quoted
On 8/22/18 6:54 AM, Holger Hoffstätte wrote:
quoted
On 08/22/18 06:10, Jens Axboe wrote:
quoted
[...]
If you have time, please look at the 3 patches I posted earlier today.
Those are for mainline, so should be OK :-)
I'm just playing along at home but with those 3 I get repeatable
hangs & writeback not starting at all, but curiously *only* on my btrfs
device; for inexplicable reasons some other devices with ext4/xfs flush
properly. Yes, that surprised me too, but it's repeatable.
Now this may or may not have something to do with some of my in-testing
patches for btrfs itself, but if I remove those 3 wbt fixes, everything
is golden again. Not eager to repeat since it hangs sync & requires a
hard reboot.. :(
Just thought you'd like to know.
Thanks, that's very useful info! I'll see if I can reproduce that.
Any chance you can try with and see which patch is causing the issue?
I can't reproduce it here, seems solid.

Either that, or a reproducer would be great...
It's a hacked up custom tree but the following things have emerged so far:

- it's not btrfs.

- it also happens with ext4.

- I first suspected bfq on a nonrotational device disabling WBT by default,
but using deadline didn't help either. Can't even mkfs.ext4.

- I suspect - but do not know - that using xfs everywhere else is the
reason I got lucky, because xfs. :D

- it immediately happens with only the first patch
("move disable check into get_limit()")

So the obvious suspect is the new return of UINT_MAX from get_limit() to
__wbt_wait(). I first suspected that I mispatched something, but it's all
like in mainline or your tree. Even the recently moved-around atomic loop
inside rq_wait_inc_below() is 1:1 the same and looks like it should.
Now building mainline and see where that leads me.
I wonder if it's a signedness thing? Can you try and see if using INT_MAX
instead changes anything?

-- 
Jens Axboe
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help