Thread (10 messages) 10 messages, 3 authors, 2016-01-29

Re: bcache_gc: BUG: soft lockup

From: Johannes Thumshirn <hidden>
Date: 2016-01-29 11:54:51

[ +cc Kent ]

On Wed, Jan 27, 2016 at 02:57:25PM +0000, Yannis Aribaud wrote:
Hi,

After several weeks using the 4.2.6 kernel + patches from Ewheeler we just ran into a crash again.
This time the kernel was still running and the server was responsive but not able to do any IO on the bcache devices.

[696983.683498] bcache_writebac D ffffffff810643df     0  5741      2 0x00000000
[696983.683505]  ffff88103d01f180 0000000000000046 ffff88107842d000 ffffffff811a95cd
[696983.683510]  0000000000000000 ffff8810388c4000 ffff88103d01f180 0000000000000001
[696983.683514]  ffff882034ae0c10 0000000000000000 ffff882034ae0000 ffffffff8139601e
[696983.683518] Call Trace:
[696983.683530]  [<ffffffff811a95cd>] ? blk_queue_bio+0x262/0x279
[696983.683539]  [<ffffffff8139601e>] ? schedule+0x6b/0x78
[696983.683553]  [<ffffffffa032ce9b>] ? closure_sync+0x66/0x91 [bcache]
[696983.683563]  [<ffffffffa033c89f>] ? bch_writeback_thread+0x622/0x6b5 [bcache]
[696983.683569]  [<ffffffff8100265c>] ? __switch_to+0x1de/0x3f7
[696983.683578]  [<ffffffffa033c89f>] ? bch_writeback_thread+0x622/0x6b5 [bcache]
[696983.683586]  [<ffffffffa033c27d>] ? write_dirty_finish+0x1bf/0x1bf [bcache]
[696983.683594]  [<ffffffff810589d6>] ? kthread+0x99/0xa1
[696983.683598]  [<ffffffff8105893d>] ? kthread_parkme+0x16/0x16
[696983.683603]  [<ffffffff813986df>] ? ret_from_fork+0x3f/0x70
[696983.683607]  [<ffffffff8105893d>] ? kthread_parkme+0x16/0x16

Don't know if this help.
Unfortunately I thing that we will rollback and stop using Bcache unless this is really fixed :/
Hi Yannis,

Do you have a machine with a bcache setup running where you can reproduce the
error? Or do you know a method to reproduce the error?

What I'd be interested in is which locks are held when it locks up (you can
acquire this information with SysRq+d or echo d > /proc/sysrq-trigger.

Kent, do you have an idea what's happening here?
Regards,

7 décembre 2015 11:35 "Yannis Aribaud" [off-list ref] a écrit:
quoted
Hi everyone,

It's been one week I'm using a 4.2.6 kernel merged with the Bcache patches from Ewheeler and no
signs of any kind of trouble I had before.
Thus it seems your patches fix my soft lockup issue.
It's currently running on one of my ceph nodes, I will certainly push it on the others during the
next weeks.

It would be great to merge thoses patches upstream since it seems that using Bcache in production
requires those fixes.

Anyway, thanks to all of you for your time, advices and work on Bcache. I'll keep you updated.

Regards,
-- 
Open is better
-- 
Open is better
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help