Thread (4 messages) 4 messages, 3 authors, 2014-05-13

Re: bcache: btree_split() couldn't split

From: Slava Pestov <hidden>
Date: 2014-05-13 17:14:28

Hi Zhe and Mariusz,

Based on my understanding of the code, this problem only occurs with
3.14 and older kernels. I believe Kent fixed this bug in v3.15-rc1
with this patch:

commit 0a63b66db566cffdf90182eb6e66fdd4d0479e63
Author: Kent Overstreet [off-list ref]
Date:   Mon Mar 17 17:15:53 2014 -0700

    bcache: Rework btree cache reserve handling

    This changes the bucket allocation reserves to use _real_ reserves
- separate
    freelists - instead of watermarks, which if nothing else makes the
current code
    saner to reason about and is going to be important in the future when we add
    support for multiple btrees.

    It also adds btree_check_reserve(), which checks (and locks) the
reserves for
    both bucket allocation and memory allocation for btree nodes; the
old code just
    kinda sorta assumed that since (e.g. for btree node splits) it had the root
    locked and that meant no other threads could try to make use of the same
    reserve; this technically should have been ok for memory
allocation (we should
    always have a reserve for memory allocation (the btree node cache
is used as a
    reserve and we preallocate it)), but multiple btrees will mean
that locking the
    root won't be sufficient anymore, and for the bucket allocation
reserve it was
    technically possible for the old code to deadlock.

    Signed-off-by: Kent Overstreet [off-list ref]

On Mon, May 12, 2014 at 4:53 AM, Mariusz Paradowski
[off-list ref] wrote:
Confirmed on kernel 3.14.3 from kernel.org:

May 11 17:43:16 x kernel: ------------[ cut here ]------------
May 11 17:43:16 x kernel: WARNING: CPU: 3 PID: 376101 at
drivers/md/bcache/btree.c:1979 0xffffffffa00d65ab()
May 11 17:43:16 x kernel: bcache: btree split failed
May 11 17:43:16 x kernel: Modules linked in: e1000e ptp pps_core microcode
firmware_class unix mpt2sas raid_class scsi_transport_sas bcache fuse
hid_generic usbhid hid xhci_hcd ehci_pci ehci_hcd usbcore usb_common msr
cpuid
May 11 17:43:16 x kernel: CPU: 3 PID: 376101 Comm: kworker/3:2 Not tainted
3.14.3 #1
May 11 17:43:16 x kernel: Hardware name:                  /DH87MC, BIOS
MCH8710H.86A.0047.2013.0606.1508 06/06/2013
May 11 17:43:16 x kernel: Workqueue: events 0xffffffffa00e8fa0
May 11 17:43:16 x kernel: 0000000000000009 ffffffff81303a63 ffff88040c24b988
ffffffff8104c2fd
May 11 17:43:16 x kernel: ffff8801056f2400 ffff88040c24b9d8 ffff88040c24ba00
ffff88040c24bd10
May 11 17:43:16 x kernel: ffffffffffffffe4 ffffffff8104c367 ffffffffa00ea33b
ffff880400000018
May 11 17:43:16 x kernel: Call Trace:
May 11 17:43:16 x kernel: [<ffffffff81303a63>] ? 0xffffffff81303a63
May 11 17:43:16 x kernel: [<ffffffff8104c2fd>] ? 0xffffffff8104c2fd
May 11 17:43:16 x kernel: [<ffffffff8104c367>] ? 0xffffffff8104c367
May 11 17:43:16 x kernel: [<ffffffffa00d65ab>] ? 0xffffffffa00d65ab
May 11 17:43:16 x kernel: [<ffffffff810752c3>] ? 0xffffffff810752c3
May 11 17:43:16 x kernel: [<ffffffffa00d669d>] ? 0xffffffffa00d669d
May 11 17:43:16 x kernel: [<ffffffffa00d7520>] ? 0xffffffffa00d7520
May 11 17:43:16 x kernel: [<ffffffffa00d753b>] ? 0xffffffffa00d753b
May 11 17:43:16 x kernel: [<ffffffffa00d4bce>] ? 0xffffffffa00d4bce
May 11 17:43:16 x kernel: [<ffffffffa00d12a9>] ? 0xffffffffa00d12a9
May 11 17:43:16 x kernel: [<ffffffffa00d4975>] ? 0xffffffffa00d4975
May 11 17:43:16 x kernel: [<ffffffffa00d7520>] ? 0xffffffffa00d7520
May 11 17:43:16 x kernel: [<ffffffffa00d4c65>] ? 0xffffffffa00d4c65
May 11 17:43:16 x kernel: [<ffffffff811bc9c4>] ? 0xffffffff811bc9c4
May 11 17:43:16 x kernel: [<ffffffffa00d7d2c>] ? 0xffffffffa00d7d2c
May 11 17:43:16 x kernel: [<ffffffffa00d7520>] ? 0xffffffffa00d7520
May 11 17:43:16 x kernel: [<ffffffffa00d7e98>] ? 0xffffffffa00d7e98
May 11 17:43:16 x kernel: [<ffffffff81079110>] ? 0xffffffff81079110
May 11 17:43:16 x kernel: [<ffffffffa00e914a>] ? 0xffffffffa00e914a
May 11 17:43:16 x kernel: [<ffffffff81054cb1>] ? 0xffffffff81054cb1
May 11 17:43:16 x kernel: [<ffffffff81054b9d>] ? 0xffffffff81054b9d
May 11 17:43:16 x kernel: [<ffffffff81054eaf>] ? 0xffffffff81054eaf
May 11 17:43:16 x kernel: [<ffffffff8105e9a1>] ? 0xffffffff8105e9a1
May 11 17:43:16 x kernel: [<ffffffff8105c9f3>] ? 0xffffffff8105c9f3
May 11 17:43:16 x kernel: [<ffffffff8105f566>] ? 0xffffffff8105f566
May 11 17:43:16 x kernel: [<ffffffff8105f450>] ? 0xffffffff8105f450
May 11 17:43:16 x kernel: [<ffffffff81064621>] ? 0xffffffff81064621
May 11 17:43:16 x kernel: [<ffffffff81064560>] ? 0xffffffff81064560
May 11 17:43:16 x kernel: [<ffffffff8130853c>] ? 0xffffffff8130853c
May 11 17:43:16 x kernel: [<ffffffff81064560>] ? 0xffffffff81064560
May 11 17:43:16 x kernel: ---[ end trace 4fa5a49292304c0d ]---
May 11 17:43:16 x kernel: bcache: bch_btree_insert() error -12
--
Mariusz Paradowski


--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help