Re: [PATCH 0/4 v2] BDI lifetime fix
From: Thiago Jung Bauermann <hidden>
Date: 2017-02-06 14:48:42
Hello, Am Dienstag, 31. Januar 2017, 13:54:25 BRST schrieb Jan Kara:
this is a second version of the patch series that attempts to solve the problems with the life time of a backing_dev_info structure. Currently it lives inside request_queue structure and thus it gets destroyed as soon as request queue goes away. However the block device inode still stays around and thus inode_to_bdi() call on that inode (e.g. from flusher worker) may happen after request queue has been destroyed resulting in oops. This patch set tries to solve these problems by making backing_dev_info independent structure referenced from block device inode. That makes sure inode_to_bdi() cannot ever oops. I gave some basic testing to the patches in KVM and on a real machine, Dan was running them with libnvdimm test suite which was previously triggering the oops and things look good. So they should be reasonably healthy. Laurent, if you can give these patches testing in your environment where you were triggering the oops, it would be nice.
I know you posted a v3, but we are seeing this crash on v2 and looking at v3's changelog it doesn't seem it would make a difference: 6:mon> th [c000000003e6b940] c00000000037d15c writeback_sb_inodes+0x30c/0x590 [c000000003e6ba50] c00000000037d4c4 __writeback_inodes_wb+0xe4/0x150 [c000000003e6bab0] c00000000037d91c wb_writeback+0x2fc/0x440 [c000000003e6bb80] c00000000037e778 wb_workfn+0x268/0x580 [c000000003e6bc90] c0000000000f3890 process_one_work+0x1e0/0x590 [c000000003e6bd20] c0000000000f3ce8 worker_thread+0xa8/0x660 [c000000003e6bdc0] c0000000000fd124 kthread+0x154/0x1a0 [c000000003e6be30] c00000000000b4e8 ret_from_kernel_thread+0x5c/0x74
--- Exception: 0 at 00000000000000006:mon> r
R00 = c00000000037d15c R16 = c0000001fca60160
R01 = c000000003e6b8e0 R17 = c0000001fca600d8
R02 = c0000000014c3800 R18 = c0000001fca601c8
R03 = c0000001fca600d8 R19 = 0000000000000000
R04 = c0000000036478d0 R20 = 0000000000000000
R05 = 0000000000000000 R21 = c000000003e68000
R06 = 00000001fee70000 R22 = c0000001f49d17c0
R07 = 0001c6ce3a83dfca R23 = c0000001f49d17a0
R08 = 0000000000000000 R24 = 0000000000000000
R09 = 0000000000000000 R25 = c0000001fca60160
R10 = 0000000080000006 R26 = 0000000000000000
R11 = c0000000fb627b68 R27 = 0000000000000000
R12 = 0000000000002200 R28 = 0000000000000001
R13 = c00000000fb83600 R29 = c0000001fca600d8
R14 = c0000000000fcfd8 R30 = c000000003e6bbe0
R15 = 0000000000000000 R31 = 0000000000000000
pc = c0000000003799a0 locked_inode_to_wb_and_lock_list+0x50/0x290
cfar= c0000000005f5568 iowrite16+0x38/0xb0
lr = c00000000037d15c writeback_sb_inodes+0x30c/0x590
msr = 800000000280b033 cr = 24e62882
ctr = c00000000012c110 xer = 0000000000000000 trap = 300
dar = 0000000000000000 dsisr = 40000000
6:mon> sh
[312489.344110] INFO: rcu_sched detected stalls on CPUs/tasks:
[312489.396998] INFO: rcu_sched detected stalls on CPUs/tasks:
[312489.397003] 3-...: (4 ticks this GP) idle=59b/140000000000001/0 softirq=18323196/18323196 fqs=2
[312489.397005] 6-...: (1 GPs behind) idle=86f/140000000000001/0 softirq=18012373/18012374 fqs=2
[312489.397005] (detected by 2, t=47863798 jiffies, g=9340524, c=9340523, q=170)
[312489.505361] rcu_sched kthread starved for 47863823 jiffies! g9340524 c9340523 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
[312489.537334] 3-...: (26 ticks this GP) idle=59b/140000000000000/0 softirq=18323196/18323196 fqs=2
[312489.537395] 6-...: (1 GPs behind) idle=86f/140000000000001/0 softirq=18012373/18012374 fqs=2
[312489.537454] (detected by 0, t=47863836 jiffies, g=9340524, c=9340523, q=170)
[312489.537528] rcu_sched kthread starved for 47863832 jiffies! g9340524 c9340523 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0
[312489.672967] Unable to handle kernel paging request for data at address 0x00000000
[312489.673028] Faulting instruction address: 0xc0000000003799a0
cpu 0x6: Vector: 300 (Data Access) at [c000000003e6b660]
pc: c0000000003799a0: locked_inode_to_wb_and_lock_list+0x50/0x290
lr: c00000000037d15c: writeback_sb_inodes+0x30c/0x590
sp: c000000003e6b8e0
msr: 800000000280b433
dar: 0
dsisr: 40000000
current = 0xc000000003646e00
paca = 0xc00000000fb83600 softe: 0 irq_happened: 0x01
pid = 8569, comm = kworker/u16:5
Linux version 4.10.0-rc3jankarav2+ (bauermann@u1604le) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #3 SMP Wed Feb 1 13:22:47 BRST 2017
enter ? for help
6:mon>
It took more than a day under I/O stress test to crash, so it seems to be a
hard to hit race condition. PC is at:
$ addr2line -e /usr/lib/debug/vmlinux-4.10.0-rc3jankarav2+ c0000000003799a0
wb_get at /home/bauermann/src/linux/./include/linux/backing-dev-defs.h:218
(inlined by) locked_inode_to_wb_and_lock_list at /home/bauermann/src/linux/fs/fs-writeback.c:281
Which is:
216 static inline void wb_get(struct bdi_writeback *wb)
217 {
218 if (wb != &wb->bdi->wb)
219 percpu_ref_get(&wb->refcnt);
220 }
So it looks like wb->bdi is NULL.
--
Thiago Jung Bauermann
IBM Linux Technology Center