Re: [PATCH 0/4 v2] BDI lifetime fix
From: Tejun Heo <tj@kernel.org>
Date: 2017-02-07 17:21:01
Hello, On Tue, Feb 07, 2017 at 01:33:31PM +0100, Jan Kara wrote:
quoted
We can see above that inode->i_wb is in r31, and the machine crashed at 0xc0000000003799a0 so it was trying to dereference wb and crashed. r31 is NULL in the crash information.Thanks for report and the analysis. After some looking into the code I see where the problem is. Writeback code assumes inode->i_wb can never become invalid once it is set however we still call inode_detach_wb() from __blkdev_put(). So in a way this is a different problem but closely related.
Heh, it feels like we're chasing our own tails.
It seems to me that instead of calling inode_detach_wb() in __blkdev_put() we may just switch blkdev inode to bdi->wb (it is now guaranteed to stay around). That way bdi_unregister() can complete (destroying all writeback structures except for bdi->wb) while block device inode can still live with a valid i_wb structure.
So, the problem there would be synchronizing get_wb against the transition. We can do that and inode_switch_wbs_work_fn() already does it but it is a bit nasty. I'm getting a bit confused here, so the reason we added inode_detach_wb() in __blkdev_put() was because the root wb might go away because it's embedded in the bdi which is embedded in the request_queue which is put and may be released by put_disk(). Are you saying that we changed the behavior so that bdi->wb stays around? If so, we can just remove the inode_detach_wb() call, no? Thanks. -- tejun