Re: [PATCH] swapoff tmpfs radix_tree: remember to rcu_read_unlock
From: Hugh Dickins <hughd@google.com>
Date: 2014-02-15 23:53:52
Also in:
lkml
On Thu, 13 Feb 2014, Andrew Morton wrote:
On Wed, 12 Feb 2014 18:45:07 -0800 (PST) Hugh Dickins [off-list ref] wrote:quoted
Running fsx on tmpfs with concurrent memhog-swapoff-swapon, lots of BUG: sleeping function called from invalid context at kernel/fork.c:606 in_atomic(): 0, irqs_disabled(): 0, pid: 1394, name: swapoff 1 lock held by swapoff/1394: #0: (rcu_read_lock){.+.+.+}, at: [<ffffffff812520a1>] radix_tree_locate_item+0x1f/0x2b6 followed by ================================================ [ BUG: lock held when returning to user space! ] 3.14.0-rc1 #3 Not tainted ------------------------------------------------ swapoff/1394 is leaving the kernel with locks still held! 1 lock held by swapoff/1394: #0: (rcu_read_lock){.+.+.+}, at: [<ffffffff812520a1>] radix_tree_locate_item+0x1f/0x2b6 after which the system recovered nicely. Whoops, I long ago forgot the rcu_read_unlock() on one unlikely branch. Fixes: e504f3fdd63d ("tmpfs radix_tree: locate_item to speed up swapoff")huh. Venerable. I'm surprised that such an obvious blooper wasn't spotted at review. Why didn't anyone else hit this.
No surprise that it missed review, obvious though it is in the fix. And not much surprise that noone else hit this: for most people, even those using tmpfs and pushing out to swap, swapoff is just something that happens shortly before the screen goes blank when you shutdown (and, I haven't noticed how distros order it these days, but swapoff is anyway better done after unmounting tmpfss, to avoid its slowness). And it does need the swapped tmpfs file to be truncated or unlinked while swapoff is searching through it racily with RCU lookups. What puzzled me more was, why hadn't I seen it before? I don't run that fsx test particularly often, but have certainly run it dozens of times between then and now. I think the answer must be where I said "after which the system recovered nicely": I probably did hit it before, but wasn't attending to the screen at the time, the warnings got scrolled off by timestamps I was printing, and I failed to check dmesg or /var/log/messages afterwards.
quoted
Of course, the truth is that I had been hoping to break Johannes's patchset in mmotm, was thrilled to get this on that, then despondent to realize that the only bug I had found was mine. Surprised I've not seen it before in 2.5 years: tried again on 3.14-rc1, got the same after 25 minutes. Probably not serious enough for -stable, but please can we slip the fix into 3.14 - sorry, Johannes's mm-keep-page-cache-radix-tree-nodes-in-check.patch will need a refresh.I fixed it up.
Thanks! Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>