Thread (8 messages) 8 messages, 4 authors, 2011-02-24

Re: PowerPC BUG: using smp_processor_id() in preemptible code

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: 2010-12-30 10:45:44
Also in: lkml

On Wed, 2010-12-29 at 14:54 -0800, Hugh Dickins wrote:
With recent 2.6.37-rc, with CONFIG_PREEMPT=y CONFIG_DEBUG_PREEMPT=y
on the PowerPC G5, I get spammed by BUG warnings each time I swapoff:

BUG: using smp_processor_id() in preemptible [00000000] code: swapoff/3974
caller is .hpte_need_flush+0x4c/0x2e8
Call Trace:
[c0000001b4a3f830] [c00000000000f3cc] .show_stack+0x6c/0x16c (unreliable)
[c0000001b4a3f8e0] [c00000000023eda0] .debug_smp_processor_id+0xe4/0x11c
[c0000001b4a3f970] [c00000000002f2f4] .hpte_need_flush+0x4c/0x2e8
[c0000001b4a3fa30] [c0000000000e7ef8] .vunmap_pud_range+0x148/0x200
[c0000001b4a3fb10] [c0000000000e8058] .vunmap_page_range+0xa8/0xd4
[c0000001b4a3fbb0] [c0000000000e80a4] .free_unmap_vmap_area+0x20/0x38
[c0000001b4a3fc40] [c0000000000e8138] .remove_vm_area+0x7c/0xb4
[c0000001b4a3fcd0] [c0000000000e8308] .__vunmap+0x50/0x104
[c0000001b4a3fd60] [c0000000000ef3fc] .SyS_swapoff+0x59c/0x6a8
[c0000001b4a3fe30] [c0000000000075a8] syscall_exit+0x0/0x40

I notice hpte_need_flush() itself acknowledges
 * Must be called from within some kind of spinlock/non-preempt region...
Yes, we assume that the PTE lock is always held when modifying page
tables...
Though I didn't actually bisect, I believe this is since Jeremy's
64141da587241301ce8638cc945f8b67853156ec "vmalloc: eagerly clear ptes
on vunmap", which moves a call to vunmap_page_range() from one place
(which happened to be inside a spinlock) to another (where it's not).

I guess my warnings would be easily silenced by moving that call to
vunmap_page_range() down just inside the spinlock below it; but I'm
dubious that that's the right fix - it looked as if there are other
paths through vmalloc.c where vunmap_page_range() has been getting
called without preemption disabled, long before Jeremy's change,
just paths that I never happen to go down in my limited testing.

For the moment I'm using the obvious patch below to keep it quiet;
but I doubt that this is the right patch either.  I'm hoping that
ye who understand the importance of hpte_need_flush() will be best
able to judge what to do.  Or might there be other architectures
expecting to be unpreemptible there?
Well, it looks like our kernel mappings tend to take some nasty
shortcuts with the PTE locking, which I suppose are legit but do break
some of my assumptions there. I need to have a closer look. Thanks for
the report !

Cheers,
Ben.
quoted hunk ↗ jump to hunk
Thanks,
Hugh
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -37,11 +37,13 @@ static void vunmap_pte_range(pmd_t *pmd,
 {
 	pte_t *pte;
 
+	preempt_disable(); /* Stop __vunmap() triggering smp_processor_id() in preemptible from hpte_need_flush() */
 	pte = pte_offset_kernel(pmd, addr);
 	do {
 		pte_t ptent = ptep_get_and_clear(&init_mm, addr, pte);
 		WARN_ON(!pte_none(ptent) && !pte_present(ptent));
 	} while (pte++, addr += PAGE_SIZE, addr != end);
+	preempt_enable();
 }
 
 static void vunmap_pmd_range(pud_t *pud, unsigned long addr, unsigned long end)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help