Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with... | linuxppc-dev

[PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Darren Hart <hidden> · 2010-07-22
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Darren Hart <hidden> · 2010-07-22
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Thomas Gleixner <hidden> · 2010-07-22
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Darren Hart <hidden> · 2010-08-10
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2010-07-22
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Darren Hart <hidden> · 2010-07-22
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Darren Hart <hidden> · 2010-07-23
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Vaidyanathan Srinivasan <hidden> · 2010-07-23
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2010-07-23
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Vaidyanathan Srinivasan <hidden> · 2010-07-23
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Darren Hart <hidden> · 2010-08-05
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Vaidyanathan Srinivasan <hidden> · 2010-08-05
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Thomas Gleixner <hidden> · 2010-08-05
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2010-07-23
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Darren Hart <hidden> · 2010-08-06
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Vaidyanathan Srinivasan <hidden> · 2010-08-06
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Darren Hart <hidden> · 2010-08-06
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Will Schmidt <hidden> · 2010-07-23
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Darren Hart <hidden> · 2010-08-04
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Ankita Garg <hidden> · 2010-08-19
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Will Schmidt <hidden> · 2010-08-19
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Darren Hart <hidden> · 2010-08-23
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Darren Hart <hidden> · 2010-08-31
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Michael Ellerman <hidden> · 2010-09-01
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Darren Hart <hidden> · 2010-09-01
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Darren Hart <hidden> · 2010-09-01
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Steven Rostedt <rostedt@goodmis.org> · 2010-09-01
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Darren Hart <hidden> · 2010-09-01
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Michael Neuling <hidden> · 2010-09-02
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Steven Rostedt <rostedt@goodmis.org> · 2010-09-02
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Darren Hart <hidden> · 2010-09-02
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Will Schmidt <hidden> · 2010-09-03
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Michael Neuling <hidden> · 2010-09-02
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Darren Hart <hidden> · 2010-09-03
Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries · Michael Neuling <hidden> · 2010-09-02

Re: [PATCH][RFC] preempt_count corruption across H_CEDE call with CONFIG_PREEMPT on pseries

From: Vaidyanathan Srinivasan <hidden>
Date: 2010-07-23 05:08:31

* Darren Hart [off-list ref] [2010-07-22 21:44:04]:

On 07/22/2010 04:57 PM, Darren Hart wrote:

quoted

On 07/22/2010 03:25 PM, Benjamin Herrenschmidt wrote:

quoted

On Thu, 2010-07-22 at 11:24 -0700, Darren Hart wrote:

quoted

1) How can the preempt_count() get mangled across the H_CEDE hcall?
2) Should we call preempt_enable() in cpu_idle() prior to cpu_die() ?

The preempt count is on the thread info at the bottom of the stack.

Can you check the stack pointers ?

Hi Ben, thanks for looking.

I instrumented the area around extended_cede_processor() as follows
(please confirm I'm getting the stack pointer correctly).

while (get_preferred_offline_state(cpu) == CPU_STATE_INACTIVE) {
	asm("mr %0,1" : "=r" (sp));
	printk("before H_CEDE current->stack: %lx, pcnt: %x\n", sp, preempt_count());
	extended_cede_processor(cede_latency_hint);
	asm("mr %0,1" : "=r" (sp));
	printk("after H_CEDE current->stack: %lx, pcnt: %x\n", sp, preempt_count());
  }


On Mainline (2.6.33.6, CONFIG_PREEMPT=y) I see this:
Jul 22 18:37:08 igoort1 kernel: before H_CEDE current->stack: c00000010e9e3ce0, pcnt: 1
Jul 22 18:37:08 igoort1 kernel: after H_CEDE current->stack:  c00000010e9e3ce0, pcnt: 1

This surprised me as preempt_count is 1 before and after, so no
corruption appears to occur on mainline. This makes the pcnt of 65 I see
without the preempt_count()=0 hack very strange. I ran several hundred
off/on cycles. The issue of preempt_count being 1 is still addressed by
this patch however.

On PREEMPT_RT (2.6.33.5-rt23 - tglx, sorry, rt/2.6.33 next time, promise):
Jul 22 18:51:11 igoort1 kernel: before H_CEDE current->stack: c000000089bcfcf0, pcnt: 1
Jul 22 18:51:11 igoort1 kernel: after H_CEDE current->stack:  c000000089bcfcf0, pcnt: ffffffff

In both cases the stack pointer appears unchanged.

Note: there is a BUG triggered in between these statements as the
preempt_count causes the printk to trigger:
Badness at kernel/sched.c:5572

At Steven's suggestion I updated the instrumentation to display the
local_save_flags and irqs_disabled_flags:

while (get_preferred_offline_state(cpu) == CPU_STATE_INACTIVE) {
        local_save_flags(flags);
        printk("local flags: %lx, irqs disabled: %d\n", flags, irqs_disabled_flags(flags));
        asm("mr %0,1" : "=r" (sp));
        printk("before H_CEDE current->stack: %lx, pcnt: %x\n", sp, preempt_count());
        extended_cede_processor(cede_latency_hint);
        asm("mr %0,1" : "=r" (sp));
        printk("after H_CEDE current->stack: %lx, pcnt: %x\n", sp, preempt_count());
}


Jul 22 23:36:58 igoort1 kernel: local flags: 0, irqs disabled: 1
Jul 22 23:36:58 igoort1 kernel: before H_CEDE current->stack: c00000010e9e3ce0, pcnt: 1
Jul 22 23:36:58 igoort1 kernel: after H_CEDE current->stack: c00000010e9e3ce0, pcnt: 1

I'm not sure if I'm reading that right, but I believe interrupts are
intended to be disabled here. If accomplished via the
spin_lock_irqsave() this would behave differently on RT. However, this
path disables the interrupts handled by xics, all but the IPIs anyway.
On RT I disabled the decrementer as well.

Is it possible for RT to be receiving other interrupts here?

Yes.  extended_cede_processor() will return with interrupts enabled in
the cpu. (This is done by the hypervisor).  Under normal cases we
cannot be interrupted because no IO interrupts are routed to us after
xics_teardown_cpu() and since the CPU is out of the map, nobody will
send us IPIs.

Though H_CEDE will return with interrupts enabled, it is unlikely that
an interrupt can be delivered in this context.

--Vaidy

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help