Re: [PATCH RFC] v7 expedited "big hammer" RCU grace periods
From: Paul E. McKenney <hidden>
Date: 2009-05-27 04:30:22
Also in:
lkml, netfilter-devel
On Wed, May 27, 2009 at 09:57:19AM +0800, Lai Jiangshan wrote:
Paul E. McKenney wrote:quoted
I am concerned about the following sequence of events: o synchronize_sched_expedited() disables preemption, thus blocking offlining operations. o CPU 1 starts offlining CPU 0. It acquires the CPU-hotplug lock, and proceeds, and is now waiting for preemption to be enabled. o synchronize_sched_expedited() disables preemption, sees that CPU 0 is online, so initializes and queues a request, does a wake-up-process(), and finally does a preempt_enable(). o CPU 0 is currently running a high-priority real-time process, so the wakeup does not immediately happen. o The offlining process completes, including the kthread_stop() to the migration task. o The migration task wakes up, sees kthread_should_stop(), and so exits without checking its queue. o synchronize_sched_expedited() waits forever for CPU 0 to respond. I suppose that one way to handle this would be to check for the CPU going offline before doing the wait_for_completion(), but I am concerned about races affecting this check as well. Or is there something in the CPU-offline process that makes the above sequence of events impossible? Thanx, PaulI realized this, I wrote this:quoted
The coupling of synchronize_sched_expedited() and migration_req is largely increased: 1) The offline cpu's per_cpu(rcu_migration_req, cpu) is handled. See migration_call::CPU_DEADsynchronize_sched_expedited() will not wait for CPU#0, because migration_call()::case CPU_DEAD wakes up the requestors. migration_call() { ... case CPU_DEAD: case CPU_DEAD_FROZEN: ... /* * No need to migrate the tasks: it was best-effort if * they didn't take sched_hotcpu_mutex. Just wake up * the requestors. */ spin_lock_irq(&rq->lock); while (!list_empty(&rq->migration_queue)) { struct migration_req *req; req = list_entry(rq->migration_queue.next, struct migration_req, list); list_del_init(&req->list); spin_unlock_irq(&rq->lock); complete(&req->done); spin_lock_irq(&rq->lock); } spin_unlock_irq(&rq->lock); ... ... } My approach depend on the requestors are waked up at any case. migration_call() does it for us but the coupling is largely increased.
OK, good point! I do need to think about this. In the meantime, where do you see a need to run synchronize_sched_expedited() from within a hotplug CPU notifier? Thanx, Paul