Re: [RFC PATCH v4 1/9] CPU hotplug: Provide APIs to prevent CPU offline from atomic context
From: Srivatsa S. Bhat <hidden>
Date: 2012-12-20 14:08:46
Also in:
lkml
On 12/20/2012 07:12 PM, Oleg Nesterov wrote:
On 12/20, Srivatsa S. Bhat wrote:quoted
On 12/20/2012 12:44 AM, Oleg Nesterov wrote:quoted
We need 2 helpers for writer, the 1st one does synchronize_sched() and the 2nd one takes rwlock. A generic percpu_write_lock() simply calls them both.Ah, that's the problem no? Users of reader-writer locks expect to run in atomic context (ie., they don't want to sleep).Ah, I misunderstood. Sure, percpu_write_lock() should be might_sleep(), and this is not symmetric to percpu_read_lock().quoted
We can't expose an API that can make the task go to sleep under the covers!Why? Just this should be documented. However I would not worry until we find another user. Until then we do not even need to add percpu_write_lock or try to generalize this code too much.
Hmm.. But considering the disable_nonboot_cpus() case you mentioned below, I'm only getting farther away from using synchronize_sched() ;-) And that also makes it easier to expose a generic percpu rwlock API, like Tejun was suggesting. So I'll give it a shot.
quoted
quoted
To me, the main question is: can we use synchronize_sched() in cpu_down? It is slow.Haha :-) So we don't want smp_mb() in the reader,We need mb() + rmb(). Plust cli/sti unless this arch has optimized this_cpu_add() like x86 (as you pointed out).quoted
*and* also don't want synchronize_sched() in the writer! Sounds like saying we want to have the cake and eat it too ;-) :PPersonally I'd vote for synchronize_sched() but I am not sure. And I do not really understand the problem space.quoted
And moreover, since I'm still not convinced about the writer API part if use synchronize_sched(), I'd rather avoid synchronize_sched().)Understand. And yes, synchronize_sched() adds more problems. For example, where should we call it? I do not this _cpu_down() should do this, in this case, say, disable_nonboot_cpus() needs num_online_cpus() synchronize_sched's.
Ouch! I should have seen that coming!
So probably cpu_down() should call it before cpu_maps_update_begin(), this makes the locking even less obvious.
True.
In short. What I am trying to say is, don't ask me I do not know ;)
OK then, I'll go with what I believe is a reasonably good way (not necessarily the best way) to deal with this: I'll avoid the use of synchronize_sched(), expose a decent-looking percpu rwlock implementation, use it in CPU hotplug and get rid of stop_machine(). That would certainly be a good starting base, IMHO. Regards, Srivatsa S. Bhat