Thread (38 messages) 38 messages, 3 authors, 2012-12-11

Re: [RFC PATCH v3 1/9] CPU hotplug: Provide APIs to prevent CPU offline from atomic context

From: Srivatsa S. Bhat <hidden>
Date: 2012-12-11 13:15:31
Also in: lkml

On 12/10/2012 10:54 PM, Oleg Nesterov wrote:
On 12/10, Srivatsa S. Bhat wrote:
quoted
On 12/10/2012 01:52 AM, Oleg Nesterov wrote:
quoted
On 12/10, Srivatsa S. Bhat wrote:
quoted
On 12/10/2012 12:44 AM, Oleg Nesterov wrote:
quoted
But yes, it is easy to blame somebody else's code ;) And I can't suggest
something better at least right now. If I understand correctly, we can not
use, say, synchronize_sched() in _cpu_down() path
We can't sleep in that code.. so that's a no-go.
But we can?

Note that I meant _cpu_down(), not get_online_cpus_atomic() or take_cpu_down().
Maybe I'm missing something, but how would it help if we did a
synchronize_sched() so early (in _cpu_down())? Another bunch of preempt_disable()
sections could start immediately after our call to synchronize_sched() no?
How would we deal with that?
Sorry for confusion. Of course synchronize_sched() alone is not enough.
But we can use it to synchronize with preempt-disabled section and avoid
the barriers/atomic in the fast-path.

For example,

	bool writer_pending;
	DEFINE_RWLOCK(writer_rwlock);
	DEFINE_PER_CPU(int, reader_ctr);

	void get_online_cpus_atomic(void)
	{
		preempt_disable();
	
		if (likely(!writer_pending) || __this_cpu_read(reader_ctr)) {
			__this_cpu_inc(reader_ctr);
			return;
		}

		read_lock(&writer_rwlock);
		__this_cpu_inc(reader_ctr);
		read_unlock(&writer_rwlock);
	}

	// lacks release semantics, but we don't care
	void put_online_cpus_atomic(void)
	{
		__this_cpu_dec(reader_ctr);
		preempt_enable();
	}

Now, _cpu_down() does

	writer_pending = true;
	synchronize_sched();

before stop_one_cpu(). When synchronize_sched() returns, we know that
every get_online_cpus_atomic() must see writer_pending == T. And, if
any CPU incremented its reader_ctr we must see it is not zero.

take_cpu_down() does

	write_lock(&writer_rwlock);

	for_each_online_cpu(cpu) {
		while (per_cpu(reader_ctr, cpu))
			cpu_relax();
	}

and takes the lock.

However. This can lead to the deadlock we already discussed. So
take_cpu_down() should do

 retry:
	write_lock(&writer_rwlock);

	for_each_online_cpu(cpu) {
		if (per_cpu(reader_ctr, cpu)) {
			write_unlock(&writer_rwlock);
			goto retry;
		}
	}

to take the lock. But this is livelockable. However, I do not think it
is possible to avoid the livelock.

Just in case, the code above is only for illustration, perhaps it is not
100% correct and perhaps we can do it better. cpu_hotplug.active_writer
is ignored for simplicity, get/put should check current == active_writer.
This approach (of using synchronize_sched()) also looks good. It is simple,
yet effective, but unfortunately inefficient at the writer side (because
he'll have to wait for a full synchronize_sched()).

So I would say that we should keep this approach as a fallback, if we don't
come up with any better synchronization scheme (in terms of efficiency) at
a comparable level of simplicity/complexity.

I have come up with a v4 that simplifies several aspects of the synchronization
and makes it look a lot more simpler than v3. (Lesser race windows to take
care of, implicit ACKs, no atomic ops etc..)

I'll post it soon. Let me know what you think...

Regards,
Srivatsa S. Bhat
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help