Thread (40 messages) 40 messages, 3 authors, 2012-12-24

Re: [RFC PATCH v4 1/9] CPU hotplug: Provide APIs to prevent CPU offline from atomic context

From: Srivatsa S. Bhat <hidden>
Date: 2012-12-22 20:19:25
Also in: lkml

On 12/20/2012 07:12 PM, Oleg Nesterov wrote:
On 12/20, Srivatsa S. Bhat wrote:
quoted
On 12/20/2012 12:44 AM, Oleg Nesterov wrote:
quoted
We need 2 helpers for writer, the 1st one does synchronize_sched() and the
2nd one takes rwlock. A generic percpu_write_lock() simply calls them both.
Ah, that's the problem no? Users of reader-writer locks expect to run in
atomic context (ie., they don't want to sleep).
Ah, I misunderstood.

Sure, percpu_write_lock() should be might_sleep(), and this is not
symmetric to percpu_read_lock().
quoted
We can't expose an API that
can make the task go to sleep under the covers!
Why? Just this should be documented. However I would not worry until we
find another user. Until then we do not even need to add percpu_write_lock
or try to generalize this code too much.
quoted
quoted
To me, the main question is: can we use synchronize_sched() in cpu_down?
It is slow.
Haha :-) So we don't want smp_mb() in the reader,
We need mb() + rmb(). Plust cli/sti unless this arch has optimized
this_cpu_add() like x86 (as you pointed out).
Hey, IIUC, we actually don't need mb() in the reader!! Just an rmb() will do.

This is the reader code I have so far:

#define reader_nested_percpu()						\
	     (__this_cpu_read(reader_percpu_refcnt) & READER_REFCNT_MASK)

#define writer_active()							\
				(__this_cpu_read(writer_signal))


#define READER_PRESENT		(1UL << 16)
#define READER_REFCNT_MASK	(READER_PRESENT - 1)

void get_online_cpus_atomic(void)
{
	preempt_disable();

	/*
	 * First and foremost, make your presence known to the writer.
	 */
	this_cpu_add(reader_percpu_refcnt, READER_PRESENT);

	/*
	 * If we are already using per-cpu refcounts, it is not safe to switch
	 * the synchronization scheme. So continue using the refcounts.
	 */
	if (reader_nested_percpu()) {
		this_cpu_inc(reader_percpu_refcnt);
	} else {
		smp_rmb();
		if (unlikely(writer_active())) {
			... //take hotplug_rwlock
		}
	}

	...

	/* Prevent reordering of any subsequent reads of cpu_online_mask. */
	smp_rmb();
}

The smp_rmb() before writer_active() ensures that LOAD(writer_signal) follows
LOAD(reader_percpu_refcnt) (at the 'if' condition). And in turn, that load is
automatically going to follow the STORE(reader_percpu_refcnt) (at this_cpu_add())
due to the data dependency. So it is something like a transitive relation.

So, the result is that, we mark ourselves as active in reader_percpu_refcnt before
we check writer_signal. This is exactly what we wanted to do right?
And luckily, due to the dependency, we can achieve it without using the heavy
smp_mb(). And, we can't crib about the smp_rmb() because it is unavoidable anyway
(because we want to prevent reordering of the reads to cpu_online_mask, like you
pointed out earlier).

I hope I'm not missing anything...

Regards,
Srivatsa S. Bhat
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help