Re: [PATCH RFC] v5 expedited "big hammer" RCU grace periods

From: Paul E. McKenney <hidden>
Date: 2009-05-18 15:14:37
Also in: lkml, netfilter-devel

On Mon, May 18, 2009 at 09:56:30AM +0200, Ingo Molnar wrote:

* Paul E. McKenney [off-list ref] wrote:

quoted

+void sched_expedited_wake(void *unused)
+{
+	mutex_lock(&__get_cpu_var(sched_expedited_done_mutex));
+	if (__get_cpu_var(sched_expedited_done_qs) ==
+	    SCHED_EXPEDITED_QS_DONE_QS) {
+		__get_cpu_var(sched_expedited_done_qs) =
+			SCHED_EXPEDITED_QS_NEED_QS;
+		wake_up(&__get_cpu_var(sched_expedited_qs_wq));
+	}
+	mutex_unlock(&__get_cpu_var(sched_expedited_done_mutex));
+}

( hm, IPI handlers are supposed to be atomic. )

<red face>

quoted

+/*
+ * Kernel thread that processes synchronize_sched_expedited() requests.
+ * This is implemented as a separate kernel thread to avoid the need
+ * to mess with other tasks' cpumasks.
+ */
+static int krcu_sched_expedited(void *arg)
+{
+	int cpu;
+	int mycpu;
+	int nwait;
+
+	do {
+		wait_event_interruptible(need_sched_expedited_wq,
+					 need_sched_expedited);
+		smp_mb(); /* In case we didn't sleep. */
+		if (!need_sched_expedited)
+			continue;
+		need_sched_expedited = 0;
+		get_online_cpus();
+		preempt_disable();
+		mycpu = smp_processor_id();
+		smp_call_function(sched_expedited_wake, NULL, 1);
+		preempt_enable();

i might be missing something fundamental here, but why not just have 
per CPU helper threads, all on the same waitqueue, and wake them up 
via a single wake_up() call? That would remove the SMP cross call 
(wakeups do immediate cross-calls already).

My concern with this is that the cache misses accessing all the processes
on this single waitqueue would be serialized, slowing things down.
In contrast, the bitmask that smp_call_function() traverses delivers on
the order of a thousand CPUs' worth of bits per cache miss.  I will give
it a try, though.

Even more - we already have a per-CPU, high RT priority helper 
thread that could be reused: the per CPU migration threads. Couldnt 
we queue these requests to them? RCU is arguably closely related to 
scheduling so there's no layering violation IMO.

There's already a struct migration_req machinery that performs 
something quite similar. (do work on behalf of another task, on a 
specific CPU, and then signal completion)

Also, per CPU workqueues have similar features as well.

Good points!!!

I will post a working patch using my current approach, then try out some
of these approaches.

							Thanx, Paul

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help