Thread (21 messages) 21 messages, 5 authors, 2012-12-05

Re: [RFC PATCH 01/10] CPU hotplug: Introduce "stable" cpu online mask, for atomic hotplug readers

From: Michael Wang <hidden>
Date: 2012-12-05 03:29:12
Also in: lkml

On 12/05/2012 10:56 AM, Michael Wang wrote:
[...]
quoted
I wonder about the cpu-online case.  A typical caller might want to do:


/*
 * Set each online CPU's "foo" to "bar"
 */

int global_bar;

void set_cpu_foo(int bar)
{
	get_online_cpus_stable_atomic();
	global_bar = bar;
	for_each_online_cpu_stable()
		cpu->foo = bar;
	put_online_cpus_stable_atomic()
}

void_cpu_online_notifier_handler(void)
{
	cpu->foo = global_bar;
}
Oh, forgive me for misunderstanding your question :(

In this case, we have to prevent hotplug happen, not just ensure the
online mask is correct.

Hmm..., we need more consideration.

Regards,
Michael Wang
quoted
And I think that set_cpu_foo() would be buggy, because a CPU could come
online before global_bar was altered, and that newly-online CPU would
pick up the old value of `bar'.

So what's the rule here?  global_bar must be written before we run
get_online_cpus_stable_atomic()?

Anyway, please have a think and spell all this out?
That's right, actually this related to one question, should the hotplug
happen during get_online and put_online?

Answer will be YES according to old API which using mutex, the hotplug
won't happen in critical section, but the cost is get_online() will
block, which will kill the performance.

So we designed this mechanism to do acceleration, but as you pointed
out, although the result will never be wrong, but the 'stable' mask is
not stable since it could be changed in critical section.

And we have two solution.

One is from Srivatsa, using 'read_lock' and 'write_lock', it will
prevent hotplug happen just like the old rule, the cost is we need a
global 'rw_lock' which perform bad on NUMA system, and no doubt,
get_online() will block for short time when doing hotplug.

Another is to maintain a per-cpu cache mask, this mask will only be
updated in get_online(), and be used in critical section, then we will
get a real stable mask, but one flaw is, on different cpu in critical
section, online mask will be different.

We will be appreciate if we could collect some comments on which one to
be used in next version.

Regards,
Michael Wang
quoted
quoted
 struct take_cpu_down_param {
 	unsigned long mod;
 	void *hcpu;
@@ -246,7 +351,9 @@ struct take_cpu_down_param {
 static int __ref take_cpu_down(void *_param)
 {
 	struct take_cpu_down_param *param = _param;
-	int err;
+	int err, cpu = (long)(param->hcpu);
+
Like this please:

	int err;
	int cpu = (long)(param->hcpu);
quoted
+	prepare_cpu_take_down(cpu);
 
 	/* Ensure this CPU doesn't handle any more interrupts. */
 	err = __cpu_disable();

...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help