Thread (2 messages) 2 messages, 2 authors, 2012-01-20

Re: [PATCH 2/2] mm: page allocator: Do not drain per-cpu lists via IPI from page allocator context

From: Mel Gorman <mgorman@suse.de>
Date: 2012-01-20 08:48:48
Also in: linux-fsdevel, lkml

Possibly related (same subject, not in this thread)

On Fri, Jan 20, 2012 at 03:16:58AM +0530, Srivatsa S. Bhat wrote:
[Reinstating the original Cc list]

On 01/19/2012 09:50 PM, Mel Gorman wrote:> 
quoted
On a different x86-64 machines with an intel-specific MCE, I have
also noted that the value of num_online_cpus() can change while
stop_machine() is running.

That is expected and intentional right? Meaning, it is during the
stop_machine() thing itself that a CPU is actually taken offline.
And at the same time, it is removed from the cpu_online_mask.
It's intentional sometimes and no others. The machine does halt
sometimes and stays there.
On Intel boxes, essentially, the following gets executed on the dying
CPU, as set up by the stop_machine stuff.

__cpu_disable()
    native_cpu_disable()
        cpu_disable_common()
            remove_cpu_from_maps()
                set_cpu_online(cpu, false)
			^^^^^^
So, set_cpu_online will remove this CPU from the cpu_online_mask.
And all this runs while still under the stop machine context.
And this is exactly what we want right?
We don't want it to halt in stop_machine forever waiting on acknowledges
that are never received until the NMI handler fires.
quoted
This is sensitive to timing and part of
the problem seems to be due to cmci_rediscover() running without the
CPU hotplug mutex held. This is not related to the IPI mess and is
unrelated to memory pressure but is just to note that CPU hotplug in
general can be fragile in parts.

For the cmci_rediscover() part, I feel a simple get/put_online_cpus()
around it should work.
Yeah, that's the first thing I tried first too. Doesn't work though.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help