Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's
From: Michael Ellerman <mpe@ellerman.id.au>
Date: 2017-07-25 12:01:00
Nicholas Piggin [off-list ref] writes:
On Mon, 24 Jul 2017 23:46:44 +1000 Michael Ellerman [off-list ref] wrote:quoted
Nicholas Piggin [off-list ref] writes:quoted
On Mon, 24 Jul 2017 14:28:02 +1000 Benjamin Herrenschmidt [off-list ref] wrote:quoted
Instead of comparing the whole CPU mask every time, let's keep a counter of how many bits are set in the mask. Thus testing for a local mm only requires testing if that counter is 1 and the current CPU bit is set in the mask....quoted
Also does it make sense to define it based on NR_CPUS > BITS_PER_LONG? If it's <= then it should be similar load and compare, no?Do we make a machine with that few CPUs? ;) I don't think it's worth special casing, all the distros run with much much larger NR_CPUs than that.Not further special-casing, but just casing it based on NR_CPUS rather than BOOK3S.
The problem is the mm_context_t is defined based on BookE vs BookS etc. not based on NR_CPUS. So we'd have to add the atomic_t to all mm_context_t's, but #ifdef'ed based on NR_CPUS. But then some platforms don't support SMP, so it's a waste there. The existing cpumask check compiles to ~= nothing on UP. cheers