Re: [PATCH] memcg: remove unneeded preempt_disable
From: Peter Zijlstra <peterz@infradead.org>
Date: 2011-08-25 18:40:40
Also in:
linux-arch, lkml
On Thu, 2011-08-25 at 11:31 -0500, Christoph Lameter wrote:
On Thu, 25 Aug 2011, James Bottomley wrote:quoted
On Thu, 2011-08-25 at 10:11 -0500, Christoph Lameter wrote:quoted
On Thu, 25 Aug 2011, Peter Zijlstra wrote:quoted
On Thu, 2011-08-18 at 14:40 -0700, Andrew Morton wrote:quoted
I think I'll apply it, as the call frequency is low (correct?) and the problem will correct itself as other architectures implement their atomic this_cpu_foo() operations.Which leads me to wonder, can anything but x86 implement that this_cpu_* muck? I doubt any of the risk chips can actually do all this. Maybe Itanic, but then that seems to be dying fast.The cpu needs to have an RMW instruction that does something to a variable relative to a register that points to the per cpu base. Thats generally possible. The problem is how expensive the RMW is going to be.Risc systems generally don't have a single instruction for this, that's correct. Obviously we can do it as a non atomic sequence: read variable, compute relative, read, modify, write ... but there's absolutely no point hand crafting that in asm since the compiler can usually work it out nicely. And, of course, to have this atomic, we have to use locks, which ends up being very expensive.ARM seems to have these LDREX/STREX instructions for that purpose which seem to be used for generating atomic instructions without lockes. I guess other RISC architectures have similar means of doing it?
Even with LL/SC and the CPU base in a register you need to do something like: again: LL $target-reg, $cpubase-reg + offset <foo> SC $ret, $target-reg, $cpubase-reg + offset if !$ret goto again Its the +offset that's problematic, it either doesn't exist or is very limited (a quick look at the MIPS instruction set gives a limit of 64k). Without the +offset you need: again: $tmp-reg = $cpubase-reg $tmp-reg += offset; LL $target-reg, $tmp-reg <foo> SC $ret, $target-reg, $tmp-reg if !$ret goto again Which is wide open to migration races. Also, very often there are constraints on LL/SC that mandate we use preempt_disable/enable around its use, which pretty much voids the whole purpose, since if we disable preemption we might as well just use C (ARM belongs in this class). It does look POWERPC's lwarx/stwcx is sane enough, although the instruction reference I found doesn't list what happens if the LL/SC doesn't use the same effective address or has other loads/stores in between, if its ok with those and simply fails the SC it should be good. Still, creating atomic ops for per-cpu ops might be more expensive than simply doing the preempt-disable/rmw/enable dance, dunno don't know these archs that well. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>