Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul E. McKenney <hidden>
Date: 2007-08-18 23:19:44
Also in:
linux-arch, lkml
From: Paul E. McKenney <hidden>
Date: 2007-08-18 23:19:44
Also in:
linux-arch, lkml
On Sat, Aug 18, 2007 at 03:41:13PM -0700, Linus Torvalds wrote:
On Sat, 18 Aug 2007, Paul E. McKenney wrote:quoted
One of the gcc guys claimed that he thought that the two-instruction sequence would be faster on some x86 machines. I pointed out that there might be a concern about code size. I chose not to point out that people might also care about the other x86 machines. ;-)Some (very few) x86 uarchs do tend to prefer "load-store" like code generation, and doing a "mov [mem],reg + op reg" instead of "op [mem]" can actually be faster on some of them. Not any that are relevant today, though.
;-)
Also, that has nothing to do with volatile, and should be controlled by optimization flags (like -mtune). In fact, I thought there was a separate flag to do that (ie something like "-mload-store"), but I can't find it, so maybe that's just my fevered brain..
Good point, will suggest this if the need arises. Thanx, Paul