Thread (305 messages) 305 messages, 27 authors, 2007-09-11

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

From: Paul E. McKenney <hidden>
Date: 2007-08-18 23:19:44
Also in: linux-arch, lkml

On Sat, Aug 18, 2007 at 03:41:13PM -0700, Linus Torvalds wrote:

On Sat, 18 Aug 2007, Paul E. McKenney wrote:
quoted
One of the gcc guys claimed that he thought that the two-instruction
sequence would be faster on some x86 machines.  I pointed out that
there might be a concern about code size.  I chose not to point out
that people might also care about the other x86 machines.  ;-)
Some (very few) x86 uarchs do tend to prefer "load-store" like code 
generation, and doing a "mov [mem],reg + op reg" instead of "op [mem]" can 
actually be faster on some of them. Not any that are relevant today, 
though.
;-)
Also, that has nothing to do with volatile, and should be controlled by 
optimization flags (like -mtune). In fact, I thought there was a separate 
flag to do that (ie something like "-mload-store"), but I can't find it, 
so maybe that's just my fevered brain..
Good point, will suggest this if the need arises.

							Thanx, Paul
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help