Thread (305 messages) 305 messages, 27 authors, 2007-09-11

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

From: Paul E. McKenney <hidden>
Date: 2007-08-17 05:18:29
Also in: lkml, netdev

On Thu, Aug 16, 2007 at 08:42:23PM -0700, Linus Torvalds wrote:

On Fri, 17 Aug 2007, Paul Mackerras wrote:
quoted
I'm really surprised it's as much as a few K.  I tried it on powerpc
and it only saved 40 bytes (10 instructions) for a G5 config.
One of the things that "volatile" generally screws up is a simple

	volatile int i;

	i++;

which a compiler will generally get horribly, horribly wrong.

In a reasonable world, gcc should just make that be (on x86)

	addl $1,i(%rip)

on x86-64, which is indeed what it does without the volatile. But with the 
volatile, the compiler gets really nervous, and doesn't dare do it in one 
instruction, and thus generates crap like

        movl    i(%rip), %eax
        addl    $1, %eax
        movl    %eax, i(%rip)
Blech.  Sounds like a chat with some gcc people is in order.  Will
see what I can do.

						Thanx, Paul
instead. For no good reason, except that "volatile" just doesn't have any 
good/clear semantics for the compiler, so most compilers will just make it 
be "I will not touch this access in any way, shape, or form". Including 
even trivially correct instruction optimization/combination.

This is one of the reasons why we should never use "volatile". It 
pessimises code generation for no good reason - just because compilers 
don't know what the heck it even means! 

Now, people don't do "i++" on atomics (you'd use "atomic_inc()" for that), 
but people *do* do things like

	if (atomic_read(..) <= 1)
		..

On ppc, things like that probably don't much matter. But on x86, it makes 
a *huge* difference whether you do

	movl i(%rip),%eax
	cmpl $1,%eax

or if you can just use the value directly for the operation, like this:

	cmpl $1,i(%rip)

which is again a totally obvious and totally safe optimization, but is 
(again) something that gcc doesn't dare do, since "i" is volatile.

In other words: "volatile" is a horribly horribly bad way of doing things, 
because it generates *worse*code*, for no good reason. You just don't see 
it on powerpc, because it's already a load-store architecture, so there is 
no "good code" for doing direct-to-memory operations.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help