Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: 2007-08-17 03:51:45
Also in:
lkml, netdev
On Fri, 17 Aug 2007, Nick Piggin wrote:
I'm surprised too. Numbers were from the "...use asm() like the other atomic operations already do" thread. According to them, text data bss dec hex filename 3434150 249176 176128 3859454 3ae3fe atomic_normal/vmlinux 3436203 249176 176128 3861507 3aec03 atomic_volatile/vmlinux The first one is a stock kenel, the second is with atomic_read/set cast to volatile. gcc-4.1 -- maybe if you have an earlier gcc it won't optimise as much?
No, see my earlier reply. "volatile" really *is* an incredible piece of
crap.
Just try it yourself:
volatile int i;
int j;
int testme(void)
{
return i <= 1;
}
int testme2(void)
{
return j <= 1;
}
and compile with all the optimizations you can.
I get:
testme:
movl i(%rip), %eax
subl $1, %eax
setle %al
movzbl %al, %eax
ret
vs
testme2:
xorl %eax, %eax
cmpl $1, j(%rip)
setle %al
ret
(now, whether that "xorl + setle" is better than "setle + movzbl", I don't
really know - maybe it is. But that's not thepoint. The point is the
difference between
movl i(%rip), %eax
subl $1, %eax
and
cmpl $1, j(%rip)
and imagine this being done for *every* single volatile access.
Just do a
git grep atomic_read
to see how atomics are actually used. A lot of them are exactly the above
kind of "compare against a value".
Linus