Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Satyam Sharma <hidden>
Date: 2007-08-16 01:06:35
Also in:
linux-arch, lkml
Hi Herbert, On Thu, 16 Aug 2007, Herbert Xu wrote:
On Thu, Aug 16, 2007 at 06:28:42AM +0530, Satyam Sharma wrote:quoted
quoted
The udelay itself certainly should have some form of cpu_relax in it.Yes, a form of barrier() must be present in mdelay() or udelay() itself as you say, having it in __const_udelay() is *not* enough (superflous actually, considering it is already a separate translation unit and invisible to the compiler).As long as __const_udelay does something which has the same effect as barrier it is enough even if it's in the same unit.
Only if __const_udelay() is inlined. But as I said, __const_udelay() -- although marked "inline" -- will never be inlined anywhere in the kernel in reality. It's an exported symbol, and never inlined from modules. Even from built-in targets, the definition of __const_udelay is invisible when gcc is compiling the compilation units of those callsites. The compiler has no idea that that function has barriers or not, so we're saved here _only_ by the lucky fact that __const_udelay() is in a different compilation unit.
As a matter of fact it does on i386 where __delay either uses rep_nop or asm/volatile.
__delay() can be either delay_tsc() or delay_loop() on i386. delay_tsc() uses the rep_nop() there for it's own little busy loop, actually. But for a call site that inlines __const_udelay() -- if it were ever moved to a .h file and marked inline -- the call to __delay() will _still_ be across compilation units. So, again for this case, it does not matter if the callee function has compiler barriers or not (it would've been a different story if we were discussing real/CPU barriers, I think), what saves us here is just the fact that a call is made to a function from a different compilation unit, which is invisible to the compiler when compiling the callsite, and hence acting as the compiler barrier. Regarding delay_loop(), it uses "volatile" for the "asm" which has quite different semantics from the C language "volatile" type-qualifier keyword and does not imply any compiler barrier at all. Satyam