Thread (305 messages) 305 messages, 27 authors, 2007-09-11

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

From: Satyam Sharma <hidden>
Date: 2007-08-16 01:06:35
Also in: linux-arch, lkml

Hi Herbert,


On Thu, 16 Aug 2007, Herbert Xu wrote:
On Thu, Aug 16, 2007 at 06:28:42AM +0530, Satyam Sharma wrote:
quoted
quoted
The udelay itself certainly should have some form of cpu_relax in it.
Yes, a form of barrier() must be present in mdelay() or udelay() itself
as you say, having it in __const_udelay() is *not* enough (superflous
actually, considering it is already a separate translation unit and
invisible to the compiler).
As long as __const_udelay does something which has the same
effect as barrier it is enough even if it's in the same unit.
Only if __const_udelay() is inlined. But as I said, __const_udelay()
-- although marked "inline" -- will never be inlined anywhere in the
kernel in reality. It's an exported symbol, and never inlined from
modules. Even from built-in targets, the definition of __const_udelay
is invisible when gcc is compiling the compilation units of those
callsites. The compiler has no idea that that function has barriers
or not, so we're saved here _only_ by the lucky fact that
__const_udelay() is in a different compilation unit.

As a matter of fact it does on i386 where __delay either uses
rep_nop or asm/volatile.
__delay() can be either delay_tsc() or delay_loop() on i386.

delay_tsc() uses the rep_nop() there for it's own little busy
loop, actually. But for a call site that inlines __const_udelay()
-- if it were ever moved to a .h file and marked inline -- the
call to __delay() will _still_ be across compilation units. So,
again for this case, it does not matter if the callee function
has compiler barriers or not (it would've been a different story
if we were discussing real/CPU barriers, I think), what saves us
here is just the fact that a call is made to a function from a
different compilation unit, which is invisible to the compiler
when compiling the callsite, and hence acting as the compiler
barrier.

Regarding delay_loop(), it uses "volatile" for the "asm" which
has quite different semantics from the C language "volatile"
type-qualifier keyword and does not imply any compiler barrier
at all.


Satyam
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help