Thread (305 messages) 305 messages, 27 authors, 2007-09-11

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

From: Paul E. McKenney <hidden>
Date: 2007-08-15 19:07:39
Also in: linux-arch, lkml

On Wed, Aug 15, 2007 at 11:25:05PM +0530, Satyam Sharma wrote:
Hi Paul,
On Wed, 15 Aug 2007, Paul E. McKenney wrote:
quoted
On Wed, Aug 15, 2007 at 07:17:29PM +0530, Satyam Sharma wrote:
quoted
[...]
No, I'd actually prefer something like what Christoph Lameter suggested,
i.e. users (such as above) who want "volatile"-like behaviour from atomic
ops can use alternative functions. How about something like:

#define atomic_read_volatile(v)			\
	({					\
		forget(&(v)->counter);		\
		((v)->counter);			\
	})
Wouldn't the above "forget" the value, throw it away, then forget
that it forgot it, giving non-volatile semantics?
Nope, I don't think so. I wrote the following trivial testcases:
[ See especially tp4.c and tp4.s (last example). ]
Right.  I should have said "wouldn't the compiler be within its rights
to forget the value, throw it away, then forget that it forgot it".
The value coming out of the #define above is an unadorned ((v)->counter),
which has no volatile semantics.
==============================================================================
$ cat tp1.c # Using volatile access casts

#define atomic_read(a)	(*(volatile int *)&a)

int a;

void func(void)
{
	a = 0;
	while (atomic_read(a))
		;
}
==============================================================================
$ gcc -Os -S tp1.c; cat tp1.s

func:
	pushl	%ebp
	movl	%esp, %ebp
	movl	$0, a
.L2:
	movl	a, %eax
	testl	%eax, %eax
	jne	.L2
	popl	%ebp
	ret
==============================================================================
$ cat tp2.c # Using nothing; gcc will optimize the whole loop away

#define forget(x)

#define atomic_read(a)		\
	({			\
		forget(&(a));	\
		(a);		\
	})

int a;

void func(void)
{
	a = 0;
	while (atomic_read(a))
		;
}
==============================================================================
$ gcc -Os -S tp2.c; cat tp2.s

func:
	pushl	%ebp
	movl	%esp, %ebp
	popl	%ebp
	movl	$0, a
	ret
==============================================================================
$ cat tp3.c # Using a full memory clobber barrier

#define forget(x)	asm volatile ("":::"memory")

#define atomic_read(a)		\
	({			\
		forget(&(a));	\
		(a);		\
	})

int a;

void func(void)
{
	a = 0;
	while (atomic_read(a))
		;
}
==============================================================================
$ gcc -Os -S tp3.c; cat tp3.s

func:
	pushl	%ebp
	movl	%esp, %ebp
	movl	$0, a
.L2:
	cmpl	$0, a
	jne	.L2
	popl	%ebp
	ret
==============================================================================
$ cat tp4.c # Using a forget(var) macro

#define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))

#define atomic_read(a)		\
	({			\
		forget(a);	\
		(a);		\
	})

int a;

void func(void)
{
	a = 0;
	while (atomic_read(a))
		;
}
==============================================================================
$ gcc -Os -S tp4.c; cat tp4.s

func:
	pushl	%ebp
	movl	%esp, %ebp
	movl	$0, a
.L2:
	cmpl	$0, a
	jne	.L2
	popl	%ebp
	ret
==============================================================================

Possibly these were too trivial to expose any potential problems that you
may have been referring to, so would be helpful if you could write a more
concrete example / sample code.
The trick is to have a sufficiently complicated expression to force
the compiler to run out of registers.  If the value is non-volatile,
it will refetch it (and expect it not to have changed, possibly being
disappointed by an interrupt handler running on that same CPU).
quoted
quoted
Or possibly, implement these "volatile" atomic ops variants in inline asm
like the patch that Sebastian Siewior has submitted on another thread just
a while back.
Given that you are advocating a change (please keep in mind that
atomic_read() and atomic_set() had volatile semantics on almost all
platforms), care to give some example where these historical volatile
semantics are causing a problem?
[...]
Can you give even one example
where the pre-existing volatile semantics are causing enough of a problem
to justify adding yet more atomic_*() APIs?
Will take this to the other sub-thread ...
OK.
quoted
quoted
Of course, if we find there are more callers in the kernel who want the
volatility behaviour than those who don't care, we can re-define the
existing ops to such variants, and re-name the existing definitions to
somethine else, say "atomic_read_nonvolatile" for all I care.
Do we really need another set of APIs?
Well, if there's one set of users who do care about volatile behaviour,
and another set that doesn't, it only sounds correct to provide both
those APIs, instead of forcing one behaviour on all users.
Well, if the second set doesn't care, they should be OK with the volatile
behavior in this case.

							Thanx, Paul
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help