Thread (305 messages) 305 messages, 27 authors, 2007-09-11

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

From: Paul E. McKenney <hidden>
Date: 2007-08-16 16:35:08
Also in: lkml, netdev

On Thu, Aug 16, 2007 at 06:42:50PM +0800, Herbert Xu wrote:
On Thu, Aug 16, 2007 at 12:31:03PM +0200, Stefan Richter wrote:
quoted
PS:  Just to clarify, I'm not speaking for the volatile modifier.  I'm
not speaking for any particular implementation of atomic_t and its
accessors at all.  All I am saying is that
  - we use atomically accessed data types because we concurrently but
    locklessly access this data,
  - hence a read access to this data that could be optimized away
    makes *no sense at all*.
No sane compiler can optimise away an atomic_read per se.
That's only possible if there's a preceding atomic_set or
atomic_read, with no barriers in the middle.

If that's the case, then one has to conclude that doing
away with the second read is acceptable, as otherwise
a memory (or at least a compiler) barrier should have been
used.
The compiler can also reorder non-volatile accesses.  For an example
patch that cares about this, please see:

	http://lkml.org/lkml/2007/8/7/280

This patch uses an ORDERED_WRT_IRQ() in rcu_read_lock() and
rcu_read_unlock() to ensure that accesses aren't reordered with respect
to interrupt handlers and NMIs/SMIs running on that same CPU.
In fact, volatile doesn't guarantee that the memory gets
read anyway.  You might be reading some stale value out
of the cache.  Granted this doesn't happen on x86 but
when you're coding for the kernel you can't make such
assumptions.

So the point here is that if you don't mind getting a stale
value from the CPU cache when doing an atomic_read, then
surely you won't mind getting a stale value from the compiler
"cache".
Absolutely disagree.  An interrupt/NMI/SMI handler running on the CPU
will see the same value (whether in cache or in store buffer) that
the mainline code will see.  In this case, we don't care about CPU
misordering, only about compiler misordering.  It is easy to see
other uses that combine communication with handlers on the current
CPU with communication among CPUs -- again, see prior messages in
this thread.
quoted
So, the architecture guys can implement atomic_read however they want
--- as long as it cannot be optimized away.*
They can implement it however they want as long as it stays
atomic.
Precisely.  And volatility is a key property of "atomic".  Let's please
not throw it away.

						Thanx, Paul
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help