Thread (274 messages) 274 messages, 15 authors, 2014-03-07

Re: [RFC][PATCH 0/5] arch: atomic rework

From: Torvald Riegel <hidden>
Date: 2014-02-12 05:14:31
Also in: lkml

On Sun, 2014-02-09 at 19:51 -0800, Paul E. McKenney wrote:
On Mon, Feb 10, 2014 at 01:06:48AM +0100, Torvald Riegel wrote:
quoted
On Thu, 2014-02-06 at 20:20 -0800, Paul E. McKenney wrote:
quoted
On Fri, Feb 07, 2014 at 12:44:48AM +0100, Torvald Riegel wrote:
quoted
On Thu, 2014-02-06 at 14:11 -0800, Paul E. McKenney wrote:
quoted
On Thu, Feb 06, 2014 at 10:17:03PM +0100, Torvald Riegel wrote:
quoted
On Thu, 2014-02-06 at 11:27 -0800, Paul E. McKenney wrote:
quoted
On Thu, Feb 06, 2014 at 06:59:10PM +0000, Will Deacon wrote:
quoted
There are also so many ways to blow your head off it's untrue. For example,
cmpxchg takes a separate memory model parameter for failure and success, but
then there are restrictions on the sets you can use for each. It's not hard
to find well-known memory-ordering experts shouting "Just use
memory_model_seq_cst for everything, it's too hard otherwise". Then there's
the fun of load-consume vs load-acquire (arm64 GCC completely ignores consume
atm and optimises all of the data dependencies away) as well as the definition
of "data races", which seem to be used as an excuse to miscompile a program
at the earliest opportunity.
Trust me, rcu_dereference() is not going to be defined in terms of
memory_order_consume until the compilers implement it both correctly and
efficiently.  They are not there yet, and there is currently no shortage
of compiler writers who would prefer to ignore memory_order_consume.
Do you have any input on
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59448?  In particular, the
language standard's definition of dependencies?
Let's see...  1.10p9 says that a dependency must be carried unless:

— B is an invocation of any specialization of std::kill_dependency (29.3), or
— A is the left operand of a built-in logical AND (&&, see 5.14) or logical OR (||, see 5.15) operator,
or
— A is the left operand of a conditional (?:, see 5.16) operator, or
— A is the left operand of the built-in comma (,) operator (5.18);

So the use of "flag" before the "?" is ignored.  But the "flag - flag"
after the "?" will carry a dependency, so the code fragment in 59448
needs to do the ordering rather than just optimizing "flag - flag" out
of existence.  One way to do that on both ARM and Power is to actually
emit code for "flag - flag", but there are a number of other ways to
make that work.
And that's what would concern me, considering that these requirements
seem to be able to creep out easily.  Also, whereas the other atomics
just constrain compilers wrt. reordering across atomic accesses or
changes to the atomic accesses themselves, the dependencies are new
requirements on pieces of otherwise non-synchronizing code.  The latter
seems far more involved to me.
Well, the wording of 1.10p9 is pretty explicit on this point.
There are only a few exceptions to the rule that dependencies from
memory_order_consume loads must be tracked.  And to your point about
requirements being placed on pieces of otherwise non-synchronizing code,
we already have that with plain old load acquire and store release --
both of these put ordering constraints that affect the surrounding
non-synchronizing code.
I think there's a significant difference.  With acquire/release or more
general memory orders, it's true that we can't order _across_ the atomic
access.  However, we can reorder and optimize without additional
constraints if we do not reorder.  This is not the case with consume
memory order, as the (p + flag - flag) example shows.
Agreed, memory_order_consume does introduce additional restrictions.
quoted
quoted
This issue got a lot of discussion, and the compromise is that
dependencies cannot leak into or out of functions unless the relevant
parameters or return values are annotated with [[carries_dependency]].
This means that the compiler can see all the places where dependencies
must be tracked.  This is described in 7.6.4.
I wasn't aware of 7.6.4 (but it isn't referred to as an additional
constraint--what it is--in 1.10, so I guess at least that should be
fixed).
Also, AFAIU, 7.6.4p3 is wrong in that the attribute does make a semantic
difference, at least if one is assuming that normal optimization of
sequential code is the default, and that maintaining things such as
(flag-flag) is not; if optimizing away (flag-flag) would require the
insertion of fences unless there is the carries_dependency attribute,
then this would be bad I think.
No, the attribute does not make a semantic difference.  If a dependency
flows into a function without [[carries_dependency]], the implementation
is within its right to emit an acquire barrier or similar.
So you can't just ignore the attribute when generating code -- you would
at the very least have to do this consistently across all the compilers
compiling parts of your code.  Which tells me that it does make a
semantic difference.  But I know that what should or should not be an
attribute is controversial.
quoted
IMHO, the dependencies construct (carries_dependency, kill_dependency)
seem to be backwards to me.  They assume that the compiler preserves all
those dependencies including (flag-flag) by default, which prohibits
meaningful optimizations.  Instead, I guess there should be a construct
for explicitly exploiting the dependencies (e.g., a
preserve_dependencies call, whose argument will not be optimized fully).
Exploiting dependencies will be special code and isn't the common case,
so it can be require additional annotations.
If you are compiling a function that has no [[carries_dependency]]
attributes on it arguments and return value, and none on any of the
functions that it calls, and contains no memomry_order_consume loads,
then you can break dependencies all you like within that function.

That said, I am of course open to discussing alternative approaches.
Especially those that ease the migration of the existing code in the
Linux kernel that rely on dependencies.  ;-)
quoted
quoted
If a dependency chain
headed by a memory_order_consume load goes into or out of a function
without the aid of the [[carries_dependency]] attribute, the compiler
needs to do something else to enforce ordering, e.g., emit a memory
barrier.
I agree that this is a way to see it.  But I can't see how this will
motivate compiler implementers to not just emit a stronger barrier right
away.
That certainly has been the most common approach.
quoted
quoted
From a Linux-kernel viewpoint, this is a bit ugly, as it requires
annotations and use of kill_dependency, but it was the best I could do
at the time.  If things go as they usually do, there will be some other
reason why those are needed...
Did you consider something along the "preserve_dependencies" call?  If
so, why did you go for kill_dependency?
Could you please give more detail on what a "preserve_dependencies" call
would do and where it would be used?
Demarcating regions of code (or particular expressions) that require the
preservation of source-code-level dependencies (eg, "p + flag - flag"),
and which thus constraints optimizations allowed on normal code.

What we have right now is a "blacklist" of things that kill
dependencies, which still requires compilers to not touch things like
"flag - flag".  Doing so isn't useful in most code, so it would be
better to have a "whitelist" of things in which dependencies are
strictly preserved.  The list of relevant expressions found in the
current RCU uses might be one such list.  Another would be to require
programmers to explicitly annotate the expressions where those
dependencies should be specially preserved, with something like a
"preserve_dependencies(p + flag - flag)" call.

I agree that this might be harder when dependency tracking is scattered
throughout a larger region of code, as you pointed out today.  But I'm
just looking for a setting that is more likely to see good support by
compilers.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help