Re: perf events ring buffer memory barrier on powerpc
From: Victor Kaplansky <hidden>
Date: 2013-11-01 13:13:12
Also in:
lkml
"Paul E. McKenney" [off-list ref] wrote on 10/31/2013 08:16:02 AM:
quoted
BTW, it is why you also don't need ACCESS_ONCE() around @tail, but only around @head read.
Just to be sure, that we are talking about the same code - I was
considering
ACCESS_ONCE() around @tail in point AAA in the following example from
Documentation/circular-buffers.txt for CONSUMER:
unsigned long head = ACCESS_ONCE(buffer->head);
unsigned long tail = buffer->tail; /* AAA */
if (CIRC_CNT(head, tail, buffer->size) >= 1) {
/* read index before reading contents at that index */
smp_read_barrier_depends();
/* extract one item from the buffer */
struct item *item = buffer[tail];
consume_item(item);
smp_mb(); /* finish reading descriptor before incrementing
tail */
buffer->tail = (tail + 1) & (buffer->size - 1); /* BBB */
}
If you omit the ACCESS_ONCE() calls around @tail, the compiler is within its rights to combine adjacent operations and also to invent loads and stores, for example, in cases of register pressure.
Right. And I was completely aware about these possible transformations when said that ACCESS_ONCE() around @tail in point AAA is redundant. Moved, or even completely dismissed reads of @tail in consumer code, are not a problem at all, since @tail is written exclusively by CONSUMER side.
It is also within its rights to do piece-at-a-time loads and stores, which might sound unlikely, but which can actually has happened when the compiler figures out exactly what is to be stored at compile time, especially on hardware that only allows small immediate values.
As for writes to @tail, the ACCESS_ONCE around @tail at point AAA, doesn't prevent in any way an imaginary super-optimizing compiler from moving around the store to @tail (which appears in the code at point BBB). It is why ACCESS_ONCE at point AAA is completely redundant. -- Victor