Re: perf events ring buffer memory barrier on powerpc
From: Peter Zijlstra <peterz@infradead.org>
Date: 2013-11-01 16:11:48
Also in:
lkml
On Wed, Oct 30, 2013 at 11:40:15PM -0700, Paul E. McKenney wrote:
quoted
void kbuf_write(int sz, void *buf) { u64 tail = ACCESS_ONCE(ubuf->tail); /* last location userspace read */ u64 offset = kbuf->head; /* we already know where we last wrote */ u64 head = offset + sz; if (!space(tail, offset, head)) { /* discard @buf */ return; } /* * Ensure that if we see the userspace tail (ubuf->tail) such * that there is space to write @buf without overwriting data * userspace hasn't seen yet, we won't in fact store data before * that read completes. */ smp_mb(); /* A, matches with D */ write(kbuf->data + offset, buf, sz); kbuf->head = head % kbuf->size; /* * Ensure that we write all the @buf data before we update the * userspace visible ubuf->head pointer. */ smp_wmb(); /* B, matches with C */ ubuf->head = kbuf->head; }
quoted
Now the whole crux of the question is if we need barrier A at all, since the STORES issued by the @buf writes are dependent on the ubuf->tail read.The dependency you are talking about is via the "if" statement? Even C/C++11 is not required to respect control dependencies.
But surely we must be able to make it so; otherwise you'd never be able
to write:
void *ptr = obj1;
void foo(void)
{
/* create obj2, obj3 */
smp_wmb(); /* ensure the objs are complete */
/* expose either obj2 or obj3 */
if (x)
ptr = obj2;
else
ptr = obj3;
/* free the unused one */
if (x)
free(obj3);
else
free(obj2);
}
Earlier you said that 'volatile' or '__atomic' avoids speculative
writes; so would:
volatile void *ptr = obj1;
Make the compiler respect control dependencies again? If so, could we
somehow mark that !space() condition volatile?
Currently the above would be considered a valid pattern. But you're
saying its not because the compiler is free to expose both obj2 and obj3
(for however short a time) and thus the free of the 'unused' object is
incorrect and can cause use-after-free.
In fact; how can we be sure that:
void *ptr = NULL;
void bar(void)
{
void *obj = malloc(...);
/* fill obj */
if (!err)
rcu_assign_pointer(ptr, obj);
else
free(obj);
}
Does not get 'optimized' into:
void bar(void)
{
void *obj = malloc(...);
void *old_ptr = ptr;
/* fill obj */
rcu_assign_pointer(ptr, obj);
if (err) { /* because runtime profile data says this is unlikely */
ptr = old_ptr;
free(obj);
}
}
We _MUST_ be able to rely on control flow, otherwise me might as well
all go back to writing kernels in asm.