Re: perf events ring buffer memory barrier on powerpc
From: Oleg Nesterov <oleg@redhat.com>
Date: 2013-10-28 20:17:12
Also in:
lkml
On 10/28, Paul E. McKenney wrote:
On Mon, Oct 28, 2013 at 02:26:34PM +0100, Peter Zijlstra wrote:quoted
--- linux-2.6.orig/kernel/events/ring_buffer.c +++ linux-2.6/kernel/events/ring_buffer.c@@ -87,10 +87,31 @@ static void perf_output_put_handle(struc goto out; /* - * Publish the known good head. Rely on the full barrier implied - * by atomic_dec_and_test() order the rb->head read and this - * write. + * Since the mmap() consumer (userspace) can run on a different CPU: + * + * kernel user + * + * READ ->data_tail READ ->data_head + * smp_rmb() (A) smp_rmb() (C)Given that both of the kernel's subsequent operations are stores/writes, doesn't (A) need to be smp_mb()?
Yes, this is my understanding^Wfeeling too, but I have to admit that
I can't really explain to myself why _exactly_ we need mb() here.
And let me copy-and-paste the artificial example from my previous
email,
bool BUSY;
data_t DATA;
bool try_to_get(data_t *data)
{
if (!BUSY)
return false;
rmb();
*data = DATA;
mb();
BUSY = false;
return true;
}
bool try_to_put(data_t *data)
{
if (BUSY)
return false;
mb(); // XXXXXXXX: do we really need it? I think yes.
DATA = *data;
wmb();
BUSY = true;
return true;
}
(just in case, the code above obviously assumes that _get or _put
can't race with itself, but they can race with each other).
Could you confirm that try_to_put() actually needs mb() between
LOAD(BUSY) and STORE(DATA) ?
I am sure it actually needs, but I will appreciate it if you can
explain why. IOW, how it is possible that without mb() try_to_put()
can overwrite DATA before try_to_get() completes its "*data = DATA"
in this particular case.
Perhaps this can happen if, say, reader and writer share a level of
cache or something like this...
Oleg.