Re: [PATCH v1 2/2] perf auxtrace: Optimize barriers with load-acquire and store-release
From: Peter Zijlstra <peterz@infradead.org>
Date: 2021-06-01 06:59:00
Also in:
lkml
From: Peter Zijlstra <peterz@infradead.org>
Date: 2021-06-01 06:59:00
Also in:
lkml
On Tue, Jun 01, 2021 at 02:33:42PM +0800, Leo Yan wrote:
32-bit perf wants to access 64-bit value atomically, I think it tries to
avoid the issue caused by scenario:
CPU0 (64-bit kernel) CPU1 (32-bit user)
read head_lo
WRITE_ONCE(head)
read head_hiRight; so I think Mark and me once spend a bunch of time on this for the regular ring buffer, but my memory is vague. It was supposed to be that the high word would always be zero on 32bit, but it turns out that that is not in fact the case and we get to have this race that's basically unfixable :/ Or maybe that was only the compat case.. Ah yes, so see the kernel uses unsigned long, so on 32bit the high word is empty and we always read/write 0s, unless you're explicitly doing daft things. But on compat, the high word can be non-zero and we get to have 'fun'.