Thread (22 messages) 22 messages, 3 authors, 2021-06-01

Re: [PATCH v1 1/2] perf auxtrace: Change to use SMP memory barriers

From: Adrian Hunter <adrian.hunter@intel.com>
Date: 2021-05-27 09:23:57
Also in: lkml

On 27/05/21 11:25 am, Adrian Hunter wrote:
On 27/05/21 11:11 am, Peter Zijlstra wrote:
quoted
On Thu, May 27, 2021 at 10:54:56AM +0300, Adrian Hunter wrote:
quoted
On 19/05/21 5:03 pm, Leo Yan wrote:
quoted
The AUX ring buffer's head and tail can be accessed from multiple CPUs
on SMP system, so changes to use SMP memory barriers to replace the
uniprocessor barriers.
I don't think user space should attempt to be SMP-aware.
Uhh, what? It pretty much has to. Since userspace cannot assume UP, it
must assume SMP.
Yeah that is what I meant, but consequently we generally shouldn't be
using functions called smp_<anything>
quoted
quoted
For perf tools, on __x86_64__ it looks like smp_rmb() is only a compiler barrier, whereas
rmb() is a "lfence" memory barrier instruction, so this patch does not
seem to do what the commit message says at least for x86.
The commit message is somewhat confused; *mb() are not UP barriers
(although they are available and useful on UP). They're device/dma
barriers.
quoted
With regard to the AUX area, we don't know in general how data gets there,
so using memory barriers seems sensible.
IIRC (but I ddn't check) the rule was that the kernel needs to ensure
the AUX area is complete before it updates the head pointer. So if
userspace can observe the head pointer, it must then also be able to
observe the data. This is not something userspace can fix up anyway.

The ordering here is between the head pointer and the data, and from a
userspace perspective that's a regular smp ordering. Similar for the
tail update, that's between our reading the data and writing the tail,
regular cache coherent smp ordering.

So ACK on the patch, it's sane and an optimization for both x86 and ARM.
Just the Changelog needs work.
If all we want is a compiler barrier, then shouldn't that be what we use?
i.e. barrier()
I guess you are saying we still need to stop potential re-ordering across
CPUs, so please ignore my comments.
quoted
quoted
quoted
Signed-off-by: Leo Yan <redacted>
---
 tools/perf/util/auxtrace.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 472c0973b1f1..8bed284ccc82 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -452,7 +452,7 @@ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm)
 	u64 head = READ_ONCE(pc->aux_head);
 
 	/* Ensure all reads are done after we read the head */
-	rmb();
+	smp_rmb();
 	return head;
 }
 
@@ -466,7 +466,7 @@ static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
 #endif
 
 	/* Ensure all reads are done after we read the head */
-	rmb();
+	smp_rmb();
 	return head;
 }
 
@@ -478,7 +478,7 @@ static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
 #endif
 
 	/* Ensure all reads are done before we write the tail out */
-	mb();
+	smp_mb();
 #if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
 	pc->aux_tail = tail;
 #else
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help