Thread (5 messages) 5 messages, 3 authors, 2021-12-22

Re: [PATCH] asm-generic: introduce io_stop_wc() and add implementation for ARM64

From: Catalin Marinas <catalin.marinas@arm.com>
Date: 2021-12-17 11:59:24
Also in: linux-arch, lkml

On Fri, Dec 17, 2021 at 04:56:11PM +0800, Xiongfeng Wang wrote:
For memory accesses with Normal-Non Cacheable attributes, the CPU may do
You may want to mention "arm64 Normal Non-Cacheable" as other
architectures have a different meaning of NC.
quoted hunk ↗ jump to hunk
write combining. But in some situation, this is bad for the performance
because the prior access may wait too long just to be merged.

We introduce io_stop_wc() to prevent the Normal-NC memory accesses before
this instruction to be merged with memory accesses after this
instruction.

We add implementation for ARM64 using DGH instruction and provide NOP
implementation for other architectures.

Signed-off-by: Xiongfeng Wang <redacted>
---
 Documentation/memory-barriers.txt |  9 +++++++++
 arch/arm64/include/asm/barrier.h  |  9 +++++++++
 include/asm-generic/barrier.h     | 11 +++++++++++
 3 files changed, 29 insertions(+)
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 7367ada13208..b868b51b1801 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -1950,6 +1950,15 @@ There are some more advanced barrier functions:
      For load from persistent memory, existing read memory barriers are sufficient
      to ensure read ordering.
 
+ (*) io_stop_wc();
+
+     For memory accesses with Normal-Non Cacheable attributes (e.g. those
+     returned by ioremap_wc()), the CPU may do write combining. But in some
+     situation, this is bad for the performance because the prior access may
+     wait too long just to be merged. io_stop_wc() can be used to prevent
+     merging memory accesses with Normal-Non Cacheable attributes before this
+     instruction with any memory accesses appearing after this instruction.
I'm fine with the patch in general but the comment here and in
asm-generic/barrier.h should avoid Normal Non-Cacheable as that's an
arm-specific term. Looking at Documentation/driver-api/device-io.rst, we
could simply say "write-combining". Something like:

     For memory accesses with write-combining attributes (e.g. those
     returned by ioremap_wc()), the CPU may wait for prior accesses to
     be merged with subsequent ones. io_stop_wc() can be used to prevent
     the merging of write-combining memory accesses before this macro
     with those after it when such wait has performance implications.

(feel free to rephrase it but avoid Normal NC here)
quoted hunk ↗ jump to hunk
+
 ===============================
 IMPLICIT KERNEL MEMORY BARRIERS
 ===============================
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 1c5a00598458..62217be36217 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -26,6 +26,14 @@
 #define __tsb_csync()	asm volatile("hint #18" : : : "memory")
 #define csdb()		asm volatile("hint #20" : : : "memory")
 
+/*
+ * Data Gathering Hint:
+ * This instruction prevents merging memory accesses with Normal-NC or
+ * Device-GRE attributes before the hint instruction with any memory accesses
+ * appearing after the hint instruction.
+ */
+#define dgh()		asm volatile("hint #6" : : : "memory")
This is fine, arm-specific code.
quoted hunk ↗ jump to hunk
+
 #ifdef CONFIG_ARM64_PSEUDO_NMI
 #define pmr_sync()						\
 	do {							\
@@ -46,6 +54,7 @@
 #define dma_rmb()	dmb(oshld)
 #define dma_wmb()	dmb(oshst)
 
+#define io_stop_wc()	dgh()
 
 #define tsb_csync()								\
 	do {									\
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 640f09479bdf..083be6d34cb9 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -251,5 +251,16 @@ do {									\
 #define pmem_wmb()	wmb()
 #endif
 
+/*
+ * ioremap_wc() maps I/O memory as memory with Normal-Non Cacheable attributes.
+ * The CPU may do write combining for this kind of memory access. io_stop_wc()
+ * prevents merging memory accesses with Normal-Non Cacheable attributes
+ * before this instruction with any memory accesses appearing after this
+ * instruction.
Please change this as well along the lines of my comment above.
+ */
+#ifndef io_stop_wc
+#define io_stop_wc do { } while (0)
+#endif
+
 #endif /* !__ASSEMBLY__ */
 #endif /* __ASM_GENERIC_BARRIER_H */
Thanks.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help