Re: [PATCH] asm-generic: introduce io_stop_wc() and add implementation for ARM64
From: Catalin Marinas <catalin.marinas@arm.com>
Date: 2021-12-17 11:59:24
Also in:
linux-arch, lkml
On Fri, Dec 17, 2021 at 04:56:11PM +0800, Xiongfeng Wang wrote:
For memory accesses with Normal-Non Cacheable attributes, the CPU may do
You may want to mention "arm64 Normal Non-Cacheable" as other architectures have a different meaning of NC.
quoted hunk ↗ jump to hunk
write combining. But in some situation, this is bad for the performance because the prior access may wait too long just to be merged. We introduce io_stop_wc() to prevent the Normal-NC memory accesses before this instruction to be merged with memory accesses after this instruction. We add implementation for ARM64 using DGH instruction and provide NOP implementation for other architectures. Signed-off-by: Xiongfeng Wang <redacted> --- Documentation/memory-barriers.txt | 9 +++++++++ arch/arm64/include/asm/barrier.h | 9 +++++++++ include/asm-generic/barrier.h | 11 +++++++++++ 3 files changed, 29 insertions(+)diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index 7367ada13208..b868b51b1801 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt@@ -1950,6 +1950,15 @@ There are some more advanced barrier functions: For load from persistent memory, existing read memory barriers are sufficient to ensure read ordering. + (*) io_stop_wc(); + + For memory accesses with Normal-Non Cacheable attributes (e.g. those + returned by ioremap_wc()), the CPU may do write combining. But in some + situation, this is bad for the performance because the prior access may + wait too long just to be merged. io_stop_wc() can be used to prevent + merging memory accesses with Normal-Non Cacheable attributes before this + instruction with any memory accesses appearing after this instruction.
I'm fine with the patch in general but the comment here and in
asm-generic/barrier.h should avoid Normal Non-Cacheable as that's an
arm-specific term. Looking at Documentation/driver-api/device-io.rst, we
could simply say "write-combining". Something like:
For memory accesses with write-combining attributes (e.g. those
returned by ioremap_wc()), the CPU may wait for prior accesses to
be merged with subsequent ones. io_stop_wc() can be used to prevent
the merging of write-combining memory accesses before this macro
with those after it when such wait has performance implications.
(feel free to rephrase it but avoid Normal NC here)
quoted hunk ↗ jump to hunk
+ =============================== IMPLICIT KERNEL MEMORY BARRIERS ===============================diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h index 1c5a00598458..62217be36217 100644 --- a/arch/arm64/include/asm/barrier.h +++ b/arch/arm64/include/asm/barrier.h@@ -26,6 +26,14 @@ #define __tsb_csync() asm volatile("hint #18" : : : "memory") #define csdb() asm volatile("hint #20" : : : "memory") +/* + * Data Gathering Hint: + * This instruction prevents merging memory accesses with Normal-NC or + * Device-GRE attributes before the hint instruction with any memory accesses + * appearing after the hint instruction. + */ +#define dgh() asm volatile("hint #6" : : : "memory")
This is fine, arm-specific code.
quoted hunk ↗ jump to hunk
+ #ifdef CONFIG_ARM64_PSEUDO_NMI #define pmr_sync() \ do { \@@ -46,6 +54,7 @@ #define dma_rmb() dmb(oshld) #define dma_wmb() dmb(oshst) +#define io_stop_wc() dgh() #define tsb_csync() \ do { \diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h index 640f09479bdf..083be6d34cb9 100644 --- a/include/asm-generic/barrier.h +++ b/include/asm-generic/barrier.h@@ -251,5 +251,16 @@ do { \ #define pmem_wmb() wmb() #endif +/* + * ioremap_wc() maps I/O memory as memory with Normal-Non Cacheable attributes. + * The CPU may do write combining for this kind of memory access. io_stop_wc() + * prevents merging memory accesses with Normal-Non Cacheable attributes + * before this instruction with any memory accesses appearing after this + * instruction.
Please change this as well along the lines of my comment above.
+ */
+#ifndef io_stop_wc
+#define io_stop_wc do { } while (0)
+#endif
+
#endif /* !__ASSEMBLY__ */
#endif /* __ASM_GENERIC_BARRIER_H */Thanks. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel