Thread (15 messages) 15 messages, 6 authors, 2012-08-13

Re: [PATCH v2 4/6] x86: Add clear_page_nocache

From: Jan Beulich <hidden>
Date: 2012-08-09 15:22:14
Also in: linux-mips, linux-mm, linuxppc-dev, lkml, sparclinux

quoted
quoted
On 09.08.12 at 17:03, "Kirill A. Shutemov" [off-list ref] wrote:
From: Andi Kleen <redacted>

Add a cache avoiding version of clear_page. Straight forward integer variant
of the existing 64bit clear_page, for both 32bit and 64bit.
While on 64-bit this is fine, I fail to see how you avoid using the
SSE2 instruction on non-SSE2 systems.
Also add the necessary glue for highmem including a layer that non cache
coherent architectures that use the virtual address for flushing can
hook in. This is not needed on x86 of course.

If an architecture wants to provide cache avoiding version of clear_page
it should to define ARCH_HAS_USER_NOCACHE to 1 and implement
clear_page_nocache() and clear_user_highpage_nocache().

Signed-off-by: Andi Kleen <redacted>
Signed-off-by: Kirill A. Shutemov <redacted>
---
 arch/x86/include/asm/page.h          |    2 ++
 arch/x86/include/asm/string_32.h     |    5 +++++
 arch/x86/include/asm/string_64.h     |    5 +++++
 arch/x86/lib/Makefile                |    1 +
 arch/x86/lib/clear_page_nocache_32.S |   30 ++++++++++++++++++++++++++++++
 arch/x86/lib/clear_page_nocache_64.S |   29 +++++++++++++++++++++++++++++
Couldn't this more reasonably go into clear_page_{32,64}.S?
quoted hunk ↗ jump to hunk
 arch/x86/mm/fault.c                  |    7 +++++++
 7 files changed, 79 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/lib/clear_page_nocache_32.S
 create mode 100644 arch/x86/lib/clear_page_nocache_64.S
...
--- /dev/null
+++ b/arch/x86/lib/clear_page_nocache_32.S
@@ -0,0 +1,30 @@
+#include <linux/linkage.h>
+#include <asm/dwarf2.h>
+
+/*
+ * Zero a page avoiding the caches
+ * rdi	page
Wrong comment.
+ */
+ENTRY(clear_page_nocache)
+	CFI_STARTPROC
+	mov    %eax,%edi
You need to pick a different register here (e.g. %edx), since
%edi has to be preserved by all functions called from C.
+	xorl   %eax,%eax
+	movl   $4096/64,%ecx
+	.p2align 4
+.Lloop:
+	decl	%ecx
+#define PUT(x) movnti %eax,x*8(%edi) ; movnti %eax,x*8+4(%edi)
Is doing twice as much unrolling as on 64-bit really worth it?

Jan
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help