Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core

[PATCH 0/1] Fixup write permission of TLB on powerpc e500 core · Shan Hai <hidden> · 2011-07-15
[PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Shan Hai <hidden> · 2011-07-15
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Peter Zijlstra <hidden> · 2011-07-15
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Shan Hai <hidden> · 2011-07-15
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Peter Zijlstra <hidden> · 2011-07-15
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Shan Hai <hidden> · 2011-07-16
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Shan Hai <hidden> · 2011-07-16
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-16
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Peter Zijlstra <peterz@infradead.org> · 2011-07-17
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-17
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-17
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-18
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-18
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-18
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Shan Hai <hidden> · 2011-07-18
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-18
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Shan Hai <hidden> · 2011-07-18
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-18
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Shan Hai <hidden> · 2011-07-18
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Shan Hai <hidden> · 2011-07-19
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-19
[RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-19
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Shan Hai <hidden> · 2011-07-19
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Shan Hai <hidden> · 2011-07-19
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-19
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Shan Hai <hidden> · 2011-07-19
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-19
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Shan Hai <hidden> · 2011-07-19
RE: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW trackingof dirty & young · David Laight <hidden> · 2011-07-19
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW trackingof dirty & young · Shan Hai <hidden> · 2011-07-19
RE: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW trackingof dirty & young · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-19
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Peter Zijlstra <peterz@infradead.org> · 2011-07-19
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Darren Hart <hidden> · 2011-07-20
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Andrew Morton <akpm@linux-foundation.org> · 2011-07-21
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-21
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-21
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Andrew Morton <akpm@linux-foundation.org> · 2011-07-21
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-22
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Shan Hai <hidden> · 2011-07-22
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Mike Frysinger <hidden> · 2011-07-27
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-27
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Peter Zijlstra <peterz@infradead.org> · 2011-07-27
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · David Howells <dhowells@redhat.com> · 2011-07-27
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Peter Zijlstra <peterz@infradead.org> · 2011-07-27
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-27
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Mike Frysinger <hidden> · 2011-07-28
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · David Howells <dhowells@redhat.com> · 2011-07-28
Re: [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young · Mike Frysinger <hidden> · 2011-08-08
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Peter Zijlstra <hidden> · 2011-07-17
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Shan Hai <hidden> · 2011-07-17
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-17
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Shan Hai <hidden> · 2011-07-17
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-17
Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-17
Re: [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core · Peter Zijlstra <hidden> · 2011-07-15
Re: [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core · MailingLists <hidden> · 2011-07-15
Re: [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core · Peter Zijlstra <hidden> · 2011-07-15
Re: [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core · Shan Hai <hidden> · 2011-07-15
Re: [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-15
Re: [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core · Peter Zijlstra <hidden> · 2011-07-15
Re: [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core · Shan Hai <hidden> · 2011-07-15
RE: [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core · David Laight <hidden> · 2011-07-15
Re: [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core · Shan Hai <hidden> · 2011-07-15
Re: [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-16
Re: [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core · Shan Hai <hidden> · 2011-07-16
RE: [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core · Peter Zijlstra <hidden> · 2011-07-15
RE: [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-15
Re: [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-15
Re: [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core · Benjamin Herrenschmidt <benh@kernel.crashing.org> · 2011-07-15

From: Shan Hai <hidden>
Date: 2011-07-18 07:24:17
Also in: lkml

On 07/18/2011 03:01 PM, Benjamin Herrenschmidt wrote:

On Mon, 2011-07-18 at 14:48 +0800, Shan Hai wrote:

quoted

It could not fix the problem, refer the following reply for
the reasons.

  .../...

quoted

diff --git a/kernel/futex.c b/kernel/futex.c
index fe28dc2..02adff7 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c

@@ -355,8 +355,8 @@ static int fault_in_user_writeable(u32 __user *uaddr)
   	int ret;

   	down_read(&mm->mmap_sem);
-	ret = get_user_pages(current, mm, (unsigned long)uaddr,
-			     1, 1, 0, NULL, NULL);
+	ret = __get_user_pages(current, mm, (unsigned long)uaddr, 1,
+			       FOLL_WRITE | FOLL_FIXFAULT, NULL, NULL, NULL);

the FOLL_FIXFAULT is filtered out at the following code
get_user_pages()
      if (write)
                  flags |= FOLL_WRITE;

I'm not sure what you're talking about here, you may notice that I'm
calling __get_user_pages() not get_user_pages(). Make sure you get my
-second- post of the patch (the one with a proper description&  s-o-b)
since the first one was a mis-send of an wip version.

I am sorry I hadn't tried your newer patch, I tried it but it still 
could not
work in my test environment, I will dig into and tell you why
that failed later.

quoted

+
+	if (flags&   FOLL_FIXFAULT)
+		handle_pte_sw_young_dirty(vma, address, ptep,
+					  flags&   FOLL_WRITE);
   	if (flags&   FOLL_TOUCH) {
   		if ((flags&   FOLL_WRITE)&&
   		!pte_dirty(pte)&&   !PageDirty(page))

call handle_pte_sw_young_dirty before !pte_dirty(pte)
might has problems.

No this is on purpose.

My initial patch was only calling it under the same condition as the
FOLL_TOUCH case, but I got concerned by this whole
flush_tlb_fix_spurious_fault() business.

Basically, our generic code is designed so that relaxing write
protection on a PTE can be done without flushing the TLB on all CPUs, so
that a "spurrious" fault on a secondary CPU will flush the TLB at that
point.

I don't know which arch relies on this feature (ARM maybe ?) but if we
are going to be semantically equivalent to a real fault, we must also do
that, so the right thing to do here is to always call in there if
FOLL_FIXFAULT is set.

It's up to the caller to only set FOLL_FIXFAULT when really trying to
deal with an -EFAULT, to avoid possible unnecessary overhead, but in
this case I think we are fine, this is all a fallback slow path.

  .../...

quoted

So what about the following?

diff --git a/mm/memory.c b/mm/memory.c
index 9b8a01d..fb48122 100644
--- a/mm/memory.c
+++ b/mm/memory.c

@@ -1442,6 +1442,7 @@ struct page *follow_page(struct vm_area_struct

*vma, unsig
          spinlock_t *ptl;
          struct page *page;
          struct mm_struct *mm = vma->vm_mm;
+       int fix_write_permission = false;

          page = follow_huge_addr(mm, address, flags&  FOLL_WRITE);
          if (!IS_ERR(page)) {

@@ -1519,6 +1520,11 @@ split_fallthrough:
                  if ((flags&  FOLL_WRITE)&&
                      !pte_dirty(pte)&&  !PageDirty(page))
                          set_page_dirty(page);
+
+#ifdef CONFIG_FIXUP_WRITE_PERMISSION
+               if ((flags&  FOLL_WRITE)&&  !pte_dirty(pte))
+                       fix_write_permission = true;
+#endif
                  /*
                   * pte_mkyoung() would be more correct here, but atomic

care
                   * is needed to avoid losing the dirty bit: it is
easier to use

@@ -1551,7 +1557,7 @@ split_fallthrough:
   unlock:
          pte_unmap_unlock(ptep, ptl);
   out:
-       return page;
+       return (fix_write_permission == true) ? NULL: page;

   bad_page:
          pte_unmap_unlock(ptep, ptl);

You patch not only is uglier (more ifdef's) but also incomplete since it
doesn't handle the young case and it doesn't handle the spurious fault
case either.

Yep, I know holding lots of ifdef's everywhere is not so good,
but if we have some other way(I don't know how till now) to
figure out the arch has the need to fixup up the write permission
we could eradicate the ugly ifdef's here.

I think the handle_mm_fault could do all dirty/young tracking,
because the purpose of making follow_page return NULL to
its caller is that want to the handle_mm_fault to be called
on write permission protection fault.

Thanks
Shan Hai

What the futex code is trying to do is use gup() as a way to fixup from
a fault which means essentially to have the -exact- same semantics as a
normal fault would have.

Thus by factoring the common fault fixup code and using that exact same
code in gup(), we get a much more robust guarantee that this will work
in the long run.

I don't expect gup to be that commonly used to fixup access after an
attempt at doing a user access with page faults disabled, only those
case will need to be modified to use the new flag.

quoted

   From the CONFIG_FIXUP_WRITE_PERMISSION and
(flags&  FOLL_WRITE)&&  !pte_dirty(pte) the follow_page()
could figure out that the caller want to write to the
(present&&  writable&&  non-dirty) pte, and the architecture
want to fixup the problem by indicating CONFIG_FIXUP_WRITE_PERMISSION,
so let the follow_page() return NULL to the __get_user_pages, and
let the handle_mm_fault to fixup dirty/young tracking.

Checking the following code we can conclude that the handle_mm_fault
is ready to handle the faults and the write permission violation is
a kind of fault, so why don't we let the handle_mm_fault to
handle that fault?

__get_user_pages()
       while (!(page = follow_page(vma, start, foll_flags))) {
          ...
          ret = handle_mm_fault(mm, vma, start,
                                                          fault_flags);
          ...
      }

Cheers,
Ben.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help