RE: [PATCH v4 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared
From: Justin He (Arm Technology China) <hidden>
Date: 2019-09-19 15:02:57
Also in:
linux-mm, lkml
Hi Kirill Thanks for the detailed explanation. -- Cheers, Justin (Jia He)
-----Original Message----- From: Kirill A. Shutemov <redacted> Sent: 2019年9月19日 22:58 To: Jia He <redacted> Cc: Justin He (Arm Technology China) <redacted>; Catalin Marinas [off-list ref]; Will Deacon [off-list ref]; Mark Rutland [off-list ref]; James Morse [off-list ref]; Marc Zyngier [off-list ref]; Matthew Wilcox [off-list ref]; Kirill A. Shutemov [off-list ref]; linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org; linux-mm@kvack.org; Suzuki Poulose [off-list ref]; Punit Agrawal [off-list ref]; Anshuman Khandual [off-list ref]; Jun Yao [off-list ref]; Alex Van Brunt [off-list ref]; Robin Murphy [off-list ref]; Thomas Gleixner [off-list ref]; Andrew Morton [off-list ref]; Jérôme Glisse [off-list ref]; Ralph Campbell [off-list ref]; Kaly Xin (Arm Technology China) [off-list ref] Subject: Re: [PATCH v4 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared On Thu, Sep 19, 2019 at 10:16:34AM +0800, Jia He wrote:quoted
Hi Kirill [On behalf of justin.he@arm.com because some mails are filted...] On 2019/9/18 22:00, Kirill A. Shutemov wrote:quoted
On Wed, Sep 18, 2019 at 09:19:14PM +0800, Jia He wrote:quoted
When we tested pmdk unit test [1] vmmalloc_fork TEST1 in arm64guest, therequoted
quoted
quoted
will be a double page fault in __copy_from_user_inatomic ofcow_user_page.quoted
quoted
quoted
Below call trace is from arm64 do_page_fault for debugging purpose [ 110.016195] Call trace: [ 110.016826] do_page_fault+0x5a4/0x690 [ 110.017812] do_mem_abort+0x50/0xb0 [ 110.018726] el1_da+0x20/0xc4 [ 110.019492] __arch_copy_from_user+0x180/0x280 [ 110.020646] do_wp_page+0xb0/0x860 [ 110.021517] __handle_mm_fault+0x994/0x1338 [ 110.022606] handle_mm_fault+0xe8/0x180 [ 110.023584] do_page_fault+0x240/0x690 [ 110.024535] do_mem_abort+0x50/0xb0 [ 110.025423] el0_da+0x20/0x24 The pte info before __copy_from_user_inatomic is (PTE_AF is cleared): [ffff9b007000] pgd=000000023d4f8003, pud=000000023da9b003,pmd=000000023d4b3003, pte=360000298607bd3quoted
quoted
quoted
As told by Catalin: "On arm64 without hardware Access Flag, copyingfromquoted
quoted
quoted
user will fail because the pte is old and cannot be marked young. Sowequoted
quoted
quoted
always end up with zeroed page after fork() + CoW for pfn mappings.wequoted
quoted
quoted
don't always have a hardware-managed access flag on arm64." This patch fix it by calling pte_mkyoung. Also, the parameter is changed because vmf should be passed to cow_user_page() [1]https://github.com/pmem/pmdk/tree/master/src/test/vmmalloc_forkquoted
quoted
quoted
Reported-by: Yibo Cai <redacted> Signed-off-by: Jia He <redacted> --- mm/memory.c | 35 ++++++++++++++++++++++++++++++----- 1 file changed, 30 insertions(+), 5 deletions(-)diff --git a/mm/memory.c b/mm/memory.c index e2bb51b6242e..d2c130a5883b 100644 --- a/mm/memory.c +++ b/mm/memory.c@@ -118,6 +118,13 @@ int randomize_va_space __read_mostly = 2; #endif +#ifndef arch_faults_on_old_pte +static inline bool arch_faults_on_old_pte(void) +{ + return false; +} +#endif + static int __init disable_randmaps(char *s) { randomize_va_space = 0;@@ -2140,8 +2147,12 @@ static inline int pte_unmap_same(structmm_struct *mm, pmd_t *pmd,quoted
quoted
quoted
return same; } -static inline void cow_user_page(struct page *dst, struct page *src,unsigned long va, struct vm_area_struct *vma)quoted
quoted
quoted
+static inline void cow_user_page(struct page *dst, struct page *src, + struct vm_fault *vmf) { + struct vm_area_struct *vma = vmf->vma; + unsigned long addr = vmf->address; + debug_dma_assert_idle(src); /*@@ -2152,20 +2163,34 @@ static inline void cow_user_page(structpage *dst, struct page *src, unsigned loquoted
quoted
quoted
*/ if (unlikely(!src)) { void *kaddr = kmap_atomic(dst); - void __user *uaddr = (void __user *)(va & PAGE_MASK); + void __user *uaddr = (void __user *)(addr & PAGE_MASK); + pte_t entry; /* * This really shouldn't fail, because the page is there * in the page tables. But it might just be unreadable, * in which case we just give up and fill the result with - * zeroes. + * zeroes. On architectures with software "accessed" bits, + * we would take a double page fault here, so mark it + * accessed here. */ + if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte)){quoted
quoted
quoted
+ spin_lock(vmf->ptl); + if (likely(pte_same(*vmf->pte, vmf->orig_pte))) { + entry = pte_mkyoung(vmf->orig_pte); + if (ptep_set_access_flags(vma, addr, + vmf->pte, entry, 0)) + update_mmu_cache(vma, addr, vmf-pte);quoted
quoted
+ }I don't follow. So if pte has changed under you, you don't set the accessed bit, butneverquoted
quoted
the less copy from the user. What makes you think it will not trigger the same problem? I think we need to make cow_user_page() fail in this case and caller -- wp_page_copy() -- return zero. If the fault was solved by other thread,wequoted
quoted
are fine. If not userspace would re-fault on the same address and wewillquoted
quoted
handle the fault from the second attempt.Thanks for the pointing. How about make cow_user_page() be returned VM_FAULT_RETRY? Then in do_page_fault(), it can retry the page fault?No. VM_FAULT_RETRY has different semantics: we have to drop mmap_sem(), so let's try to take it again and handle the fault. In this case the more likely scenario is that other thread has already handled the fault and we don't need to do anything. If it's not the case, the fault will be triggered again on the same address. -- Kirill A. Shutemov
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel