Thread (36 messages) 36 messages, 5 authors, 2024-04-12

Re: [PATCH v3 12/12] mm/gup: Handle hugetlb in the generic follow_page_mask code

From: Andrew Morton <akpm@linux-foundation.org>
Date: 2024-03-22 20:48:19
Also in: linux-arm-kernel, linux-mm, linux-riscv, lkml

On Thu, 21 Mar 2024 18:08:02 -0400 peterx@redhat.com wrote:
From: Peter Xu <peterx@redhat.com>

Now follow_page() is ready to handle hugetlb pages in whatever form, and
over all architectures.  Switch to the generic code path.

Time to retire hugetlb_follow_page_mask(), following the previous
retirement of follow_hugetlb_page() in 4849807114b8.

There may be a slight difference of how the loops run when processing slow
GUP over a large hugetlb range on cont_pte/cont_pmd supported archs: each
loop of __get_user_pages() will resolve one pgtable entry with the patch
applied, rather than relying on the size of hugetlb hstate, the latter may
cover multiple entries in one loop.

A quick performance test on an aarch64 VM on M1 chip shows 15% degrade over
a tight loop of slow gup after the path switched.  That shouldn't be a
problem because slow-gup should not be a hot path for GUP in general: when
page is commonly present, fast-gup will already succeed, while when the
page is indeed missing and require a follow up page fault, the slow gup
degrade will probably buried in the fault paths anyway.  It also explains
why slow gup for THP used to be very slow before 57edfcfd3419 ("mm/gup:
accelerate thp gup even for "pages != NULL"") lands, the latter not part of
a performance analysis but a side benefit.  If the performance will be a
concern, we can consider handle CONT_PTE in follow_page().

Before that is justified to be necessary, keep everything clean and simple.
mm/gup.c:33:21: warning: 'follow_hugepd' declared 'static' but never defined [-Wunused-function]
   33 | static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
      |                     ^~~~~~~~~~~~~
--- a/mm/gup.c~mm-gup-handle-hugepd-for-follow_page-fix
+++ a/mm/gup.c
@@ -30,10 +30,12 @@ struct follow_page_context {
 	unsigned int page_mask;
 };
 
+#ifdef CONFIG_HAVE_FAST_GUP
 static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
 				  unsigned long addr, unsigned int pdshift,
 				  unsigned int flags,
 				  struct follow_page_context *ctx);
+#endif
 
 static inline void sanity_check_pinned_pages(struct page **pages,
 					     unsigned long npages)
_


This looks inelegant.

That's two build issues so far.  Please be more expansive in the
Kconfig variations when testing.  Especially when mucking with pgtable
macros.

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help