Thread (36 messages) 36 messages, 5 authors, 2024-04-12

Re: [PATCH v3 12/12] mm/gup: Handle hugetlb in the generic follow_page_mask code

From: Jason Gunthorpe <jgg@nvidia.com>
Date: 2024-03-22 16:08:53
Also in: linux-arm-kernel, linux-mm, linux-riscv, lkml

On Fri, Mar 22, 2024 at 11:55:11AM -0400, Peter Xu wrote:
Jason,

On Fri, Mar 22, 2024 at 10:30:12AM -0300, Jason Gunthorpe wrote:
quoted
On Thu, Mar 21, 2024 at 06:08:02PM -0400, peterx@redhat.com wrote:
quoted
A quick performance test on an aarch64 VM on M1 chip shows 15% degrade over
a tight loop of slow gup after the path switched.  That shouldn't be a
problem because slow-gup should not be a hot path for GUP in general: when
page is commonly present, fast-gup will already succeed, while when the
page is indeed missing and require a follow up page fault, the slow gup
degrade will probably buried in the fault paths anyway.  It also explains
why slow gup for THP used to be very slow before 57edfcfd3419 ("mm/gup:
accelerate thp gup even for "pages != NULL"") lands, the latter not part of
a performance analysis but a side benefit.  If the performance will be a
concern, we can consider handle CONT_PTE in follow_page().
I think this is probably fine for the moment, at least for this
series, as CONT_PTE is still very new.

But it will need to be optimized. "slow" GUP is the only GUP that is
used by FOLL_LONGTERM and it still needs to be optimized because you
can't assume a FOLL_LONGTERM user will be hitting the really slow
fault path. There are enough important cases where it is just reading
already populted page tables, and these days, often with large folios.
Ah, I thought FOLL_LONGTERM should work in most cases for fast-gup,
especially for hugetlb, but maybe I missed something?  
Ah, no this is my bad memory, there was a time where that was true,
but it is not the case now. Oh, it is a really bad memory because it
seems I removed parts of it :)
I do see that devmap skips fast-gup for LONGTERM, we also have that
writeback issue but none of those that I can find applies to
hugetlb.  This might be a problem indeed if we have hugetlb cont_pte
pages that will constantly fallback to slow gup.
Right, DAX would be the main use case I can think of. Today the
intersection of DAX and contig PTE is non-existant so lets not worry.
OTOH, I also agree with you that such batching would be nice to have for
slow-gup, likely devmap or many fs (exclude shmem/hugetlb) file mappings
can at least benefit from it due to above.  But then that'll be a more
generic issue to solve, IOW, we still don't do that for !hugetlb cont_pte
large folios, before or after this series.
Right, improving contig pte is going to be a process and eventually it
will make sense to optimize this regardless of hugetlbfs

Jason
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help