Re: [PATCH v7 mm-new 04/10] mm: thp: enable THP allocation exclusively... | linux-doc

[PATCH v7 mm-new 0/9] mm, bpf: BPF based THP order selection · Yafang Shao <hidden> · 2025-09-10
[PATCH v7 mm-new 01/10] mm: thp: remove disabled task from khugepaged_mm_slot · Yafang Shao <hidden> · 2025-09-10
Re: [PATCH v7 mm-new 01/10] mm: thp: remove disabled task from khugepaged_mm_slot · Lance Yang <lance.yang@linux.dev> · 2025-09-10
Re: [PATCH v7 mm-new 01/10] mm: thp: remove disabled task from khugepaged_mm_slot · Yafang Shao <hidden> · 2025-09-10
Re: [PATCH v7 mm-new 01/10] mm: thp: remove disabled task from khugepaged_mm_slot · Lance Yang <lance.yang@linux.dev> · 2025-09-10
Re: [PATCH v7 mm-new 01/10] mm: thp: remove disabled task from khugepaged_mm_slot · kernel test robot <hidden> · 2025-09-10
Re: [PATCH v7 mm-new 01/10] mm: thp: remove disabled task from khugepaged_mm_slot · Lance Yang <lance.yang@linux.dev> · 2025-09-11
Re: [PATCH v7 mm-new 01/10] mm: thp: remove disabled task from khugepaged_mm_slot · Zi Yan <ziy@nvidia.com> · 2025-09-11
Re: [PATCH v7 mm-new 01/10] mm: thp: remove disabled task from khugepaged_mm_slot · Yafang Shao <hidden> · 2025-09-11
Re: [PATCH v7 mm-new 01/10] mm: thp: remove disabled task from khugepaged_mm_slot · Lance Yang <lance.yang@linux.dev> · 2025-09-11
Re: [PATCH v7 mm-new 01/10] mm: thp: remove disabled task from khugepaged_mm_slot · Lorenzo Stoakes <hidden> · 2025-09-11
Re: [PATCH v7 mm-new 01/10] mm: thp: remove disabled task from khugepaged_mm_slot · Yafang Shao <hidden> · 2025-09-14
Re: [PATCH v7 mm-new 01/10] mm: thp: remove disabled task from khugepaged_mm_slot · Lorenzo Stoakes <hidden> · 2025-09-11
Re: [PATCH v7 mm-new 01/10] mm: thp: remove disabled task from khugepaged_mm_slot · Yafang Shao <hidden> · 2025-09-14
[PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Yafang Shao <hidden> · 2025-09-10
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Lance Yang <lance.yang@linux.dev> · 2025-09-10
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Lance Yang <lance.yang@linux.dev> · 2025-09-10
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Lance Yang <lance.yang@linux.dev> · 2025-09-10
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Yafang Shao <hidden> · 2025-09-11
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Lance Yang <lance.yang@linux.dev> · 2025-09-11
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Lorenzo Stoakes <hidden> · 2025-09-11
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Lorenzo Stoakes <hidden> · 2025-09-11
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Lance Yang <lance.yang@linux.dev> · 2025-09-11
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Lorenzo Stoakes <hidden> · 2025-09-11
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Yafang Shao <hidden> · 2025-09-12
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Lorenzo Stoakes <hidden> · 2025-09-12
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Lorenzo Stoakes <hidden> · 2025-09-11
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Yafang Shao <hidden> · 2025-09-12
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Lorenzo Stoakes <hidden> · 2025-09-12
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Yafang Shao <hidden> · 2025-09-14
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Lorenzo Stoakes <hidden> · 2025-09-11
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Yafang Shao <hidden> · 2025-09-12
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Lorenzo Stoakes <hidden> · 2025-09-12
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Lance Yang <lance.yang@linux.dev> · 2025-09-25
Re: [PATCH v7 mm-new 02/10] mm: thp: add support for BPF based THP order selection · Yafang Shao <hidden> · 2025-09-25
[PATCH v7 mm-new 03/10] mm: thp: decouple THP allocation between swap and page fault paths · Yafang Shao <hidden> · 2025-09-10
Re: [PATCH v7 mm-new 03/10] mm: thp: decouple THP allocation between swap and page fault paths · Lorenzo Stoakes <hidden> · 2025-09-11
Re: [PATCH v7 mm-new 03/10] mm: thp: decouple THP allocation between swap and page fault paths · Yafang Shao <hidden> · 2025-09-12
Re: [PATCH v7 mm-new 03/10] mm: thp: decouple THP allocation between swap and page fault paths · Lorenzo Stoakes <hidden> · 2025-09-12
[PATCH v7 mm-new 04/10] mm: thp: enable THP allocation exclusively through khugepaged · Yafang Shao <hidden> · 2025-09-10
Re: [PATCH v7 mm-new 04/10] mm: thp: enable THP allocation exclusively through khugepaged · Lance Yang <lance.yang@linux.dev> · 2025-09-11
Re: [PATCH v7 mm-new 04/10] mm: thp: enable THP allocation exclusively through khugepaged · Yafang Shao <hidden> · 2025-09-12
Re: [PATCH v7 mm-new 04/10] mm: thp: enable THP allocation exclusively through khugepaged · Lorenzo Stoakes <hidden> · 2025-09-11
Re: [PATCH v7 mm-new 04/10] mm: thp: enable THP allocation exclusively through khugepaged · Yafang Shao <hidden> · 2025-09-12
Re: [PATCH v7 mm-new 04/10] mm: thp: enable THP allocation exclusively through khugepaged · Lorenzo Stoakes <hidden> · 2025-09-12
Re: [PATCH v7 mm-new 04/10] mm: thp: enable THP allocation exclusively through khugepaged · Yafang Shao <hidden> · 2025-09-14
[PATCH v7 mm-new 05/10] bpf: mark mm->owner as __safe_rcu_or_null · Yafang Shao <hidden> · 2025-09-10
Re: [PATCH v7 mm-new 05/10] bpf: mark mm->owner as __safe_rcu_or_null · Lorenzo Stoakes <hidden> · 2025-09-11
[PATCH v7 mm-new 06/10] bpf: mark vma->vm_mm as __safe_trusted_or_null · Yafang Shao <hidden> · 2025-09-10
Re: [PATCH v7 mm-new 06/10] bpf: mark vma->vm_mm as __safe_trusted_or_null · Lorenzo Stoakes <hidden> · 2025-09-11
Re: [PATCH v7 mm-new 06/10] bpf: mark vma->vm_mm as __safe_trusted_or_null · Liam R. Howlett <hidden> · 2025-09-11
Re: [PATCH v7 mm-new 06/10] bpf: mark vma->vm_mm as __safe_trusted_or_null · Lorenzo Stoakes <hidden> · 2025-09-11
Re: [PATCH v7 mm-new 06/10] bpf: mark vma->vm_mm as __safe_trusted_or_null · Yafang Shao <hidden> · 2025-09-12
Re: [PATCH v7 mm-new 06/10] bpf: mark vma->vm_mm as __safe_trusted_or_null · Yafang Shao <hidden> · 2025-09-12
[PATCH v7 mm-new 07/10] selftests/bpf: add a simple BPF based THP policy · Yafang Shao <hidden> · 2025-09-10
Re: [PATCH v7 mm-new 07/10] selftests/bpf: add a simple BPF based THP policy · Alexei Starovoitov <hidden> · 2025-09-10
Re: [PATCH v7 mm-new 07/10] selftests/bpf: add a simple BPF based THP policy · Yafang Shao <hidden> · 2025-09-11
[PATCH v7 mm-new 08/10] selftests/bpf: add test case to update THP policy · Yafang Shao <hidden> · 2025-09-10
[PATCH v7 mm-new 09/10] selftests/bpf: add test cases for invalid thp_adjust usage · Yafang Shao <hidden> · 2025-09-10
[PATCH v7 mm-new 10/10] Documentation: add BPF-based THP policy management · Yafang Shao <hidden> · 2025-09-10
Re: [PATCH v7 mm-new 0/9] mm, bpf: BPF based THP order selection · Lance Yang <lance.yang@linux.dev> · 2025-09-10

Re: [PATCH v7 mm-new 04/10] mm: thp: enable THP allocation exclusively through khugepaged

From: Yafang Shao <hidden>
Date: 2025-09-14 02:20:31
Also in: bpf, linux-mm

On Fri, Sep 12, 2025 at 9:48 PM Lorenzo Stoakes
[off-list ref] wrote:

On Fri, Sep 12, 2025 at 02:17:01PM +0800, Yafang Shao wrote:

quoted

On Thu, Sep 11, 2025 at 11:58 PM Lorenzo Stoakes
[off-list ref] wrote:

quoted

On Wed, Sep 10, 2025 at 10:44:41AM +0800, Yafang Shao wrote:

quoted

Currently, THP allocation cannot be restricted to khugepaged alone while
being disabled in the page fault path. This limitation exists because
disabling THP allocation during page faults also prevents the execution of
khugepaged_enter_vma() in that path.

This is quite confusing, I see what you mean - you want to be able to disable
page fault THP but not khugepaged THP _at the point of possibly faulting in a
THP aligned VMA_.

It seems this patch makes khugepaged_enter_vma() unconditional for an anonymous
VMA, rather than depending on the return value specified by
thp_vma_allowable_order().

The functions thp_vma_allowable_order(TVA_PAGEFAULT) and
thp_vma_allowable_order(TVA_KHUGEPAGED) are functionally equivalent
within the page fault handler; they always yield the same result.
Consequently, their execution order is irrelevant.

It seems hard to definitely demonstrate that by checking !in_pf vs not in this
situation :) but it seems broadly true afaict.

So they differ only in that one starts khugepaged, the other tries to
establish a THP on fault via create_huge_pmd().

right

quoted

The change reorders these two calls and, in doing so, also moves the
call to vmf_anon_prepare(vmf). This alters the control flow:
- before this change:  The logic checked the return value of
vmf_anon_prepare() between the two thp_vma_allowable_order() calls.

    thp_vma_allowable_order(TVA_PAGEFAULT);
    ret = vmf_anon_prepare(vmf);
    if (ret)
        return ret;
    thp_vma_allowable_order(TVA_KHUGEPAGED);

I mean it's also _only if_ the TVA_PAGEFAULT invocation succeeds that the
TVA_KHUGEPAGED one happens.

quoted

 - after this change: The logic now executes both
thp_vma_allowable_order() calls first and does not check the return
value of vmf_anon_prepare().

    thp_vma_allowable_order(TVA_KHUGEPAGED);
    thp_vma_allowable_order(TVA_PAGEFAULT);
    ret = vmf_anon_prepare(vmf); // Return value 'ret' is ignored.

Hm this is confusing, your code does:

+       if (pmd_none(*vmf.pmd)) {
+               if (vma_is_anonymous(vma))
+                       khugepaged_enter_vma(vma, vm_flags);
+               if (thp_vma_allowable_order(vma, vm_flags, TVA_PAGEFAULT, PMD_ORDER)) {
+                       ret = create_huge_pmd(&vmf);
+                       if (!(ret & VM_FAULT_FALLBACK))
+                               return ret;
+               }

So the ret is absolutely not ignored, but whether it succeeds or not, we still
invoke khugepaged_enter_vma().

Previously we would not have one this had vmf_anon_prepare() failed in
do_huge_pmd_anonymous_page().

Which I guess is what you mean?

quoted

This change is safe because the return value of vmf_anon_prepare() can
be safely ignored. This function checks for transient system-level
conditions (e.g., memory pressure, THP availability) that might
prevent an immediate THP allocation. It does not guarantee that a
subsequent allocation will succeed.

This behavior is consistent with the policy in hugepage_madvise(),
where a VMA is queued for khugepaged before a definitive allocation
check. If the system is under pressure, khugepaged will simply retry
the allocation at a more opportune time.

OK. I do note though that the khugepaged being kicked off is at mm_struct level.

The unit of operation for khugepaged is the mm_struct itself. It
processes the entire mm even when only a single VMA within it is a
candidate for a THP.

So us trying to invoke khugepaged on the mm again is about.. something having
changed that would previously have prevented us but now doesn't?

That is, a product of thp_vma_allowable_order() right?

So probably a sysfs change or similar?

But I guess it makes sense to hook in BPF whenever this is the case because this
_could_ be the point at which khugepaged enters the mm, and we want to select
the allowable order at this time.

So on basis of the two checks being effectively equivalent (on assumption this
is always the case) then the change is fairly reasonable.

Yes, that is exactly what I mean.

Though I would put this information, that the checks are equivalent, in the
commit message so it's really clear.

will add it.

-- 
Regards
Yafang

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help