Re: [PATCH mm-unstable v19 11/14] mm/khugepaged: Introduce mTHP collapse support
From: Nico Pache <npache@redhat.com>
Date: 2026-06-09 10:49:47
Also in:
linux-doc, linux-mm, lkml
On Tue, Jun 9, 2026 at 4:37 AM Lance Yang [off-list ref] wrote:
On 2026/6/9 17:32, Nico Pache wrote:quoted
On Tue, Jun 9, 2026 at 3:26 AM David Hildenbrand (Arm) [off-list ref] wrote:quoted
On 6/9/26 11:06, Nico Pache wrote:quoted
On Mon, Jun 8, 2026 at 8:57 AM David Hildenbrand (Arm) [off-list ref] wrote:quoted
On 6/6/26 12:28, Lance Yang wrote:quoted
Looks broken for swap PTEs in PMD collapse ... collapse_scan_pmd() allows them up to max_ptes_swap and record them in unmapped, but they don't get a bit in mthp_present_ptes. And then mthp_collapse() does the check above:Right. I assumed this is implicitly handled by the optimization in collapse_scan_pmd: if (enabled_orders != BIT(HPAGE_PMD_ORDER)) max_ptes_none = KHUGEPAGED_MAX_PTES_LIMIT; But we perform the check a second time.quoted
nr_occupied_ptes >= nr_ptes - max_ptes_none So max_ptes_none=0 + 511 present PTEs + one allowed swap PTE won't even call collapse_huge_page() for PMD order. Shouldn't we account for them in the PMD-order check? Something like: if (is_pmd_order(order)) nr_occupied_ptes += unmapped;This solution seems good for a temporary fixup. but longterm we may want something else. I'm still not sure how we plan on supporting swapin without causing creep. So I'd be ok with adding a fix for legacy PMD behavior until we know how to handle mTHP creep correctly.quoted
As an alternative, we could either 1) skip the check there for pmd order (as the check was already done); or 2) introduce+maintain a bitmap that tracks non-present PTEs.@@ -1475,7 +1477,9 @@ static enum scan_result mthp_collapse(struct mm_struct *mm, nr_occupied_ptes = bitmap_weight_from(cc->mthp_present_ptes, offset, offset + nr_ptes); - if (nr_occupied_ptes >= nr_ptes - max_ptes_none) { + /* Check was already done in the caller. */ + if (is_pmd_order(order) || + nr_occupied_ptes >= nr_ptes - max_ptes_none) { enum scan_result ret; collapse_address = address + offset * PAGE_SIZE;2) would probably be cleanest long-term.That would be best for future swapin support in mTHP, but I still don't think it solves the creep issue.It wouldn't, we'd simply maintain the state we collect + rely on in separate bitmaps. On swapin, we'd have to update/refresh bitmaps I guess.Yeah, I'm saying for the future, it obviously solves this issue here as well, but if we have positional tracking of the swapout, shared, and none PTEs, I think we can use this to determine whether the collapse would lead to creep. If we detect creep would happen it may be best to automatically collapse to the N+1 (or greater) candidate. Just thinking outloud here.quoted
quoted
Perhaps we could combine the two bitmaps to determine if it would make the future collapse eligible again? Not sure but ill start thinking about it. Should I send a fixup for this using Lance's solution? Or does Lance want to send a patch out with the fixes tag?If Lance could send a fixup, explaining the situation, that would be nice.Sure, happy to send a fixup :P Should I send it as a fixup to be folded into this patch, or as a separate patch with a Fixes: tag?
Id assume a seperate patch so you can keep credit for the discovery :) Thank you for all the review you provided on this series, its been really helpful! -- Nico
Will get one out soon :)quoted
OK, I'd appreciate that :)Cheers!