Thread (89 messages) 89 messages, 12 authors, 2025-09-11

Re: [PATCH v10 00/13] khugepaged: mTHP support

From: Baolin Wang <baolin.wang@linux.alibaba.com>
Date: 2025-08-28 09:46:56
Also in: linux-doc, linux-mm, lkml

(Sorry for chiming in late)

On 2025/8/22 22:10, David Hildenbrand wrote:
quoted
quoted
Once could also easily support the value 255 (HPAGE_PMD_NR / 2- 1), 
but not sure
if we have to add that for now.
Yeah not so sure about this, this is a 'just have to know' too, and 
yes you
might add it to the docs, but people are going to be mightily 
confused, esp if
it's a calculated value.

I don't see any other way around having a separate tunable if we don't 
just have
something VERY simple like on/off.
Yeah, not advocating that we add support for other values than 0/511, 
really.
quoted
Also the mentioned issue sounds like something that needs to be fixed 
elsewhere
honestly in the algorithm used to figure out mTHP ranges (I may be 
wrong - and
happy to stand corrected if this is somehow inherent, but reallly 
feels that
way).
I think the creep is unavoidable for certain values.

If you have the first two pages of a PMD area populated, and you allow 
for at least half of the #PTEs to be non/zero, you'd collapse first a
order-2 folio, then and order-3 ... until you reached PMD order.

So for now we really should just support 0 / 511 to say "don't collapse 
if there are holes" vs. "always collapse if there is at least one pte 
used".
If we only allow setting 0 or 511, as Nico mentioned before, "At 511, no 
mTHP collapses would ever occur anyway, unless you have 2MB disabled and 
other mTHP sizes enabled. Technically, at 511, only the highest enabled 
order would ever be collapsed."

In other words, for the scenario you described, although there are only 
2 PTEs present in a PMD, it would still get collapsed into a PMD-sized 
THP. In reality, what we probably need is just an order-2 mTHP collapse.

If 'khugepaged_max_ptes_none' is set to 255, I think this would achieve 
the desired result: when there are only 2 PTEs present in a PMD, an 
order-2 mTHP collapse would be successed, but it wouldn’t creep up to an 
order-3 mTHP collapse. That’s because:
When attempting an order-3 mTHP collapse, 'threshold_bits' = 1, while 
'bits_set' = 1 (means only 1 chunk is present), so 'bits_set > 
threshold_bits' is false, then an order-3 mTHP collapse wouldn’t be 
attempted. No?

So I have some concerns that if we only allow setting 0 or 511, it may 
not meet the goal we have for mTHP collapsing.
quoted
quoted
Because, as raised in the past, I'm afraid nobody on this earth has a 
clue how
to set this parameter to values different to 0 (don't waste memory 
with khugepaged)
and 511 (page fault behavior).
Yup
quoted

If any other value is set, essentially
    pr_warn("Unsupported 'max_ptes_none' value for mTHP collapse");

for now and just disable it.
Hmm but under what circumstances? I would just say unsupported value 
not mention
mTHP or people who don't use mTHP might find that confusing.
Well, we can check whether any mTHP size is enabled while the value is 
set to something unexpected. We can then even print the problematic 
sizes if we have to.

We could also just just say that if the value is set to something else 
than 511 (which is the default), it will be treated as being "0" when 
collapsing mthp, instead of doing any scaling.
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help