Thread (79 messages) 79 messages, 8 authors, 2025-09-15

Re: [PATCH v11 00/15] khugepaged: mTHP support

From: Lorenzo Stoakes <hidden>
Date: 2025-09-15 12:15:33
Also in: linux-doc, linux-mm, lkml

On Mon, Sep 15, 2025 at 01:01:26PM +0100, Kiryl Shutsemau wrote:
On Mon, Sep 15, 2025 at 01:45:39PM +0200, David Hildenbrand wrote:
quoted
On 15.09.25 13:35, Lorenzo Stoakes wrote:
quoted
On Mon, Sep 15, 2025 at 01:29:22PM +0200, David Hildenbrand wrote:
quoted
On 15.09.25 13:23, Lorenzo Stoakes wrote:
quoted
On Mon, Sep 15, 2025 at 01:14:32PM +0200, David Hildenbrand wrote:
quoted
On 15.09.25 13:02, Lorenzo Stoakes wrote:
quoted
On Mon, Sep 15, 2025 at 12:52:03PM +0200, David Hildenbrand wrote:
quoted
On 15.09.25 12:43, Lorenzo Stoakes wrote:
quoted
On Mon, Sep 15, 2025 at 12:22:07PM +0200, David Hildenbrand wrote:
quoted
0 -> ~100% used (~0% none)
1 -> ~50% used (~50% none)
2 -> ~25% used (~75% none)
3 -> ~12.5% used (~87.5% none)
4 -> ~11.25% used (~88,75% none)
...
10 -> ~0% used (~100% none)
Oh and shouldn't this be inverted?

0 eagerness = we eat up all none PTE entries? Isn't that pretty eager? :P
10 eagerness = we aren't eager to eat up none PTE entries at all?

Or am I being dumb here?
Good question.

For swappiness it's: 0 -> no swap (conservative)

So intuitively I assumed: 0 -> no pte_none (conservative)

You're the native speaker, so you tell me :)
To me this is about 'eagerness to consume empty PTE entries' so 10 is more
eager, 0 is not eager at all, i.e. inversion of what you suggest :)
Just so we are on the same page: it is about "eagerness to collapse", right?

Wouldn't a 0 mean "I am not eager, I will not waste any memory, I am very
careful and bail out on any pte_none" vs. 10 meaning "I am very eager, I
will collapse no matter what I find in the page table, waste as much memory
as I want"?
Yeah, this is my understanding of your scale, or is my understanding also
inverted? :)

Right now it's:

eagerness max_ptes_none

0 -> 511
...
10 -> 0

Right?
Just so we are on the same page, this is what I had:

0 -> ~100% used (~0% none)

So "0" -> 0 pte_none or 512 used.

(note the used vs. none)
OK right so we're talking about the same thing, I guess?

I was confused partly becuase of the scale, becuase weren't people setting
this parameter to low values in practice?

And now we make it so we have equivalent of:

0 -> 0
1 -> 256
2 -> 384
Ah, there is the problem, that's not what I had in mind.

0 -> ~100% used (~0% none)
...
8 -> ~87,5% used (~12.5% none)
9 -> ~75% used (~25% none)
9 -> ~50% used (~50% none)
10 -> ~0% used (~100% none)

Hopefully I didn't mess it up again.
I think this kind of table is fine for initial implementation of the
knob, but we don't want to document it to userspace like this.
I think we want to be strategically ambiguous on what the knob does
exactly, so kernel could evolve the meaning of the knob over time.

We don't want to repeat the problem we have with max_ptes_none which too
prescriptive and got additional meaning with introduction of shrinker.

As kernel evolves, we want ability to adjust the meaning and keep the
knob useful.
I mean, having said this exact thing several times in the thread obviously
I agree... FWIW...

To repeat, I think it should be an abstraction that we entirely control and
whose meaning we can vary over time.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help