Thread (80 messages) 80 messages, 4 authors, 2024-02-13

Re: [PATCH v5 19/25] arm64/mm: Wire up PTE_CONT for user mappings

From: Ryan Roberts <ryan.roberts@arm.com>
Date: 2024-02-13 15:29:30
Also in: linux-arm-kernel, linux-mm, lkml

On 12/02/2024 16:24, David Hildenbrand wrote:
On 12.02.24 16:34, Ryan Roberts wrote:
quoted
On 12/02/2024 15:26, David Hildenbrand wrote:
quoted
On 12.02.24 15:45, Ryan Roberts wrote:
quoted
On 12/02/2024 13:54, David Hildenbrand wrote:
quoted
quoted
quoted
If so, I wonder if we could instead do that comparison modulo the
access/dirty
bits,
I think that would work - but will need to think a bit more on it.
quoted
and leave ptep_get_lockless() only reading a single entry?
I think we will need to do something a bit less fragile. ptep_get() does
collect
the access/dirty bits so its confusing if ptep_get_lockless() doesn't
IMHO. So
we will likely want to rename the function and make its documentation
explicit
that it does not return those bits.

ptep_get_lockless_noyoungdirty()? yuk... Any ideas?

Of course if I could convince you the current implementation is safe, I
might be
able to sidestep this optimization until a later date?
As discussed (and pointed out abive), there might be quite some callsites
where
we don't really care about uptodate accessed/dirty bits -- where ptep_get() is
used nowadays.

One way to approach that I had in mind was having an explicit interface:

ptep_get()
ptep_get_uptodate()
ptep_get_lockless()
ptep_get_lockless_uptodate()
Yes, I like the direction of this. I guess we anticipate that call sites
requiring the "_uptodate" variant will be the minority so it makes sense to use
the current names for the "_not_uptodate" variants? But to do a slow migration,
it might be better/safer to have the weaker variant use the new name - that
would allow us to downgrade one at a time?
Yes, I was primarily struggling with names. Likely it makes sense to either have
two completely new function names, or use the new name only for the "faster but
less precise" variant.
quoted
quoted
Especially the last one might not be needed.
I've done a scan through the code and agree with Mark's original conclusions.
Additionally, huge_pte_alloc() (which isn't used for arm64) doesn't rely on
access/dirty info. So I think I could migrate everything to the weaker variant
fairly easily.
quoted
Futher, "uptodate" might not be the best choice because of PageUptodate() and
friends. But it's better than "youngdirty"/"noyoungdirty" IMHO.
Certainly agree with "noyoungdirty" being a horrible name. How about "_sync" /
"_nosync"?
I could live with

ptep_get_sync()
ptep_get_nosync()

with proper documentation :)
but could you live with:

ptep_get()
ptep_get_nosync()
ptep_get_lockless_nosync()

?

So leave the "slower, more precise" version with the existing name.
Sure.
I'm just implementing this (as a separate RFC), and had an alternative idea for
naming/semantics:

ptep_get()
ptep_get_norecency()
ptep_get_lockless()
ptep_get_lockless_norecency()

The "_norecency" versions explicitly clear the access/dirty bits. This is useful
for the "compare to original pte to check we are not racing" pattern:

pte = ptep_get_lockless_norecency(ptep)
...
<lock>
if (!pte_same(pte, ptep_get_norecency(ptep)))
	// RACE!
...
<unlock>

With the "_nosync" semantic, the access/dirty bits may or may not be set, so the
user has to explicitly clear them to do the comparison. (although I considered a
pte_same_nosync() that would clear the bits for you - but that name is pretty naff).

Although the _norecency semantic requires always explicitly clearing the bits,
so may be infinitesimally slower, it gives a very clear expectation that the
access/dirty bits are always clear and I think that's conveyed well in the name too.

Thoughts?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help