Thread (35 messages) 35 messages, 5 authors, 2024-06-04

Re: [PATCH v4 2/7] mm: multi-gen LRU: Have secondary MMUs participate in aging

From: Yu Zhao <hidden>
Date: 2024-05-31 20:31:57
Also in: kvm, kvm-riscv, kvmarm, linux-arm-kernel, linux-doc, linux-kselftest, linux-mips, linux-mm, linux-riscv, lkml, loongarch

On Fri, May 31, 2024 at 1:24 AM Oliver Upton [off-list ref] wrote:
On Wed, May 29, 2024 at 03:03:21PM -0600, Yu Zhao wrote:
quoted
On Wed, May 29, 2024 at 12:05 PM James Houghton [off-list ref] wrote:
quoted
Secondary MMUs are currently consulted for access/age information at
eviction time, but before then, we don't get accurate age information.
That is, pages that are mostly accessed through a secondary MMU (like
guest memory, used by KVM) will always just proceed down to the oldest
generation, and then at eviction time, if KVM reports the page to be
young, the page will be activated/promoted back to the youngest
generation.
Correct, and as I explained offline, this is the only reasonable
behavior if we can't locklessly walk secondary MMUs.

Just for the record, the (crude) analogy I used was:
Imagine a large room with many bills ($1, $5, $10, ...) on the floor,
but you are only allowed to pick up 10 of them (and put them in your
pocket). A smart move would be to survey the room *first and then*
pick up the largest ones. But if you are carrying a 500 lbs backpack,
you would just want to pick up whichever that's in front of you rather
than walk the entire room.

MGLRU should only scan (or lookaround) secondary MMUs if it can be
done lockless. Otherwise, it should just fall back to the existing
approach, which existed in previous versions but is removed in this
version.
Grabbing the MMU lock for write to scan sucks, no argument there. But
can you please be specific about the impact of read lock v. RCU in the
case of arm64? I had asked about this before and you never replied.

My concern remains that adding support for software table walkers
outside of the MMU lock entirely requires more work than just deferring
the deallocation to an RCU callback. Walkers that previously assumed
'exclusive' access while holding the MMU lock for write must now cope
with volatile PTEs.

Yes, this problem already exists when hardware sets the AF, but the
lock-free walker implementation needs to be generic so it can be applied
for other PTE bits.
Direct reclaim is multi-threaded and each reclaimer can take the mmu
lock for read (testing the A-bit) or write (unmapping before paging
out) on arm64. The fundamental problem of using the readers-writer
lock in this case is priority inversion: the readers have lower
priority than the writers, so ideally, we don't want the readers to
block the writers at all.

Using my previous (crude) analogy: puting the bill right in front of
you (the writers) profits immediately whereas searching for the
largest bill (the readers) can be futile.

As I said earlier, I prefer we drop the arm64 support for now, but I
will not object to taking the mmu lock for read when clearing the
A-bit, as long as we fully understand the problem here and document it
clearly.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help