Re: [PATCH v11 07/14] mm: multi-gen LRU: exploit locality in rmap
From: Will Deacon <will@kernel.org>
Date: 2022-06-07 10:44:29
Also in:
linux-doc, linux-mm, lkml
On Tue, Jun 07, 2022 at 10:37:46AM +1200, Barry Song wrote:
On Tue, Jun 7, 2022 at 10:21 PM Will Deacon [off-list ref] wrote:quoted
On Tue, Jun 07, 2022 at 07:37:10PM +1200, Barry Song wrote:quoted
I can't really explain why we are getting a random app/java vm crash in monkey test by using ptep_test_and_clear_young() only in lru_gen_look_around() on an armv8-a machine without hardware PTE young support. Moving to ptep_clear_flush_young() in look_around can make the random hang disappear according to zhanyuan(Cc-ed). On x86, ptep_clear_flush_young() is exactly ptep_test_and_clear_young() after 'commit b13b1d2d8692 ("x86/mm: In the PTE swapout page reclaim case clear the accessed bit instead of flushing the TLB")' But on arm64, they are different. according to Will's comments in this thread which tried to make arm64 same with x86, https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1793881.html " This is blindly copied from x86 and isn't true for us: we don't invalidate the TLB on context switch. That means our window for keeping the stale entries around is potentially much bigger and might not be a great idea. If we roll a TLB invalidation routine without the trailing DSB, what sort of performance does that get you? " We shouldn't think ptep_clear_flush_young() is safe enough in LRU to clear PTE young? Any comments from Will?Given that this issue is specific to the multi-gen LRU work, I think Yu is the best person to comment. However, looking quickly at your analysis above, I wonder if the code is relying on this sequence: ptep_test_and_clear_young(vma, address, ptep); ptep_clear_flush_young(vma, address, ptep); to invalidate the TLB. On arm64, that won't be the case, as the invalidation in ptep_clear_flush_young() is predicated on the pte being young (and this patches the generic implementation in mm/pgtable-generic.c. In fact, that second function call is always going to be a no-op unless the pte became young again in the middle.thanks for your reply, sorry for failing to let you understand my question. my question is actually as below, right now lru_gen_look_around() is using ptep_test_and_clear_young() only without flush to clear pte for a couple of pages including the specific address: void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) { ... for (i = 0, addr = start; addr != end; i++, addr += PAGE_SIZE) { ... if (!ptep_test_and_clear_young(pvmw->vma, addr, pte + i)) continue; ... } I wonder if it is safe to arm64. Do we need to move to ptep_clear_flush_young() in the loop?
I don't know what this code is doing, so Yu is the best person to answer
that. There's nothing inherently dangerous about eliding the TLB
maintenance; it really depends on the guarantees needed by the caller.
However, the snippet you posted from folio_referenced_one():
| if (pvmw.pte) {
| + if (lru_gen_enabled() && pte_young(*pvmw.pte) &&
| + !(vma->vm_flags & (VM_SEQ_READ | VM_RAND_READ))) {
| + lru_gen_look_around(&pvmw);
| + referenced++;
| + }
| +
| if (ptep_clear_flush_young_notify(vma, address,
Does seem to call lru_gen_look_around() *and*
ptep_clear_flush_young_notify(), which is what prompted my question as it
looks pretty suspicious to me.
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel