Re: [PATCH v3 06/13] mm: introduce generic lazy_mmu helpers
From: David Hildenbrand <hidden>
Date: 2025-10-24 13:27:39
Also in:
linux-arm-kernel, linux-mm, lkml, sparclinux, xen-devel
On 24.10.25 14:13, Kevin Brodsky wrote:
On 23/10/2025 21:52, David Hildenbrand wrote:quoted
On 15.10.25 10:27, Kevin Brodsky wrote:quoted
[...] * madvise_*_pte_range() call arch_leave() in multiple paths, some followed by an immediate exit/rescheduling and some followed by a conditional exit. These functions assume that they are called with lazy MMU disabled and we cannot simply use pause()/resume() to address that. This patch leaves the situation unchanged by calling enable()/disable() in all cases.I'm confused, the function simply does (a) enables lazy mmu (b) does something on the page table (c) disables lazy mmu (d) does something expensive (split folio -> take sleepable locks, flushes tlb) (e) go to (a)That step is conditional: we exit right away if pte_offset_map_lock() fails. The fundamental issue is that pause() must always be matched with resume(), but as those functions look today there is no situation where a pause() would always be matched with a resume().
We have matches enable/disable, so my question is rather "why" you are even thinking about using pause/resume? What would be the benefit of that? If there is no benefit then just drop this from the patch description as it's more confusing than just ... doing what the existing code does :)
quoted
Why would we use enable/disable instead?quoted
* x86/Xen is currently the only case where explicit handling is required for lazy MMU when context-switching. This is purely an implementation detail and using the generic lazy_mmu_mode_* functions would cause trouble when nesting support is introduced, because the generic functions must be called from the current task. For that reason we still use arch_leave() and arch_enter() there.How does this interact with patch #11?It is a requirement for patch 11, in fact. If we called disable() when switching out a task, then lazy_mmu_state.enabled would (most likely) be false when scheduling it again. By calling the arch_* helpers when context-switching, we ensure lazy_mmu_state remains unchanged. This is consistent with what happens on all other architectures (which don't do anything about lazy_mmu when context-switching). lazy_mmu_state is the lazy MMU status *when the task is scheduled*, and should be preserved on a context-switch.
Okay, thanks for clarifying. That whole XEN stuff here is rather horrible.
quoted
quoted
Note: x86 calls arch_flush_lazy_mmu_mode() unconditionally in a few places, but only defines it if PARAVIRT_XXL is selected, and we are removing the fallback in <linux/pgtable.h>. Add a new fallback definition to <asm/pgtable.h> to keep things building.I can see a call in __kernel_map_pages() and arch_kmap_local_post_map()/arch_kmap_local_post_unmap(). I guess that is ... harmless/irrelevant in the context of this series?It should be. arch_flush_lazy_mmu_mode() was only used by x86 before this series; we're adding new calls to it from the generic layer, but existing x86 calls shouldn't be affected.
Okay, I'd like to understand the rules when arch_flush_lazy_mmu_mode() would actually be required in such arch code, but that's outside of the scope of your patch series. -- Cheers David / dhildenb