Re: [PATCH v4 0/7] introduce vm_flags modifier functions
From: Suren Baghdasaryan <surenb@google.com>
Date: 2023-03-17 23:05:19
Also in:
linux-mm, lkml
On Fri, Mar 17, 2023 at 3:41 PM Alex Williamson [off-list ref] wrote:
On Fri, 17 Mar 2023 12:08:32 -0700 Suren Baghdasaryan [off-list ref] wrote:quoted
On Tue, Mar 14, 2023 at 1:11 PM Alex Williamson [off-list ref] wrote:quoted
On Thu, 26 Jan 2023 11:37:45 -0800 Suren Baghdasaryan [off-list ref] wrote:quoted
This patchset was originally published as a part of per-VMA locking [1] and was split after suggestion that it's viable on its own and to facilitate the review process. It is now a preprequisite for the next version of per-VMA lock patchset, which reuses vm_flags modifier functions to lock the VMA when vm_flags are being updated. VMA vm_flags modifications are usually done under exclusive mmap_lock protection because this attrubute affects other decisions like VMA merging or splitting and races should be prevented. Introduce vm_flags modifier functions to enforce correct locking. The patchset applies cleanly over mm-unstable branch of mm tree.With this series, vfio-pci developed a bunch of warnings around not holding the mmap_lock write semaphore while calling io_remap_pfn_range() from our fault handler, vfio_pci_mmap_fault(). I suspect vdpa has the same issue for their use of remap_pfn_range() from their fault handler, JasonW, MST, FYI. It also looks like gru_fault() would have the same issue, Dimitri. In all cases, we're preemptively setting vm_flags to what remap_pfn_range_notrack() uses, so I thought we were safe here as I specifically remember trying to avoid changing vm_flags from the fault handler. But apparently that doesn't take into account track_pfn_remap() where VM_PAT comes into play. The reason for using remap_pfn_range() on fault in vfio-pci is that we're mapping device MMIO to userspace, where that MMIO can be disabled and we'd rather zap the mapping when that occurs so that we can sigbus the user rather than allow the user to trigger potentially fatal bus errors on the host. Peter Xu has suggested offline that a non-lazy approach to reinsert the mappings might be more inline with mm expectations relative to touching vm_flags during fault. What's the right solution here? Can the fault handling be salvaged, is proactive remapping the right approach, or is there something better? Thanks,Hi Alex, If in your case it's safe to change vm_flags without holding exclusive mmap_lock, maybe you can use __vm_flags_mod() the way I used it in https://lore.kernel.org/all/20230126193752.297968-7-surenb@google.com (local), while explaining why this should be safe?Hi Suren, Thanks for the reply, but I'm not sure I'm following. Are you suggesting a bool arg added to io_remap_pfn_range(), or some new variant of that function to conditionally use __vm_flags_mod() in place of vm_flags_set() across the call chain? Thanks,
I think either way could work but after taking a closer look, both ways would be quite ugly. If we could somehow identify that we are handling a page fault and use __vm_flags_mod() without additional parameters it would be more palatable IMHO... Peter's suggestion to avoid touching vm_flags during fault would be much cleaner but I'm not sure how easily that can be done.
Alex -- To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.