Re: [PATCH RFC v3 13/13] uprobes: add speculative lockless VMA to inode resolution
From: Suren Baghdasaryan <surenb@google.com>
Date: 2024-08-13 15:36:16
Also in:
bpf, linux-mm, lkml
On Mon, Aug 12, 2024 at 11:18 PM Mateusz Guzik [off-list ref] wrote:
On Mon, Aug 12, 2024 at 09:29:17PM -0700, Andrii Nakryiko wrote:quoted
Now that files_cachep is SLAB_TYPESAFE_BY_RCU, we can safely access vma->vm_file->f_inode lockless only under rcu_read_lock() protection, attempting uprobe look up speculatively. We rely on newly added mmap_lock_speculation_{start,end}() helpers to validate that mm_struct stays intact for entire duration of this speculation. If not, we fall back to mmap_lock-protected lookup. This allows to avoid contention on mmap_lock in absolutely majority of cases, nicely improving uprobe/uretprobe scalability.Here I have to admit to being mostly ignorant about the mm, so bear with me. :> I note the result of find_active_uprobe_speculative is immediately stale in face of modifications. The thing I'm after is that the mmap_lock_speculation business adds overhead on archs where a release fence is not a de facto nop and I don't believe the commit message justifies it. Definitely a bummer to add merely it for uprobes. If there are bigger plans concerning it that's a different story of course. With this in mind I have to ask if instead you could perhaps get away with the already present per-vma sequence counter?
per-vma sequence counter does not implement acquire/release logic, it relies on vma->vm_lock for synchronization. So if we want to use it, we would have to add additional memory barriers here. This is likely possible but as I mentioned before we would need to ensure the pagefault path does not regress. OTOH mm->mm_lock_seq already halfway there (it implements acquire/release logic), we just had to ensure mmap_write_lock() increments mm->mm_lock_seq. So, from the release fence overhead POV I think whether we use mm->mm_lock_seq or vma->vm_lock, we would still need a proper fence here.