Re: [RFC PATCH RESEND 00/28] per-VMA locks proposal
From: Vlastimil Babka <hidden>
Date: 2022-09-11 09:35:47
Also in:
linux-arm-kernel, linux-mm, lkml
On 9/2/22 01:26, Suren Baghdasaryan wrote:
On Thu, Sep 1, 2022 at 1:58 PM Kent Overstreet [off-list ref] wrote:quoted
On Thu, Sep 01, 2022 at 10:34:48AM -0700, Suren Baghdasaryan wrote:quoted
Resending to fix the issue with the In-Reply-To tag in the original submission at [4]. This is a proof of concept for per-vma locks idea that was discussed during SPF [1] discussion at LSF/MM this year [2], which concluded with suggestion that “a reader/writer semaphore could be put into the VMA itself; that would have the effect of using the VMA as a sort of range lock. There would still be contention at the VMA level, but it would be an improvement.” This patchset implements this suggested approach. When handling page faults we lookup the VMA that contains the faulting page under RCU protection and try to acquire its lock. If that fails we fall back to using mmap_lock, similar to how SPF handled this situation. One notable way the implementation deviates from the proposal is the way VMAs are marked as locked. Because during some of mm updates multiple VMAs need to be locked until the end of the update (e.g. vma_merge, split_vma, etc). Tracking all the locked VMAs, avoiding recursive locks and other complications would make the code more complex. Therefore we provide a way to "mark" VMAs as locked and then unmark all locked VMAs all at once. This is done using two sequence numbers - one in the vm_area_struct and one in the mm_struct. VMA is considered locked when these sequence numbers are equal. To mark a VMA as locked we set the sequence number in vm_area_struct to be equal to the sequence number in mm_struct. To unlock all VMAs we increment mm_struct's seq number. This allows for an efficient way to track locked VMAs and to drop the locks on all VMAs at the end of the update.I like it - the sequence numbers are a stroke of genuius. For what it's doing the patchset seems almost small.Thanks for reviewing it!quoted
Two complaints so far: - I don't like the vma_mark_locked() name. To me it says that the caller already took or is taking the lock and this function is just marking that we're holding the lock, but it's really taking a different type of lock. But this function can block, it really is taking a lock, so it should say that. This is AFAIK a new concept, not sure I'm going to have anything good either, but perhaps vma_lock_multiple()?I'm open to name suggestions but vma_lock_multiple() is a bit confusing to me. Will wait for more suggestions.
Well, it does act like a vma_write_lock(), no? So why not that name. The checking function for it is even called vma_assert_write_locked(). We just don't provide a single vma_write_unlock(), but a vma_mark_unlocked_all(), that could be instead named e.g. vma_write_unlock_all(). But it's called on a mm, so maybe e.g. mm_vma_write_unlock_all()?