Thread (181 messages) 181 messages, 12 authors, 2023-11-22

Re: [PATCH 12/41] mm: add per-VMA lock and helper functions to control it

From: Jann Horn <jannh@google.com>
Date: 2023-01-17 22:46:59
Also in: linux-arm-kernel, linux-mm, lkml

On Tue, Jan 17, 2023 at 10:28 PM Suren Baghdasaryan [off-list ref] wrote:
On Tue, Jan 17, 2023 at 10:03 AM Jann Horn [off-list ref] wrote:
quoted
+locking maintainers
Thanks! I'll CC the locking maintainers in the next posting.
quoted
On Mon, Jan 9, 2023 at 9:54 PM Suren Baghdasaryan [off-list ref] wrote:
quoted
Introduce a per-VMA rw_semaphore to be used during page fault handling
instead of mmap_lock. Because there are cases when multiple VMAs need
to be exclusively locked during VMA tree modifications, instead of the
usual lock/unlock patter we mark a VMA as locked by taking per-VMA lock
exclusively and setting vma->lock_seq to the current mm->lock_seq. When
mmap_write_lock holder is done with all modifications and drops mmap_lock,
it will increment mm->lock_seq, effectively unlocking all VMAs marked as
locked.
[...]
quoted
+static inline void vma_read_unlock(struct vm_area_struct *vma)
+{
+       up_read(&vma->lock);
+}
One thing that might be gnarly here is that I think you might not be
allowed to use up_read() to fully release ownership of an object -
from what I remember, I think that up_read() (unlike something like
spin_unlock()) can access the lock object after it's already been
acquired by someone else. So if you want to protect against concurrent
deletion, this might have to be something like:

rcu_read_lock(); /* keeps vma alive */
up_read(&vma->lock);
rcu_read_unlock();
But for deleting VMA one would need to write-lock the vma->lock first,
which I assume can't happen until this up_read() is complete. Is that
assumption wrong?
__up_read() does:

rwsem_clear_reader_owned(sem);
tmp = atomic_long_add_return_release(-RWSEM_READER_BIAS, &sem->count);
DEBUG_RWSEMS_WARN_ON(tmp < 0, sem);
if (unlikely((tmp & (RWSEM_LOCK_MASK|RWSEM_FLAG_WAITERS)) ==
      RWSEM_FLAG_WAITERS)) {
  clear_nonspinnable(sem);
  rwsem_wake(sem);
}

The atomic_long_add_return_release() is the point where we are doing
the main lock-releasing.

So if a reader dropped the read-lock while someone else was waiting on
the lock (RWSEM_FLAG_WAITERS) and no other readers were holding the
lock together with it, the reader also does clear_nonspinnable() and
rwsem_wake() afterwards.
But in rwsem_down_write_slowpath(), after we've set
RWSEM_FLAG_WAITERS, we can return successfully immediately once
rwsem_try_write_lock() sees that there are no active readers or
writers anymore (if RWSEM_LOCK_MASK is unset and the cmpxchg
succeeds). We're not necessarily waiting for the "nonspinnable" bit or
the wake.

So yeah, I think down_write() can return successfully before up_read()
is done with its memory accesses.

(Spinlocks are different - the kernel relies on being able to drop
references via spin_unlock() in some places.)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help