Re: [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock
From: Peter Zijlstra <peterz@infradead.org>
Date: 2018-03-14 08:52:06
Also in:
linuxppc-dev, lkml
On Tue, Mar 13, 2018 at 06:59:47PM +0100, Laurent Dufour wrote:
This change is inspired by the Peter's proposal patch [1] which was protecting the VMA using SRCU. Unfortunately, SRCU is not scaling well in that particular case, and it is introducing major performance degradation due to excessive scheduling operations.
Do you happen to have a little more detail on that?
quoted hunk ↗ jump to hunk
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 34fde7111e88..28c763ea1036 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h@@ -335,6 +335,7 @@ struct vm_area_struct { struct vm_userfaultfd_ctx vm_userfaultfd_ctx; #ifdef CONFIG_SPECULATIVE_PAGE_FAULT seqcount_t vm_sequence; + atomic_t vm_ref_count; /* see vma_get(), vma_put() */ #endif } __randomize_layout;@@ -353,6 +354,9 @@ struct kioctx_table; struct mm_struct { struct vm_area_struct *mmap; /* list of VMAs */ struct rb_root mm_rb; +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT + rwlock_t mm_rb_lock; +#endif u32 vmacache_seqnum; /* per-thread vmacache */ #ifdef CONFIG_MMU unsigned long (*get_unmapped_area) (struct file *filp,
When I tried this, it simply traded contention on mmap_sem for contention on these two cachelines. This was for the concurrent fault benchmark, where mmap_sem is only ever acquired for reading (so no blocking ever happens) and the bottle-neck was really pure cacheline access. Only by using RCU can you avoid that thrashing. Also note that if your database allocates the one giant mapping, it'll be _one_ VMA and that vm_ref_count gets _very_ hot indeed.