Re: [RFC v2 PATCH] mm, sl[au]b: Introduce lockless cache
From: Matthew Wilcox <willy@infradead.org>
Date: 2021-09-21 02:05:19
Also in:
linux-block, linux-mm, lkml
On Mon, Sep 20, 2021 at 03:48:16PM +0000, Hyeonggon Yoo wrote:
+#define KMEM_LOCKLESS_CACHE_QUEUE_SIZE 64
I would suggest that, to be nice to the percpu allocator, this be one less than 2^n.
+struct kmem_lockless_cache {
+ void *queue[KMEM_LOCKLESS_CACHE_QUEUE_SIZE];
+ unsigned int size;
+};I would also suggest that 'size' be first as it is going to be accessed every time, and then there's a reasonable chance that queue[size - 1] will be in the same cacheline. CPUs will tend to handle that better.
+/**
+ * kmem_cache_alloc_cached - try to allocate from cache without lock
+ * @s: slab cache
+ * @flags: SLAB flags
+ *
+ * Try to allocate from cache without lock. If fails, fill the lockless cache
+ * using bulk alloc API
+ *
+ * Be sure that there's no race condition.
+ * Must create slab cache with SLAB_LOCKLESS_CACHE flag to use this function.
+ *
+ * Return: a pointer to free object on allocation success, NULL on failure.
+ */
+void *kmem_cache_alloc_cached(struct kmem_cache *s, gfp_t gfpflags)
+{
+ struct kmem_lockless_cache *cache = this_cpu_ptr(s->cache);
+
+ BUG_ON(!(s->flags & SLAB_LOCKLESS_CACHE));
+
+ if (cache->size) /* fastpath without lock */
+ return cache->queue[--cache->size];
+
+ /* slowpath */
+ cache->size = kmem_cache_alloc_bulk(s, gfpflags,
+ KMEM_LOCKLESS_CACHE_QUEUE_SIZE, cache->queue);Go back to the Bonwick paper and look at the magazine section again. You have to allocate _half_ the size of the queue, otherwise you get into pathological situations where you start to free and allocate every time.
+void kmem_cache_free_cached(struct kmem_cache *s, void *p)
+{
+ struct kmem_lockless_cache *cache = this_cpu_ptr(s->cache);
+
+ BUG_ON(!(s->flags & SLAB_LOCKLESS_CACHE));
+
+ /* Is there better way to do this? */
+ if (cache->size == KMEM_LOCKLESS_CACHE_QUEUE_SIZE)
+ kmem_cache_free(s, cache->queue[--cache->size]);
Yes.
if (cache->size == KMEM_LOCKLESS_CACHE_QUEUE_SIZE) {
kmem_cache_free_bulk(s, KMEM_LOCKLESS_CACHE_QUEUE_SIZE / 2,
&cache->queue[KMEM_LOCKLESS_CACHE_QUEUE_SIZE / 2));
cache->size = KMEM_LOCKLESS_CACHE_QUEUE_SIZE / 2;
(check the maths on that; it might have some off-by-one)