Thread (76 messages) 76 messages, 9 authors, 2024-10-09

Re: [PATCH 00/14] replace call_rcu by kfree_rcu for simple kmem_cache_free callback

From: "Jason A. Donenfeld" <Jason@zx2c4.com>
Date: 2024-06-13 00:32:02
Also in: bridge, kernel-janitors, kvm, linux-block, linux-can, linux-nfs, linux-trace-kernel, linuxppc-dev, lkml, netfilter-devel

On Thu, Jun 13, 2024 at 01:31:57AM +0200, Jason A. Donenfeld wrote:
On Wed, Jun 12, 2024 at 03:37:55PM -0700, Paul E. McKenney wrote:
quoted
On Wed, Jun 12, 2024 at 02:33:05PM -0700, Jakub Kicinski wrote:
quoted
On Sun,  9 Jun 2024 10:27:12 +0200 Julia Lawall wrote:
quoted
Since SLOB was removed, it is not necessary to use call_rcu
when the callback only performs kmem_cache_free. Use
kfree_rcu() directly.

The changes were done using the following Coccinelle semantic patch.
This semantic patch is designed to ignore cases where the callback
function is used in another way.
How does the discussion on:
  [PATCH] Revert "batman-adv: prefer kfree_rcu() over call_rcu() with free-only callbacks"
  https://lore.kernel.org/all/20240612133357.2596-1-linus.luessing@c0d3.blue/ (local)
reflect on this series? IIUC we should hold off..
We do need to hold off for the ones in kernel modules (such as 07/14)
where the kmem_cache is destroyed during module unload.

OK, I might as well go through them...

[PATCH 01/14] wireguard: allowedips: replace call_rcu by kfree_rcu for simple kmem_cache_free callback
	Needs to wait, see wg_allowedips_slab_uninit().
Right, this has exactly the same pattern as the batman-adv issue:

    void wg_allowedips_slab_uninit(void)
    {
            rcu_barrier();
            kmem_cache_destroy(node_cache);
    }

I'll hold off on sending that up until this matter is resolved.
BTW, I think this whole thing might be caused by:

    a35d16905efc ("rcu: Add basic support for kfree_rcu() batching")

The commit message there mentions:

    There is an implication with rcu_barrier() with this patch. Since the
    kfree_rcu() calls can be batched, and may not be handed yet to the RCU
    machinery in fact, the monitor may not have even run yet to do the
    queue_rcu_work(), there seems no easy way of implementing rcu_barrier()
    to wait for those kfree_rcu()s that are already made. So this means a
    kfree_rcu() followed by an rcu_barrier() does not imply that memory will
    be freed once rcu_barrier() returns.

Before that, a kfree_rcu() used to just add a normal call_rcu() into the
list, but with the function offset < 4096 as a special marker. So the
kfree_rcu() calls would be treated alongside the other call_rcu() ones
and thus affected by rcu_barrier(). Looks like that behavior is no more
since this commit.

Rather than getting rid of the batching, which seems good for
efficiency, I wonder if the right fix to this would be adding a
`should_destroy` boolean to kmem_cache, which kmem_cache_destroy() sets
to true. And then right after it checks `if (number_of_allocations == 0)
actually_destroy()`, and likewise on each kmem_cache_free(), it could
check `if (should_destroy && number_of_allocations == 0)
actually_destroy()`. This way, the work is delayed until it's safe to do
so. This might also mitigate other lurking bugs of bad code that calls
kmem_cache_destroy() before kmem_cache_free().

Jason
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help