Re: [PATCH 00/14] replace call_rcu by kfree_rcu for simple kmem_cache_free callback
From: Vlastimil Babka <hidden>
Date: 2024-06-18 17:21:45
Also in:
bridge, kernel-janitors, kvm, linux-block, linux-can, linux-nfs, linux-trace-kernel, lkml, netdev, netfilter-devel
On 6/18/24 6:48 PM, Paul E. McKenney wrote:
On Tue, Jun 18, 2024 at 11:31:00AM +0200, Uladzislau Rezki wrote:quoted
quoted
On 6/17/24 8:42 PM, Uladzislau Rezki wrote:quoted
quoted
+ + s = container_of(work, struct kmem_cache, async_destroy_work); + + // XXX use the real kmem_cache_free_barrier() or similar thing hereIt implies that we need to introduce kfree_rcu_barrier(), a new API, which i wanted to avoid initially.I wanted to avoid new API or flags for kfree_rcu() users and this would be achieved. The barrier is used internally so I don't consider that an API to avoid. How difficult is the implementation is another question, depending on how the current batching works. Once (if) we have sheaves proven to work and move kfree_rcu() fully into SLUB, the barrier might also look different and hopefully easier. So maybe it's not worth to invest too much into that barrier and just go for the potentially longer, but easier to implement?Right. I agree here. If the cache is not empty, OK, we just defer the work, even we can use a big 21 seconds delay, after that we just "warn" if it is still not empty and leave it as it is, i.e. emit a warning and we are done. Destroying the cache is not something that must happen right away.OK, I have to ask... Suppose that the cache is created and destroyed by a module and init/cleanup time, respectively. Suppose that this module is rmmod'ed then very quickly insmod'ed. Do we need to fail the insmod if the kmem_cache has not yet been fully cleaned up?
We don't have any such link between kmem_cache and module to detect that, so we would have to start tracking that. Probably not worth the trouble.
If not, do we have two versions of the same kmem_cache in /proc during the overlap time?
Hm could happen in /proc/slabinfo but without being harmful other than perhaps confusing someone. We could filter out the caches being destroyed trivially. Sysfs and debugfs might be more problematic as I suppose directory names would clash. I'll have to check... might be even happening now when we do detect leaked objects and just leave the cache around... thanks for the question.
Thanx, Paulquoted
quoted
quoted
Since you do it asynchronous can we just repeat and wait until it a cache is furry freed?The problem is we want to detect the cases when it's not fully freed because there was an actual read. So at some point we'd need to stop the repeats because we know there can no longer be any kfree_rcu()'s in flight since the kmem_cache_destroy() was called.Agree. As noted above, we can go with 21 seconds(as an example) interval and just perform destroy(without repeating). -- Uladzislau Rezki