Re: [PATCH 10/10] nf_conntrack: Use rcu_barrier().
From: Jesper Dangaard Brouer <hidden>
Date: 2009-06-24 09:33:35
Also in:
linux-ext4, linux-nfs, linux-wireless, lkml, netfilter-devel
On Tue, 2009-06-23 at 18:23 +0200, Patrick McHardy wrote:
Jesper Dangaard Brouer wrote:quoted
I'm not sure which is are most optimal place to call rcu_barrier(). The patch probably calls rcu_barrier() too much, but its a better safe than sorry approach. There is embedded some comments that I would like Patrick McHardy to look at.diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 5f72b94..cea4537 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c@@ -1084,6 +1084,8 @@ static void nf_conntrack_cleanup_init_net(void) { nf_conntrack_helper_fini(); nf_conntrack_proto_fini(); + rcu_barrier(); + /* Need to wait for call_rcu() before dealloc the kmem_cache */ kmem_cache_destroy(nf_conntrack_cachep);Which call_rcu() is this referring to?
It is the call_rcu() in nf_conntrack_expect.c (which is linked into nf_conntrack.ko). But that also means that it should have been the slab cache called "nf_ct_expect_cachep" we should have waited for... (and I also notice that "nf_ct_expect_cachep" is missing the SLAB_DESTROY_BY_RCU flag, and the SLAB_DESTROY_BY_RCU flag should be removed from "nf_conntrack_cachep")
If its the conntrack destruction, that one is gone in the current kernel and I think destruction is handled properly by the sl*b-allocators (SLAB_DESTROY_BY_RCU).
Just dived into the slab.c code and noticed that it also is flawed, ARGH!. When the SLAB_DESTROY_BY_RCU flags is set, it only calls synchronize_rcu() and not rcu_barrier() as it should! I'll fix that up in another patch series... Looking into slub and slob at the moment, and it seems that they schedule another call_rcu callback for freeing when the SLAB_DESTROY_BY_RCU flags is set. That seems racy to me... Paul?
quoted
@@ -1118,6 +1120,9 @@ void nf_conntrack_cleanup(struct net *net) /* This makes sure all current packets have passed through netfilter framework. Roll on, two-stage module delete... */ + /* hawk@comx.dk 2009-06-20: Think this should be replaced by a + rcu_barrier() ??? + */ synchronize_net();AFAICT this one is used to make sure the old value of the ip_ct_attach hook is not visible anymore before beginning cleanup and is not needed for anything else.
Fine!
quoted
nf_conntrack_cleanup_net(net);diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c index 1935153..29c6cd0 100644 --- a/net/netfilter/nf_conntrack_standalone.c +++ b/net/netfilter/nf_conntrack_standalone.c@@ -500,6 +500,8 @@ static void nf_conntrack_net_exit(struct net *net) nf_conntrack_standalone_fini_sysctl(net); nf_conntrack_standalone_fini_proc(net); nf_conntrack_cleanup(net); + /* hawk@comx.dk: Think rcu_barrier() should to be called earlier? */ + rcu_barrier(); /* Wait for completion of call_rcu()'s */ }Which call_rcu() is this referring to? We should place them in the cleanup sub-functions to make this clearly visible.
This call_rcu() is the one done in nf_conntrack_extend.c:114 (notice "_extend" NOT "_expect"), which calls __nf_ct_ext_free_rcu(). Guess this rcu_barrier() should then be move to nf_ct_extend_unregister() right? (it already invokes a synchronize_rcu() that should be replaced by rcu_barrier()). Although this means the nf_ct_extend_unregister() will be called three times in nf_conntrack_cleanup_net() when unregistering ecache, acct and expect. Thank you for your feedback :-) ... I'll post a new v2 patch... -- Med venlig hilsen / Best regards Jesper Brouer ComX Networks A/S Linux Network developer Cand. Scient Datalog / MSc. Author of http://adsl-optimizer.dk LinkedIn: http://www.linkedin.com/in/brouer