Re: [PATCH] net: add a synchronize_net() in netdev_rx_handler_unregister()
From: Steven Rostedt <rostedt@goodmis.org>
Date: 2013-03-29 13:17:24
Also in:
lkml
On Fri, 2013-03-29 at 06:01 -0700, Eric Dumazet wrote:
From: Eric Dumazet <edumazet@google.com> On Fri, 2013-03-29 at 10:48 +0100, Jiri Pirko wrote:quoted
Hmm. I think that this might be issue introduced by: commit a9b3cd7f323b2e57593e7215362a7b02fc933e3a Author: Stephen Hemminger [off-list ref] Date: Mon Aug 1 16:19:00 2011 +0000 rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER Because, if rcu_dereference(dev->rx_handler) is null, rcu_dereference(dev->rx_handler_data) is never done. Therefore I believe we are hitting following scenario: CPU0 CPU1 ---- ---- dev->rx_handler_data = NULL rcu_read_lock() dev->rx_handler = NULL CPU0 will see rx_handler set and yet, rx_handler_data nulled. Write barrier in rcu_assign_pointer() might prevent this reorder from happening. Therefore I suggest:diff --git a/net/core/dev.c b/net/core/dev.c index 0caa38e..c16b829 100644 --- a/net/core/dev.c +++ b/net/core/dev.c@@ -3332,8 +3332,8 @@ void netdev_rx_handler_unregister(struct net_device *dev) { ASSERT_RTNL(); - RCU_INIT_POINTER(dev->rx_handler, NULL); - RCU_INIT_POINTER(dev->rx_handler_data, NULL); + rcu_assign_pointer(dev->rx_handler, NULL); + rcu_assign_pointer(dev->rx_handler_data, NULL); } EXPORT_SYMBOL_GPL(netdev_rx_handler_unregister);Nope this changes nothing at all.
Exactly! In fact, the bug triggered on an older kernel that had the original rcu_assign_pointer()
However, we can fix the bug in a different way, if we want to avoid a test in fast path. With following patch, we can make sure that a reader seeing a non NULL rx_handler has a guarantee to see a non NULL rx_handler_data
[..]
quoted hunk ↗ jump to hunk
We can fix bug this in two ways. First is adding a test in bond_handle_frame() and others to check if rx_handler_data is NULL. A second way is adding a synchronize_net() in netdev_rx_handler_unregister() to make sure that a rcu protected reader has the guarantee to see a non NULL rx_handler_data. The second way is better as it avoids an extra test in fast path. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jiri Pirko <redacted> Cc: Paul E. McKenney <redacted> --- net/core/dev.c | 6 ++++++ 1 file changed, 6 insertions(+)diff --git a/net/core/dev.c b/net/core/dev.c index b13e5c7..56932a4 100644 --- a/net/core/dev.c +++ b/net/core/dev.c@@ -3314,6 +3314,7 @@ int netdev_rx_handler_register(struct net_device *dev, if (dev->rx_handler) return -EBUSY; + /* Note: rx_handler_data must be set before rx_handler */ rcu_assign_pointer(dev->rx_handler_data, rx_handler_data); rcu_assign_pointer(dev->rx_handler, rx_handler);@@ -3334,6 +3335,11 @@ void netdev_rx_handler_unregister(struct net_device *dev) ASSERT_RTNL(); RCU_INIT_POINTER(dev->rx_handler, NULL); + /* a reader seeing a non NULL rx_handler in a rcu_read_lock() + * section has a guarantee to see a non NULL rx_handler_data + * as well. + */ + synchronize_net();
I've thought about this too, but I wasn't sure we wanted two synchronize_*() functions, as the caller does a synchronize as well. That said, I think this is the more robust solution and it lets all rx_handler() functions assume that their rx_handler_data is set. And it removes the check from the fast path which outweighs an added synchronization in the slow path. Acked-by: Steven Rostedt <rostedt@goodmis.org> Thanks! -- Steve
RCU_INIT_POINTER(dev->rx_handler_data, NULL); } EXPORT_SYMBOL_GPL(netdev_rx_handler_unregister);