Thread (18 messages) 18 messages, 8 authors, 2013-03-30

Re: [PATCH] net: add a synchronize_net() in netdev_rx_handler_unregister()

From: Steven Rostedt <rostedt@goodmis.org>
Date: 2013-03-29 13:17:24
Also in: lkml

On Fri, 2013-03-29 at 06:01 -0700, Eric Dumazet wrote:
From: Eric Dumazet <edumazet@google.com>

On Fri, 2013-03-29 at 10:48 +0100, Jiri Pirko wrote:
quoted
Hmm. I think that this might be issue introduced by:
commit a9b3cd7f323b2e57593e7215362a7b02fc933e3a
Author: Stephen Hemminger [off-list ref]
Date:   Mon Aug 1 16:19:00 2011 +0000

    rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER


Because, if rcu_dereference(dev->rx_handler) is null,
rcu_dereference(dev->rx_handler_data) is never done. Therefore I believe
we are hitting following scenario:


   CPU0				CPU1
   ----				----
  			    dev->rx_handler_data = NULL
 rcu_read_lock()
 			    dev->rx_handler = NULL


CPU0 will see rx_handler set and yet, rx_handler_data nulled. Write
barrier in rcu_assign_pointer() might prevent this reorder from happening.
Therefore I suggest:
diff --git a/net/core/dev.c b/net/core/dev.c
index 0caa38e..c16b829 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3332,8 +3332,8 @@ void netdev_rx_handler_unregister(struct net_device *dev)
 {
 
 	ASSERT_RTNL();
-	RCU_INIT_POINTER(dev->rx_handler, NULL);
-	RCU_INIT_POINTER(dev->rx_handler_data, NULL);
+	rcu_assign_pointer(dev->rx_handler, NULL);
+	rcu_assign_pointer(dev->rx_handler_data, NULL);
 }
 EXPORT_SYMBOL_GPL(netdev_rx_handler_unregister);
 
Nope this changes nothing at all.
Exactly! In fact, the bug triggered on an older kernel that had the
original rcu_assign_pointer()
However, we can fix the bug in a different way, if we want to avoid a
test in fast path.

With following patch, we can make sure that a reader seeing a non NULL
rx_handler has a guarantee to see a non NULL rx_handler_data
[..]
quoted hunk ↗ jump to hunk
We can fix bug this in two ways. First is adding a test in
bond_handle_frame() and others to check if rx_handler_data is NULL.

A second way is adding a synchronize_net() in
netdev_rx_handler_unregister() to make sure that a rcu protected reader
has the guarantee to see a non NULL rx_handler_data.

The second way is better as it avoids an extra test in fast path.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jiri Pirko <redacted>
Cc: Paul E. McKenney <redacted>
---
 net/core/dev.c |    6 ++++++
 1 file changed, 6 insertions(+)
diff --git a/net/core/dev.c b/net/core/dev.c
index b13e5c7..56932a4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3314,6 +3314,7 @@ int netdev_rx_handler_register(struct net_device *dev,
 	if (dev->rx_handler)
 		return -EBUSY;
 
+	/* Note: rx_handler_data must be set before rx_handler */
 	rcu_assign_pointer(dev->rx_handler_data, rx_handler_data);
 	rcu_assign_pointer(dev->rx_handler, rx_handler);
 
@@ -3334,6 +3335,11 @@ void netdev_rx_handler_unregister(struct net_device *dev)
 
 	ASSERT_RTNL();
 	RCU_INIT_POINTER(dev->rx_handler, NULL);
+	/* a reader seeing a non NULL rx_handler in a rcu_read_lock()
+	 * section has a guarantee to see a non NULL rx_handler_data
+	 * as well.
+	 */
+	synchronize_net();
I've thought about this too, but I wasn't sure we wanted two
synchronize_*() functions, as the caller does a synchronize as well.
That said, I think this is the more robust solution and it lets all
rx_handler() functions assume that their rx_handler_data is set. And it
removes the check from the fast path which outweighs an added
synchronization in the slow path.

Acked-by: Steven Rostedt <rostedt@goodmis.org>

Thanks!

-- Steve
 	RCU_INIT_POINTER(dev->rx_handler_data, NULL);
 }
 EXPORT_SYMBOL_GPL(netdev_rx_handler_unregister);
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help