Thread (28 messages) 28 messages, 8 authors, 2020-12-10

Re: Correct usage of dev_base_lock in 2020

From: Vladimir Oltean <olteanv@gmail.com>
Date: 2020-11-30 19:04:48

On Mon, Nov 30, 2020 at 08:00:40PM +0100, Eric Dumazet wrote:

On 11/30/20 7:48 PM, Vladimir Oltean wrote:
quoted
On Mon, Nov 30, 2020 at 10:14:05AM -0800, Jakub Kicinski wrote:
quoted
On Mon, 30 Nov 2020 11:41:10 +0100 Eric Dumazet wrote:
quoted
quoted
So dev_base_lock dates back to the Big Kernel Lock breakup back in Linux 2.4
(ie before my time). The time has come to get rid of it.

The use is sysfs is because could be changed to RCU. There have been issues
in the past with sysfs causing lock inversions with the rtnl mutex, that
is why you will see some trylock code there.

My guess is that dev_base_lock readers exist only because no one bothered to do
the RCU conversion.
I think we did, a long time ago.

We took care of all ' fast paths' already.

Not sure what is needed, current situation does not bother me at all ;)
Perhaps Vladimir has a plan to post separately about it (in that case
sorry for jumping ahead) but the initial problem was procfs which is
(hopefully mostly irrelevant by now, and) taking the RCU lock only
therefore forcing drivers to have re-entrant, non-sleeping
.ndo_get_stats64 implementations.
Right, the end reason why I'm even looking at this is because I want to
convert all callers of dev_get_stats to use sleepable context and not
atomic. This makes it easier to gather statistics from devices that have
a firmware, or off-chip devices behind a slow bus like SPI.

Like Jakub pointed out, some places call dev_get_stats while iterating
through the list of network interfaces - one would be procfs, but not
only. These callers are pure readers, so they use RCU protection. But
that gives us atomic context when calling dev_get_stats. The naive
solution is to convert all those callers to hold the RTNL mutex, which
is the writer-side protection for the network interface lists, and which
is sleepable. In fact I do have a series of 8 patches where I get that
done. But there are some weirder cases, such as the bonding driver,
where I need to do this:

-----------------------------[cut here]-----------------------------
From 369a0e18a2446cda8ff52d72c02aa144ae6687ec Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Mon, 30 Nov 2020 02:39:46 +0200
Subject: [PATCH] net: bonding: retrieve device statistics under RTNL, not RCU

In the effort of making .ndo_get_stats64 be able to sleep, we need to
ensure the callers of dev_get_stats do not use atomic context.

The bonding driver uses an RCU read-side critical section to ensure the
integrity of the list of network interfaces, because the driver iterates
through all net devices in the netns to find the ones which are its
configured slaves. We still need some protection against an interface
registering or deregistering, and the writer-side lock, the RTNL mutex,
is fine for that, because it offers sleepable context.

We are taking the RTNL this way (checking for rtnl_is_locked first)
because the RTNL is not guaranteed to be held by all callers of
ndo_get_stats64, in fact there will be work in the future that will
avoid as much RTNL-holding as possible.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/bonding/bond_main.c | 18 +++++++-----------
 include/net/bonding.h           |  1 -
 2 files changed, 7 insertions(+), 12 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index e0880a3840d7..1d44534e95d2 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3738,21 +3738,17 @@ static void bond_get_stats(struct net_device *bond_dev,
 			   struct rtnl_link_stats64 *stats)
 {
 	struct bonding *bond = netdev_priv(bond_dev);
+	bool rtnl_locked = rtnl_is_locked();
 	struct rtnl_link_stats64 temp;
 	struct list_head *iter;
 	struct slave *slave;
-	int nest_level = 0;

+	if (!rtnl_locked)
+		rtnl_lock();
Gosh, do not do that.

Convert the bonding ->stats_lock to a mutex instead.

Adding more reliance to RTNL is not helping cases where
access to stats should not be blocked by other users of RTNL (which can be abused)
I can't, Eric. The bond_for_each_slave() macro needs protection against
net devices being registered and unregistered.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help