Re: Correct usage of dev_base_lock in 2020
From: Eric Dumazet <edumazet@google.com>
Date: 2020-11-30 20:43:56
On Mon, Nov 30, 2020 at 9:36 PM Vladimir Oltean [off-list ref] wrote:
On Mon, Nov 30, 2020 at 09:29:15PM +0100, Eric Dumazet wrote:quoted
On Mon, Nov 30, 2020 at 9:26 PM Vladimir Oltean [off-list ref] wrote:quoted
On Mon, Nov 30, 2020 at 12:21:29PM -0800, Stephen Hemminger wrote:quoted
if device is in a private list (in bond device), the way to handle this is to use dev_hold() to keep a ref count.Correct, dev_hold is a tool that can also be used. But it is a tool that does not solve the general problem - only particular ones. See the other interesting callers of dev_get_stats in parisc, appldata, net_failover. We can't ignore that RTNL is used for write-side locking forever.dev_base_lock is used to protect the list of devices (eg for /proc/net/devices), so this will need to be replaced by something. dev_hold() won't protect the 'list' from changing under us.Yes, so as I was saying. I was thinking that I could add another locking mechanism, such as struct net::netdev_lists_mutex or something like that. A mutex does not really have a read-side and a write-side, but logically speaking, this one would. So as long as I take this mutex from all places that also take the write-side of dev_base_lock, I should get equivalent semantics on the read side as if I were to take the RTNL mutex. I don't even need to convert all instances of RTNL-holding, that could be spread out over a longer period of time. It's just that I can hold this new netdev_lists_mutex in new code that calls for_each_netdev and friends, and doesn't otherwise need the RTNL. Again, the reason why I opened this thread was that I wanted to get rid of dev_base_lock first, before I introduced the struct net::netdev_lists_mutex.
Understood, but really dev_base_lock can only be removed _after_ we convert all usages to something else (mutex based, and preferably not the global RTNL) Focusing on dev_base_lock seems a distraction really.