Re: [net-next 10/11] net/mlx5e: Implement queue mgmt ops and single channel swap
From: Jakub Kicinski <kuba@kernel.org>
Date: 2025-01-27 19:27:46
On Fri, 24 Jan 2025 11:34:54 -0800 Saeed Mahameed wrote:
On 24 Jan 07:26, Jakub Kicinski wrote:quoted
quoted
Are you expecting drivers to hold netdev_lock internally? I was thinking something more scalable, queue_mgmt API to take netdev_lock, and any other place in the stack that can access "netdev queue config" e.g ethtool/netlink/netdev_ops should grab netdev_lock as well, this is better for the future when we want to reduce rtnl usage in the stack to protect single netdev ops where netdev_lock will be sufficient, otherwise you will have to wait for ALL drivers to properly use netdev_lock internally to even start thinking of getting rid of rtnl from some parts of the core stack.Agreed, expecting drivers to get the locking right internally is easier short term but messy long term. I'm thinking opt-in for drivers to have netdev_lock taken by the core. Probably around all ops which today hold rtnl_lock, to keep the expectations simple.Why opt-in? I don't see any overhead of taking netdev_lock by default in rtnl_lock flows.
We could, depends on how close we take the dev lock to the ndo vs to rtnl_lock. Some drivers may call back into the stack so if we're not careful enough we'll get flooded by static analysis reports saying that we had deadlocked some old Sun driver :( Then there are SW upper drivers like bonding which we'll need at the very least lockdep nesting allocations for. Would be great to solve all these issues, but IMHO not a hard requirement, we can at least start with opt in. Unless always taking the lock gives us some worthwhile invariant I haven't considered?