Thread (17 messages) 17 messages, 4 authors, 2020-01-14

Re: WARNING: bad unlock balance in sch_direct_xmit

From: Cong Wang <hidden>
Date: 2020-01-11 21:53:48

On Thu, Jan 9, 2020 at 10:02 PM Taehee Yoo [off-list ref] wrote:
ndo_get_lock_subclass() was used to calculate subclass which was used by
netif_addr_lock_nested().

-static inline void netif_addr_lock_nested(struct net_device *dev)
-{
-       int subclass = SINGLE_DEPTH_NESTING;
-
-       if (dev->netdev_ops->ndo_get_lock_subclass)
-               subclass = dev->netdev_ops->ndo_get_lock_subclass(dev);
-
-       spin_lock_nested(&dev->addr_list_lock, subclass);
-}

The most important thing about nested lock is to get the correct subclass.
nest_level was used as subclass and this was calculated by
->ndo_get_lock_subclass().
But, ->ndo_get_lock_subclass() didn't calculate correct subclass.
After "master" and "nomaster" operations, nest_level should be updated
recursively, but it didn't. So incorrect subclass was used.

team3 <-- subclass 0

"ip link set team3 master team2"

team2 <-- subclass 0
team3 <-- subclass 1

"ip link set team2 master team1"

team1 <-- subclass 0
team3 <-- subclass 1
team3 <-- subclass 1

"ip link set team1 master team0"

team0 <-- subclass 0
team1 <-- subclass 1
team3 <-- subclass 1
team3 <-- subclass 1

After "master" and "nomaster" operation, subclass values of all lower or
upper interfaces would be changed. But ->ndo_get_lock_subclass()
didn't update subclass recursively, lockdep warning appeared.
In order to fix this, I had two ways.
1. use dynamic keys instead of static keys.
2. fix ndo_get_lock_subclass().

The reason why I adopted using dynamic keys instead of fixing
->ndo_get_lock_subclass() is that the ->ndo_get_lock_subclass() isn't
a common helper function.
So, driver writers should implement ->ndo_get_lock_subclass().
If we use dynamic keys, ->ndo_get_lock_subclass() code could be removed.
The details you provide here are really helpful for me to understand
the reasons behind your changes. Let me think about this and see how
I could address both problems. This appears to be harder than I originally
thought.
What I fixed problems with dynamic lockdep keys could be fixed by
nested lock too. I think if the subclass value synchronization routine
works well, there will be no problem.
Great! We are on the same page.

Thanks for all the information and the reproducer too!
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help