Re: [PATCH net-next v3 03/13] net: introduce ndo_set_rx_mode_async and dev_rx_mode_work
From: Stanislav Fomichev <hidden>
Date: 2026-03-25 17:35:00
Also in:
intel-wired-lan, linux-doc, linux-kselftest, linux-rdma, linux-wireless, lkml
On 03/25, Stanislav Fomichev wrote:
On 03/24, Jakub Kicinski wrote:quoted
On Tue, 24 Mar 2026 15:49:27 -0700 Stanislav Fomichev wrote:quoted
quoted
quoted
Not sure why cancel+release, maybe you're thinking about the unregister path? This is rtnl_unlock -> netdev_run_todo -> __rtnl_unlock + some extras. And the flush is here to plumb the addresses to the real devices before we return to the callers. Mostly because of the following things we have in the tests: # TEST: team cleanup mode lacp [FAIL] # macvlan unicast address not found on a slave Can you explain a bit more on the suggestion?Oh, I thought it's here for unregister! Feels like it'd be cleaner to add the flush in dev_*c_add() and friends? How hard would it be to identify the callers in atomic context?Not sure we can do it in dev_xc_add because it runs under rtnl :-( I currently do flush in netdev_run_todo because that's the place that doesn't hold rtnl. Otherwise flush will get stuck because the work handler grabs it...I was thinking of something a'la linkwatch. We can "steal" / "flush" the pending work inline. I guess linkwatch is a major source of races over the years... Does the macvlan + team problem still happens with the current implementation minus the flush? We are only flushing once so only pushing the addresses thru one layer of async callbacks.Yes, it does happen consistently when I remove the flush. It also happens with my internal v4, so I need to look again at what's going on. Not sure whether it's my internal regression or I was just sloppy/lucky (since you're correct in pointing out that we flush only once).
Hmm, the test does 'team -d' in the background. That's why it works for bonding, but not the teaming. I'll update the test to a bunch of 'ip' commands instead of starting a daemon..
Before I went down the workqueue route, I had a simple net_todo_list-like approach: `list_add_tail` on enqueue and `while(!list_empty) run_work()` on rtnl_unlock. This had a nice properly of tracking re-submissions (by checking whether the device's list_head is linked into the list or not) and it was relatively easy to do the recursive flush. Let me try get back to this approach and see whether it solves the flush? Not sure what wq buys us at this point.
Will still look into that, maybe something similar to the linkwatch as you mentioned.