Thread (5 messages) 5 messages, 4 authors, 2025-05-29

Re: [syzbot] [net?] possible deadlock in rtnl_newlink

From: Joe Damato <hidden>
Date: 2025-05-29 23:54:16
Also in: lkml

On Thu, May 29, 2025 at 09:45:10AM -0700, Stanislav Fomichev wrote:
On 05/29, Jakub Kicinski wrote:
quoted
On Thu, 29 May 2025 08:59:43 -0700 Stanislav Fomichev wrote:
quoted
So this is internal WQ entry lock that is being reordered with rtnl
lock. But looking at process_one_work, I don't see actual locks, mostly
lock_map_acquire/lock_map_release calls to enforce some internal WQ
invariants. Not sure what to do with it, will try to read more.
Basically a flush_work() happens while holding rtnl_lock,
but the work itself takes that lock. It's a driver bug.
e400c7444d84 ("e1000: Hold RTNL when e1000_down can be called") ?
I think similar things (but wrt netdev instance lock) are happening
with iavf: iavf_remove calls cancel_work_sync while holding the
instance lock and the work callbacks grab the instance lock as well :-/
I think this is probably the same thread as:

 https://lore.kernel.org/netdev/CAP=Rh=OEsn4y_2LvkO3UtDWurKcGPnZ_NPSXK=FbgygNXL37Sw@mail.gmail.com/ (local)

I posted a response there about how to possibly avoid the problem
(based on my rough reading of the driver code), but am still
thinking more on this.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help