Thread (6 messages) 6 messages, 4 authors, 2025-06-02

Re: [PATCH iwl-net] e1000: Move cancel_work_sync to avoid deadlock

From: Stanislav Fomichev <hidden>
Date: 2025-05-30 15:07:31
Also in: intel-wired-lan, lkml

On 05/30, Joe Damato wrote:
Previously, e1000_down called cancel_work_sync for the e1000 reset task
(via e1000_down_and_stop), which takes RTNL.

As reported by users and syzbot, a deadlock is possible due to lock
inversion in the following scenario:

CPU 0:
  - RTNL is held
  - e1000_close
  - e1000_down
  - cancel_work_sync (takes the work queue mutex)
  - e1000_reset_task

CPU 1:
  - process_one_work (takes the work queue mutex)
  - e1000_reset_task (takes RTNL)
nit: as Jakub mentioned in another thread, it seems more about the
flush_work waiting for the reset_task to complete rather than
wq mutexes (which are fake)?

CPU 0:
  - RTNL is held
  - e1000_close
  - e1000_down
  - cancel_work_sync
  - __flush_work
  - <wait here for the reset_task to finish>

CPU 1:
  - process_one_work
  - e1000_reset_task (takes RTNL)
  - <but cpu 0 already holds rtnl>

The fix looks good!

Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help