Re: net: usb: r8152: resume-reset deadlock
From: Sergey Senozhatsky <senozhatsky@chromium.org>
Date: 2026-01-28 05:58:51
Also in:
linux-usb, lkml
Subsystem:
networking drivers, the rest, usb networking drivers · Maintainers:
Andrew Lunn, "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Linus Torvalds
On (25/09/02 10:36), Paolo Abeni wrote:
On 8/26/25 11:55 AM, Sergey Senozhatsky wrote:quoted
I'm looking into the following deadlock <4>[ 1596.492101] schedule_preempt_disabled+0x15/0x30 <4>[ 1596.492170] __mutex_lock_common+0x256/0x490 <4>[ 1596.492209] __mutex_lock_slowpath+0x18/0x30 <4>[ 1596.492249] __rtl8152_set_mac_address+0x80/0x1f0 [r8152 (HASH:ce6f 4)] <4>[ 1596.492327] dev_set_mac_address+0x7d/0x150 <4>[ 1596.492395] rtl8152_post_reset+0x72/0x150 [r8152 (HASH:ce6f 4)] <4>[ 1596.492438] usb_reset_device+0x1ce/0x220 <4>[ 1596.492507] rtl8152_resume+0x99/0xc0 [r8152 (HASH:ce6f 4)] <4>[ 1596.492550] usb_resume_interface+0x3c/0xc0 <4>[ 1596.492619] usb_resume_both+0x104/0x150 <4>[ 1596.492657] ? usb_dev_suspend+0x20/0x20 <4>[ 1596.492725] usb_resume+0x22/0x110 <4>[ 1596.492763] ? usb_dev_suspend+0x20/0x20 <4>[ 1596.492800] dpm_run_callback+0x83/0x1d0 <4>[ 1596.492873] device_resume+0x35f/0x3d0 <4>[ 1596.492912] ? pm_verb+0xa0/0xa0 <4>[ 1596.492951] async_resume+0x1d/0x30 <4>[ 1596.493019] async_run_entry_fn+0x2b/0xd0 <4>[ 1596.493060] worker_thread+0x2ce/0xef0 <4>[ 1596.493101] ? cancel_delayed_work+0x2d0/0x2d0 <4>[ 1596.493170] kthread+0x16d/0x190 <4>[ 1596.493209] ? cancel_delayed_work+0x2d0/0x2d0 <4>[ 1596.493247] ? kthread_associate_blkcg+0x80/0x80 <4>[ 1596.493316] ret_from_fork+0x1f/0x30 rtl8152_resume() seems to be tricky, because it's under tp->control mutex, when it can see RTL8152_INACCESSIBLE and initiate a full device reset via usb_reset_device(), which eventually re-enters rtl8152, at which point it calls __rtl8152_set_mac_address() and deadlocks on tp->control (I assume) mutex.Decoding the above stack trace will tell for sure.
Apologies for the delay, somehow this slipped through.
quoted
__rtl8152_set_mac_address() has in_resume flag (added by Takashi in 776ac63a986d), which is set only in "reset_resume" case, wheres what we have is "resume_reset". Moreover in_resume flag is not for tp->control mutex, as far as I can tell, but for PM lock. When we set_mac_address from resume_reset, we lose in_resume flat, so not only we deadlock on tp->control mutex, but also we may (I guess) deadlock on the PM lock. Also, we still call rtl8152_resume() even in reset_resume, which I assume still can end up resetting device and hence in set_mac_address() in non-in_resume mode, potentially triggering the same deadlock that Takashi fixed. Well, unless I'm missing something. So I don't think I want to add another flag to mark "current owns tp->control mutex" so that we can handle re-entry. How about moving usb reset outside of tp->control scope? Is there any harm in doing that?According to commit 4933b066fefbee4f1d2d708de53c4ab7f09026ad, the usb_reset_device() call is intentionally under tp->control protection to vs double reset. At very least the proposed code here could end-up causing an unexpected reset when SELECTIVE_SUSPEND is set. I *guess* you should track the current status explicitly to restrict the reset at in the !test_bit(SELECTIVE_SUSPEND) scenario and explicitly avoid delayed reset during resume.
Something like this? ---
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 30f937527cd2..0a570b8ac7dd 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c@@ -8542,14 +8542,9 @@ static int rtl8152_system_resume(struct r8152 *tp) * reset. This is important because the usb_lock_device_for_reset() * that happens as a result of usb_queue_reset_device() will silently * fail if the device was suspended or if too much time passed. - * - * NOTE: The device is locked here so we can directly do the reset. - * We don't need usb_lock_device_for_reset() because that's just a - * wrapper over device_lock() and device_resume() (which calls us) - * does that for us. */ if (test_bit(RTL8152_INACCESSIBLE, &tp->flags)) - usb_reset_device(tp->udev); + return -ENODEV; return 0; }
@@ -8674,6 +8669,14 @@ static int rtl8152_resume(struct usb_interface *intf) mutex_unlock(&tp->control); + /* + * This only happens for !SELECTIVE_SUSPEND and RTL8152_INACCESSIBLE, + * drop tp->control, because reset path can re-acquire it (e.g. for + * MAC address/policy update). + */ + if (ret == -ENODEV) + ret = usb_reset_device(tp->udev); + return ret; } ---
Ad very least you should add a fixes tag, a proper Sob and use canonical commit references.
Well, sure, when we have a fix that we agree on I'll submit it properly.