Re: [PATCH] net: usb: r8152: fix resume reset deadlock
From: Sergey Senozhatsky <senozhatsky@chromium.org>
Date: 2026-01-29 03:06:39
Also in:
linux-usb, lkml
Hi Doug, On (26/01/28 10:05), Doug Anderson wrote:
quoted
rtl8152 can trigger device reset during reset which potentially can result in a deadlock: **** DPM device timeout after 10 seconds; 15 seconds until panic **** Call Trace: <TASK> schedule+0x483/0x1370 schedule_preempt_disabled+0x15/0x30 __mutex_lock_common+0x1fd/0x470 __rtl8152_set_mac_address+0x80/0x1f0 dev_set_mac_address+0x7f/0x150 rtl8152_post_reset+0x72/0x150 usb_reset_device+0x1d0/0x220 rtl8152_resume+0x99/0xc0 usb_resume_interface+0x3e/0xc0 usb_resume_both+0x104/0x150 usb_resume+0x22/0x110 The problem is that rtl8152 resume calls reset under tp->control mutex while reset basically re-enters rtl8152 and attempts to acquire the same tp->control lock once again. Reset INACCESSIBLE device outside of tp->control mutex scope to avoid recursive mutex_lock() deadlock. Fixes: 4933b066fefb ("r8152: If inaccessible at resume time, issue a reset") Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> --- drivers/net/usb/r8152.c | 27 ++++++++++++++------------- 1 file changed, 14 insertions(+), 13 deletions(-)This is effectively v2 of: https://lore.kernel.org/r/20241018141337.316807-1-danielgeorgem@chromium.org/ (local) ...and you've incorporated my feedback there. Thanks! :-)
Oh, nice :)
quoted
@@ -8674,6 +8662,19 @@ static int rtl8152_resume(struct usb_interface *intf) mutex_unlock(&tp->control); + /* If the device is RTL8152_INACCESSIBLE here then we should do a + * reset. This is important because the usb_lock_device_for_reset() + * that happens as a result of usb_queue_reset_device() will silently + * fail if the device was suspended or if too much time passed. + * + * NOTE: The device is locked here so we can directly do the reset. + * We don't need usb_lock_device_for_reset() because that's just a + * wrapper over device_lock() and device_resume() (which calls us) + * does that for us. + */ + if (system_resume && test_bit(RTL8152_INACCESSIBLE, &tp->flags)) + ret = usb_reset_device(tp->udev); + return ret;Question when looking at the above again: have you thought about the consequences of clobbering `ret` above? I guess it's fine since rtl8152_system_resume() always returns 0, but it looks a little awkward. It's been long enough since I thought through all this code that I'm not 100% sure what it _should_ do if rtl8152_system_resume() was ever changed to return an error. Shouldn't it honor the existing error instead of trying to reset the device and clearing the error?
Right... so that "ret" thing, I thought about it and at the end I
just decided that returning an actual device reset error from resume
is still better than "return 0 but device is inaccessible" ("mission
failed successfully" kind of a thing). I'm not entirely sure what
would be the best way to handle this. Like you said, for the time
being, rtl8152_system_resume() always returns 0. Do we expect this
to change in the future? Probably not. On the other hand if
RTL8152_INACCESSIBLE bit is not cleared then user-space will
figure it out eventually (ioctl calls will fail, etc). So maybe I
can just keep the existing code and ignore usb_reset_device() return
value.
Also: I guess you've added the `system_resume` variable here, which is different than the earlier patch. It seems fine to me, though maybe you want to consistently use the `system_resume` variable earlier in the function too?
Sounds good!
In any case, both of the above are pretty nitty, so I'm OK with: Reviewed-by: Douglas Anderson <dianders@chromium.org>
Thanks!