Re: [PATCH 11/11] usb: core: fix a race with usb_queue_reset_device()
From: Oliver Neukum <hidden>
Date: 2015-01-20 13:48:52
Also in:
lkml
On Tue, 2015-01-20 at 13:29 +0100, Olivier Sobrie wrote:
When usb_queue_reset() is called it schedules a work in view of
resetting the usb interface. When the reset work is running, it
can be scheduled again (e.g. by the usb disconnect method of
the driver).
Consider that the reset work is queued again while the reset work
is running and that this work leads to a forced unbinding of the
usb interface (e.g. because a driver is bound to the interface
and has no pre/post_reset methods - see usb_reset_device()).
In such condition, usb_unbind_interface() gets called and this
function calls usb_cancel_queued_reset() which does nothing
because the flag "reset_running" is set to 1. The second reset
work that has been scheduled is therefore not cancelled.
Later, the usb_reset_device() tries to rebind the interface.
If it fails, then the usb interface context which contain the
reset work struct is freed and it most likely crash when the
second reset work tries to be run.
The following flow shows the problem:
* usb_queue_reset_device()
* __usb_queue_reset_device() <- If the reset work is queued after here, then
reset_running = 1 it will never be cancelled.
usb_reset_device()
usb_forced_unbind_intf()
usb_driver_release_interface()
usb_unbind_interface()
driver->disconnect()
usb_queue_reset_device() <- second resetThat is the sledgehammer approach. Wouldn't it be better to guarantee that usb_queue_reset_device() be a nop when reset_running==1 ?
usb_cancel_queued_reset() <- does nothing because
the flag reset_running
is set
usb_unbind_and_rebind_marked_interfaces()
usb_rebind_intf()
device_attach()
driver->probe() <- fails (no more drivers hold a reference to
the usb interface)
reset_running = 0
* hub_event()
usb_disconnect()
usb_disable_device()
kobject_release()
device_release()
usb_release_interface()
kfree(intf) <- usb interface context is released
while we still have a pending reset
work that should be run
To avoid this problem, we use a delayed work so that if the reset
work is currently run, we can avoid further call to
__usb_queue_reset_device() work by using cancel_delayed_work().
Unfortunately it increases the size of the usb_interface structure...Regards Oliver -- Oliver Neukum [off-list ref]