Re: [PATCH] hid: usbhid: fix possible deadlock in __usbhid_submit_report

From: Alan Stern <stern@rowland.harvard.edu>
Date: 2012-04-23 15:42:12

Possibly related (same subject, not in this thread)

2012-04-26 · Re: [PATCH] hid: usbhid: fix possible deadlock in __usbhid_submit_report · Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-04-26 · Re: [PATCH] hid: usbhid: fix possible deadlock in __usbhid_submit_report · Jiri Kosina <hidden>
2012-04-25 · Re: [PATCH] hid: usbhid: fix possible deadlock in __usbhid_submit_report · Alan Stern <hidden>
2012-04-25 · Re: [PATCH] hid: usbhid: fix possible deadlock in __usbhid_submit_report · Ming Lei <hidden>
2012-04-25 · Re: [PATCH] hid: usbhid: fix possible deadlock in __usbhid_submit_report · Oliver Neukum <hidden>

On Sun, 22 Apr 2012, Ming Lei wrote:

On Sun, Apr 22, 2012 at 8:50 PM, Alan Stern [off-list ref] wrote:

quoted

On Sun, 22 Apr 2012, Ming Lei wrote:

quoted

Although the kerneldoc doesn't actually say so, it should be safe to
assume that usb_unlink_urb calls the completion routine directly _only_
in cases where the unlink succeeded. �(We could add this to the
kerneldoc.)

Therefore: If the URB completes with status other than -ECONNRESET then
you can safely take the lock for resubmission. �If the URB completes
with status == -ECONNRESET then you know it was unlinked, so you don't
need to take the lock -- the race has already been lost.

Does that solve your problem?

Not sure if that does work.

If the URB completes asynchronously after unlinking, its status is still
�-ECONNRESET, so extra race may be caused without holding the lock
because complete handler will access some global data.

That would be a completely separate race, right? �So maybe it can use a

Not sure, at least in both usbnet and usbhid cases, the lock is held before
usb_unlink_urb, and the same lock is to be acquired in the URB complete
handler.

quoted

different lock for protection -- and this other lock could be dropped
before usb_unlink_urb is called.

If the lock which is to be acquired in the URB complete handler is dropped
before calling usb_unlink_urb, one new submitted URB in complete handler
may be unlinked, as mentioned by Oliver already.

We are now talking about two locks.  One of them is held during the 
call to usb_unlink_urb; the completion handler does not acquire that 
lock if the URB's status is -ECONNRESET.  The other lock is dropped 
before usb_unlink_urb is called, so the completion handler can safely 
grab it.


On Mon, 23 Apr 2012, Oliver Neukum wrote:

quoted

If the URB completes asynchronously after unlinking, its status is still
 -ECONNRESET, so extra race may be caused without holding the lock
because complete handler will access some global data.

That is the race. And you need not invoke global data. The original
race opens again if you are submitting a new URB without the lock
held.
This is because we cannot be sure that the same URB is unlinked
only once. A subsequent timeout may kill the wrong URB if the
first is unlinked so that the callback really comes in interrupt.

But the basic idea is brilliant. It's just that the one way logical implication:
recursive direct call of the callback -> status == -ECONNRESET
is not strong enough. But that is very easy to fix. As we know whether
the callback is directly called or not, all we need to do is differentiate
the cases in urb->status, by introducing a new error code.

I don't like the idea of changing the status codes.  It would mean 
changing usb_kill_urb too.

Instead of changing return codes or adding locks, how about
implementing a small state machine for each URB?

	Initially the state is ACTIVE.

	When the URB times out, acquire the lock.  If the state is not
	equal to ACTIVE, drop the lock and return immediately (the URB
	is being unlinked concurrently).  Otherwise set the state to 
	UNLINK_STARTED, drop the lock, call usb_unlink_urb, and
	reacquire the lock.  If the state hasn't changed, set it back
	to ACTIVE.  But if the state has changed to UNLINK_FINISHED,
	set it to ACTIVE and resubmit.

	In the completion handler, grab the lock.  If the state
	is ACTIVE, resubmit.  But if the state is UNLINK_STARTED, 
	change it to UNLINK_FINISHED and don't resubmit.

This is a better approach, in that it doesn't make any assumptions 
regarding synchronous vs. asynchronous unlinks.  If you want, you could 
have two different ACTIVE substates, one for URBs which haven't yet 
been unlinked and one for URBs which have been.  Then you could avoid 
unlinking the same URB twice.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help