Thread (13 messages) 13 messages, 3 authors, 2020-09-03

Re: [PATCH 0/5] media: uvcvideo: Fix race conditions

From: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Date: 2020-08-30 21:36:48
Also in: linux-media, lkml

Hi Guenter,

On Sun, Aug 30, 2020 at 01:48:24PM -0700, Guenter Roeck wrote:
On 8/30/20 8:58 AM, Laurent Pinchart wrote:
quoted
On Sun, Aug 30, 2020 at 08:04:38AM -0700, Guenter Roeck wrote:
quoted
The uvcvideo code has no lock protection against USB disconnects
while video operations are ongoing. This has resulted in random
error reports, typically pointing to a crash in usb_ifnum_to_if(),
called from usb_hcd_alloc_bandwidth(). A typical traceback is as
follows.

usb 1-4: USB disconnect, device number 3
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 5633 Comm: V4L2CaptureThre Not tainted 4.19.113-08536-g5d29ca36db06 #1
Hardware name: GOOGLE Edgar, BIOS Google_Edgar.7287.167.156 03/25/2019
RIP: 0010:usb_ifnum_to_if+0x29/0x40
Code: <...>
RSP: 0018:ffffa46f42a47a80 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff904a396c9000
RDX: ffff904a39641320 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffffa46f42a47a80 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000009975 R11: 0000000000000009 R12: 0000000000000000
R13: ffff904a396b3800 R14: ffff904a39e88000 R15: 0000000000000000
FS: 00007f396448e700(0000) GS:ffff904a3ba00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000016cb46000 CR4: 00000000001006f0
Call Trace:
 usb_hcd_alloc_bandwidth+0x1ee/0x30f
 usb_set_interface+0x1a3/0x2b7
 uvc_video_start_transfer+0x29b/0x4b8 [uvcvideo]
 uvc_video_start_streaming+0x91/0xdd [uvcvideo]
 uvc_start_streaming+0x28/0x5d [uvcvideo]
 vb2_start_streaming+0x61/0x143 [videobuf2_common]
 vb2_core_streamon+0xf7/0x10f [videobuf2_common]
 uvc_queue_streamon+0x2e/0x41 [uvcvideo]
 uvc_ioctl_streamon+0x42/0x5c [uvcvideo]
 __video_do_ioctl+0x33d/0x42a
 video_usercopy+0x34e/0x5ff
 ? video_ioctl2+0x16/0x16
 v4l2_ioctl+0x46/0x53
 do_vfs_ioctl+0x50a/0x76f
 ksys_ioctl+0x58/0x83
 __x64_sys_ioctl+0x1a/0x1e
 do_syscall_64+0x54/0xde

While this is problem rarely observed in the field, it is relatively easy
to reproduce by adding msleep() calls into the code.

I don't presume to claim that I found every issue, but this patch series
should fix at least the major problems.

The patch series was tested exensively on a Chromebook running chromeos-4.19
and on a Linux system running a v5.8.y based kernel.
I'll review each patch individually, but I think 2/5, 4/5 and 5/5 should
be handled in the V4L2 core, not the uvcvideo driver. Otherwise we would
have to replicate that logic in all drivers, while I think it can easily
be implemented in a generic fashion as previously discussed.
The problem is that the v4l2 core already does support locking. There is
a global lock, in struct video_device, a queue lock in struct v4l2_m2m_ctx,
and another queue lock in struct vb2_queue. However, all of those have
to be initialized from the driver. The uvcvideo driver uses its own locks and
does not set the lock pointers in the various generic structures. I was able
to figure out how to use the uvcvideo specific locks in the uvcvideo
driver, but all my attempts to initialize and use the generic locks failed.

It may well be that the generic code isn't entirely clean - for example
I am not sure if the lock protection in v4l2_open() is complete since
it doesn't handle disconnects after checking if the video device is still
registered (and I don't really see the point of the second video_is_registered()
call in v4l2_open). However, that may just be a lack of understanding on my
side on how the code is supposed to work. Maybe the actual device open function
is expected to have its own protection against underlying hardware removal
and video device unregistration while opening the device.

[ Regarding the second call to video_is_registered() in v4l2_open():
  Add msleep(5000) between it and the call to the driver open function,
  disconnect the device during the sleep, and it will happily call the device
  open function on a non-registered video device. That is what patch 5/5 tries
  to fix or the uvcvideo driver.
  The same problem applies to other file operations in v4l2-dev.c: They all
  check if the video device is registered before calling the device
  specific code, but I don't really see the point of doing that because
  there is no protection against unregistration after the check was made
  and before/while the device specific code is running.
  Patch 4/5 tries to fix this for the uvcvideo driver.
  If that is a bug in the v4l2 code, I'll be happy to work on a fix,
  but the only generic fix I could think of would be to utilize the lock in
  struct video_device ... but that lock isn't initialized by the uvcvideo
  driver.
]

Either case, I don't think my understanding of the interaction between
v4l2 and uvcvideo is good enough to make more invasive changes. I _think_
any generic improvement should start with refactoring the uvcvideo code to
use the v4l2 locking mechanism. However, from the exchange here, my
understanding is that this locking mechanism is not used on purpose. That
means we'll have a uvcvideo specific locking mechanism, period, and I don't
think it is even possible to solve the problem without utilizing this locking
mechanism.

Of course, it may as well be that I am completely off track and clueless.
After all, the first time I looked into this code was about two weeks ago.
So please bear with me if I talk nonsense.
It would be rather impolite to claim you're clueless, given that you
managed to write this patch series only two weeks after first looking
into the problem :-)

I'll try to prototype what I envision would be a good solution in the
V4L2 core. If stars align, I may even try to push it one level up, to
the chardev layer. Would you then be able to test it ?
quoted
quoted
----------------------------------------------------------------
Guenter Roeck (5):
      media: uvcvideo: Cancel async worker earlier
      media: uvcvideo: Lock video streams and queues while unregistering
      media: uvcvideo: Release stream queue when unregistering video device
      media: uvcvideo: Protect uvc queue file operations against disconnect
      media: uvcvideo: In uvc_v4l2_open, check if video device is registered

 drivers/media/usb/uvc/uvc_ctrl.c   | 11 ++++++----
 drivers/media/usb/uvc/uvc_driver.c | 12 ++++++++++
 drivers/media/usb/uvc/uvc_queue.c  | 32 +++++++++++++++++++++++++--
 drivers/media/usb/uvc/uvc_v4l2.c   | 45 ++++++++++++++++++++++++++++++++++++--
 drivers/media/usb/uvc/uvcvideo.h   |  1 +
 5 files changed, 93 insertions(+), 8 deletions(-)
-- 
Regards,

Laurent Pinchart
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help