Thread (33 messages) 33 messages, 6 authors, 2021-11-25

Re: [PATCH V5 1/4] virtio_ring: validate used buffer length

From: Michael Ellerman <mpe@ellerman.id.au>
Date: 2021-11-24 01:30:38
Also in: lkml
Subsystem: the rest, virtio balloon, virtio core · Maintainers: Linus Torvalds, "Michael S. Tsirkin", David Hildenbrand, Jason Wang

"Michael S. Tsirkin" [off-list ref] writes:
On Tue, Nov 23, 2021 at 10:25:20AM +0800, Jason Wang wrote:
quoted
On Tue, Nov 23, 2021 at 4:24 AM Halil Pasic [off-list ref] wrote:
quoted
On Mon, 22 Nov 2021 14:25:26 +0800
Jason Wang [off-list ref] wrote:
quoted
I think the fixes are:

1) fixing the vhost vsock
2) use suppress_used_validation=true to let vsock driver to validate
the in buffer length
3) probably a new feature so the driver can only enable the validation
when the feature is enabled.
I'm not sure, I would consider a F_DEV_Y_FIXED_BUG_X a perfectly good
feature. Frankly the set of such bugs is device implementation
specific and it makes little sense to specify a feature bit
that says the device implementation claims to adhere to some
aspect of the specification. Also what would be the semantic
of not negotiating F_DEV_Y_FIXED_BUG_X?
Yes, I agree. Rethink of the feature bit, it seems unnecessary,
especially considering the driver should not care about the used
length for tx.
quoted
On the other hand I see no other way to keep the validation
permanently enabled for fixed implementations, and get around the problem
with broken implementations. So we could have something like
VHOST_USED_LEN_STRICT.
It's more about a choice of the driver's knowledge. For vsock TX it
should be fine. If we introduce a parameter and disable it by default,
it won't be very useful.
quoted
Maybe, we can also think of 'warn and don't alter behavior' instead of
'warn' and alter behavior. Or maybe even not having such checks on in
production, but only when testing.
I think there's an agreement that virtio drivers need more hardening,
that's why a lot of patches were merged. Especially considering the
new requirements came from confidential computing, smart NIC and
VDUSE. For virtio drivers, enabling the validation may help to

1) protect the driver from the buggy and malicious device
2) uncover the bugs of the devices (as vsock did, and probably rpmsg)
3) force the have a smart driver that can do the validation itself
then we can finally remove the validation in the core

So I'd like to keep it enabled.

Thanks
Let's see how far we can get. But yes, maybe we were too aggressive in
breaking things by default, a warning might be a better choice for a
couple of cycles.
This series appears to break the virtio_balloon driver as well.

The symptom is soft lockup warnings, eg:

  INFO: task kworker/1:1:109 blocked for more than 614 seconds.
        Not tainted 5.16.0-rc2-gcc-10.3.0 #21
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  task:kworker/1:1     state:D stack:12496 pid:  109 ppid:     2 flags:0x00000800
  Workqueue: events_freezable update_balloon_size_func
  Call Trace:
  [c000000003cef7c0] [c000000003cef820] 0xc000000003cef820 (unreliable)
  [c000000003cef9b0] [c00000000001e238] __switch_to+0x1e8/0x2f0
  [c000000003cefa10] [c000000000f0a00c] __schedule+0x2cc/0xb50
  [c000000003cefae0] [c000000000f0a8fc] schedule+0x6c/0x140
  [c000000003cefb10] [c00000000095b6c4] tell_host+0xe4/0x130
  [c000000003cefba0] [c00000000095d234] update_balloon_size_func+0x394/0x3f0
  [c000000003cefc70] [c000000000178064] process_one_work+0x2c4/0x5b0
  [c000000003cefd10] [c0000000001783f8] worker_thread+0xa8/0x640
  [c000000003cefda0] [c000000000185444] kthread+0x1b4/0x1c0
  [c000000003cefe10] [c00000000000cee4] ret_from_kernel_thread+0x5c/0x64

Similar backtrace reported here by Luis:

  https://lore.kernel.org/lkml/YY2duTi0wAyAKUTJ@bombadil.infradead.org/ (local)

Bisect points to:

  # first bad commit: [939779f5152d161b34f612af29e7dc1ac4472fcf] virtio_ring: validate used buffer length

Adding suppress used validation to the virtio balloon driver "fixes" it, eg.
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index c22ff0117b46..a14b82ceebb2 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -1150,6 +1150,7 @@ static unsigned int features[] = {
 };
 
 static struct virtio_driver virtio_balloon_driver = {
+	.suppress_used_validation = true,
 	.feature_table = features,
 	.feature_table_size = ARRAY_SIZE(features),
 	.driver.name =	KBUILD_MODNAME,

cheers
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help