Re: [PATCH V6 8/9] virtio: harden vring IRQ

From: Jason Wang <jasowang@redhat.com>
Date: 2022-06-15 01:41:46
Also in: linux-s390, lkml, virtualization

On Wed, Jun 15, 2022 at 12:46 AM Cristian Marussi
[off-list ref] wrote:

On Tue, Jun 14, 2022 at 03:40:21PM +0800, Jason Wang wrote:

quoted

On Mon, Jun 13, 2022 at 5:28 PM Michael S. Tsirkin [off-list ref] wrote:

quoted

Hi Jason,

quoted

On Mon, Jun 13, 2022 at 05:14:59PM +0800, Jason Wang wrote:

quoted

On Mon, Jun 13, 2022 at 5:08 PM Jason Wang [off-list ref] wrote:

quoted

On Mon, Jun 13, 2022 at 4:59 PM Michael S. Tsirkin [off-list ref] wrote:

quoted

On Mon, Jun 13, 2022 at 04:51:08PM +0800, Jason Wang wrote:

quoted

On Mon, Jun 13, 2022 at 4:19 PM Michael S. Tsirkin [off-list ref] wrote:

quoted

On Mon, Jun 13, 2022 at 04:07:09PM +0800, Jason Wang wrote:

quoted

On Mon, Jun 13, 2022 at 3:23 PM Michael S. Tsirkin [off-list ref] wrote:

quoted

On Mon, Jun 13, 2022 at 01:26:59PM +0800, Jason Wang wrote:

quoted

On Sat, Jun 11, 2022 at 1:12 PM Michael S. Tsirkin [off-list ref] wrote:

quoted

On Fri, May 27, 2022 at 02:01:19PM +0800, Jason Wang wrote:

quoted

This is a rework on the previous IRQ hardening that is done for
virtio-pci where several drawbacks were found and were reverted:

1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
   that is used by some device such as virtio-blk
2) done only for PCI transport

The vq->broken is re-used in this patch for implementing the IRQ
hardening. The vq->broken is set to true during both initialization
and reset. And the vq->broken is set to false in
virtio_device_ready(). Then vring_interrupt() can check and return
when vq->broken is true. And in this case, switch to return IRQ_NONE
to let the interrupt core aware of such invalid interrupt to prevent
IRQ storm.

The reason of using a per queue variable instead of a per device one
is that we may need it for per queue reset hardening in the future.

Note that the hardening is only done for vring interrupt since the
config interrupt hardening is already done in commit 22b7050a024d7
("virtio: defer config changed notifications"). But the method that is
used by config interrupt can't be reused by the vring interrupt
handler because it uses spinlock to do the synchronization which is
expensive.

Cc: Thomas Gleixner <redacted>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Vineeth Vijayan <vneethv@linux.ibm.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Jason Wang <jasowang@redhat.com>


Jason, I am really concerned by all the fallout.
I propose adding a flag to suppress the hardening -
this will be a debugging aid and a work around for
users if we find more buggy drivers.

suppress_interrupt_hardening ?

I can post a patch but I'm afraid if we disable it by default, it
won't be used by the users so there's no way for us to receive the bug
report. Or we need a plan to enable it by default.

It's rc2, how about waiting for 1 and 2 rc? Or it looks better if we
simply warn instead of disable it by default.

Thanks

I meant more like a flag in struct virtio_driver.
For now, could you audit all drivers which don't call _ready?
I found 5 of these:

drivers/bluetooth/virtio_bt.c

This driver seems to be fine, it doesn't use the device/vq in its probe().


But it calls hci_register_dev and that in turn queues all kind of
work. Also, can linux start using the device immediately after
it's registered?

So I think the driver is allowed to queue before DRIVER_OK.

it's not allowed to kick

Yes.

quoted

If yes,
the only side effect is the delay of the tx interrupt after DRIVER_OK
for a well behaved device.

your patches drop the interrupt though, it won't be just delayed.

For a well behaved device, it can only trigger the interrupt after DRIVER_OK.

So for virtio bt, it works like:

1) driver queue buffer and kick
2) driver set DRIVER_OK
3) device start to process the buffer
4) device send an notification

The only risk is that the virtqueue could be filled before DRIVER_OK,
or anything I missed?

btw, hci has an open and close method and we do rx refill in
hdev->open, so we're probably fine here.

Thanks


Sounds good. Now to audit the rest of them from this POV ;)

Adding maintainers.

quoted

 drivers/i2c/busses/i2c-virtio.c

It looks to me the device could be used immediately after
i2c_add_adapter() return. So we probably need to add
virtio_device_ready() before that. Fortunately, there's no rx vq in
i2c and the callback looks safe if the callback is called before the
i2c registration and after virtio_device_ready().

quoted

 drivers/net/caif/caif_virtio.c

A networking device, RX is backed by vringh so we don't need to
refill. TX is backed by virtio and is available until ndo_open. So
it's fine to let the core to set DRIVER_OK after probe().

quoted

 drivers/nvdimm/virtio_pmem.c

It doesn't use interrupt so far, so it has nothing to do with the IRQ hardening.

But the device could be used by the subsystem immediately after
nvdimm_pmem_region_create(), this means the flush could be issued
before DRIVER_OK. We need virtio_device_ready() before. We don't have
a RX virtqueue and the callback looks safe if the callback is called
after virtio_device_ready() but before the nvdimm region creating.

And it looks to me there's a race between the assignment of
provider_data and virtio_pmem_flush(). If the flush was issued before
the assignment we will end up with a NULL pointer dereference. This is
something we need to fix.

quoted

 arm_scmi

It looks to me the singleton device could be used by SCMI immediately after

        /* Ensure initialized scmi_vdev is visible */
        smp_store_mb(scmi_vdev, vdev);

So we probably need to do virtio_device_ready() before that. It has an
optional rx queue but the filling is done after the above assignment,
so it's safe. And the callback looks safe is a callback is triggered
after virtio_device_ready() buy before the above assignment.

I wanted to give it a go at this series testing it on the context of
SCMI but it does not apply

- not on a v5.18:

17:33 $ git rebase -i v5.18
17:33 $ git am ./v6_20220527_jasowang_rework_on_the_irq_hardening_of_virtio.mbx
Applying: virtio: use virtio_device_ready() in virtio_device_restore()
Applying: virtio: use virtio_reset_device() when possible
Applying: virtio: introduce config op to synchronize vring callbacks
Applying: virtio-pci: implement synchronize_cbs()
Applying: virtio-mmio: implement synchronize_cbs()
error: patch failed: drivers/virtio/virtio_mmio.c:345
error: drivers/virtio/virtio_mmio.c: patch does not apply
Patch failed at 0005 virtio-mmio: implement synchronize_cbs()

- neither on a v5.19-rc2:

17:33 $ git rebase -i v5.19-rc2
17:35 $ git am ./v6_20220527_jasowang_rework_on_the_irq_hardening_of_virtio.mbx
Applying: virtio: use virtio_device_ready() in virtio_device_restore()
error: patch failed: drivers/virtio/virtio.c:526
error: drivers/virtio/virtio.c: patch does not apply
Patch failed at 0001 virtio: use virtio_device_ready() in
virtio_device_restore()
hint: Use 'git am --show-current-patch=diff' to see the failed patch
When you have resolved this problem, run "git am --continue".

... what I should take as base ?

It should have already been included in rc2, so there's no need to
apply patch manually.

Thanks

Thanks,
Cristian

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help